So, I’ve been building projects using NoSQL databases (like MongoDB and Apache CouchDB) for several years now, and there are a lot of things I really like about them. But one need I’m always bumping up against is the ability to summarize tons of heterogeneous documents in a straightforward way. For example, I’m producing a list of users and I have a hierarchy of preferred display strings I want to generate, based on what information is available. Here’s a sample document that includes all of the possible fields I might want to show in summary string:
{ "_id": "User_admin", "_rev": "6-3361423ee660182bc365fbe1439ea1a4", "UserName": "admin", "Email": "admin@localhost", "CompanyName": "Absion Technologies LC", "FirstName": "Scott", "LastName": "Means", "ObjectType": "User" }
So, in this case I want to see the user’s company and full name, like this:
Absion Technologies LC: Scott Means
But if the company name isn’t there, I want to see:
Scott Means
If there’s no CompanyName or LastName, I want to see the Email field. These rules aren’t complex, but when you’re doing this for a lot of different types of documents, it gets repetitive. So I thought about it for a while. I felt like I could almost express my desires using a simple JSON data structure. I could use arrays to convey order, nesting to convey composition, etc. So after some fiddling around, I came up with a very simple JSON template language that lets me express some pretty sophisticated string composition rules.
The engine expects a JSON array and a JSON object as its inputs. I could explain all of the rules here, but it’s actually a little easier to see some examples with an explanation:
[“FirstName”] => “Scott”
Any array with only one element is inferred to be a field in the JSON object, and the corresponding value is returned. If this value doesn’t exist, it returns undefined:
["MissingField"] => undefined
If the array has multiple elements, the evalJsonTemplate() function is executed recursively on each element, with the results being inserted into a result array:
[["FirstName"], " ", ["LastName"]] => ["Scott", " ", "Means"]
Then, the elements of the result array are concated and the result is returned to the caller. But what if one of the values is missing? Take the following example:
[["Title"], " ", ["LastName"]] => result array: [undefined, " ", "Means"]
By default, if any of the elements of the result array are undefined, the engine returns an empty string (”). This is implicit, but it is possible to pass an explicit operator as the first element to any template array, like so:
["$all", ["Title"], " ", ["LastName"]] => ''
But there are other operators available. The $any operator will concat any valid string elements together and return the result. So, the preceding template could be rewritten as:
["$any", ["Title"], " ", ["LastName"]] => ' Means'
Note the blank preceding my last name. That’s not very tidy. But using the rules we already know, it’s easy to collapse the blank by nesting the Title rule in another array:
[[["Title"], " "], ["LastName"]]
It was also possible to lose the $any operator, because the [[“Title”], ” “] rule returns an empty string, which will be prepended to the last name.
Now, there is only one more operator needed to express all of the cases that I need to cover. That is the $first operator. It indicates that the first valid element of the result array (not undefined) should be returned. This allows a list of possible formats to be specified in order, and the first one that produces output will be chosen.
As a complete example, the following JSON string conveys all of the preferred display strings that I want for my sample document above. It’s guaranteed to always produce something, thanks to the final [“_id”] element:
["$first", [[["Company"], ": "], [ ["FirstName"], " " ], ["LastName"] ], ["Company"], ["Email"], ["Username"], ["_id"] ]
And, as an added bonus, the entire template function is only 42 lines of code. It’s the answer to life, the universe, and everything. Or at least to generating summary strings without too much hassle. Here is the full source code for the template:
function evalJsonTemplate(t, o) { if (t.constructor !== Array) { return t; } if (t.length < 1) { return undefined; } if (t.length == 1) { return o[t[0]]; } var op = '$all'; if (t[0].indexOf('$') == 0) { op = t[0]; t = t.slice(1); } var ra = []; t.forEach(v => { ra.push(evalJsonTemplate(v, o)); }); switch (op) { case '$first': { return ra.reduce((a, c) => a || c); } break; case '$any': { return ra.join(''); } default: case '$all': { var r = ra.reduce((a, c) => (a == undefined || c == undefined) ? undefined : a + c, ''); return r == undefined ? '' : r; } break; } }
You can also see the code in action in this JS fiddle. I’ll try to write a future blog post to go through this code and explain how it works, it illustrates some useful concepts related to recursion and array manipulation.
Leave a Reply