Supercharging operations with Common Expression Language in the browser
Searching and filtering is a fairly common task, particularly as our data needs have continued to grow.
In one particular application I was building, users found themselves with a diverse collection of items and poor tools to make sense of them. To make it easier, I added a search system. I also added specific logic for tags that each item may have:
While this filter system was responsive, running entirely locally in the browser, it was fairly limited. Each of these entries have more than just a label and tags. There are about half a dozen different pieces of metadata that could apply. If I tried to copy that logic for each attribute, it wouldn’t work. Some attributes are lists. Some are numbers.
The filter system works on an OR basis, matching your input against any one of the attributes. In this way, including more attributes makes the filtering less precise. There is no way to create more complex queries.
Finding a query language
I could’ve tried to write one of these myself. I don’t think that would’ve been a good idea.
Aside from being a lot of work, it would also create a new standard for writing complex queries. Rather than improving the user experience, the users would have to learn a new format unlike what they use otherwise.
I resisted efforts to add this feature until I came across the Common Expression Language. This is a spec from another Google team which defines expression evaluations in a relatively safe way using strings. The entire spec has a lot of bells and whistles, but is intentionally not turing complete.
From the entirety of the spec, I mainly just needed a way to evaluate a string as a self-contained piece of logic. As these would be entered by the user, I wanted to ensure that it wouldn’t cause the application to crash but would be flexible enough to do everything they wanted.
"1 == 1"
"999u == x"
"true ? 1 : 2"
"false && 32"
"3 in [5, 4, 3, 2, 1]"
As shown in these examples, there is support for basically equality, ternary operators, lists, variables, and advanced logic. The entire spec has even more that was unneeded.
Until I made one
I installed the Myna Parser, a TypeScript library for defining custom grammars and parsing logic. Starting from the top of my grammar’s definition, I start to define a variety of parser types. They start out fairly simple, laying out how various number formats are represented in a string.
As you start to look for more complicated operations, you need to begin representing grammar types as combinations of previously defined grammar types. In order to define
1 < x && x < 10 , you need to first say what
1 < x is. This means you need to define what
x represent. Then you bundle two of them together with
You do need to consider the many different possible permutations. You may want to compare two numbers, a number to a variable, or two variables. These are strings, with people adding spaces in-between or not. I have to take into account
x && y , or maybe
x&& y .
After close to four hundred lines of grammar definitions, defining just the bare minimum of what I need for myself, I can finally close out the function and register it with the Myna Parser system.
Note how I just take all of my defined grammars and stick them into this
m.choice. The parser will take the input string and look for any of the following grammars. The order is in most complex to least, since the parser will go in order on what to match.
How do I actually make any sense of this grammar? First I need to create a text formatter. Basically, a concrete definition of how the grammar can is converted into an output.
In a programming language, what we’ve constructed is the syntax and now we need to have a way of converting that into assembly.
This can be done with the TextFormatter. It is an object which is constructed with
options are empty right now, but in theory I could put in here additional settings on specifically how to interpret the grammar. For example a ‘strict mode’ might throw errors in more situations.
bindings are related to variables. As I will want to filter entries based on attributes, I will need to have a way to represent each attribute in the evaluation. For instance, the entry’s
label needs to be passed into the formatter so that the parser can run operations on it.
Note the way this formatter works. It recursively parses the string and converts each one into this protobuf-like object. This is specifically to align with the spec. For those specific grammar matches that are considered invalid, at the bottom, I throw a
message rather than a specific object interpretation.
To run a comparison, the formatter will process each parameter and then return an object that only returns a
bool_value. If this value is
true, then the query is successful for a given entry.
How do I know this works?
Wow this is quite complicated. Worse, it looks fragile. A simple grammar change could result in cascading breakages.
This is why I’ve written a bunch of unit tests, many based on the original spec. Using the Jest testing framework, all of these grammars are evaluated and shown to be correctly parsing with each commit.
I also added more complicated tests as well based on Pokémon that show how the bindings can be used to match these expressions.
GitHub - Fleker/cel-js: A port of Common Expression Language that can run in a Node or browser JS…
A port of Common Expression Language that can run in a Node or browser JS environment - GitHub - Fleker/cel-js: A port…
The project, aptly named cel-js, is open-sourced and ready for you to use. It was a good experience learning how syntax parsers are designed and being able to align with a known spec for cross-platform consistency.
I managed to successfully incorporate this into my application as an optional filtering tool. But for more on that, check out the next article.