The Lexer and Parser of InfluxDB

The database folks are familiar with lexer and parsers, since SQL is a language widely supported in this area, and it is common wise to write the lexer and parser of SQL using yacc/lex or similiar tools. Evidence can be found from many databases like PostgreSQL, MySQL or many Big Data query engines. In fact, lexer is not so complicated to written by hand, while parser is much more tedious.

InfluxDB is an exception to the list of databases, it uses both a handwriting lexer and a parser. This link1 clearly disribes the reasons behind the decision. If it only support simple syntax just like in its early days2, the maintenance burdon is trivial. But if it wants to support the rich feature set of standard SQL, then yacc/lex-alike tools may be a better choice.

To help understanding its history, we can dig into the commit logs. The query language of InfluxDB is a not standard SQL, but a simpler SQL-alike language named InfluxQL, which is described in the influxql/InfluxQL.md file. The commit3 on 2016-02-04 changed the readme file’s name from InfluxQL.md to README.md. While the commit4 on 2017-10-30 changed the directory and moved the parser into its own repository5. We can see the change history in the latter repository.

In 2020, InfluxDB announced that it’s future version will be influxdb-iox6, a new designed system written in Rust. The reasons behind the decision are clearly written in the blog7, and it looks more like a classical OLAP columnal database than a purpose-designed time-series database. It claims to be fully compatible with the previous version, but it looks like InfluxQL is not supported yet. Instead, it uses Apache DataFusion8 and the underlying Apache Arrow engine to quickly support SQL. Of course, both of them are written in Rust.

Footnotes

1 https://blog.gopheracademy.com/advent-2014/parsers-lexers/

2 https://github.com/influxdata/influxdb/blob/776e9f2ec254225686eb49303c117bb9761d9414/influxql/INFLUXQL.md

3 https://github.com/influxdata/influxdb/commit/e2a24c26fda26c247a5d22b87a3ccd445bb5128e

4 https://github.com/influxdata/influxdb/commit/f3d45ba301d65eb63d26f85a0ae5b5bd0f4b18d3

5 https://github.com/influxdata/influxql.git

6 https://github.com/influxdata/influxdb_iox/

7 https://www.influxdata.com/blog/announcing-influxdb-iox/

8 https://github.com/apache/arrow-datafusion

Written on November 24, 2021