· 7 min read ·

SQLite's Tooling Gap and What Closing It Properly Requires

Source: lobsters

SQLite runs on roughly a trillion devices. It ships inside every iOS and Android phone, every browser, most desktop applications, and a growing number of server-side applications that have decided they don’t need the operational overhead of a separate database process. By any deployment metric, it is the most successful database ever built. Its developer tooling, however, tells a different story.

Open a .sqlite file in any major editor and you will get generic SQL syntax highlighting at best. The autocompletion, if it exists at all, is borrowed from PostgreSQL or ANSI SQL, which means it will suggest functions that don’t exist in SQLite, miss functions that do, and treat PRAGMA statements as syntax errors. This is the gap that syntaqlite is trying to close.

The Dialect Problem

The root of the tooling problem is that “SQL” is not a single language. PostgreSQL, MySQL, SQLite, and MS SQL Server all implement a different dialect, and the differences go well beyond surface-level keyword variations. They reflect genuinely different architectural choices.

SQLite does not have a SHOW TABLES command. You query sqlite_master (or its alias sqlite_schema since version 3.33.0) directly:

SELECT name, type, sql
FROM sqlite_master
WHERE type IN ('table', 'view')
ORDER BY name;

SQLite has no BOOLEAN type. It stores booleans as integers, and the type affinity system means a column declared as BOOLEAN has NUMERIC affinity in practice. A linter that doesn’t know this will either complain about missing boolean support or silently accept code that behaves differently than the developer expects.

SQLite’s INTEGER PRIMARY KEY is special. Unlike every other database where a primary key is just a constraint, in SQLite an INTEGER PRIMARY KEY column is an alias for the internal rowid. The distinction has real performance implications and behavioral differences that don’t exist in other databases.

Then there are the SQLite-specific extensions that any genuinely useful tool needs to understand: PRAGMA statements for runtime configuration, the json_extract and json_each families of JSON functions (added in 3.9.0, 2015), window functions (3.25.0, 2018), the RETURNING clause (3.35.0, 2021), STRICT tables (3.37.0, 2021), and the full-text search virtual table modules fts4 and fts5.

A generic SQL tool that doesn’t know about virtual tables will fail immediately when it encounters:

CREATE VIRTUAL TABLE email_idx USING fts5(
  subject,
  body,
  sender UNINDEXED
);

SELECT * FROM email_idx WHERE email_idx MATCH 'quarterly report';

This is not exotic syntax. It is the primary way to do full-text search in SQLite, and it has its own query language embedded inside the string literal passed to MATCH. High-fidelity tooling needs to understand not just that this is valid SQL, but what USING fts5 implies about the table’s columns and the queries it supports.

The LSP Landscape

The Language Server Protocol has become the standard mechanism for providing editor-agnostic intelligence for programming languages. For SQL, the options are sparse and mostly inadequate for SQLite specifically.

sqls is the most commonly referenced SQL language server. It supports multiple databases including SQLite, but the SQLite support is thin: it can read the schema to provide table and column completions, but it doesn’t understand SQLite’s type system, doesn’t handle PRAGMA statements, and treats many valid SQLite expressions as errors because it validates against a more generic SQL grammar.

sql-language-server is similar. It provides completions and some linting, but the dialect awareness is shallow. The project treats databases as interchangeable with a thin adapter layer, which is the wrong abstraction for a language where the dialects differ as fundamentally as SQLite’s does from PostgreSQL’s.

Postgres has fared better. The postgres-lsp project from Supabase has invested heavily in building a proper parser that understands PostgreSQL’s full syntax, including its extension model. The result is a much richer experience: diagnostics that understand PostgreSQL semantics, not just syntax; completions that reflect the actual catalog; and hover documentation that links to the right version of the PostgreSQL docs. SQLite deserved the same treatment and hadn’t gotten it.

What High-Fidelity Actually Means

The phrase “high-fidelity” in the syntaqlite title is doing real work. It points at a specific problem in how most SQL tools are built.

Most SQL linters and completers operate on a two-stage model: parse the SQL into an abstract syntax tree using a grammar that approximates the target dialect, then validate or complete based on that tree. The approximation is where they fail. A grammar that says “an expression can be a function call” and lists known function names will reject json_tree, generate_series (available as a loadable extension), or any user-defined function. It will fail to complete the first argument to sqlite3_create_function_v2 because it doesn’t know that SQLite supports runtime function registration.

A high-fidelity tool starts from SQLite’s actual grammar, not an approximation of ANSI SQL. SQLite publishes a formal grammar in BNF notation, and building a parser from that grammar is the only way to correctly handle edge cases like the interaction between WITH clauses and INSERT OR REPLACE, or the precise rules for when a parenthesized expression is a subquery versus a row value.

The second dimension of fidelity is schema awareness. SQLite stores its schema in sqlite_master, and a serious devtools project needs to read that schema, understand the virtual table modules in use, and reflect that understanding back to the developer. When you type SELECT inside a query against an FTS5 table, the completions should include the hidden columns that FTS5 exposes: rank, the table name itself as a special column, and the column list from the CREATE VIRTUAL TABLE statement.

The Virtual Table Challenge

Virtual tables are where the gap between SQLite’s capabilities and its tooling gets widest. The virtual table mechanism is SQLite’s extension model: it lets C code register a module that looks like a table to the SQL engine. FTS5, R*Tree for spatial indexing, and the dbstat module for storage analysis are all virtual tables built into SQLite. Third-party loaders extend this further with modules for math functions, string processing, cryptography, and UUID generation.

From a devtools perspective, virtual tables present a hard problem. The schema of a virtual table is not stored in the conventional sense: CREATE VIRTUAL TABLE t USING fts5(a, b) records the full CREATE statement in sqlite_master, but the columns available for queries include hidden columns that the module synthesizes at runtime. Knowing what those columns are requires understanding the specific module’s semantics, not just reading the stored DDL.

For built-in modules like FTS5 and R*Tree, a devtools tool can hardcode this knowledge. For third-party extensions, it requires some form of extension introspection or a plugin model. This is not a theoretical problem: the sqlite-utils library from Simon Willison, which has become a standard part of the SQLite power-user toolkit, regularly deals with schemas involving multiple virtual table types.

Why This Matters Now

The timing of projects like syntaqlite reflects a shift in how SQLite gets used. Ten years ago, SQLite was primarily an embedded database: your iOS app used it, your browser used it, but developers rarely wrote SQLite queries by hand in ways that would benefit from language server support.

That has changed. The local-first software movement has pushed SQLite into application development in ways that require developers to write real, complex SQL. Projects like Datasette, Litestream, LiteFS, and libSQL have made SQLite a serious option for server-side applications. The CR-SQLite extension adds CRDT support for multi-writer synchronization. These use cases involve developers spending real time in SQL files, and the tooling gap becomes visible quickly.

The Cloudflare D1 service runs SQLite at the edge. Turso offers a distributed SQLite service. When SQLite becomes the database you reach for when building a new product, the quality of your editor experience starts to matter in the same way it matters for any other language.

The Parsing Problem at the Bottom

Building a correct SQLite parser is harder than it looks. SQLite’s grammar has a number of context-sensitive rules that make it difficult to parse with standard tools. The most notorious is the ambiguity between table-valued functions and regular function calls in certain positions, and the interaction between INSERT INTO and SELECT in the UPSERT syntax.

SQLite’s source code includes a Lemon parser generated grammar that handles these ambiguities, but Lemon is SQLite’s own parser generator, not a standard tool. Replicating this grammar in a form suitable for a language server, where incremental parsing and error recovery are requirements rather than luxuries, is a substantial engineering effort. Tree-sitter has a SQLite grammar that is often used as a foundation, though its error recovery in partial-edit scenarios (the common case in an editor) is imperfect.

The combination of a correct grammar, schema introspection that handles virtual tables, and PRAGMA awareness is what separates a tool that handles 80% of real SQLite usage from one that handles the full picture. Getting that last 20% right is where the engineering investment lives, and it is what makes high-fidelity tooling meaningfully different from the approximations that exist today.

SQLite earned better tooling a long time ago. Projects building toward that fidelity are filling a gap that has been visible to anyone who has spent serious time writing SQLite queries in an editor and found themselves constantly context-switching to the documentation to check whether a function exists, what a PRAGMA does, or whether a particular syntax is valid in the version they are targeting.

Was this interesting?