SQLite's Tooling Gap Lives in Its Virtual Table Layer

SQLite’s virtual table interface is where its power and its tooling gap converge. Most discussions of the SQLite devtools problem focus on type affinity, PRAGMA statements, and dialect-specific syntax like UPSERT and RETURNING. These are real gaps, but they are tractable: a parser that knows SQLite’s grammar can handle them. The harder problem is that SQLite’s most important extension mechanism defines its schemas entirely outside the DDL that a static parser can read.

Syntaqlite uses the framing “high-fidelity devtools” to describe what SQLite has been missing. The fidelity problem is partly about dialect accuracy, but it is fundamentally about schema resolution, and the virtual table layer is where schema resolution becomes genuinely difficult.

What Virtual Tables Actually Are

SQLite’s virtual table interface lets a C module present itself as a database table, implementing custom storage, custom filtering, and custom column definitions. The module implements a set of callbacks: xCreate, xConnect, xBestIndex, xOpen, xColumn, and others. From SQL’s perspective, a virtual table looks identical to a real table: you can select from it, join it, and use it in CTEs. The difference is that its data and schema come from C code rather than SQLite’s B-tree storage.

The creation syntax is distinctive:

CREATE VIRTUAL TABLE docs USING fts5(title, body, tokenize = 'porter ascii');
CREATE VIRTUAL TABLE geo USING rtree(id, minX, maxX, minY, maxY);
CREATE VIRTUAL TABLE fs USING dbstat;

Built-in virtual table modules include FTS5 for full-text search, rtree for spatial indexing, dbstat for storage statistics, and json_each and json_tree for JSON traversal. Loadable extensions can register additional modules, and the application embedding SQLite can register its own. The CREATE VIRTUAL TABLE statement is stored in sqlite_schema like any other DDL. But what it declares tells only part of the story.

The Schema Resolution Gap

When you query sqlite_schema to build a schema model for completions, you get the original DDL text. For a regular table, parsing that text gives you everything: column names, declared types, constraints, indexes. For a virtual table, parsing the DDL gives you the module name and module-specific options. The actual column schema exposed by the table is defined in C.

FTS5 is the clearest example. A table created as:

CREATE VIRTUAL TABLE articles USING fts5(title, body);

exposes not just title and body, but also rank, a hidden column bearing the table’s own name for MATCH queries, and several internal state columns. None of these appear in the DDL. The way you discover them is through PRAGMA table_xinfo, added in SQLite 3.26.0, which reports both visible and hidden columns including those defined by the module:

PRAGMA table_xinfo('articles');
-- Returns: cid, name, type, notnull, dflt_value, pk, hidden
-- Includes rank and FTS5 hidden columns not present in DDL

The hidden column in table_xinfo output distinguishes user-declared columns from module-defined ones. But running PRAGMA table_xinfo requires a live connection to a database where the virtual table module is loaded. A parser working from DDL text alone cannot derive this information.

The json_each and json_tree table-valued functions expose eight columns each: key, value, type, atom, id, parent, fullkey, and path. These are entirely defined in the JSON extension’s C source. No DDL describes them because these are table-valued functions registered through SQLite’s function interface rather than virtual tables created with CREATE VIRTUAL TABLE. A completion engine that does not know about these columns will fail to complete anything inside a json_each JOIN, which is one of the more common patterns for working with stored JSON:

SELECT j.key, j.value
FROM log_entries, json_each(log_entries.data) AS j
WHERE j.type = 'integer';

The columns key, value, and type here come from json_each’s C implementation. A tool that only parsed schema DDL would have nothing to offer when the cursor lands inside that alias.

The Perfetto Case as an Extreme

Lalit Maganti, who built syntaqlite, works on Perfetto, Google’s open-source system profiling infrastructure. Perfetto’s trace processor uses SQLite as its query interface, exposing trace data through a large collection of virtual tables implemented in C++: slice, thread, process, sched, counter, ftrace_event, and hundreds more depending on which data sources were active during the trace.

Each of these tables has a well-defined schema specified in C++. But from SQLite’s perspective, they are registered virtual table modules with no DDL whatsoever. A generic SQL tool pointed at a Perfetto trace file sees none of these tables in a standard sqlite_schema query, because they are not created with CREATE TABLE statements. They are registered by the trace processor’s C++ startup code before any query runs.

Writing analytical SQL against Perfetto trace data is genuinely complex work. Queries join across process and thread hierarchies, aggregate slice durations over time ranges, correlate CPU scheduling events with application-level operations. Doing this without schema-aware completion or inline diagnostics means either memorizing the virtual table schema or running queries repeatedly to discover column names. That daily friction is what makes the case for high-fidelity tooling concrete rather than theoretical.

The Query Planner’s Statistics Layer

Virtual tables have a second devtools implication beyond schema resolution. The xBestIndex callback in the virtual table interface is how a module communicates query planner costs to SQLite. FTS5’s MATCH constraint is handled this way: when SQLite evaluates a query against an FTS5 table, it calls the module’s xBestIndex with the constraint, and FTS5 reports that it can satisfy this filter efficiently using its inverted index. SQLite then chooses the FTS5-specific access path over a full table scan.

A devtools tool that shows query plans needs to surface when FTS5’s MATCH constraint is being used as an efficient index operation versus when a WHERE clause is degrading to a scan. The EXPLAIN QUERY PLAN output since SQLite 3.31.0 provides structured tree output using id and parent columns that captures this, but interpreting it correctly requires understanding which nodes represent virtual table operations versus B-tree operations.

The statistics picture is further complicated by ANALYZE and the sqlite_stat tables. SQLite’s query planner uses sqlite_stat1, which stores row counts and average column values per index, and sqlite_stat4, which stores column value histograms for more precise selectivity estimates. A database where ANALYZE has never been run will make different, often worse, query plan decisions than one with current statistics. A devtools layer that visualizes query plans without surfacing whether ANALYZE has been run recently is showing a potentially misleading picture of how the query actually executes against real data distributions.

What High-Fidelity Requires in Practice

The consequence of all this is that building genuinely high-fidelity SQLite devtools requires a live connection to an actual database, not just a parser over DDL files. The connection needs to be established with the same set of loaded extensions and registered virtual table modules that the application uses. Without that, the tool’s schema model is incomplete for any database that uses virtual tables, and the incompleteness is hard to discover because the DDL appears to parse correctly.

This is a different design constraint than most language server implementations face. For languages like Rust or Go, a language server reads source files and a build manifest. For PostgreSQL, a language server like postgres_lsp establishes a connection and queries the system catalog. For SQLite, the connection model is more complex because extension loading is part of application initialization code, not a static configuration file.

Syntaqlite takes the approach of building on the actual SQLite C library rather than re-implementing the grammar, which is the right foundation. A parser built on the SQLite amalgamation inherits the real parser’s correctness across all of SQLite’s syntax, including virtual table module options that are module-specific strings rather than SQL syntax. But the harder part of making the tool genuinely useful for virtual-table-heavy codebases is the connection model: loading the right extensions so that PRAGMA table_xinfo returns accurate results for every table in the schema.

For the common case this is manageable. FTS5 and rtree are built into SQLite, dbstat and JSON functions require no external loading, and a tool that handles these correctly covers the majority of real-world SQLite usage. The Perfetto case represents heavier extension use than most applications, but it illustrates precisely where DDL-only analysis hits its limit.

The growing production SQLite ecosystem sharpens this further. Turso’s libSQL adds a vector search extension. Cloudflare D1 runs SQLite at the edge with its own extension set. Electric SQL uses SQLite as the local store in a sync-based architecture. Each of these involves virtual table or extension use beyond the SQLite standard library. Devtools that cannot resolve virtual table schemas become less useful precisely as SQLite’s production role expands.

The “high-fidelity” framing is the right one to aim at. The fidelity gap in existing SQLite tooling is real, it compounds with schema complexity, and the virtual table layer is where generic approximations fail most completely and most silently.