How Emacs Packs Type Dispatch and GC Metadata Into a Single Header Field
Source: lobsters
When you read the walkthrough of Emacs’s tagged pointer scheme, the natural focus lands on type dispatch: how three low-order bits sort Lisp values into categories, how a secondary tag inside the object header narrows a vectorlike object down to a buffer or a window or a hash table. That framing is accurate, but it treats type identification as the primary purpose of the design. The more important purpose is garbage collection, and understanding that changes how the whole architecture reads.
What the GC Actually Needs
Emacs’s garbage collector is precise, not conservative. A conservative collector scans memory looking for anything that resembles a valid heap address and treats possible pointers as definite ones. It cannot move objects, because an integer that coincidentally equals a heap address would become a dangling reference after relocation. A precise collector knows exactly which words in each object are live Lisp_Object references and which are raw data. It traces only real references and can support compaction if needed.
Making a collector precise requires answering one question for every heap object: which fields contain live references? For a homogeneous structure like a cons cell, the answer is static and obvious. For a buffer, which has around seventy Lisp_Object fields followed by C pointers, integers, and character arrays, the answer is more involved.
The header.size Encoding
Every pseudovector in Emacs begins with a vectorlike_header:
struct vectorlike_header {
ptrdiff_t size;
};
This single field carries three pieces of information packed into distinct bit ranges within a 64-bit integer:
- A high-bit flag (
PSEUDOVECTOR_FLAG) marking the object as a pseudovector rather than a plain vector. - A type code (
pvec_type) encoded in the bits immediately below that flag, shifted up byPSEUDOVECTOR_AREA_BITSpositions. - A slot count in the lower
PSEUDOVECTOR_AREA_BITSbits, recording how manyLisp_Objectfields precede any non-Lisp data.
Type dispatch reads the flag and type code:
#define PSEUDOVECTORP(x, code) \
(VECTORLIKEP(x) \
&& (((XVECTOR(x)->header.size \
& (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK)) \
== (PSEUDOVECTOR_FLAG | ((code) << PSEUDOVECTOR_AREA_BITS)))))
GC traversal reads only the slot count:
ptrdiff_t nlisp = header.size & PSEUDOVECTOR_SIZE_MASK;
for (ptrdiff_t i = 0; i < nlisp; i++) {
mark_object(XVECTOR(obj)->contents[i]);
}
The two operations touch different bit fields of the same word and do not interfere. Type dispatch and GC traversal share a data source but are entirely independent.
The Layout Rule That Cannot Break
This GC traversal is only correct if the first nlisp consecutive fields in every pseudovector are in fact Lisp_Object values. That requirement is a layout convention rather than a type system rule. Every pseudovector struct must declare all its Lisp_Object fields first, before any raw C data:
struct buffer {
union vectorlike_header header;
Lisp_Object name_;
Lisp_Object filename_;
Lisp_Object directory_;
/* ... roughly 70 more Lisp_Object fields ... */
/* raw C data below this line */
struct buffer_text *text;
ptrdiff_t pt;
ptrdiff_t pt_byte;
/* ... */
};
The slot count in header.size covers the Lisp_Object block. The GC stops at that boundary and never reads the raw C data that follows. A buffer, a window, a hash table, and a tree-sitter parser node all follow the same rule, and the GC handles all of them identically through the same loop.
If a developer adds a new Lisp_Object field after a raw C pointer, the slot count becomes wrong. The GC will miss a live reference and the next collection may free an object that is still in use. There is no compiler check for this ordering requirement. It is documented in comments near the relevant macros in src/lisp.h and enforced through code review.
The Python Comparison
CPython solves the same problem differently. Every Python type provides a tp_traverse function in its PyTypeObject:
static int
list_traverse(PyListObject *o, visitproc visit, void *arg)
{
for (Py_ssize_t i = Py_SIZE(o); --i >= 0; )
Py_VISIT(o->ob_item[i]);
return 0;
}
The GC calls ob_type->tp_traverse(obj, visit, arg) for each live object. Reaching the traversal function costs one pointer dereference through ob_type. Each call to Py_VISIT for an individual reference is itself an indirect call through the visit function pointer. The CPython documentation describes tp_traverse as mandatory for any container type.
Emacs uses no function pointer, no per-type dispatch table, and no traversal callback for pseudovector traversal. Reading the slot count and iterating over a contiguous array is the entire traversal implementation for any vectorlike object. The flexibility that tp_traverse provides is real: C extension types can store Python references at arbitrary offsets, in linked structures, or inside external data, as long as the traversal function finds them. Emacs forgoes that flexibility and gains uniform, non-dispatching traversal across every type it defines.
The constraint that makes the optimization possible is the same one that would make it unsuitable for a general-purpose object system. It works here because every pseudovector type is internal to the Emacs codebase and subject to the layout review process.
SBCL’s Similar but Different Approach
SBCL’s garbage collector, built on lowtags and widetags, is architecturally closer to Emacs than Python is. For simple vectors, SBCL also scans a contiguous array directly. For complex types, it generates specialized scavenge functions from type metadata at build time rather than relying on a runtime-maintained slot count.
The generated code approach gives SBCL more flexibility: it can handle object layouts where the pointer-containing slots are not contiguous, or where the count depends on a runtime field. For the kinds of types Emacs defines, the uniform loop with a header-encoded count is sufficient and eliminates the code-generation machinery entirely. Both designs reflect the same insight that per-object type dispatch in the inner loop of a GC is expensive, but they reach different solutions given different constraints.
Adding a New Type Is a Multi-Site Operation
When Emacs 29 added tree-sitter integration, defining the structs was not the only step. PVEC_TS_PARSER and PVEC_TS_NODE were added to the pvec_type enum. The new structs were defined with their Lisp_Object fields ordered before any external C pointers. Slot counts were reviewed. Every switch statement handling the full set of pseudovector types needed updating.
The mark phase needed no new cases. The generic loop handled the new types immediately because the slot counts were correct and the layout convention held. The sweep phase did need additions: tree-sitter parsers and nodes hold references to external C objects that require cleanup when the Lisp wrapper is collected. That cleanup is type-specific knowledge the header encoding cannot provide.
That asymmetry is instructive. Mark is generic because the layout invariant is maintained. Sweep is per-type because finalizing external resources requires type-specific logic. The design correctly separates the two phases.
What the Encoding Reflects
The description of this pattern as “poor man’s inheritance” in the source article is accurate for the type dispatch angle, but it undersells the design’s purpose. The struct embedding, the slots-first convention, the slot count in the header, and the uniform GC loop were not invented to compensate for C lacking class inheritance. They exist because a precise GC requires knowing which fields in each object are live references, and encoding the count directly in the object header is a minimal-cost way to provide that information.
The primary tag in the Lisp_Object word tells you the word-level category. The pvec_type in the header tells you the specific type. The slot count in the same header field is what makes precise traversal tractable across the full range of types the editor defines. All three pieces of information coexist in two words, the Lisp_Object value and its header, and they have served a growing type hierarchy across four decades because the encoding left enough room to keep adding types without revisiting the fundamental layout.