· 6 min read ·

The Round-Trip Constraint: Why V8's HashDoS Fix Required an Invertible Hash

Source: nodejs

HashDoS attacks have a clear origin point. In December 2011, Alexander Klink and Julian Wälde presented at the 28C3 conference a demonstration that most major web runtimes were vulnerable to hash flooding: by crafting HTTP POST parameters whose keys all land in the same hash bucket, an attacker could turn O(1) average-case lookups into O(n²) worst-case behavior, hanging servers on small payloads. The CERT/CC advisory VU#903934 followed, and PHP, Java, Python, Ruby, and ASP.NET all patched within months. V8 introduced a seeded hash for string objects around that same time.

But V8 left one class of string unprotected: array index strings. Numeric strings whose values fit in a 32-bit unsigned integer receive special treatment in V8’s string table. Rather than computing a hash from their characters, V8 encodes the integer value directly into the 32-bit hash field:

Bits 31–26: string length (6 bits)
Bits 25–2:  raw numeric value (24 bits)
Bits 1–0:   00 (array index marker)

This encoding is a significant optimization: extracting the integer value from an array index string becomes a bitfield read rather than a character-by-character parse. The problem is that it also made the hash fully deterministic and predictable. CVE-2026-21717, patched in the March 2026 Node.js security release, closes that gap with a seeded bijective permutation. The solution is neat, but the constraint that shaped it, namely that the hash must be reversible, is what makes this fix technically interesting compared to how other runtimes handled the same class of problem.

The Attack Mechanics

V8’s hash tables use open addressing with quadratic probing. The i-th probe position for a key with hash h in a table of capacity c is (h + i*(i+1)/2) & (c-1). Because capacity is always a power of two, an attacker who knows h can compute the entire probe sequence in advance. Constructing a chain of n values that all collide requires knowing only the initial slot, which is a function of the hash.

With a deterministic hash, that computation is trivial. The proof-of-concept in the advisory constructs a 2 MB JSON payload of adversarially chosen numeric strings that causes roughly 30 seconds of hang during JSON.parse and JSON.stringify on a standard laptop. The mechanism: V8’s string table is queried during interning, and each lookup for the target string must walk the entire probe chain of colliding keys inserted before it.

This affects not just Node.js but any runtime embedding V8, which is why the March 2026 release notes mention coordinated disclosure to Deno and Cloudflare Workers as well.

Why the Fix Could Not Simply Use SipHash

When Python closed its own HashDoS gap in 3.4, it switched to SipHash-2-4 as the default string hash. SipHash is a keyed pseudorandom function: given a random per-process key and any input, it produces output that is computationally indistinguishable from random without the key. It provides strong collision resistance.

But SipHash is not invertible. You cannot recover the input from the output, and that rules it out for V8’s array index path. The encoding described above is not just a hash stored alongside the string; it is the canonical representation of the string’s integer value inside the runtime. Typed array operations, Array.prototype methods, and JIT-compiled fast paths in Turbofan and Maglev all need to extract integer values from array index strings on extremely hot paths. If that value is destroyed by a one-way hash function, the only fallback is re-parsing the string character by character on every access, which is a meaningful regression on array-heavy code.

Java’s approach, introduced in Java 8, was structural rather than cryptographic: HashMap switches from linked-list chaining to a red-black tree for any bucket exceeding 8 entries, capping worst-case behavior at O(n log n) without changing the hash function at all. That does not apply here either, because V8’s string table is a flat open-addressed structure where the probe sequence must be traversed linearly.

The fix needed a construction that was simultaneously unpredictable without a secret and reversible given that secret.

The Xorshift-Multiply Bijection

The solution is a three-round xorshift-multiply permutation operating entirely within the 24-bit value space. The forward pass:

const uint32_t kMask = (1 << 24) - 1;
const uint32_t kShift = 12;  // half of 24

uint32_t SeedArrayIndexValue(uint32_t value, uint32_t m[3]) {
  uint32_t x = value;
  x ^= x >> kShift; x = (x * m[0]) & kMask;
  x ^= x >> kShift; x = (x * m[1]) & kMask;
  x ^= x >> kShift; x = (x * m[2]) & kMask;
  x ^= x >> kShift;
  return x;
}

And the inverse:

uint32_t UnseedArrayIndexValue(uint32_t hash, uint32_t m_inv[3]) {
  uint32_t x = hash;
  x ^= x >> kShift; x = (x * m_inv[2]) & kMask;
  x ^= x >> kShift; x = (x * m_inv[1]) & kMask;
  x ^= x >> kShift; x = (x * m_inv[0]) & kMask;
  x ^= x >> kShift;
  return x;
}

Each component is a bijection on the 24-bit domain. An XOR-shift (x ^= x >> k) is self-inverse: applying it twice returns the original value. It also propagates high bits downward into lower bit positions. Multiplication by an odd constant is invertible modulo any power of two, because odd integers are coprime with 2^n; the modular inverse m_inv satisfying (m * m_inv) & kMask == 1 is precomputed at startup using Newton’s method for modular inversion. Multiplication propagates information upward through carry. Alternating the two operations achieves diffusion in both directions across all 24 bits, which neither operation achieves alone.

The multipliers m[0], m[1], m[2] are derived from V8’s existing rapidhash secret: take the low 24 bits of each 64-bit secret value and force the least significant bit to 1 to ensure oddness. They are regenerated on every process start, so any precomputed collision list from a previous session is useless.

How Three Rounds Was Established

The evaluation criterion used is the Strict Avalanche Criterion (SAC): flipping any single input bit should change each output bit with probability as close to 50% as possible. The hash-prospector project by Chris Wellons provides a framework for scoring integer hash constructions against this criterion using a bias metric, where lower is better and 0 represents ideal.

ConstructionSAC Bias
No hash (identity)1000.000
XOR only1000.000
1-round xorshift-multiply446.852
2-round xorshift-multiply3.447
3-round xorshift-multiply~0.50 (mean)

Critically, the 3-round construction was also tested across 50 independently drawn secret sets. Mean bias was 0.50 with a standard deviation of 0.20. That consistency matters: if quality varied widely depending on which multipliers were randomly generated at startup, some fraction of Node.js processes would be significantly weaker than others. The stability across diverse secrets means the protection is reliably effective regardless of the particular randomness in any given startup.

Performance

The decode operation adds 4 XOR-shifts, 3 multiplies, 3 masks, and 3 cache-local loads compared to the old single bitfield extraction. In isolation that is roughly 10 times more expensive. But it remains far cheaper than re-parsing a string, and it executes on cache-hot data in the same object header read that would have happened anyway.

The benchmark results from the advisory confirm this holds in practice. Across SunSpider, Kraken, Octane, and JetStream 3, the seeded version shows at most 0.2% deviation from baseline, well within measurement noise. The patch touches C++ runtime code, CodeStubAssembler macros for JIT-compiled paths, and Torque builtins, but the macrobenchmark impact is negligible.

What This Reveals About Minimal Security

Comparing the approaches across runtimes makes the design philosophy here explicit. Python uses SipHash-2-4: a cryptographic MAC that provides full pseudorandomness guarantees but is not invertible. Java uses tree-binning: a structural mitigation that does not require changing the hash function at all. Ruby and Perl use per-process hash seeds with standard hash functions. Each of these was chosen to fit the runtime’s specific architecture and threat model.

For V8, neither a non-invertible cryptographic hash nor a purely structural fix was applicable. The solution had to satisfy two constraints simultaneously: it must be unpredictable to an attacker without the secret, and it must be efficiently reversible given the secret. A seeded bijection is the minimal construction that satisfies both.

The Node.js article is explicit about the threat model under which this “minimal” resistance is sufficient: JavaScript cannot read a string’s hash value directly, application strings mix with attacker-controlled strings in the same table, and per-restart secret regeneration limits precomputed attacks. Full cryptographic strength is not needed under those conditions, and demanding it would require discarding the integer extraction optimization.

That reasoning, spelling out what the attacker can and cannot observe before choosing a construction, is often absent from security patches. Here it is documented clearly, and the result is a fix that is appropriately strong, efficiently invertible, and measurably inexpensive to run.

Was this interesting?