· 7 min read ·

Seeded and Reversible: The Engineering Constraints Behind V8's New Integer Hash

Source: nodejs

Hash flooding attacks are old news. The 2011 oCERT advisory that hit PHP, Java, Python, Ruby, and Node.js all at once is over a decade old. The core fix, seeding the hash function with a per-process random value, has been standard practice since 2012. So when Node.js published a March 2026 security release for CVE-2026-21717, the natural reaction is confusion. Didn’t we solve this already?

The answer is that V8 solved it for string keys, but left a specific category of keys unprotected: array index strings. These are numeric strings small enough to be represented as a 24-bit integer value, things like "0", "1024", "16777215". The hash for these strings was not just deterministic, it was a direct bit-packing of the string’s length and numeric value into a 30-bit field. An attacker who knows how V8 lays out that field can enumerate collisions trivially.

The attack is straightforward. A ~2 MB payload fed to JSON.parse(), crafted so the parsed object has thousands of array-index keys that all hash to the same bucket, can stall the process for around 30 seconds on modern hardware. That is not a theoretical concern in a server context.

The interesting engineering story here is not the vulnerability itself but the constraint that made fixing it harder than it looks.

Why You Cannot Just Reach for SipHash

For string keys, V8’s fix in 2012 was effective: introduce a random seed at isolate startup and mix it into the hash computation. The string hash is opaque; once computed, it’s just a number used for bucket selection. Nothing downstream cares about reversing it.

Array index strings are different. V8 has a fast path for recovering integer values from hashed strings. When code accesses a property like obj[42] or calls parseInt("42"), V8 can read the numeric value directly out of the string’s raw_hash_field without re-parsing the string. This is not a minor optimization. It eliminates a parsing step that would otherwise run on every array access involving a string key.

If you replace the deterministic bit-pack with an irreversible hash like SipHash, you break this fast path. The hash value no longer encodes the original integer, so recovering it requires falling back to string parsing. The fast path disappears, and you pay for every array access.

The design requirement is therefore: find a seeded hash for 24-bit integer values that is unpredictable enough to defeat collision attacks, but invertible enough that V8 can efficiently recover the original integer from the hash.

The Construction: Xorshift-Multiply in Three Rounds

Most integer hash functions are already bijective. Thomas Wang’s classic 32-bit hash, which V8 uses for NumberDictionary lookups, is a permutation of the 32-bit integer space. Every input maps to a unique output before the modulo-by-table-size step, so collisions arise only from that final reduction. The issue with array index strings was not the absence of a good integer hash, it was the absence of any hash at all.

The approach chosen for CVE-2026-21717 uses alternating xorshift and multiply operations with seeded multipliers:

const uint32_t kMask = (1 << 24) - 1;
const uint32_t kShift = 12;  // half of 24 bits

uint32_t SeedArrayIndexValue(uint32_t value, uint32_t m[3]) {
  uint32_t x = value;
  x ^= x >> kShift; x = (x * m[0]) & kMask;
  x ^= x >> kShift; x = (x * m[1]) & kMask;
  x ^= x >> kShift; x = (x * m[2]) & kMask;
  x ^= x >> kShift;
  return x;
}

The xorshift step, x ^= x >> k, propagates bits downward. The multiplication step propagates bits upward through the carry chain. Alternating them means each round scrambles what the previous round left ordered. Both operations are bijections over the 24-bit space: the xorshift is an involution (applying it twice returns the original value, provided k >= bitwidth / 2), and multiplication by an odd number is invertible modulo any power of two.

The seeded multipliers m[0], m[1], m[2] are derived from the randomness already generated for V8’s string hashing (rapidhash secrets). They are odd numbers drawn from the low 24 bits of 64-bit secrets, so their modular inverses can be precomputed at startup. Inversion of the full function is straightforward: apply the steps in reverse order, replacing each multiplier with its precomputed inverse:

uint32_t UnseedArrayIndexValue(uint32_t hash, uint32_t m_inv[3]) {
  uint32_t x = hash;
  x ^= x >> kShift; x = (x * m_inv[2]) & kMask;
  x ^= x >> kShift; x = (x * m_inv[1]) & kMask;
  x ^= x >> kShift; x = (x * m_inv[0]) & kMask;
  x ^= x >> kShift;
  return x;
}

The modular inverses are computed using Newton’s method for modular inversion, which converges in O(log log N) iterations for N-bit integers.

Measuring Unpredictability With the Strict Avalanche Criterion

Claiming a hash function is “unpredictable” requires a concrete definition. The Strict Avalanche Criterion (SAC) measures this: flipping any single input bit should flip each output bit with probability exactly 0.5. Perfect SAC means the output looks random conditioned on any single-bit change to the input.

The bias metric used in the evaluation measures deviation from ideal SAC across all input-output bit pairs. Lower is better; zero would be a perfect PRF. The comparison across candidate constructions tells the story:

ConstructionSAC Bias
Identity1000.000
XOR only1000.000
Linear congruential797.523
1-round xorshift-multiply446.852
2-round xorshift-multiply3.447
3-round xorshift-multiply~0.50

The jump from two rounds to three rounds is substantial in terms of stability. With two rounds, testing across 50 randomly generated seed sets showed a standard deviation in bias of 7.19 and a maximum of 40.37, meaning some seed combinations produce noticeably weaker scrambling. Three rounds bring the standard deviation to 0.20 and the maximum to 1.68, effectively making the unpredictability uniform across the entire key space regardless of which random multipliers are generated at startup.

It is worth being clear about what SAC measures and what it does not. SAC is a necessary condition for a good hash function, not a sufficient one for hash table security. The design does not claim cryptographic strength. It relies on the multipliers remaining secret, hash outputs not being observable by the attacker (they are not exposed to JavaScript), and operational defenses like payload size limits doing some of the work. The goal is “unpredictable enough given the threat model,” not “provably secure.”

Performance: The Cost of Invertibility

The unseeded fast path needed two operations: one shift and one mask. The seeded path needs four XORs, four shifts, three multiplications, three masks, and three cache-local loads from the hash seed structure, roughly 14 instructions. That is significantly more work per decode, but the comparison that matters is to the alternative: re-parsing the string. String parsing requires 10 or more instructions plus a potential cache miss. The seeded decode is faster than falling back to parsing, which means V8’s fast path remains valid.

Benchmarks across SunSpider, Kraken, Octane, and JetStream 3 all show impact within measurement noise, roughly 0.0% to -0.15%. For practical workloads this is invisible.

The Chrome Split

The same fix ships in V8 but is disabled in Chrome. The reasoning is direct: HashDoS requires the ability to send large crafted payloads to a server, observe degraded response times, and repeat the attack. A browser processes pages from arbitrary origins but does not act as a server for attacker-controlled input in the same way. The threat model that justifies the hash seeding cost does not apply to the browser context.

This split is implemented via a build flag: v8_enable_seeded_array_index_hash. Node.js (v20, v22, v24, v25 in the March 2026 security release), Deno, and Cloudflare Workers all enable it. Chrome does not. The V8 team notified other embedders during the coordinated disclosure process.

The Broader Point About Calibrated Mitigations

The default response to hash table attacks since 2012 has been “use SipHash with a random key.” That is correct for most contexts. Python switched to SipHash-1-3 for string hashing in 3.4. Rust’s standard HashMap defaults to SipHash-1-3 via RandomState. These are clean, well-understood solutions.

But V8’s fast-path optimization for array index strings is a legitimate performance investment that should not be discarded. The right response was not to reach for the standard solution and accept the regression, but to ask what properties the replacement actually needs. It needs to be seeded and random enough to prevent collision enumeration. It does not need pre-image resistance or second pre-image resistance. It needs to be invertible in constant time. SipHash provides far more than necessary at higher cost.

The xorshift-multiply construction with three rounds and seeded multipliers hits the actual requirements cleanly. The SAC evaluation quantifies the unpredictability rigorously rather than relying on intuition. The modular inverse precomputation makes decoding cheap. The result is a hash that is good enough for the threat without sacrificing the optimization that motivated keeping a specialized fast path in the first place.

That calibration, understanding what a security property actually needs to be rather than defaulting to maximum strength, is the part of this design worth paying attention to.

Was this interesting?