The Hash That Has to Remember What It Hashed: V8's March 2026 HashDoS Fix
Source: nodejs
Hash Denial of Service attacks are old news. The foundational paper, Crosby and Wallach’s 2003 USENIX Security presentation, described the mechanics two decades ago: find inputs that map to the same bucket, insert them all, watch O(1) lookups degrade to O(n). Every major runtime has dealt with this. Perl, Python, Ruby, PHP, Java, and Node.js have all shipped hash randomization fixes at some point. What makes V8’s March 2026 vulnerability (CVE-2026-21717) interesting is not the attack pattern, which is familiar, but the design constraint that made a straightforward fix impossible.
The optimization that created the problem
V8 distinguishes between regular strings and what it calls array index strings: decimal representations of integers with values that fit in 24 bits. Strings like "0", "42", or "16777215" fall into this category. V8 stores them differently because they appear constantly as property keys when JavaScript code indexes into arrays and objects with numeric-looking keys.
For these strings, V8 embedded the numeric value directly into the hash field:
"1234" (length=4, value=1234):
+--------+--------------------------+----+
| 000100 | 000000000000010011010010 | 00 |
| length | numeric value | |
+--------+--------------------------+----+
The hash was deterministic: (length << 24) | value. No seed, no randomization. The benefit was that operations like parseInt("42") could extract the numeric value directly from the hash field without reparsing the string. This is a genuine performance optimization in code that touches arrays heavily.
The problem is that a deterministic hash means an attacker can predict every collision in advance. V8 uses quadratic probing for open addressing: the probe sequence for a slot h goes h, h+1, h+3, h+6, h+10, adding successive integers. An attacker can construct a sequence of array index strings that all map to the same initial bucket and form a collision chain under that probe sequence. Then insert many copies of a target value. Every lookup for the target traverses the entire chain.
The proof-of-concept was roughly 2 MB of JSON that caused roughly 30 seconds of parse time. Node.js processes accepting user-controlled JSON payloads were straightforwardly exploitable.
Why most hash functions do not solve this
The standard fix for HashDoS is to seed your hash function: pick a random secret at startup and mix it into every hash computation. This is what V8 already does for regular strings via rapidhash. Attackers cannot predict collisions without knowing the seed.
But for array index strings, seeding alone breaks the optimization. If you seed the hash, you can no longer recover the original integer value from the hash field. parseInt("42") would need to reparse the string every time. The hash field would be useless for anything except equality comparison.
This rules out the entire family of irreversible hash functions: FNV-1a, MurmurHash3, SipHash, xxHash, rapidhash itself. They all discard information. Once you hash 42, you cannot reconstruct 42 from the output. They are many-to-one (collisions intentional for non-cryptographic use) in the general case, or at minimum not invertible by design.
The fix needed a hash function that is:
- Seeded - uses a per-isolate random secret so attackers cannot predict outputs
- Reversible - given the hash output and the seed, recover the original value in O(1) without string operations
- A bijection on 24-bit integers - one-to-one mapping, no collisions introduced by the hash itself
- Fast - comparable to the original identity-style encoding
- Well-diffused - small input changes produce large output changes, complicating collision construction even for attackers who know the structure
This is a narrow design space.
Xorshift-multiply as a reversible permutation
The solution uses a three-round xorshift-multiply construction. Each round applies two operations:
x ^= x >> kShift; // xorshift (self-inverse)
x = (x * m) & kMask; // multiply modulo 2^24
The xorshift is trivially reversible: applying it twice returns the original value. The multiply is reversible if the multiplier is odd, because odd numbers are coprime to any power of two, so modular inverses exist. Given multiplier m, compute m_inv such that (m * m_inv) & kMask == 1, and then (x * m_inv) & kMask undoes the multiplication.
Full encoding and decoding:
// Encode: seed the array index value
uint32_t SeedArrayIndexValue(uint32_t value, uint32_t m[3]) {
uint32_t x = value;
x ^= x >> 12; x = (x * m[0]) & kMask;
x ^= x >> 12; x = (x * m[1]) & kMask;
x ^= x >> 12; x = (x * m[2]) & kMask;
x ^= x >> 12;
return x;
}
// Decode: recover the original value from the hash
uint32_t UnseedArrayIndexValue(uint32_t hash, uint32_t m_inv[3]) {
uint32_t x = hash;
x ^= x >> 12; x = (x * m_inv[2]) & kMask;
x ^= x >> 12; x = (x * m_inv[1]) & kMask;
x ^= x >> 12; x = (x * m_inv[0]) & kMask;
x ^= x >> 12;
return x;
}
The multipliers m[0], m[1], m[2] are derived from V8’s existing rapidhash secrets, which are randomized per Isolate. The modular inverses are precomputed at startup using Newton’s method.
Why three rounds, and how they validated it
Two rounds are not enough. The team validated diffusion quality using the Strict Avalanche Criterion (SAC): for each possible one-bit input change, measure what fraction of output bits flip. An ideal hash flips every output bit with probability 0.5 when any single input bit changes. Deviation from that ideal is bias.
With two rounds, some secret combinations produced mean bias around 7.9% with maximum bias up to 40%. That is not defensible. Some secrets would produce nearly-as-predictable outputs as the original deterministic hash.
With three rounds, mean bias dropped to 0.5%, maximum to 1.68%. That is close to ideal given the 24-bit word size and the specific xorshift-multiply construction. No secret combination produced unacceptable diffusion.
Three rounds also costs almost nothing at runtime. The full decode adds four xors, four shifts, three multiplies, and three memory reads for the precomputed inverses. These operations execute in a few nanoseconds and are dwarfed by the cache miss that might follow when accessing the string’s data. Benchmark suites (Octane, JetStream 3, Kraken) showed around 0.1-0.2% regression, within measurement noise.
What “minimally HashDoS resistant” actually means
The title of the Node.js advisory describes the fix as achieving “minimal” resistance rather than full immunity, and that framing is honest.
The hash values themselves are never exposed to JavaScript or to the network. An attacker cannot observe them directly. They would need to construct inputs, send them, observe timing differences, and iterate to infer the secret multipliers. On a real server with request rate limits, response time jitter from other workloads, garbage collection, and hash table resizing between requests, this inference is unreliable. The attack surface is not zero, but it is narrow enough that operational controls (payload size limits, request throttling, timeouts) close it in practice.
This is the right framing for a performance-sensitive runtime component. A truly collision-proof hash at this layer would require cryptographic strength, which means non-reversible, which means breaking an optimization that exists for good reason. The goal is to make blind collision attacks impractical, not to achieve theoretical security guarantees against a computationally unbounded attacker with direct access to hash outputs.
The patch shipped in Node.js v20, v22, v24, and v25 in the March 2026 security release. Chrome disabled it since denial of service is outside the browser’s threat model. Deno and Cloudflare Workers received the fix through their own V8 update cycles.
The broader lesson about optimization-security interactions
This vulnerability follows a pattern that appears repeatedly in runtime security. A performance optimization encodes information into a structure in a way that becomes an assumption elsewhere in the codebase. That assumption constrains the fix options when the optimization turns out to have a security implication.
V8’s string hash field serving double duty as both a hash and an integer cache is elegant in isolation. It saves a few nanoseconds per operation, and at JavaScript scale, those nanoseconds add up. But the constraint it imposes on the hash function’s design is invisible until you need to change the hash function.
Similar patterns appear in Python’s hash randomization (added in 3.3 after CVE-2012-1150), Ruby’s hash seed randomization, and PHP’s SipHash adoption. Each of those was simpler because the hash value was not reused for any other purpose. V8’s case is harder because reversibility is load-bearing.
The xorshift-multiply construction the V8 team landed on is not exotic in academic terms, these are standard building blocks from hash function literature, but applying them under a reversibility requirement with strong per-configuration diffusion guarantees is a well-constrained engineering problem. The three-round design with SAC validation shows the kind of careful work that goes into what might look from the outside like a routine security patch.