· 8 min read ·

V8's Reversible Integer Hash Fix and the Constraints That Shaped It

Source: nodejs

Hash denial-of-service attacks are old news. The original HashDoS paper from 2011 demonstrated that most hash table implementations of the era were vulnerable to crafted input that forced worst-case O(n²) behavior. Languages and runtimes scrambled to add seed randomization. Most did. V8, it turns out, had a gap that persisted for years, and the March 2026 Node.js security release finally addressed it through a carefully constrained fix designed around a requirement that most hash functions don’t have: the output must be reversible.

The Specific Vulnerability

V8 has a fast path for property lookup on objects where the key is a numeric string that fits in a small integer range. When you write obj["42"] or access an array element by index, V8 can represent that key as an integer rather than a full string, avoiding heap allocation and string comparison. The hash for such keys was computed entirely from their numeric value, with no random seed:

"1234" (length=4, value=1234):
+--------+--------------------------+---+
| 000100 | 000000000000010011010010 |0 0|
| length |      numeric value       |   |
+--------+--------------------------+---+
31     26 25                       2 1  0

This is entirely deterministic. Given any numeric string, an attacker can compute exactly where it will land in a hash table and craft a payload where thousands of keys hash to the same bucket, forcing quadratic probe chains. The attack was demonstrated: a roughly 2MB JSON payload of carefully chosen numeric keys could hang JSON parsing for around 30 seconds on a modern machine. Multiply that by concurrent connections and you have a practical denial-of-service vector.

The discoverer, Mate Marjanović, reported this as CVE-2026-21717. The fix was implemented by Joyee Cheung of Igalia/Bloomberg in collaboration with the V8 team.

Why This Wasn’t Just “Add a Seed”

The standard fix for HashDoS across most runtimes is seed randomization. Python has done this via PYTHONHASHSEED since 3.3. Perl, Ruby, Java, and most JVMs have equivalent mechanisms. You pick a random seed at startup, fold it into the hash computation, and now an attacker who doesn’t know the seed can’t predict collisions.

For string hashes in V8, this works fine. The hash is opaque; you feed it into the hash table and never need to recover the original input from it. But for array index hashes, V8 has an optimization that depends on recovering the original integer from the stored hash value. When the engine needs to convert an array index key back to its integer representation during certain operations, it can decode it directly from the hash bits rather than re-parsing the string. That’s faster, but it means the hash function must be invertible.

A seeded hash using something like SipHash or even a seeded MurmurHash is not invertible in any practical sense. You can’t recover the input from the output without knowing the full inverse function, and for most cryptographic or near-cryptographic constructions that’s infeasible by design. So the usual “add SipHash” approach used by languages like Rust (which uses SipHash-1-3 by default) or Python’s SipHash-based string hashing was not directly applicable here.

The constraint was: design a seeded hash for 24-bit integers that is resistant to HashDoS, runs fast, and can be inverted efficiently given the seed.

The Construction: Xorshift-Multiply Rounds

The solution is a three-round xorshift-multiply construction. Each round applies a right shift XOR followed by multiplication modulo 2^24:

// Forward: encode integer value to hash
uint32_t SeedArrayIndexValue(uint32_t value, uint32_t m[3]) {
  uint32_t x = value;
  x ^= x >> kShift; x = (x * m[0]) & kMask;  // round 1
  x ^= x >> kShift; x = (x * m[1]) & kMask;  // round 2
  x ^= x >> kShift; x = (x * m[2]) & kMask;  // round 3
  x ^= x >> kShift;
  return x;
}

// Inverse: decode hash back to integer value
uint32_t UnseedArrayIndexValue(uint32_t hash, uint32_t m_inv[3]) {
  uint32_t x = hash;
  x ^= x >> kShift;
  x = (x * m_inv[2]) & kMask;  // undo round 3
  x ^= x >> kShift;
  x = (x * m_inv[1]) & kMask;  // undo round 2
  x ^= x >> kShift;
  x = (x * m_inv[0]) & kMask;  // undo round 1
  x ^= x >> kShift;
  return x;
}

Invertibility of the xorshift step x ^= x >> k over a bounded bit width is a well-known property: it’s its own inverse over the truncated integer space. The multiplication by an odd number modulo 2^n is invertible because odd numbers are coprime with any power of two; their modular multiplicative inverses exist and can be precomputed at startup. This gives you a bijection at each round, and the composition of bijections is a bijection, so the whole construction is invertible.

The multipliers are derived from the rapidhash secrets. They’re chosen to be odd (required for invertibility) and to have good statistical properties.

Measuring Diffusion with the Strict Avalanche Criterion

The team measured how well each round configuration disperses input bits using the Strict Avalanche Criterion, which tests whether flipping any single input bit changes each output bit with probability close to 0.5. A bias score of 0 is ideal; higher values indicate more predictable structure.

ConstructionBias Score
Identity (no-op baseline)1000.000
XOR only1000.000
Multiply + add797.523
1-round xorshift-multiply446.852
2-round xorshift-multiply3.447
3-round xorshift-multiply0.50

The jump from two rounds to three rounds is significant not just in the central score but in variance across different random secrets. Two-round constructions show high variance, meaning some seeds work well and others don’t. Three rounds produces low bias consistently, which matters because the runtime needs to work correctly regardless of what random seed it draws at startup.

This kind of empirical avalanche testing for a non-cryptographic hash is careful engineering. Most performance-oriented hash functions are evaluated on throughput and distribution quality over known input sets, not on cryptographic diffusion properties. The team was deliberately trying to satisfy a weaker-than-cryptographic but stronger-than-statistical security requirement.

The Threat Model: “Minimally” is Load-Bearing

The title of the fix document says “minimally HashDoS resistant,” and that word choice is deliberate. The design does not assume that the hash values are fully secret. It assumes they’re not directly observable. That’s a weaker assumption, and it’s realistic.

Hash values in V8 are internal implementation details. They’re not exposed through any JavaScript API. An attacker interacting with a Node.js server over HTTP cannot read the hash of any key; they can only observe response timing. Timing side channels exist, but the hash values are mixed in with uncontrolled application data (other properties, framework internals) and subject to network jitter. Extracting the seed from timing alone is not feasible in practice for a network adversary.

The threat model explicitly acknowledges additional mitigations that responsible deployments already use: payload size limits, rate limiting, and load balancing across processes that regenerate seeds on restart. These operational controls complement the cryptographic fix. Neither alone is sufficient; together they make a practical attack implausible.

This is different from how, say, Rust’s HashMap approaches the problem. Rust uses SipHash by default specifically because it provides a cryptographic-strength guarantee against an attacker who can observe hash outputs directly. That’s appropriate for a general-purpose data structure exposed through a public API. V8’s integer hash is not a public API, so the threat model is different, and a lighter construction that satisfies the actual threat model is the right call.

Performance Impact

The cost of adding the seeded hash function is real but small. The original decode path was one shift and one mask. The new path requires four XOR operations, four shifts, three multiplications, three masks, and three memory loads for the precomputed inverse multipliers. On modern CPUs this is still cheaper than the alternative: re-parsing the integer from the original string, which involves memory access to the string data, bounds checking, and character-by-character conversion.

Benchmarks across SunSpider, Kraken, Octane, and JetStream 3 show impact within noise:

SuiteBeforeAfterDelta
SunSpider86.9 ms86.9 ms0.0%
Kraken470.3 ms469.2 ms+0.2%
Octane7284872742-0.1%
JetStream 3203.20202.90-0.15%

This is the expected profile for a fix that touches low-level hash computation. The operation is fast, it runs on a subset of property accesses, and modern out-of-order CPUs handle the multiply chains without stalling on anything else.

Deployment and Configuration

The fix shipped in Node.js v20, v22, v24, and v25 as part of the March 2026 security releases. It’s enabled by default through two build flags:

v8_enable_seeded_array_index_hash = true
v8_use_default_hasher_secret = false

Notably, the fix was not enabled in Chrome. The V8 team’s position is that HashDoS attacks require a persistent server context where the attacker can send many requests; in a browser tab, the threat model doesn’t apply. This is a reasonable scoping decision, and it explains why the fix is Node.js-specific rather than a V8 upstream default.

A related vulnerability, CVE-2025-27209, addressed improper seeding in rapidhash itself, which influenced the choice of constants here.

What This Illustrates About Hash Function Design

Hash function design involves navigating a space of competing constraints that rarely all point in the same direction. Cryptographic hashes like SHA-256 are slow and non-invertible by design. Fast general-purpose hashes like xxHash or wyhash prioritize throughput and distribution quality but offer no security guarantees. SipHash sits in the middle, providing DoS resistance at reasonable cost.

The V8 array index hash doesn’t fit cleanly into any of these categories because the invertibility requirement cuts across standard design goals. The solution, a carefully parameterized xorshift-multiply construction measured against the avalanche criterion and validated against realistic threat models, is a good example of applying the right level of rigor to the actual problem rather than defaulting to either “fast and insecure” or “cryptographic and slow.”

Contributors Joyee Cheung, Leszek Swirski from the V8 team, and Gus Caplan from Deno all participated in getting the design right. The public write-up in the Node.js security advisory is unusually detailed for a vulnerability disclosure, walking through the bit layout, the construction, the avalanche measurements, and the threat model in full. It’s worth reading if you work on runtime internals or have any interest in practical hash function design.

Was this interesting?