· 6 min read ·

The Exploitability Asymmetry: Which Vulnerability Classes Small Models Already Find Reliably

Source: hackernews

The security community’s response to the finding that small models are replicating Mythos-level vulnerability discovery has mostly treated it as a uniform capability claim: small models can find what big models find, so the cost barrier to dangerous capabilities is lower than expected. That framing is broadly correct but operationally vague. The more useful claim is narrower: small models find specific vulnerability classes with high reliability, and those classes are not randomly distributed.

The asymmetry traces to the computational structure of different bug categories. Some bugs are structurally tractable for fine-tuned smaller models backed by good tooling. Others require reasoning capabilities that do not compress well into smaller architectures, regardless of fine-tuning quality. Getting this taxonomy right matters more for defensive prioritization than the aggregate headline does.

What Makes Injection Bugs Tractable

SQL injection, command injection, template injection, LDAP injection, and related classes are the bug category where small model detection is most reliable. The reason is that these bugs reduce to a satisfiability question with clean formalization: can an attacker construct input that reaches a dangerous sink while bypassing whatever sanitization exists between source and sink?

This is the kind of question Z3 and similar SMT solvers were built to answer. Given a path from user input to a SQL query constructor, the solver asks whether there exists an assignment of the input bytes that makes the query semantically different from what the developer intended. The model’s role in this pipeline is relatively narrow: identify candidate data flows, pick which ones are worth passing to constraint analysis, and interpret the results.

A 7B model fine-tuned on CVE writeups and bug bounty disclosures for injection bugs develops strong pattern recognition for these flows without needing the full general reasoning capacity of a frontier system. The training signal is dense. Every SQL injection CVE in the National Vulnerability Database is a labeled example of what a vulnerable flow looks like. The model learns structural signatures: unsanitized string concatenation into query templates, dynamic SQL constructed from request parameters, user-controlled input routed to shell execution contexts.

Semgrep’s taint tracking mode formalizes this exactly. Define a source (user input), a sink (dangerous function call), and a list of sanitizers. Semgrep propagates taint through the AST and flags paths where tainted data reaches a sink without passing through a sanitizer. A small model integrated into that pipeline reads the output and reasons about whether a flagged path is exploitable in practice. The model does not need to understand the full program; it needs to evaluate one candidate flow. That is a tractable task at smaller scales.

Memory Safety Bugs: Rich Fine-Tuning Signal

Buffer overflows, use-after-free, double-free, heap corruption, and stack-based vulnerabilities form the other class where small model detection is solid. The mechanism here is different from injection: the fine-tuning signal is extraordinarily rich and structured.

The kernel sanitizer stack, KASAN, KMSAN, UBSAN, and KCSAN, generates structured bug reports with consistent format: the operation type, the memory address, the allocation site, the size, and the stack trace at the point of corruption. Google’s syzbot has been generating these reports against the Linux kernel continuously since 2017, producing thousands of labeled examples of what a memory safety violation looks like at the moment it triggers.

A model trained on that corpus learns what the structural preconditions of memory corruption look like in source code: missing bounds checks before array indexing, pointer arithmetic without accompanying length validation, lifetime patterns where freed memory is later dereferenced. These are structural patterns visible in the code itself, not temporal or semantic properties that require runtime observation to detect.

The UIUC research from 2024 that showed 87% exploit success rates for GPT-4 on real CVEs was heavily weighted toward exactly these classes. The 87% number comes from agentic setups with tool access, and it was measured when the model received a CVE description labeling the vulnerability class. Without the CVE description, the rate dropped below 7%. That gap is a useful diagnostic: the model is strong at exploitation given the class, weaker at independent class identification from first principles. Fine-tuned smaller models that have been trained extensively on the structural signatures close that identification gap for known-pattern classes.

Where Small Models Struggle: Race Conditions

Race conditions, TOCTOU bugs, and concurrency vulnerabilities are where small model capability degrades substantially. The structural reason is that these bugs are temporal properties, and temporal properties require runtime observation to confirm.

A race condition is not visible in a single static snapshot of code. It requires reasoning about possible interleavings of concurrent execution: which interleaving is dangerous, and whether an attacker can influence timing to force the vulnerable ordering to occur. KCSAN detects these by instrumenting every memory access and checking whether concurrent accesses race at runtime. That instrumentation-based approach is what actually works.

Static analysis can flag shared mutable state accessed without obvious synchronization, but the false positive rate in any real codebase is high enough to erode signal quality. A small model guided by those static flags performs only marginally better than the underlying static analysis, because the model cannot resolve the fundamental ambiguity without runtime information it does not have access to. Fine-tuning on race condition CVEs helps the model recognize common patterns like lock ordering violations or check-then-act sequences, but the coverage is narrow and the confidence is low. This class rewards instrumentation-first approaches over model-first approaches.

Authorization Logic: The Semantic Gap

Authorization and access control bugs, along with general business logic vulnerabilities, represent a third category where small models add little reliable value. These bugs are not structural in the code analysis sense. They require understanding what the code is supposed to do and identifying where the implementation diverges from the intended policy.

Broken object-level authorization in an API endpoint is not detectable by examining the endpoint’s code in isolation. You need to understand the access policy: who owns which resource type, what operations are permitted at what privilege level, how identity flows through the request context. A missing permission check is only a bug if the caller was not supposed to have permission. That determination requires semantic context that lives outside the code itself, in specifications, documentation, or domain knowledge.

This class is where the 30-70% false positive rates in SAST deployments partly originate. Tools flag anything that could be a permission boundary violation. Developers ignore most of it because most flagged cases require context the tool lacks to evaluate correctly. A Microsoft Research study found developers acted on roughly 14% of static analysis findings in practice. A small model reasoning about flagged authorization code adds some filtering value, but the fundamental semantic gap between code structure and intended policy is not something pattern matching closes.

Implications for Defensive Prioritization

The asymmetry has a direct operational implication: the bug classes where small models are already reliable are the classes where hardening investment has the clearest return given the current threat landscape.

SQL injection has well-documented countermeasures: parameterized queries, prepared statements, ORM abstraction that handles quoting correctly by construction. If fine-tuned small models scanning public-facing codebases will find SQLi-vulnerable data flows reliably, and that scanning capability is now at commodity cost, then any SQLi-vulnerable flow in a public application should be treated as having a known path to exploitation. The question shifts from whether an attacker will look for it to how quickly.

Memory safety has a different remediation path. Language-level solutions, Rust, modern C++ with bounds-checked containers, Go’s garbage collector handling object lifetime, eliminate whole vulnerability classes at the source. For C codebases that cannot migrate, running AddressSanitizer and the full sanitizer stack in CI pipelines catches violations before they ship. DARPA’s AIxCC demonstrated that automated pipelines can find and generate patch candidates for memory safety classes at scale; the challenge the competition surfaced is that patch correctness remains harder than detection, a pattern that mirrors what the automated program repair research community found a decade earlier.

Race conditions and auth logic bugs, by contrast, are not well-served by small model tooling today. Investment in runtime monitoring, structured threat modeling processes that explicitly enumerate access control decisions, and code review practices focused on temporal and semantic properties remains work that automated tooling cannot substitute for yet.

The headline that small models replicate Mythos findings is a useful forcing function for reconsidering threat models. The more precise picture is that specific, tractable bug classes are now in commodity tooling range, and the defender response to that should be class-specific rather than uniform. Treating all vulnerability classes as equally threatened by capability diffusion leads to unfocused response. The asymmetry is structural, and it is grounded in the same computational properties that make some bugs easy to find with symbolic execution while others have resisted automation for decades.

Was this interesting?