Building a Security Harness for AI Code Generation

The Thoughtworks team that discovered their AI coding assistants recommending public storage buckets and overprivileged service accounts learned what many engineering teams are learning right now: telling an AI to write secure code is not the same as enforcing that it does. The statistics bear this out. Twenty-five percent of AI-generated code contains confirmed vulnerabilities, and one in five enterprise breaches in 2026 now stem from AI-generated code, according to recent industry research.

The problem is structural. Language models optimize for the path of least resistance. Public buckets are easier than signed URLs. Broad IAM roles are simpler than granular permissions. The model has no stake in your security posture, so when a user pushes back on a restriction, the constraint often evaporates.

The solution is not to abandon AI coding tools. The solution is to wrap them in what Birgitta Böckeler calls a “harness,” a system of guides and sensors that steer the model before it acts and validate its output afterward. Her framework divides these controls along two axes: feedforward versus feedback, and computational versus inferential.

Computational Controls: The First Line of Defense

Computational controls are deterministic, fast, and CPU-driven. They include static analysis tools, linters, policy-as-code engines, and pre-commit hooks. These tools do not reason about intent; they enforce rules.

Consider Semgrep, which uses pattern matching to catch insecure code structures. You can write a rule that flags any GCP bucket configuration missing uniform bucket-level access controls, or any AWS S3 bucket with acl = "public-read". When an AI agent generates Terraform that violates this rule, Semgrep blocks the commit.

Here is an example Semgrep rule for catching overly permissive IAM bindings:

rules:
  - id: gcp-overprivileged-service-account
    patterns:
      - pattern: |
          resource "google_project_iam_member" "$NAME" {
            role = "roles/iam.serviceAccountTokenCreator"
            ...
          }
    message: "Service account granted Token Creator role. Use least-privilege alternatives."
    severity: ERROR
    languages: [hcl]

This rule stops the exact vulnerability the Thoughtworks team caught manually. The difference is that Semgrep runs on every commit, not just when someone remembers to audit the code.

Other computational tools serve similar purposes. Checkov scans infrastructure as code for misconfigurations before provisioning. Trivy inspects container images for vulnerabilities and secrets. KICS validates Kubernetes manifests against CIS benchmarks. Each of these tools operates in milliseconds, requires no model inference, and cannot be convinced to make exceptions through clever prompting.

The computational layer should also include dependency scanning. Tools like Snyk and Dependabot flag known CVEs in the libraries an AI agent pulls into a project. AI models trained on open source code often suggest popular packages, but popularity does not guarantee security. The Black Duck Open Source Security and Risk Analysis report found that 78% of codebases in 2026 contain high or critical severity vulnerabilities, a number that has doubled year over year as AI-generated code pulls in dependencies without human vetting.

Inferential Controls: Context-Aware Guardrails

Inferential controls rely on semantic analysis and AI-driven judgment. These include system prompts, security context files, and LLM-based code review agents. They are slower and less deterministic than computational controls, but they handle nuance that rigid rules cannot capture.

The Thoughtworks team implemented a security context file, a document the AI reads before generating any infrastructure or application code. This file does not just say “be secure.” It specifies concrete constraints:

All GCP storage buckets must use uniform bucket-level access with signed URLs.
Service accounts must follow least-privilege principles; use predefined roles, not primitive roles.
No hardcoded secrets. Use Secret Manager with workload identity federation.
All API endpoints must enforce authentication via Identity-Aware Proxy.

The context file turns vague guidance into technical requirements. When the AI proposes a configuration, it must justify how that configuration satisfies each constraint. If it cannot, the proposal is flagged for human review.

Security context files are not a new idea, but their application to AI coding workflows is. Projects like Claude Code support .claude directories where developers can place project-specific instructions. The file CLAUDE.md acts as persistent context that shapes every interaction with the agent. Other tools, including GitHub Copilot and Cursor, are adding similar features, recognizing that one-off prompts are insufficient for enforcing enterprise security policies.

Another inferential control is the security intelligence feed. The Thoughtworks team created a daily digest of new CVEs, OWASP updates, and cloud provider security advisories. This feed is injected into the AI’s context window at the start of each session, ensuring the model has access to vulnerability data that postdates its training cutoff. When a zero-day is disclosed, the feed prevents the AI from suggesting the affected pattern.

The inferential layer also includes human-in-the-loop review. Not every security decision can be automated. When an AI agent requests a permission that seems excessive, a human must evaluate whether the request is justified by the task or symptomatic of a deeper architectural problem. This is why the Thoughtworks team emphasizes caution with AI permission requests: granting an agent broad access to your cloud environment is equivalent to granting that access to anyone who can manipulate the agent’s prompts.

Feedforward and Feedback: Timing Matters

Böckeler’s framework also distinguishes between feedforward controls, which steer the model before it generates output, and feedback controls, which validate the output after the fact.

Feedforward controls include the security context file, system prompts, and example templates. These shape the AI’s initial response. A secure-by-default template for a serverless function might include IAM role definitions with scoped permissions, environment variable placeholders that reference Secret Manager, and HTTPS-only endpoints. When the AI starts from this template, insecure configurations become the exception rather than the default.

Feedback controls include static analysis, CI/CD gates, and runtime monitoring. These catch what the feedforward controls miss. A pre-commit hook that runs OWASP Dependency-Check will block a pull request if the AI has introduced a vulnerable dependency, even if the context file did not anticipate that specific library.

The most effective harnesses combine both. Feedforward controls reduce the volume of insecure code that reaches the validation stage. Feedback controls provide a deterministic safety net when inferential controls fail.

Implementation: A Practical Workflow

Here is a workflow that integrates these principles:

Project setup: Create a .security directory with a context file, secure templates, and a policy-as-code configuration.
Agent initialization: Load the security context file into the AI’s system prompt. Provide templates for common tasks.
Code generation: The AI generates code guided by the context file and templates.
Pre-commit validation: Run Semgrep, Checkov, and Trivy as Git hooks. Block commits that violate policy.
CI/CD enforcement: Run the same tools in CI, plus dependency scanning and container image analysis. Fail the build on critical findings.
Human review: Flag high-risk changes for manual security review before merge.
Runtime monitoring: Use cloud provider audit logs and runtime security tools to detect anomalies in deployed code.

This workflow is not theoretical. Tools exist for every stage. Pre-commit orchestrates Git hooks. Open Policy Agent enforces policy as code in CI/CD. Falco provides runtime threat detection for Kubernetes. The infrastructure is available; what is often missing is the organizational commitment to treat AI-generated code as untrusted by default.

Why This Matters Now

The Thoughtworks article highlights a broader trend: AI coding tools are proliferating faster than security practices are adapting. Forty-two percent of new enterprise software in 2026 is AI-generated, according to a recent developer survey, but 50% of organizations have no policies governing what sensitive data can be shared with AI tools. The result is a security gap that attackers are already exploiting.

The gap is widening because AI-generated code is outpacing security teams’ ability to review it. Sixty-two percent of security teams report that keeping up with the volume is getting harder, according to ProjectDiscovery’s 2026 AI Coding Impact Report. Manual code review cannot scale at the rate AI is generating code. Automated controls are not optional; they are the only path to maintaining security at velocity.

The irony is that AI coding tools could make software more secure, not less. If the harness is well-designed, the AI can be steered toward secure patterns more consistently than human developers, who vary in expertise and attention. The model does not get tired, does not take shortcuts under deadline pressure, and can be updated with new security guidance instantly. The failure mode is not the AI itself; it is the absence of deterministic enforcement around it.

Conclusion

Prompting an AI to be secure is an inferential control. It is valuable, but insufficient. Security at scale requires computational controls that cannot be overridden by prompt engineering, cannot be fatigued by volume, and do not depend on the AI’s understanding of your intent.

The Thoughtworks team learned this by catching two critical vulnerabilities before they reached production. The next team might not be as vigilant. The solution is to build security into the development workflow itself: context files that set expectations, templates that encode best practices, static analysis that blocks violations, and CI/CD gates that prevent insecure code from reaching production.

The tools exist. The framework exists. What remains is the discipline to treat AI-generated code as untrusted until proven otherwise, and to automate that proof through a harness of computational and inferential controls working in concert.