· 5 min read ·

Building Security Guardrails Into AI-Assisted Development

Source: martinfowler

The Thoughtworks team working on global marketing applications ran into a problem that most teams using AI coding assistants will recognize. Their velocity jumped, prototypes materialized in days instead of weeks, but the code that emerged carried a troubling pattern: insecure defaults, overly permissive configurations, credentials in plain text.

This maps to what the team calls vibe coding, where developers iterate rapidly with AI assistance but skip the deliberate security review that traditionally gates production deployments. The AI suggestions work, they solve the immediate problem, and they ship. The security debt accumulates silently.

The question becomes: how do you preserve the velocity gains while systematically preventing the most common security failures?

The Pattern of Insecure Suggestions

AI models trained on public code repositories inherit the security posture of their training data. GitHub contains millions of examples of AWS credentials in environment variables, database connections without TLS, CORS set to *, JWT secrets hardcoded in source files. When Claude or GPT-4.1 suggests code, it draws from this distribution.

The models understand security concepts when explicitly prompted. Ask Claude Sonnet 4.6 to review code for security issues and it will flag most obvious problems. The failure occurs in the default suggestion path, where the model optimizes for working code over secure code.

Consider a typical exchange. You ask the AI to add database connectivity to your Express application. It suggests:

const pool = new Pool({
  user: 'admin',
  host: 'localhost',
  database: 'myapp',
  password: 'password123',
  port: 5432,
});

This works. It solves the immediate problem. It also hardcodes credentials, uses a weak password, and provides no SSL enforcement. The AI will improve this if you ask, but most developers accept the first working solution.

Security Context Files

The Thoughtworks approach centers on providing the AI with explicit security context before it generates code. This takes the form of a structured document, typically .claude/SECURITY.md or similar, that lives in the repository.

The content specifies mandatory requirements:

## Authentication
- All API endpoints must use JWT validation
- JWTs must be verified using RS256, never HS256
- Secrets must be loaded from environment variables, never hardcoded
- Token expiry must not exceed 15 minutes for access tokens

## Database
- All connections must use TLS 1.3
- Connection strings must be loaded from AWS Secrets Manager
- No direct database credentials in environment variables
- Use read-only connection pools for query-only operations

## Dependencies
- No packages with known critical CVEs
- Pin exact versions, never use `^` or `~` in package.json
- Run `npm audit` before accepting any AI-suggested dependencies

When the AI has this context loaded, either through a system prompt or by reading the file as part of its context window, the suggestions shift. The database connection example becomes:

import { SecretsManagerClient, GetSecretValueCommand } from '@aws-sdk/client-secrets-manager';

const client = new SecretsManagerClient({ region: 'us-east-1' });
const response = await client.send(
  new GetSecretValueCommand({ SecretId: process.env.DB_SECRET_ARN })
);
const credentials = JSON.parse(response.SecretString);

const pool = new Pool({
  ...credentials,
  ssl: { rejectUnauthorized: true, minVersion: 'TLSv1.3' },
});

The AI needs the context present for every request. Claude Code and Cursor both support project-level context files that load automatically. GitHub Copilot requires manual inclusion in comments or relies on existing code patterns.

Permission Request Scrutiny

The second vector comes from AI requests for elevated permissions. Tools like Claude Code operate with a permission model: the AI can read files freely but must request permission to execute commands, write files, or access network resources.

The Thoughtworks team found that AI agents frequently request permissions that seem reasonable in isolation but create security holes:

  • chmod 777 on a directory to fix a file access error
  • sudo npm install to bypass permission errors
  • Disabling CORS temporarily for debugging
  • Setting database user to superuser role to resolve a constraint error

Each request arrives with plausible justification. The developer, focused on forward progress, approves. The permission remains in place, often long after the immediate problem is solved.

The mitigation requires policy at the team level. Define a set of requests that always require a second review:

  • Any sudo command
  • Any chmod above 755
  • Any database permission grant
  • Any CORS or CSP policy change
  • Any addition to .gitignore for secrets

Some teams implement this as a pre-commit hook that scans the git diff for patterns, others use a review checklist that gates PR approval.

Daily Security Intelligence

The third component addresses the dynamic threat landscape. A secure configuration today becomes vulnerable tomorrow when a new CVE drops or an attack technique evolves.

The Thoughtworks team created an automated daily feed that updates the security context file with current intelligence:

#!/bin/bash
# security-update.sh

# Fetch latest CVE data for project dependencies
npm audit --json > audit-results.json

# Check for new advisories on used frameworks
curl -s https://api.github.com/advisories?ecosystem=npm > gh-advisories.json

# Update SECURITY.md with new findings
node scripts/update-security-context.js

# Commit if changes exist
if [[ -n $(git status -s .claude/SECURITY.md) ]]; then
  git add .claude/SECURITY.md
  git commit -m "security: update context with latest advisories"
fi

The update script parses the audit results and advisory data, then appends new restrictions to the security context. If a critical CVE drops for jsonwebtoken below version 9.0.2, the script adds a version constraint that the AI will respect in future suggestions.

This creates a feedback loop: vulnerabilities discovered in production or reported in the ecosystem immediately inform the AI’s future code generation.

Secure-by-Default Templates

The final layer provides starting points that bake in security from the foundation. Rather than asking the AI to scaffold a new Express API from scratch, you provide a template that already includes:

  • Helmet.js with strict CSP
  • Rate limiting on all routes
  • Input validation with a schema library
  • Structured logging with PII redaction
  • Health check endpoints with auth
  • Docker configuration with non-root user

The AI builds on this foundation rather than starting from first principles. This shifts the security burden from review (finding problems in generated code) to maintenance (keeping the template current).

A minimal secure Express template:

import express from 'express';
import helmet from 'helmet';
import rateLimit from 'express-rate-limit';
import { validateRequest } from './middleware/validation.js';
import { logger } from './utils/logger.js';

const app = express();

app.use(helmet({
  contentSecurityPolicy: {
    directives: {
      defaultSrc: ["'self'"],
      scriptSrc: ["'self'"],
    },
  },
}));

app.use(rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
}));

app.use(express.json({ limit: '10kb' }));
app.use((req, res, next) => {
  logger.info({ method: req.method, path: req.path, ip: req.ip });
  next();
});

export default app;

Developers clone this template, then ask the AI to add feature-specific routes and logic. The security baseline persists.

Measuring Effectiveness

The Thoughtworks implementation tracked three metrics:

  1. Security findings per 1000 lines of AI-generated code (measured via static analysis)
  2. Time from vulnerability disclosure to context file update
  3. Percentage of AI permission requests that required manual override

Over six months, findings per 1000 lines dropped from 12 to 3. Update latency improved from 5 days to same-day for critical issues. Permission overrides stabilized at 8%, suggesting the AI learned acceptable patterns.

The approach trades some velocity for systematic risk reduction. Initial setup requires defining the security context, creating templates, and establishing review criteria. Ongoing maintenance requires keeping the context current and reviewing metrics.

The alternative is unbounded security debt that compounds with each AI-assisted feature, eventually requiring expensive remediation or, worse, materialized breaches. The guardrails constrain the blast radius of any single insecure suggestion and create organizational learning that persists beyond individual pull requests.

Was this interesting?