· 6 min read ·

The Data Warehouse as Attack Surface: What Snowflake Cortex's Sandbox Escape Reveals

Source: simonwillison

When you layer an LLM onto a data warehouse, you are not just adding a feature. You are introducing an execution environment with a very different threat model than the SQL engine it sits beside. The Snowflake Cortex AI sandbox escape reported by Simon Willison is a concrete demonstration of what that means in practice.

What Snowflake Cortex Actually Is

Cortex is Snowflake’s native AI layer, available directly from SQL without any external service calls. The feature set is broader than most people realize. Cortex LLM functions let you call hosted models (Mistral, Llama 3, Snowflake’s own Arctic) directly from a query:

SELECT SNOWFLAKE.CORTEX.COMPLETE(
  'mistral-large',
  'Summarize the following customer complaint: ' || complaint_text
) AS summary
FROM support_tickets;

Cortex Analyst extends this to natural language to SQL conversion. You describe a question in plain English, Cortex generates SQL against your schema, and that SQL runs. Document AI processes uploaded PDFs and images, extracting structured fields using vision-capable models. Cortex Search handles semantic retrieval over unstructured data.

All of this runs inside Snowflake’s Virtual Warehouse compute clusters. Python workloads go through Snowpark, which executes UDFs and stored procedures in containerized environments on the same infrastructure.

The security perimeter, then, is not a narrow API endpoint. It is a system where LLM outputs feed directly into query execution, document parsing feeds LLM prompts, and Python containers sit adjacent to the data plane.

The Sandbox Problem in AI Execution Environments

Cloud data platforms have spent years hardening their SQL execution layers. The threat model is well understood: users write queries, queries access data they’re authorized to see, and compute resources are metered and isolated per account. That model works because SQL, while powerful, has a constrained execution surface. You can read, write, aggregate, join. You cannot make outbound HTTP calls from a vanilla SELECT.

LLMs break this assumption in two ways.

First, they introduce generative code execution. Cortex Analyst generates SQL that then runs. If an attacker can influence what the LLM generates, they can influence what executes. This is the indirect prompt injection attack pattern: attacker-controlled data flows into an LLM prompt, the LLM interprets embedded instructions in that data, and the resulting output triggers unintended actions.

Second, models with tool-calling capabilities can invoke side effects that pure SQL cannot. Cortex Search and Document AI are not passive retrieval systems. They interact with Snowflake Stages, make network calls, and in agentic configurations can chain multiple operations. Each of those capabilities is a potential pivot point.

Sandbox escapes in this context fall into two categories. The first is functional: an LLM generates code that stays within the intended execution environment but does something the operator did not intend, such as exfiltrating data to an external service or escalating to a more privileged role. The second is literal: code running inside a container breaks out of that isolation and gains access to the host or adjacent infrastructure. The Snowflake incident appears to involve real malware execution, which points toward the latter or a combination of both.

Why Container Isolation Is Harder Than It Looks

Snowpark Python runs in containers, and Snowflake has published details about how those containers are isolated. But container isolation is not the same as VM isolation, and the history of container escapes is long. The runc CVE-2019-5736 vulnerability, for example, allowed a malicious container to overwrite the host runc binary and gain root on the host. CVE-2022-0847 (Dirty Pipe) demonstrated that kernel vulnerabilities can bypass container boundaries entirely.

In a multi-tenant cloud environment, the blast radius of a container escape is not limited to one customer’s data. The surrounding infrastructure, credentials cached in the execution environment, network adjacency to other services, all of these become reachable if container isolation fails.

Snowflake’s architecture specifically routes all customer Python through containerized Snowpark workers. If Cortex AI can generate Python code that gets executed through that path, and if that Python exploits a vulnerability in the container runtime or underlying kernel, the sandboxing that Snowflake relies on stops working.

The Prompt Injection Entry Point

The most likely attack chain here runs through Document AI or Cortex Analyst processing untrusted content. Consider an enterprise Snowflake deployment where:

  1. Vendor invoices are uploaded to a Snowflake Stage and processed by Document AI
  2. A malicious vendor embeds adversarial text in a PDF invoice: instructions telling the model to generate and execute specific Python code when the document is processed
  3. The Document AI extraction pipeline passes this content to an LLM that has tool access
  4. The LLM follows the injected instructions, generating a Snowpark Python UDF or stored procedure that contains malicious code
  5. That code runs inside the Snowpark container

This is not speculative. Researchers including Johann Rehberger have demonstrated this exact class of attack against AI-integrated productivity tools repeatedly since 2023. The Cortex case differs in that the execution environment has direct access to a data warehouse containing potentially sensitive enterprise data, and the Snowpark container is a more capable execution environment than, say, a browser extension.

The Confused Deputy in the Data Plane

There is a deeper architectural issue here. Cortex AI runs with a service identity that has permissions within the Snowflake account. When Cortex Analyst generates SQL and executes it, that SQL runs as a specific role. When Document AI reads from a Stage, it uses a specific credential. When Snowpark Python executes, it inherits the permissions of the invoking context.

This is the confused deputy problem applied to AI agents. The LLM component acts as an intermediary between the user and the data plane. If that intermediary can be manipulated, the attacker gains access to whatever permissions the LLM is acting with, not the permissions of the original requester.

In enterprise Snowflake deployments, service roles often have broad permissions because they need to query across many schemas. A confused deputy attack against Cortex AI is not just about getting code to run; it is about getting code to run with warehouse-level data access.

How This Compares to Prior Cloud AI Incidents

This is not the first cloud AI security incident of this kind. In 2024, Wiz Research discovered multiple server-side request forgery and sandbox escape vulnerabilities across several major cloud AI services, several of which involved tenant isolation failures in model inference infrastructure.

What distinguishes the Snowflake case is the target. Most prior AI service escapes affected the model serving infrastructure itself, infrastructure that is somewhat isolated from customer data. Cortex AI is explicitly designed to operate inside the data warehouse, with full access to the tables, stages, and query engine. An escape here has immediate access to production data rather than just the compute layer.

The 2024 Snowflake credential breach, which affected hundreds of enterprise customers through stolen credentials and lacked MFA enforcement, demonstrated that Snowflake’s customer base holds extremely sensitive data. That breach was credential-based. A sandbox escape through the AI layer is a different class of attack, one that does not require credential theft and targets the service itself.

What This Requires From Snowflake and Its Users

The immediate response for most organizations should be an audit of which Cortex features are enabled and what data they touch. Cortex LLM functions used purely for text transformation on data that never includes user-controlled free-form input are substantially lower risk than Document AI pipelines that process files from external sources.

For Snowflake specifically, the architectural requirements are straightforward to describe and hard to implement: the LLM components need to run with least-privilege service identities, tool-calling capabilities need explicit user confirmation before execution in high-privilege contexts, and the boundary between LLM inference and query execution needs to be treated as an untrusted interface with input validation in both directions.

The deeper problem is that the product direction for Cortex, and for every cloud AI layer in every major data platform, is toward more autonomy, more tool access, and more agentic behavior. Google BigQuery’s Gemini integration, Databricks Mosaic AI, and Microsoft Fabric Copilot are all moving in the same direction. The surface area is growing faster than the security model for it is being developed.

Snowflake’s sandbox escape is a preview of what that trajectory produces when it meets a motivated attacker. The question for every team running AI features inside a data platform is whether their threat model has caught up to their feature adoption.

Was this interesting?