Starlette 1.0 and the Case for a Lean ASGI Foundation in Claude Integrations
Source: simonwillison
Starlette is one of those frameworks most Python developers have depended on without noticing. It is the foundation FastAPI is built on; most developers who use FastAPI have never written a Starlette route directly. Simon Willison’s recent experiments building Claude skills with Starlette 1.0 revisit the framework on its own terms, and they are worth taking seriously as a model for a category of web server that is becoming more common: lightweight HTTP backends for LLM tool use.
What Starlette Is and Where It Came From
Tom Christie built Starlette in 2018 as ASGI was becoming a viable Python server standard. ASGI (Asynchronous Server Gateway Interface) emerged from the Django Channels project, which needed a generalized way to handle WebSockets and long-lived connections alongside traditional HTTP. Christie wrote both Starlette and Uvicorn to make that standard usable. Together they gave Python a production-grade async web stack.
The framework’s core surface area is deliberate and constrained: routing with path parameters and route groups, middleware, request and response types, background tasks, WebSocket support, static file serving, and a test client built on httpx. There is no ORM, no admin interface, no migration tooling, and no built-in dependency injection. Where Django and Flask gave you a complete application framework, Starlette gave you HTTP infrastructure. FastAPI then layered Pydantic validation and OpenAPI generation on top of that infrastructure, and that combination is what made async Python web development click for a wide audience. The majority of people who reached for async Python web development over the last several years were FastAPI users who never needed to think about Starlette directly.
What Version 1.0 Actually Means
Pre-1.0 software carries an implicit cost that is easy to underestimate. Teams building on pre-1.0 frameworks accept that APIs may change without deprecation cycles. Minor version bumps can break code. The upgrade cost is unpredictable and often deferred, which means applications accumulate version debt until some external pressure forces the issue.
Starlette reaching 1.0 removes that cost. It signals a commitment to stability: a defined public API, backward compatibility guarantees, and deprecation cycles before removal. For a framework that has been in production use at significant scale, mostly inside FastAPI, for years, the 1.0 signal is less about technical readiness and more about the relationship between the project and its users.
The lead-up to 1.0 involved cleaning up accumulated API surface. Several deprecated patterns from Starlette’s earlier years were removed. The middleware interface was tightened. Routing edge cases that had been left ambiguous were resolved. The result is a cleaner API than what you would find in documentation from a few years ago, and one you can build on without worrying about version churn.
Claude Skills as HTTP Endpoints
The term “Claude skills” refers to tool definitions you provide when calling Claude through the API. Claude’s tool use mechanism lets you declare functions, services, or data sources that the model can invoke during a response. You describe each tool with a JSON schema specifying its name, description, and parameters. Claude reasons about when to call a tool, formats the call according to your schema, and your code handles execution.
For tools that run in the same process as your application, no web framework is needed. But when you want tools that run as independent services, or that need to be accessible to multiple Claude integrations, or that expose existing internal APIs under a Claude-compatible interface, you need an HTTP server. This is the space that Anthropic’s Model Context Protocol (MCP) was designed to standardize. An MCP server is an HTTP server that speaks a defined JSON protocol: requests describe tool calls, responses carry results.
A minimal Starlette server that works as a skill endpoint looks like this:
from starlette.applications import Starlette
from starlette.requests import Request
from starlette.responses import JSONResponse
from starlette.routing import Route
async def call_tool(request: Request) -> JSONResponse:
body = await request.json()
tool_name = body.get("name")
parameters = body.get("parameters", {})
if tool_name == "search":
results = await search_index(parameters["query"], limit=parameters.get("limit", 5))
return JSONResponse({"results": results})
if tool_name == "read_file":
content = await read_from_storage(parameters["path"])
return JSONResponse({"content": content})
return JSONResponse({"error": f"Unknown tool: {tool_name}"}, status_code=404)
app = Starlette(routes=[
Route("/tools/call", call_tool, methods=["POST"]),
])
The HTTP layer adds almost nothing to the invocation. The logic lives in search_index and read_from_storage, and Starlette routes requests to it without ceremony.
Why Not FastAPI Here
FastAPI would work. It handles async Python routes, it has mature middleware support, and its dependency injection system is genuinely useful for larger applications. Most teams reaching for async Python HTTP will grab FastAPI first, and for good reason.
But FastAPI brings weight in both directions. It pulls in Pydantic as a core dependency and builds the endpoint interface around type-annotated function signatures that Pydantic validates. It generates OpenAPI documentation from those signatures. For a public-facing API where you want that documentation and where input validation at the boundary matters, that weight pays off clearly.
For a skill server, the case is weaker. The JSON protocol is already defined by MCP or your Claude integration layer. Schema enforcement happens on Claude’s side: you declared the tool schema when you called the API, and Claude formats calls according to that schema. Pydantic validation at your server is either redundant or solving a problem in the wrong place. OpenAPI generation for an internal tool server is similarly unnecessary. The documentation Claude needs is the tool description you passed to the API.
Starlette gives you the ASGI foundation without the opinionated additions. For a skill server, that is the right trade.
Middleware and Authentication
Skill servers exposed over HTTP need authentication. You do not want arbitrary callers triggering tool execution. Starlette’s middleware model handles this with minimal setup:
from starlette.applications import Starlette
from starlette.middleware import Middleware
from starlette.middleware.base import BaseHTTPMiddleware
import os
SKILL_TOKEN = os.environ["SKILL_TOKEN"]
class BearerAuthMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
auth = request.headers.get("Authorization", "")
if not auth.startswith("Bearer ") or auth[7:] != SKILL_TOKEN:
return JSONResponse({"error": "Unauthorized"}, status_code=401)
return await call_next(request)
app = Starlette(
routes=[Route("/tools/call", call_tool, methods=["POST"])],
middleware=[Middleware(BearerAuthMiddleware)],
)
BaseHTTPMiddleware lets you wrap every request with arbitrary logic before and after the route handler. The declaration is explicit and visible; there is no hidden registration or annotation magic to trace.
Background Tasks for Long-Running Skills
One pattern that comes up regularly in skill servers is triggering work that takes longer than a synchronous HTTP response allows. Claude calls a tool expecting a result, and your server needs to start a longer operation without blocking.
Starlette’s BackgroundTask handles the simple case cleanly:
from starlette.background import BackgroundTask
async def trigger_analysis(request: Request) -> JSONResponse:
body = await request.json()
repo_url = body["parameters"]["repository"]
task = BackgroundTask(run_repository_analysis, url=repo_url)
return JSONResponse(
{"status": "started", "job_id": generate_job_id()},
background=task
)
The response returns immediately with a job identifier; the analysis runs after the response is sent. A separate tool call can poll job status. For operations that need guaranteed delivery or retry semantics, you would replace the background task with a proper queue, but BackgroundTask covers the common case without additional dependencies.
Testing Skill Endpoints
Starlette ships a test client built on httpx that supports both synchronous and async testing without needing a running server:
from starlette.testclient import TestClient
client = TestClient(app)
def test_search_tool():
response = client.post(
"/tools/call",
json={"name": "search", "parameters": {"query": "starlette routing"}},
headers={"Authorization": f"Bearer {SKILL_TOKEN}"}
)
assert response.status_code == 200
assert "results" in response.json()
def test_unknown_tool_returns_404():
response = client.post(
"/tools/call",
json={"name": "nonexistent", "parameters": {}},
headers={"Authorization": f"Bearer {SKILL_TOKEN}"}
)
assert response.status_code == 404
This is important for skill servers specifically because the tool contract is implicit in your Claude API call, not enforced by a schema validator at the HTTP boundary. Tests are the documentation of what the server accepts.
What Willison’s Experiments Point To
Willison has spent the last several years building quick, exploratory applications at the intersection of web development and LLMs. His datasette project pioneered much of the thinking around exposing structured data over clean HTTP APIs. Starlette’s explicit, low-friction approach fits well with that kind of exploratory development: write the routes you need, run the server, iterate without fighting framework conventions.
Skill servers fit this pattern well. They are small, purpose-built HTTP servers with a specific protocol to speak and specific operations to expose. The 1.0 milestone makes Starlette worth committing to for this use case without the concern that the API will shift during a longer project.
For Python developers building Claude integrations, MCP servers, or anything in the broader category of LLM-adjacent HTTP backends, Starlette 1.0 deserves consideration as a direct choice rather than something inherited from FastAPI. The minimal API surface maps well to the shape of the problem: a thin, stable HTTP layer that routes requests to your logic without adding its own opinions about how that logic should be structured.