Prompt Text or Running Process: The Real Trade-offs Between MCP and Claude Code Skills

David’s post at david.coffee arguing for MCP over Claude Code’s skills system attracted 353 points and 298 comments on Hacker News, which reflects genuine disagreement in the community about the right abstraction for connecting AI assistants to external capabilities. The debate surfaces a real architectural choice with consequences for portability, maintainability, and what your tools can actually do at runtime.

What Each Approach Actually Is

MCP, the Model Context Protocol, is an open protocol published by Anthropic in late 2024. The wire format is JSON-RPC 2.0, running over stdio or Server-Sent Events. An MCP server is a separate process that exposes three primitives: tools (callable functions), resources (readable data), and prompts (reusable templates). When a compatible client like Claude Desktop, Claude Code, or Cursor starts a session, it performs an initialize handshake and calls tools/list to discover capabilities.

Each tool definition includes a formal JSON Schema describing its inputs:

{
  "name": "schedule_message",
  "description": "Schedule a message to be sent at a future time",
  "inputSchema": {
    "type": "object",
    "properties": {
      "channel": { "type": "string" },
      "content": { "type": "string" },
      "cron": { "type": "string", "description": "Cron expression for timing" }
    },
    "required": ["channel", "content", "cron"]
  }
}

When the model calls a tool, it sends a tools/call request; the server executes whatever logic it wants and returns a content array injected back into the conversation. The server can be written in any language that can emit JSON over a stream, which makes MCP a genuinely language-agnostic standard.

Claude Code’s skills are different in a fundamental way: they are markdown files, not running processes. A skill lives in .claude/commands/ as a .md file that describes, in natural language, what the model should do when that slash command is invoked. The model reads the instructions and uses its own built-in tools to carry out the task. There is no JSON-RPC, no separate process, no schema validation. The “tool” is the text itself.

A skill that handles code review might look like:

Review the current diff for correctness, style issues, and potential bugs.
Focus on logic errors first, then naming and structure. Output a bulleted list
of issues grouped by severity.

That is the entire implementation. The model executes the task using its own capabilities, drawing on whatever tools the host client provides.

The Portability Gap

The portability difference between the two approaches is significant and under-discussed. An MCP server works with any MCP-compatible client: Claude Desktop, Claude Code, Cursor, Zed, and any other tool that has adopted the protocol. The ecosystem has grown rapidly since the protocol’s publication; writing a server once and having it available everywhere is a genuine force multiplier.

Skills are a Claude Code feature. They do not transfer to Claude Desktop, they do not work in Cursor, and they have no equivalent outside Anthropic’s own tooling. If you switch clients, or build something that spans multiple clients, you start over. For teams that have fully standardized on Claude Code, this constraint may not matter today. For anyone building integrations meant to outlive their current tool choice, it matters considerably.

The MCP specification is open and Apache 2.0 licensed, which has encouraged third-party client adoption at a pace that a proprietary format rarely achieves. That ecosystem momentum is part of what David’s argument rests on, and it is a reasonable foundation.

Schemas, Validation, and Failure Modes

The formal JSON Schema in MCP tool definitions is not just documentation. Clients validate tool arguments against the schema before sending them, so malformed calls get rejected before they reach the server. With skills, the model’s interpretation of natural language instructions is the only contract you have. This works well in practice, until it does not, and when it does not, you have no schema to point to, no validation layer to inspect, and no structured error to catch.

The failure mode for a skill is a model that interprets your instructions differently than intended and does something plausible but wrong. The failure mode for an MCP tool is a JSON Schema violation, which is at least debuggable and catchable at a well-defined boundary.

I run an MCP server for my Discord bot that exposes tools for scheduling messages, managing a kanban board, and querying analytics. Having tool definitions in code, with typed inputs and explicit error handling, means I can write tests against the server in isolation, inspect JSON-RPC traffic with standard tooling, and trust that if the model passes an invalid cron field to schedule_message, the server rejects it before any side effect occurs. That kind of control requires real code running on the other side of the protocol boundary.

Composition and Statefulness

MCP enables server-side composition that skills cannot replicate. Skills are stateless by design; each invocation is a fresh prompt injection with no memory of previous calls within that invocation. An MCP server is a running process. It can maintain a connection pool, cache expensive lookups, hold session-scoped state, and coordinate between multiple tool calls in ways that the model itself does not need to orchestrate.

For read-only, short-lived tasks this distinction rarely matters. For anything involving multiple steps with shared intermediate state, a live database connection, or coordination across sequential tool calls, the difference between a running process and a markdown file is the difference between a capable system and an elaborate workaround.

Server-side logic also means you can enforce invariants the model cannot. A skills-based workflow that tries to prevent duplicate operations relies on the model remembering context; an MCP server can hold a simple set of in-progress job IDs and reject duplicates unconditionally, regardless of what the model thinks it remembers.

When Skills Are the Right Tool

Skills are not wrong for every job. For prompt augmentation, they are the simpler and more maintainable choice. A skill that injects project conventions into every prompt, enforces an output format, or reminds the model how your repo uses conventional commits is the right level of complexity for that task. There is no server to run, no process to keep alive, no JSON-RPC handshake to debug. The total cost of ownership for a well-scoped skill file is genuinely low.

The HN thread on David’s post surfaced this divide clearly: commenters satisfied with skills tend to be doing lighter automation work, while those who have run into limits are building integrations where the absence of a real execution layer shows up as a concrete, recurring problem. Both groups are right about their own use cases.

The mistake is not choosing skills over MCP; it is choosing skills for problems that have outgrown them and then patching around the limitations rather than replacing the foundation.

The Underlying Trade-off

Skills put complexity into prompt text, which is fast to write, easy to change, and invisible to conventional tooling like linters, type checkers, and test runners. MCP puts complexity into code, which carries the full weight of a software engineering problem and also the full set of available solutions: type safety, testing, versioning, portability, and real execution semantics.

For prompt templates and lightweight automation, skills are a reasonable choice. For anything that requires server-side logic, shared state, cross-client portability, or formal input validation, MCP is the more capable foundation. The cost of running a sidecar process is usually smaller than it appears at the start of a project, and the benefits compound over time in ways that are hard to see until you have been maintaining a complex skill for six months and wishing you had a test suite.