· 7 min read ·

The Subcommand Architecture Problem That Python CLI Tools Never Fully Solved

Source: lobsters

The discussion around Rust versus Python for CLI tools usually focuses on startup time and binary distribution. Those advantages are real, well-documented, and the reason most developers first consider the switch. A recent post on smiling.dev describes this familiar arc: rewriting a Python CLI in Rust and finding the result better than expected. What that account, and most of the surrounding commentary, tends to underemphasize is the structural argument, the part that explains why tools rewritten in Rust tend to stay there.

The structural argument is about subcommands: how you model a tool with multiple modes of operation, how shared global options flow through to subcommand handlers, and what happens to that code when you add a new subcommand six months later.

Python’s Subcommand Model

Python’s argparse handles subcommands through add_subparsers(). The result is a flat namespace where every parsed argument lives at the same level regardless of which subcommand was invoked:

parser = argparse.ArgumentParser()
parser.add_argument("--verbose", "-v", action="store_true")
parser.add_argument("--config", default="config.toml")

subparsers = parser.add_subparsers(dest="command")

run_parser = subparsers.add_parser("run")
run_parser.add_argument("--workers", type=int, default=4)
run_parser.add_argument("input_file")

export_parser = subparsers.add_parser("export")
export_parser.add_argument("--format", choices=["json", "csv"])
export_parser.add_argument("output_dir")

args = parser.parse_args()

if args.command == "run":
    run(args)
elif args.command == "export":
    export(args)
else:
    parser.print_help()

This is workable code, but it carries structural problems that compound as the tool grows. The args namespace is flat: args.workers, args.format, args.verbose, and args.config coexist regardless of which subcommand ran. Access args.workers from within the export branch and you get an AttributeError at runtime. The relationship between args.command == "run" and the presence of args.workers is implicit, held together by programmer discipline rather than anything the interpreter can verify.

click separates subcommands into functions, which is cleaner, but shared state flows through ctx.obj, a plain dictionary:

@click.group()
@click.option("--verbose", is_flag=True)
@click.option("--config", default="config.toml")
@click.pass_context
def cli(ctx, verbose, config):
    ctx.ensure_object(dict)
    ctx.obj["verbose"] = verbose
    ctx.obj["config"] = config

@cli.command()
@click.option("--workers", default=4)
@click.argument("input_file")
@click.pass_context
def run(ctx, workers, input_file):
    verbose = ctx.obj["verbose"]
    ...

@cli.command()
@click.option("--format", type=click.Choice(["json", "csv"]))
@click.argument("output_dir")
@click.pass_context
def export(ctx, format, output_dir):
    verbose = ctx.obj["verbose"]
    ...

The separation is better. Each subcommand is a distinct function. But ctx.obj carries no type information. If you add a new shared option at the root level, you must remember to populate it in the group handler and reference it in every subcommand that needs it. Add a third subcommand later and forget to read the new shared flag, and the code compiles, the tests pass unless you wrote a specific case for it, and the bug reaches users.

Typer narrows the ergonomics gap further by using Python type annotations to drive click’s machinery. It is the best Python has to offer for this problem. But the annotations inform the runtime; the Python interpreter does not enforce them. The schema remains advisory.

How Rust Models This

clap’s derive API represents subcommands as an enum. Each variant is a struct containing only the arguments for that specific subcommand. Shared global arguments live in a parent struct:

use clap::{Parser, Subcommand};
use std::path::PathBuf;

#[derive(Parser)]
#[command(version, about)]
struct Args {
    #[arg(short, long)]
    verbose: bool,

    #[arg(long, default_value = "config.toml")]
    config: PathBuf,

    #[command(subcommand)]
    command: Command,
}

#[derive(Subcommand)]
enum Command {
    Run {
        #[arg(short, long, default_value_t = 4)]
        workers: usize,
        input_file: PathBuf,
    },
    Export {
        #[arg(short, long, value_enum)]
        format: ExportFormat,
        output_dir: PathBuf,
    },
}

#[derive(clap::ValueEnum, Clone)]
enum ExportFormat {
    Json,
    Csv,
}

The entry point dispatches on the enum variant:

fn main() {
    let args = Args::parse();

    match args.command {
        Command::Run { workers, input_file } => {
            run(args.verbose, &args.config, workers, input_file);
        }
        Command::Export { format, output_dir } => {
            export(args.verbose, &args.config, format, output_dir);
        }
    }
}

The compiler guarantees that workers is not accessible from the Export branch. There is no args.workers to reach for; the workers field exists only within the Run variant. Shared global options at the Args level are always present regardless of subcommand. The layout of the type hierarchy directly maps to the structure of the CLI.

Exhaustive Matching as a Maintenance Property

The practical benefit appears when you add a new subcommand to a tool that already exists. In Python, you add a new elif args.command == "inspect": branch. Every other dispatch site continues to compile and run whether or not you updated it. If you have error-reporting logic that should run across all subcommands and you forget to wire it up for inspect, you find out when someone files a bug.

In Rust, adding a new variant to the Command enum produces a compile error at every match statement that does not handle the new variant. The compiler identifies exactly where the new case needs handling. This is not a runtime test or a linter rule; it is a build failure. You cannot ship the new subcommand without explicitly deciding what to do at every existing dispatch site.

For a tool with five subcommands and multiple code paths that dispatch on which command ran, this property eliminates an entire category of bugs. Those bugs are not exotic edge cases; they are the predictable consequence of adding code without exhaustively auditing existing dispatch sites. Tests can catch them, but only if the test coverage was written specifically for that scenario.

The Invalid State Argument

A related benefit is that Rust makes certain invalid program states structurally unrepresentable. Consider a --dry-run flag that should suppress writes across all subcommands. In Python, this flag exists in the flat args namespace or in ctx.obj, and every handler that performs writes must remember to check it. Forget to check it in one handler and the behavior is silently wrong.

In Rust, you can encode this constraint in the type by passing a typed context struct to every handler:

struct RunContext {
    verbose: bool,
    config: PathBuf,
    dry_run: bool,
}

fn run(ctx: &RunContext, workers: usize, input_file: PathBuf) {
    if !ctx.dry_run {
        // perform writes
    }
}

The RunContext struct makes explicit what a handler receives. New handlers get the same struct. There is no ambient namespace to consult, no dictionary to forget to populate. Whether this specific pattern is superior to Python’s equivalent depends on the tool, but the point is that the options are different: Rust lets you encode invariants in types that the compiler enforces, Python asks you to enforce them through test coverage and code review.

Where This Matters in Practice

The scale at which this matters is not as large as you might expect. A tool with two subcommands and a stable argument schema does not accumulate many bugs from implicit dispatch. The Python code is readable and the problem stays tractable.

The crossover happens around three conditions: the tool has four or more subcommands with meaningful argument differences between them; more than one person is responsible for maintaining it; or the argument schema evolves frequently. Under those conditions, the type-checked invariants are doing real work.

ruff, the Python linter written in Rust, has a multi-subcommand CLI covering linting, formatting, rule documentation, server mode, and cache management. The team, which has deep Python expertise, chose Rust for the performance requirements, but the structural properties are visible in how the codebase handles subcommand dispatch. Adding a new subcommand to ruff requires updating every pattern match that dispatches on the command type, which the compiler enforces. In a Python linter codebase of similar complexity, that enforcement would fall to code review.

uv, the Rust-based Python package manager from the same team, is another example. Its subcommand surface covers pip operations, virtual environment management, tool installation, Python version management, and project builds. Maintaining consistent behavior for flags like --quiet and --python across all of those subcommands is a coordination problem; Rust’s type system makes the coordination explicit rather than implicit.

What Python Wins

The tradeoffs run both directions. Python’s dynamism is a genuine advantage when you are iterating quickly on a schema that is not yet stable. The edit-and-run cycle in Python is faster than the edit-compile-run cycle in Rust, particularly during the early stage of a tool where the argument structure is changing every day. Rust’s compile times are a real tax, and the borrow checker adds friction that Python does not.

For tools distributed to an audience of Python developers who already have the runtime installed, the distribution argument weakens considerably. For internal tooling where the schema is stable and the audience is controlled, the structural advantages of Rust matter less. Typer in particular, with its annotation-driven schema generation, narrows the ergonomics gap enough that switching to Rust would be hard to justify on ergonomics grounds alone.

The argument for Rust on structural grounds is strongest when the tool is external-facing, maintained by multiple people, and expected to grow. That is the context where the compile-time invariants pay for themselves.

The Pattern Behind the Pattern

The smiling.dev article describes what most people experience when they first ship a Rust CLI: the improvements are larger than expected because the baseline Python tool was carrying costs that were not obvious until they disappeared. Startup time is the measurable one. The structural properties around argument schema enforcement are harder to measure but explain why the tools that move to Rust tend not to move back.

The Rust CLI ecosystem, including ripgrep, fd, and the Astral tools, did not coalesce around Rust because it is fashionable. It coalesced because Rust’s output, both the runtime binary and the development-time compiler feedback, fits what CLI tools actually need: fast startup for frequent invocation, a single artifact for clean distribution, and a type system that keeps the argument schema honest as the tool ages.

Was this interesting?