· 6 min read ·

The Projects Where Fully Agentic Coding Delivers

Source: lobsters

hjr265 recently built GitTop, a terminal dashboard that shows git repository activity modeled on htop, using a fully agentic coding workflow. The agent handled implementation end to end while the author directed at a higher level. The tool ships and the author calls it their first fully agentic project. What makes this worth examining is not the tool or the agent, but the project choice: GitTop is a personal tool built for the author’s own daily use, and that fact does most of the explanatory work for why the experiment worked.

Why Project Selection Is the Primary Variable

The conversation around agentic coding tends to center on models and frameworks: which agent works best, how to structure prompts, what tools to give the model access to. These things matter, but they are not the primary variable in whether a fully agentic project succeeds. The primary variable is whether the developer can evaluate the output quickly and accurately.

Evaluation is cheap when you are the primary user of what you are building. You run the thing, you see what it does, you know immediately whether it matches what you wanted. No test suite required, no stakeholder review, no production traffic analysis. The feedback loop collapses to a single question: does this work when I use it.

Building software for others expands the evaluation surface beyond your direct experience. You need to reason about edge cases you will never personally hit. An agentic workflow can generate plausible code that covers your cases while failing silently for different users. Without broader test coverage or user feedback, you will not catch those failures. GitTop sidesteps this entirely: the author uses git, has a specific workflow they want instrumented, and knows what a useful display should feel like. When the agent produces a layout that does not match, they know on first run.

The Specification Problem

One underappreciated aspect of fully agentic development is that the developer’s primary skill shifts. In conventional development, implementation consumes most of the time: reading documentation, writing code, debugging. In a fully agentic workflow, those activities are delegated. What remains is specification and evaluation.

Specification is harder than it looks. “Show git activity like htop shows processes” is directionally clear but leaves most decisions open. How frequently should the data refresh? What columns matter? How should the author list be sorted? What happens when there are no recent commits, or when the working directory is not a repository at all?

The agent will make choices, and those choices will be reasonable defaults, but they may not match what the developer had in mind. For a personal tool, this is manageable. Run the tool, notice that the refresh interval is too aggressive or the sort order is wrong, describe the correction. The loop is tight and the cost of a wrong choice is one revision cycle.

For a shared library or service, wrong choices in specification ripple further. A REST API that returns data in a format convenient for the author but awkward for clients creates rework that only surfaces after integration. The agentic workflow amplifies whatever ambiguity exists in the original requirements; it does not resolve it. This is one reason personal tools are the natural first domain for fully agentic coding: they minimize the specification surface because the developer’s own experience is sufficient to evaluate the output.

What Go Brings to This

The technical choices in GitTop are worth noting because they reflect a project well-suited to agentic generation. Go is explicit, has a large and consistent standard library, and the ecosystem of developer tooling libraries is well-documented. The Charm toolkit in particular, bubbletea for TUI event loops, lipgloss for terminal styling, bubbles for reusable components, provides a well-structured target for agentic code generation because the responsibilities are cleanly separated and the API surface is public and extensively documented.

The git data layer is similarly tractable. Shelling out to git log with a format string produces stable, parseable output that agents generate reliably:

func fetchCommits(repoPath string) ([]commitEntry, error) {
    cmd := exec.Command("git", "-C", repoPath, "log",
        "--format=%H%x00%an%x00%ae%x00%ar%x00%s",
        "--max-count=100",
    )
    out, err := cmd.Output()
    if err != nil {
        return nil, fmt.Errorf("git log: %w", err)
    }
    return parseNullDelimited(strings.TrimSpace(string(out))), nil
}

Using null delimiters instead of pipes avoids the obvious breakage when commit subjects contain pipe characters. An agent working with well-represented library code and stable shell interfaces, both conditions that hold for this project, produces reliable output. Compare this to a project requiring novel algorithm design or integration with underdocumented internal systems. Where an agent works from sparse or inconsistent training data, it makes more mistakes and requires more correction cycles. The productivity advantage in mechanical assembly erodes when the assembly requires real synthesis.

The go-git library provides a programmatic alternative to subprocess invocation for cases where you want tighter control over repository access without spawning processes. Both paths are well-covered in public documentation, which means an agent generating GitTop-style code is drawing on dense, consistent training signal.

The Developer as Evaluator

When implementation is delegated, the remaining job is closer to acceptance testing than software engineering in the conventional sense. This requires a different kind of attention.

Reviewing code you wrote yourself is familiar: you understand the decisions you made, you can spot where you took shortcuts, you know which edge cases you were actively thinking about. Reviewing code an agent wrote requires reconstructing intent from structure. The question shifts from “is this what I meant to write” to “does this do what I need, and does it do anything I did not anticipate.”

The second question is harder, and it grows more consequential as the project scales. For GitTop, the surface area is manageable. Using the tool for a week will surface most of the edge cases worth caring about. Accepting that the agent made sensible choices about terminal compatibility, error handling for non-repository directories, and concurrent data refreshes requires either trusting the agent’s defaults or doing careful code review. That code review is, of course, part of what agentic workflows nominally reduce.

This is a real cost, and it tends to go unacknowledged in descriptions of agentic workflows. The developer saves time on initial implementation and spends it on evaluation and correction. For most personal tools, that trade clearly favors agentic development, since the evaluation is fast and the implementation cost that was avoided was genuinely high. For production software, the evaluation cost needs more deliberate accounting before the trade looks favorable.

It is also worth noting that the developer’s evaluation skill matters as much as the agent’s generation quality. A developer who knows what a well-behaved TUI should feel like, who understands the Charm ecosystem, and who has used htop as a mental model for what they want, can evaluate GitTop output quickly and accurately. A developer less familiar with the domain will spend more time per revision cycle, potentially longer than it would have taken to write the code themselves. The agentic workflow shifts when domain knowledge is applied, not whether it is required.

The Broader Class of Projects

GitTop is one data point, but it points toward a recognizable category: personal developer tools with bounded scope, success criteria derivable from the developer’s own workflow, and implementation work that is primarily mechanical rather than novel.

This category is larger than it might seem. Developers accumulate a long list of tools they want but have not built because the implementation cost is too high relative to the value. A script for automating a local deployment step, a TUI for monitoring a database without leaving the terminal, a small HTTP server for intercepting and logging webhook payloads during development, a git activity dashboard: these are the projects that exist as “I should build that eventually” items for years. Fully agentic coding changes the cost calculation. A tool that would take a weekend to build can be directed in an afternoon, evaluated over a few days of use, and revised until it fits.

What the hjr265 experiment with GitTop demonstrates is that this actually works for that category: a developer who knows what they want, a bounded project with visible output, and a workflow where implementation can be delegated without losing control of the result. Scaling this to larger projects, with ambiguous requirements, multiple stakeholders, or significant edge case surface area, is a separate problem that the GitTop case does not address. For the personal tool category, the experiment makes the case.

Was this interesting?