· 6 min read ·

How 1,000 Lines of C Make HTTP Legible

Source: lobsters

tinyweb is a static HTTP/1.1 server in roughly 1,000 lines of C. It surfaced on Lobsters recently and drew the standard discussion: path traversal risks, SIGPIPE handling, whether the keep-alive loop is complete. The discussion itself is informative, because it maps exactly which problems a 1,000-line C server must confront and which ones fall outside the budget. The constraint is not arbitrary. It does real work as a design tool.

HTTP at the Socket Level

An HTTP server is a TCP server that reads and writes formatted text. The request arrives as raw bytes:

GET /index.html HTTP/1.1\r\n
Host: localhost:8080\r\n
Connection: keep-alive\r\n
\r\n

The server reads until it finds the double CRLF (\r\n\r\n) that terminates the header block, parses the method, path, and version from the first line, then constructs a response:

HTTP/1.1 200 OK\r\n
Content-Type: text/html; charset=utf-8\r\n
Content-Length: 4096\r\n
Date: Tue, 18 Mar 2026 12:00:00 GMT\r\n
\r\n
[body bytes]

Six system calls carry all of this: socket, bind, listen, accept, read, and write. These have been stable since 4.2BSD in 1983. Setting up the listening socket looks essentially the same in every minimal C server ever written:

int fd = socket(AF_INET, SOCK_STREAM, 0);
int yes = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
bind(fd, (struct sockaddr *)&addr, sizeof(addr));
listen(fd, SOMAXCONN);

SO_REUSEADDR is the detail that breaks most first attempts: without it, restarting the server during TCP’s TIME_WAIT period fails with “Address already in use.” Production frameworks absorb this silently; a minimal C server has to handle it explicitly.

What the 1,000-Line Budget Buys

A working HTTP/1.0 server takes around 200 lines. It reads a request, opens a file, writes the response, closes the connection. This is the model used in the CS:APP textbook’s “tiny” server, which Bryant and O’Hallaron use to introduce network programming at CMU. It is also the model that sidesteps the most significant complexity in HTTP/1.1.

Persistent connections are where the protocol gets harder. HTTP/1.1 keeps the TCP connection open after each response by default, expecting successive requests on the same socket. The server must loop: parse a request, send a response, return to the top and wait for another request or a client disconnect. The connection has to track whether the client sent Connection: close. The parser has to handle a client that disconnects mid-read without crashing. Getting this right accounts for a substantial portion of the 1,000-line budget.

The other major consumers are: a MIME type table mapping file extensions to Content-Type strings (a static array of structs, typically 30-50 entries), error responses for 400, 403, 404, and 500, directory index handling, the Date header that RFC 9110 formally requires on all responses, and path sanitization to prevent directory traversal.

What the 1,000-line envelope cannot hold: TLS, HTTP/2, virtual hosting, CGI, chunked transfer encoding on the send side, byte-range requests for media, compression, and daemonization. Each of these represents a discrete complexity boundary. TLS alone requires an external library; the handshake logic in OpenSSL or mbedTLS is orders of magnitude larger than the server itself. These absences are design choices, not gaps.

What sendfile Reveals

The most instructive system call in a file-serving HTTP server is sendfile(2). On Linux:

sendfile(client_fd, file_fd, NULL, file_size);

This copies bytes from a file descriptor to a socket descriptor inside the kernel, without passing through userspace. No intermediate buffer, no malloc, no round-trip through application memory. The kernel reads from the file’s page cache and writes directly to the socket buffer.

This is what serving a file is at the OS level: a descriptor-to-descriptor copy. The web server negotiates which file to copy and writes the HTTP headers that tell the client what it is receiving. The data transfer is a kernel operation. nginx’s architecture has been organized around this insight since Igor Sysoev wrote it in 2002; a minimal C server makes the same principle concrete because there is no abstraction layer obscuring it.

The portability issue is worth noting. On macOS and BSD, sendfile exists but with a different signature: the file descriptor and socket descriptor are swapped, and a length parameter is passed by pointer. A portable minimal server either conditionally compiles per platform or falls back to a read/write loop. The fallback introduces a userspace buffer copy, which matters for throughput but not for correctness. For a learning exercise targeting Linux, the simpler path is to use the Linux form directly and be explicit about the assumption.

A Lineage of Small Servers

Tinyweb belongs to a tradition that goes back to thttpd, Jef Poskanzer’s tiny/turbo/throttling HTTP server from around 1995. Thttpd made an argument that was controversial at the time: a single-process, select-based server with careful buffer management could serve static files faster than Apache’s process-per-connection model while using a fraction of the memory. That argument proved correct, and it influenced the design of lighttpd and, eventually, nginx.

mini_httpd, also from Poskanzer, pushed the same philosophy toward deliberate minimalism: around 2,500 lines, fork-per-connection, CGI support through a clean exec boundary. IBM’s DeveloperWorks published “nweb,” a tutorial server under 200 lines, which sacrificed correctness for brevity in ways the comments made explicit. The CS:APP “tiny” server sits in the same space, designed to be discussed in a classroom rather than deployed.

Each of these is most valuable as a reading exercise. The line count is the feature; it determines whether you can hold the whole thing in your head in a single session.

The Concurrency Decision

A minimal server can handle concurrency through fork, pthread_create, select, poll, or Linux’s epoll. The choice shapes everything else.

A fork-per-connection model takes around 20 lines and requires no per-connection state management. The parent accepts, the child handles the request and exits, the parent loops. It does not scale past a few hundred concurrent connections because fork is expensive and each child duplicates the parent’s address space, but those constraints do not matter for a pedagogical server.

Moving to epoll with non-blocking I/O is the production answer, but it costs roughly twice the code. Non-blocking sockets require per-connection state machines because a single read call might return partial data. The server must track where each connection is in the request parsing state, resume correctly on the next epoll_wait event, and handle EAGAIN without treating it as an error. This is a different programming model from blocking I/O, and it pushes the implementation well past the 1,000-line target.

Most servers in this genre, including tinyweb and the CS:APP tiny server, use blocking I/O. The pedagogical case is strong: a blocking server maps almost directly to the HTTP model as described in RFC 9112, where requests and responses are sequential and complete. The performance ceiling is real, but performance is not the point of the exercise.

Security Details That Appear in the Review Thread

Lobsters discussions of minimal server implementations reliably raise the same issues. Path traversal is the most serious: a request for /../../etc/passwd can escape the document root if the server prepends the path to its root without normalizing it first. The correct defense is realpath(3), which resolves the canonical absolute path; the server then verifies the result still begins with the document root before opening the file. A 1,000-line implementation that gets this right is demonstrating something meaningful. One that skips it is illustrating why production server software is harder than it looks.

SIGPIPE is the other standard omission. When a client disconnects while the server is writing a response, write or sendfile delivers SIGPIPE, which terminates the process by default. The fix is one line at startup:

signal(SIGPIPE, SIG_IGN);

With SIGPIPE ignored, write returns -1 with errno set to EPIPE, which the server can handle gracefully and move on to the next connection. Frameworks suppress this automatically. A minimal C server requires the programmer to know it exists, which is part of what makes writing one useful.

Why the Code Is Worth Reading

A 1,000-line HTTP server in C is small enough to read completely in a single session. You can trace a request from accept through the header parser, the MIME type lookup, the sendfile call, and back to the top of the keep-alive loop. Production web servers are not structured this way; nginx’s source runs to around 150,000 lines and understanding any one subsystem requires context from several others.

The tinyweb project, like thttpd and the CS:APP tiny server before it, demonstrates something worth knowing: HTTP is not a complicated protocol, and the C standard library is sufficient to implement it correctly. The complexity in production servers comes from features, performance requirements, and decades of accumulated edge cases, not from any fundamental difficulty in the protocol itself. Reading a minimal implementation before reading a production one is a legitimate learning strategy, and the 1,000-line C server remains one of the cleaner entry points available.

Was this interesting?