The Case for Korean Keywords: Han, Rust, and the Persistent Dream of Non-English Code

The Han programming language is a statically-typed language written in Rust where every keyword is in Hangul, the Korean writing system. Its author built it as a side project with AI assistance, inspired by watching someone rewrite an entire C++ codebase in Rust using AI in under two weeks. The compiler pipeline is complete: lexer, parser, AST, interpreter, and LLVM IR codegen. It ships with a REPL and a basic LSP server, supports structs with impl blocks, closures, pattern matching, try/catch, file I/O, and module imports.

On the surface this looks niche, and it is. But it sits at the intersection of two threads worth examining separately: the decades-long history of non-English programming languages, and the current practical state of building a language implementation in Rust with LLVM.

Non-English programming has a longer history than most developers realize

The assumption that programming languages use English keywords is so thoroughly ingrained that most developers never examine it. That assumption has been challenged repeatedly since the 1960s.

APL, designed by Kenneth Iverson and introduced in his 1962 book A Programming Language, used a custom set of mathematical symbols rather than English words. It required a special keyboard. Code in APL is dense, symbolic, and has no natural English reading. The point was expressiveness per character, not alignment with any particular human language. The language influenced J, K, and the entire array programming tradition and demonstrated early that the English-keyword convention was a choice, not a technical requirement.

Graham Nelson’s Inform 7, released in 2006 for interactive fiction authoring, went in the opposite direction: the language is designed to read like natural English prose. “The candle is in the brass holder” is valid Inform 7 code that compiles to Z-machine bytecode. It found a dedicated user base of writers who were not traditional programmers and had no interest in learning C syntax.

The more direct precedents for Han are languages that use non-Latin scripts as keywords. 易语言 (YiYuYan, roughly “Easy Language”) is a Chinese programming language commercially available since 2000. It uses Chinese characters for all keywords and has a substantial user base in China, particularly for Windows application development. The fact that it is largely invisible to the Western developer community is itself informative: a language can be widely used and still fail to register in English-language tech media.

قلب (Qalb), Ramsey Nasser’s Scheme-based language written entirely in Arabic, was released in 2012 as an explicit cultural provocation. Nasser’s argument was that the assumption code must be in English encodes a hierarchy, and a Lisp in right-to-left Arabic script forces developers to sit with that assumption. It was never meant for production; it was meant to make people uncomfortable in productive ways.

Korean has prior hobby experiments and educational tools, but Han appears to be among the more complete modern implementations targeting a real compiler architecture.

What Han actually builds

The feature set for a side project is notable. Static typing with type inference, structs with impl blocks, closures, pattern matching, try/catch for error handling, file I/O, module imports. The compiler produces LLVM IR, meaning code goes through the same optimizer that backs Clang, Swift, and Rust before hitting machine code.

The REPL and LSP server deserve separate attention. Most hobby language implementations stop at the interpreter. Shipping a basic LSP server means Han has editor integration: syntax highlighting, hover documentation, error diagnostics. This is significant additional engineering, and it signals that the author thought about the developer experience, not just the compiler internals.

For LLVM integration from Rust, the standard approach is inkwell, a safe wrapper around the raw llvm-sys C bindings. Inkwell provides a typed Rust API for constructing LLVM IR programmatically:

let context = Context::create();
let module = context.create_module("han_module");
let builder = context.create_builder();

let fn_type = context.i64_type().fn_type(&[], false);
let function = module.add_function("주", fn_type, None);

let entry = context.append_basic_block(function, "entry");
builder.position_at_end(entry);

let result = context.i64_type().const_int(42, false);
builder.build_return(Some(&result)).unwrap();

module.print_to_file(Path::new("output.ll")).unwrap();

The resulting .ll file is standard LLVM IR, fed to llc to produce native assembly. This is the same IR rustc emits before handing off to LLVM, which means access to the full optimization pipeline at no additional cost. LLVM handles constant folding, dead code elimination, loop unrolling, and platform-specific instruction selection automatically once you produce valid IR.

Rust has become a popular implementation language for language projects broadly. Gleam, a statically-typed functional language targeting the Erlang VM, is written in Rust. Roc is a Rust-implemented language pursuing fast native compilation without a garbage collector. Nickel uses Rust for a gradual-typing configuration language. The pattern holds for good reasons: Rust’s memory safety without a GC matters when you are building a runtime that will manage other programs’ memory, and the ecosystem around parser tooling (logos for lexing, chumsky for parsing, inkwell for codegen) has matured enough to cover the standard pipeline.

AI-assisted compiler development in practice

The author credits AI assistance throughout the build and treats that as part of the story, which is worth examining rather than glossing over.

A compiler pipeline has unusually clear structure. The components (lexer, parser, AST, type checker, interpreter, codegen) are thoroughly documented in academic literature, textbooks, and tutorials. Crafting Interpreters by Robert Nystrom walks through two complete language implementations with full source code. The LLVM Kaleidoscope tutorial does the same for LLVM-based codegen. An AI assistant trained on this material can scaffold components quickly, catch type errors in AST visitor implementations, suggest inkwell API usage, and help debug IR output. The scaffolding work, which is tedious but not intellectually novel, compresses significantly.

What AI cannot do is make the interesting design decisions. Choosing which Hangul words map to which language constructs requires both Korean fluency and an understanding of what a keyword needs to communicate. In Korean, the concept of “function” alone branches several ways: 함수 (hamsu) carries a mathematical connotation, 기능 (gineung) suggests capability or feature, 메서드 (messeodeu) is a transliteration of “method” from English. Each choice reads differently to a native speaker. The HN discussion raised exactly this: keyword design in a non-English language is a linguistic and cultural decision that happens to have technical constraints, not the other way around.

The AI-assisted development story does something useful here: it separates the mechanical from the meaningful. The mechanical parts (wiring up inkwell, implementing a Pratt parser, building a visitor for the AST) can be accelerated. The meaningful parts (deciding that Korean programmers should write 만약 rather than 이프, that the struct syntax should feel natural to someone reading Korean) require human judgment that the tooling cannot supply.

The real obstacles

Every non-English language eventually runs into the same set of practical walls. Tooling assumes ASCII or at minimum Latin-script identifiers. Syntax highlighters need updating. Keyboard input for Hangul requires an IME switch on most systems, which disrupts typing flow for developers who are not native Korean users writing Korean all day. Error messages need to match the script of the keywords, meaning all standard library output needs to be in Hangul too, or the cognitive split between the code and its diagnostics becomes a friction source.

The deeper question is purpose. YiYuYan exists because there is a large population of Chinese-speaking developers who genuinely prefer it for certain tasks. Qalb exists as art, which liberates it entirely from the adoption question. Han sits somewhere less clearly defined: it is explicitly not a production tool pitch, but it is also a more complete implementation than most art projects.

The value it offers most clearly is demonstrative. It shows that Hangul can carry the semantic weight of a programming language’s keywords, that a full compiler pipeline in Rust with LLVM codegen and LSP support is achievable as a side project with AI assistance in a compressed timeframe, and that the English monoculture of programming keywords reflects historical accident and ecosystem inertia more than any technical constraint.

That combination, the tradition of non-English language experiments intersecting with the current tooling maturity around Rust and LLVM, makes Han a useful data point regardless of whether it ever gets a second user. The technical bar for building something real has dropped, and the cultural argument for trying has always been there.