Deriving Repositories from Declarations: Scheme Macros and the Persistence Problem
Source: lobsters
The Repository Pattern occupies a strange position in software design literature. Martin Fowler defined it in Patterns of Enterprise Application Architecture (2002) as something that “mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.” Eric Evans gave it canonical form in Domain-Driven Design (2003). Since then, it has been implemented in Java with annotations, in C# with generics, in Haskell with type classes, and in Clojure with protocols. A recent post on jointhefreeworld.org demonstrates the pattern in Scheme using hygienic macros, which opens up a worthwhile conversation about what the pattern is actually doing and why the macro approach is a credible alternative to framework-based solutions.
What the Repository Pattern Is Really About
Strip away the enterprise terminology and the pattern has three core properties: it hides persistence details behind a procedure-level interface, it makes that interface swappable at the call site, and it keeps business logic free of any dependency on a specific storage backend. The canonical OOP formulation achieves this through interfaces and classes:
public interface UserRepository {
Optional<User> findById(long id);
void save(User user);
void delete(long id);
}
public class PostgresUserRepository implements UserRepository {
// SQL via JDBC
}
public class InMemoryUserRepository implements UserRepository {
private final Map<Long, User> store = new HashMap<>();
// no SQL, used in tests
}
Business logic receives a UserRepository reference and never needs to know which implementation is behind it. The value in testing is immediate: swap the Postgres implementation for the in-memory one, run deterministic tests without a database, ship with confidence. The value in production is longer-term: if you ever need to swap backends, the business logic does not change.
The pattern has nothing inherent to do with classes or objects. It is a statement about dependency direction: callers depend on an abstraction, not on a concrete storage technology.
How Functional Languages Approach This
In Haskell, the standard approach uses a type class to define the interface and multiple instances for different backends. The MTL style makes business logic polymorphic over any monad that satisfies the constraint:
class Monad m => UserRepo m where
findById :: UserId -> m (Maybe User)
save :: User -> m ()
delete :: UserId -> m ()
createUser :: UserRepo m => Name -> Email -> m User
createUser name email = do
let user = mkUser name email
save user
return user
Production code runs in AppM (wrapping IO with a database connection); test code runs in a State-based monad that holds a Map. The type class constraint is erased at compile time, so there is no runtime dispatch cost. The downside is that the approach requires a type system to enforce it, and the machinery of monad transformers is not trivial to reason about.
Clojure takes a different path. Protocols provide a runtime dispatch mechanism:
(defprotocol UserRepository
(find-by-id [repo id])
(save! [repo user])
(delete! [repo id]))
(defrecord InMemoryUserRepo [store-atom]
UserRepository
(find-by-id [_ id] (get @store-atom id))
(save! [_ user] (swap! store-atom assoc (:id user) user))
(delete! [_ id] (swap! store-atom dissoc id)))
This is idiomatic Clojure, but it carries the overhead of protocol dispatch and requires the protocol to be defined separately from its implementations.
A simpler approach, available in any language with first-class functions, is to represent the repository as a record of procedures. You pass the record around, callers invoke the appropriate procedure, and swapping implementations is a matter of constructing a different record. No classes, no type classes, no runtime dispatch table. This maps directly onto Scheme’s model, where closures are cheap and association lists or vectors of procedures are natural data structures.
Hygienic Macros in Scheme
Before examining the macro approach, it is worth understanding what “hygienic” means and why it matters. The classic problem with unhygienic macros, as in Common Lisp’s defmacro, is variable capture. Consider a naive swap macro:
;; Common Lisp -- broken
(defmacro swap! (a b)
`(let ((tmp ,a))
(set! ,a ,b)
(set! ,b tmp)))
(let ((tmp 10) (x 20))
(swap! tmp x)) ; BUG: the macro's 'tmp' shadows the outer 'tmp'
The macro introduces tmp as a local binding, but if the caller also uses tmp, the two collide. The Common Lisp solution is gensym, which generates a fresh uninterned symbol at expansion time:
(defmacro swap! (a b)
(let ((g (gensym)))
`(let ((,g ,a))
(set! ,a ,b)
(set! ,b ,g))))
This works, but it is a manual discipline. Every macro author must remember to use gensym for every introduced binding, and the language provides no enforcement.
Scheme’s syntax-rules, standardised in R4RS (1991) and made normative in R5RS (1998), handles this automatically. The macro expander tracks the scope of every identifier. Identifiers introduced by the macro resolve in the macro’s definition environment; identifiers provided by the caller resolve in the caller’s environment. No collision is possible by construction, and no gensym is needed:
(define-syntax my-or
(syntax-rules ()
((_ a b)
(let ((tmp a))
(if tmp tmp b)))))
(let ((tmp 42))
(my-or #f tmp)) ; => 42, correct -- the macro's 'tmp' is distinct
The R7RS-small standard (2013) retains syntax-rules as the macro system for the small language. It adds the _ wildcard pattern and permits more flexible ellipsis placement relative to R5RS. syntax-case, the more powerful procedural macro system from R6RS (2007), is deferred to R7RS-large, which as of 2026 remains unratified as a single document. The Kohlbecker et al. paper “Hygienic Macro Expansion” (LFP ‘86) first formalized the concept; the core insight was that the expander should track definition-time scope rather than leaving it to the programmer.
syntax-rules uses pattern matching. Ellipsis patterns (...) match zero or more subforms, which makes it possible to write macros that accept variable numbers of declarations:
(define-syntax my-let
(syntax-rules ()
((_ ((var val) ...) body ...)
((lambda (var ...) body ...) val ...))))
This expands my-let into a direct lambda application with no intermediate allocation, no runtime dispatch, and no possibility of name collision in the generated bindings.
Generating a Repository with Macros
The article demonstrates using these hygienic macros to define the repository pattern in Scheme. The core insight is that a macro can generate several related definitions from a single declarative form, the same way a Java annotation processor generates boilerplate or Haskell’s persistent library generates type-safe query functions from a schema declaration via Template Haskell.
From a functional standpoint, the generated repository is a record of procedures, not an object. Each backend is a constructor function that closes over its state and returns a dispatch structure:
(define (make-in-memory-user-repo)
(let ((store (make-hash-table equal?)))
(list
(cons 'find-by-id (lambda (id) (hash-table-ref/default store id #f)))
(cons 'save! (lambda (user) (hash-table-set! store (user-id user) user)))
(cons 'delete! (lambda (id) (hash-table-delete! store id))))))
;; Business logic is agnostic to the backend
(define (process-user repo id)
(let ((user ((cdr (assoc 'find-by-id repo)) id)))
(if user
(handle-user user)
(error "not found" id))))
;; In tests:
(process-user (make-in-memory-user-repo) 42)
;; In production:
(process-user (make-sqlite3-user-repo db-conn) 42)
A macro can generate this entire structure from a compact declaration, enforcing that every backend implements the same set of operations and producing consistent dispatch helpers. The generated code is fully transparent: you can macroexpand the declaration and read exactly what the expander produced. This is not possible with Java reflection or runtime proxies, where the generated behaviour lives behind an opaque layer.
Hygiene matters here in a concrete way. If you define multiple repositories in the same module, the expander ensures that generated internal names do not collide across definitions. The macro author describes the structure; the expander manages name uniqueness. This is the opposite of the annotation-processor model, where the generator must be careful to produce globally unique class names.
Why This Matters Beyond Scheme
The macro approach exposes something that framework-based repository implementations tend to obscure: the pattern is fundamentally about generating code from a specification. In Java, the specification is the interface declaration combined with annotation metadata, and a framework reads it at runtime using reflection. In Haskell, the specification is the type class definition, and the compiler generates dispatch tables at compile time. In Scheme, the specification is the macro call, and the expander generates definitions at compile time, with no runtime overhead and full source transparency.
This is what Lisp programmers mean when they describe macros as a tool for building language abstractions rather than just reducing boilerplate. The repository macro is not a shorthand for writing the same code faster. It is a facility that enforces a structural constraint, generates coherent sets of definitions from a single declaration, and does so within the language itself, without reaching for an external code generator or a runtime reflection API.
Scheme is a small language by design. R7RS-small fits in under 90 pages. There is no ORM, no DI container, no annotation framework. There is a hygienic macro system, first-class functions, and closures. The repository pattern emerges from composing those three things, and the result is arguably cleaner than most framework-based versions because every part of it is visible and inspectable. The pattern does not require the language to be extended; it requires the language to be understood.