Static Graphs Are Snapshots: What Raphtory Gets Right About Time

Most graph databases and libraries operate on a snapshot: a fixed set of nodes and edges representing relationships at some point, or aggregated across all time. For a lot of problems this is fine. But for a surprising number of real-world questions, collapsing time into a static structure destroys exactly the information you need.

Raphtory is a temporal graph engine built to address that gap directly. Its source is on GitHub, it has a system paper on arXiv, and the core team gave a solid accessible explanation on Computerphile that is worth watching before diving into the codebase. The engine is written in Rust with Python bindings via PyO3, and it ships an optional GraphQL server for deployment.

The Problem With Static Aggregation

Consider a contact-tracing scenario. You have a graph of people where edges represent physical proximity events, each with a timestamp. You want to know: starting from a given infected individual, who could plausibly have been infected?

In a static graph, you aggregate all edges regardless of time and ask for reachable nodes. That gives you a wildly inflated answer. Person A met person B on Monday; person B met person C on Sunday. In the static projection, C is reachable from A. But if B was infected by A on Monday, and B’s contact with C happened on Sunday, C could not have been infected through that path. The arrow of time matters.

The correct concept here is a time-respecting path: a sequence of edges where each hop occurs after the previous one. The set of nodes reachable from a source via time-respecting paths is often dramatically smaller than the static reachable set, and it is the set that actually matters for modeling spread, influence, or sequential dependency.

This is not an edge case. Fraud rings, information cascades, lateral movement in network intrusions, supply chain failure propagation, citation influence, all of these require reasoning about paths where temporal order is part of what makes a path valid. Static graph tools simply do not model this.

Raphtory’s Data Model

Raphtory models a graph as a stream of timestamped events. Each node or edge addition carries an integer or float timestamp, and edges can carry time-varying properties as well. The underlying storage keeps a sorted list of events per node and per edge, which makes the data structure append-friendly and enables binary search for time-range queries.

The core abstraction is the windowed view: given any graph, you can create a sub-view restricted to events within a time range without copying data.

from raphtory import Graph

g = Graph()
g.add_edge(1, "Alice", "Bob", properties={"weight": 1.0})
g.add_edge(3, "Bob", "Carol")
g.add_edge(5, "Alice", "Carol")

# Only edges with timestamps in [1, 3)
w = g.window(1, 3)
print(list(w.edges))  # Only Alice->Bob

Because the event lists are sorted, this windowing is O(log n) per node or edge lookup, not O(n). You are not materializing a copy; you are constructing a view that interprets the underlying data through a time filter. This is the design decision that makes iterative temporal analysis feasible: running an algorithm over a rolling window does not require n full copies of the graph, just n re-evaluations of the same event store with different bounds.

The Python API mirrors the Rust API closely, because the Python classes are thin PyO3 wrappers around Rust structs. Method calls cross the PyO3 boundary with minimal overhead, and pure-computation algorithms run in Rust threads that release the Python GIL. Results come back as Python dicts or Pandas DataFrames.

Algorithms and Temporal Motifs

Raphtory ships a library of algorithms that operate on these windowed views. The standard suite includes PageRank, HITS, weakly and strongly connected components, Louvain community detection, degree centralities, and shortest paths. What makes these meaningful in Raphtory is that they run on a time-restricted view of the graph, so you can observe how community structure or centrality rankings evolve across time windows without writing the windowing logic yourself.

The more distinctive capability is temporal motif counting. A temporal motif is a small subgraph with a specified time ordering on its edges. A two-path motif, for example, requires that edge (A, B) occurs before edge (B, C) within some time delta. A temporal triangle requires three nodes to interact in a specific sequence within a window.

Counting these motifs in a static graph is straightforward subgraph matching. Counting them in a temporal graph requires tracking the ordering of events, which most graph tools do not support as a primitive. Raphtory implements temporal motif counting as a first-class algorithm, which means you can do things like identify accounts that form rapid sequential transfer rings (fraud), or nodes that consistently relay information within short time windows (influence nodes in a cascade), without writing custom event-sequence logic from scratch.

The Rust Core and PyO3 Pattern

The architecture of Raphtory, Rust core with Python bindings via PyO3, is increasingly the standard for high-performance Python libraries. Polars uses it, Pydantic v2 uses it, ruff and uv use it. The pattern works well because Rust gives you memory safety without a garbage collector, LLVM-quality codegen, and straightforward multi-threading. PyO3 handles the bridge, including reference counting for shared ownership between Rust’s ownership model and Python’s GC.

For Raphtory specifically, the graph data lives in Rust-owned memory. The Python Graph object holds an Arc<RwLock<GraphStorage>> under the hood. Windowed views hold a reference to that same storage plus their time bounds. When an algorithm runs, it operates entirely in Rust, bypassing the GIL and letting Python threads do other work concurrently. Only the return values cross the boundary, typically as Python dicts or Arrow-backed DataFrames.

For disk persistence, Raphtory uses an Apache Arrow-backed format via the raphtory-disk-graph crate. This means graphs that do not fit in RAM can be memory-mapped from disk, and the columnar format compresses temporal data efficiently since event timestamps tend to be monotonic and sorted.

The paper reports ingest speeds around 5 to 10 million edges per second from CSV on a modern machine, and algorithm performance 10 to 100 times faster than NetworkX on multi-million edge graphs. Those numbers are plausible given the Rust implementation; NetworkX is pure Python and not designed for throughput.

The GraphQL Layer

For deployment, Raphtory ships raphtory-graphql, a standalone binary built on Actix-web and async-graphql. You load a persisted graph, start the server, and external applications can query nodes, edges, windowed subgraphs, and algorithm results over HTTP. This is a practical bridge between the analytical engine and production applications: a dashboard, an API, or an alerting system can pull temporal graph analytics without needing to embed the Rust or Python runtime directly.

Where It Fits in the Ecosystem

NetworkX is the obvious Python comparison. It is flexible and easy to use, but it is pure Python, stores everything in memory as Python dicts, has no native time support, and gets slow around a million edges. Raphtory is faster by an order of magnitude on large graphs and adds the temporal dimension NetworkX does not have.

Neo4j is the leading graph database. It supports timestamps as properties but has no native temporal windowing or time-respecting path primitives. You can simulate some temporal queries in Cypher, but the query planner is not built for them, and temporal motif counting would require complex custom logic or external processing.

TigerGraph and Amazon Neptune support some temporal features but are primarily distributed, enterprise-oriented systems. They are solving a different scale problem and carry corresponding operational complexity.

GraphX on Spark and Apache Flink’s graph processing target distributed batch and stream scenarios. They are the right tools if you have hundreds of billions of edges and a cluster. For a single-node workload up to a few billion edges, Raphtory’s simpler deployment and richer temporal API are likely a better fit.

The honest limitation of Raphtory today is that it is single-node. Very large graphs, ones that genuinely require distributed execution, are not yet supported. The team at Pometry has indicated distributed execution is on the roadmap, but it is not available yet. For most research workloads and many production use cases in fraud, cybersecurity, and social analysis, single-node with disk persistence is sufficient.

Practical Use Cases

Fraud detection is the use case I find most compelling. A fraudster opens a set of accounts and moves money through a sequence of intermediaries to obscure the source. In a static graph, ring structures are common enough that simple pattern matching generates too many false positives. In a temporal graph, you add the constraint that each transfer in the ring must happen in sequence within a short time window. That constraint eliminates most of the static false positives and surfaces the suspicious sequential rings that matter.

Cybersecurity lateral movement works similarly. An attacker compromises a machine, uses credentials from that machine to pivot to another, and eventually reaches a high-value target. The detection question is: find all time-respecting paths from known initial compromise indicators to crown-jewel systems. Static reachability massively overapproximates; temporal reachability finds the actual pivot chains.

Epidemiology and contact tracing are the textbook examples, but the same logic applies anywhere you have events that propagate through a network sequentially, including information diffusion, cascading failures in infrastructure, and dependency propagation in software supply chains.

What It Is

Raphtory is a focused tool that takes one idea seriously: time is not metadata on a graph, it is part of the graph’s structure. By building that assumption into the storage model, the query API, and the algorithm library, it enables a class of analyses that would require substantial custom work in any other system. The Rust core keeps it performant without requiring the operational overhead of a distributed system. The Python interface keeps it accessible. The project is still maturing, particularly around distributed scale and community size, but the core design is sound, and the problem it solves is real.