Two Noisy Sources Are Better Than One: What the Kalman Filter Gets Right

Rudolf Kalman published his seminal paper in 1960 in the Journal of Basic Engineering, and the filter that bears his name was running on Apollo spacecraft computers within a decade. Sixty-five years later, a tutorial about understanding it with a simple radar example is sitting near the top of Hacker News with 400 points. That longevity is worth thinking about. The kalmanfilter.net resource that generated the discussion is a clean introduction, but the reason the concept keeps surfacing is that the underlying problem it solves, estimating hidden state from noisy, incomplete observations, appears everywhere in software and hardware systems.

The Problem Before the Solution

Consider a radar tracking an aircraft. Every few milliseconds, the radar measures range and bearing. Those measurements are noisy. The physics of the aircraft gives you a model for how it should move, but the model is imperfect too. You have two imperfect information sources: the sensor readings and your motion model. The naive approaches are obvious and obviously wrong. Trusting only the model ignores useful measurement data. Trusting only the raw measurements produces a jittery estimate that cannot predict where the target will be between pings.

The Kalman filter resolves this by maintaining a probability distribution over the unknown state rather than a single point estimate. It tracks not just the best guess for position and velocity, but also how uncertain that guess is. Every time step, it predicts where the state should be according to the model, then corrects that prediction using the measurement, weighting each source by its reliability.

State Space and the Two-Step Loop

The filter operates in a state space representation. For a simple 2D radar tracker, the state vector holds position and velocity:

x = [pos_x, pos_y, vel_x, vel_y]

The state transition matrix F encodes constant-velocity kinematics over time step dt:

import numpy as np

dt = 0.1  # seconds

F = np.array([[1, 0, dt, 0],
              [0, 1, 0, dt],
              [0, 0,  1, 0],
              [0, 0,  0, 1]])

The full system has two covariance matrices. Q is the process noise covariance, encoding uncertainty in the motion model. R is the measurement noise covariance, encoding uncertainty in the sensor. These are the two parameters you tune when deploying a Kalman filter, and getting them right requires understanding your system.

The prediction step advances the state estimate and inflates the covariance to reflect growing uncertainty:

# Predict
x_pred = F @ x
P_pred = F @ P @ F.T + Q

The update step incorporates the measurement. The key quantity is the Kalman gain K, which determines how much weight to give the new measurement versus the prediction:

# Observation matrix: we measure position only
H = np.array([[1, 0, 0, 0],
              [0, 1, 0, 0]])

# Innovation covariance
S = H @ P_pred @ H.T + R

# Kalman gain
K = P_pred @ H.T @ np.linalg.inv(S)

# Innovation: difference between measurement and prediction
y = z - H @ x_pred

# Update
x = x_pred + K @ y
P = (np.eye(4) - K @ H) @ P_pred

When measurement noise R is large relative to predicted covariance, K becomes small and the filter trusts its model more. When the model uncertainty has grown large, K becomes large and the filter leans on the measurement. This ratio is computed automatically from the covariances at each step, which is the core insight.

The Radar Nonlinearity Problem

For a linear system where the measurement is a linear function of the state, the linear Kalman filter is the optimal estimator under Gaussian noise. Radar measurements are not linear. The sensor returns range rho and bearing theta:

rho = sqrt(x^2 + y^2)
theta = atan2(y, x)

This is a nonlinear transformation from Cartesian state to polar observation. The standard Kalman filter breaks down here because its update step assumes the measurement model H is a matrix. The Extended Kalman Filter (EKF) handles this by linearizing around the current state estimate using the Jacobian of the measurement function:

def measurement_jacobian(x):
    px, py, vx, vy = x
    r = np.sqrt(px**2 + py**2)
    return np.array([
        [px/r,    py/r,    0, 0],
        [-py/r**2, px/r**2, 0, 0]
    ])

The EKF substitutes this Jacobian for H in the update equations. The approximation is good when the nonlinearity is mild and the uncertainty is small, but it can diverge badly in highly nonlinear regimes.

The Unscented Kalman Filter takes a different approach. Instead of linearizing the function, it propagates a set of carefully chosen sigma points through the nonlinear function and reconstructs the statistics on the other side. For a state of dimension n, it uses 2n+1 sigma points placed symmetrically around the mean. This captures the mean and covariance more accurately to third order in the Taylor expansion, versus first order for the EKF. The UKF does not require computing Jacobians, which matters when your measurement function is ugly or numerically unstable to differentiate.

What Particle Filters Do Instead

Both the EKF and UKF still assume the posterior distribution is approximately Gaussian. That assumption fails for systems with multimodal distributions, non-Gaussian noise, or severe discontinuities. Particle filters drop the Gaussian assumption entirely. They represent the distribution with a weighted set of samples, each propagated through the nonlinear dynamics and reweighted by measurement likelihood.

Particle filters are flexible and handle arbitrary distributions, but they scale poorly with state dimension. A Kalman filter with a 100-dimensional state still runs in polynomial time. A particle filter with 100 dimensions needs an exponentially large particle cloud to adequately cover the space. In practice, particle filters are used for lower-dimensional problems with non-Gaussian characteristics, such as robot localization in a known map where the robot might face ambiguity about which corridor it is in.

The complementary filter, common in IMU-based attitude estimation for drones and flight controllers, is simpler still. It fuses high-rate gyroscope integration with low-rate accelerometer correction using a fixed weighting coefficient. This is the Kalman filter with the gain frozen, which is computationally cheaper and works well when the noise characteristics are stable and well-characterized in advance.

The Apollo Context and Why It Matters

Kalman presented the filter to NASA scientists in 1960, and by the mid-1960s it was being implemented for the Apollo guidance computer. The constraints were brutal: the AGC had 4KB of RAM and ran at 1.024 MHz. The Kalman filter was attractive precisely because it is recursive. You do not need to store the entire history of measurements to compute the optimal estimate; you only need the previous state estimate and covariance. Each step has fixed memory and compute cost regardless of how long the mission has been running. That recursiveness is why it remains relevant in resource-constrained embedded systems today.

Modern implementations in Python are straightforward. The filterpy library provides clean implementations of linear KF, EKF, UKF, and particle filters with good documentation. For computer vision applications, OpenCV includes a cv2.KalmanFilter class that handles the common case of tracking objects across frames. The scipy signal module provides state-space model tools that can be used for Kalman filtering in batch mode.

Tuning and the Real Difficulty

The mathematics is not where people get stuck when deploying Kalman filters. The difficulty is parameter estimation: choosing Q and R to match your actual system. R is often measurable empirically by observing sensor output when the true state is known to be stationary. Q is harder because it represents model uncertainty, which depends on how much your dynamics model deviates from reality.

Poor Q choices have predictable failure modes. Set Q too small and the filter over-trusts its model, ignoring measurement data even when the model has drifted far from reality. Set Q too large and the filter behaves like raw measurement smoothing, losing the benefit of the motion model. Adaptive Kalman filter variants exist that estimate Q and R online from the innovation statistics, though they add complexity and can themselves be sensitive to initial conditions.

The innovation sequence, the series of y values from the update step, should be white noise with zero mean if the filter is performing correctly. Plotting its autocorrelation function is a standard diagnostic for detecting model mismatch.

Why This Keeps Coming Back

Sensor fusion is not going away. Every modern autonomous system, from delivery drones to surgical robots to the positioning stack in a mobile phone, combines multiple imperfect information sources into a coherent state estimate. The Kalman filter and its variants remain the dominant solution because they are analytically grounded, computationally tractable, and provide uncertainty estimates alongside state estimates. That last point is often undervalued: knowing how confident the estimate is matters as much as the estimate itself when downstream decisions depend on it.

The radar tutorial that surfaced on Hacker News is a good starting point for building intuition. The two-step predict-update loop is simple enough to implement from scratch in an afternoon, and doing so is the fastest way to internalize why the Kalman gain has the form it does. Once you see the mechanism clearly, you start recognizing the problem it solves in places you would not have looked before.