Kalman Filter, Glossary, Textbook of AI

The Kalman filter (Kalman, 1960) is the optimal Bayesian estimator of the hidden state of a linear--Gaussian state-space model given a stream of noisy measurements. It assumes:

$$x_t = F x_{t-1} + B u_t + w_t, \qquad w_t \sim \mathcal{N}(0, Q)$$ $$z_t = H x_t + v_t, \qquad v_t \sim \mathcal{N}(0, R)$$

where $x_t \in \mathbb{R}^n$ is the latent state (e.g. position and velocity), $u_t$ is a known control input, $z_t \in \mathbb{R}^m$ is the measurement, and $F, B, H, Q, R$ are the dynamics, control, observation, process-noise covariance, and measurement-noise covariance matrices.

The filter maintains a Gaussian belief $\mathcal{N}(\hat x_{t|t}, P_{t|t})$ over the current state and updates it in two steps each timestep.

Predict step: propagate the belief forward through the dynamics:

$$\hat x_{t|t-1} = F \hat x_{t-1|t-1} + B u_t$$ $$P_{t|t-1} = F P_{t-1|t-1} F^\top + Q$$

The covariance grows because process noise $Q$ is added.

Update step: incorporate the new measurement $z_t$. First compute the innovation $y_t = z_t - H \hat x_{t|t-1}$ and its covariance $S_t = H P_{t|t-1} H^\top + R$. Then form the Kalman gain:

$$K_t = P_{t|t-1} H^\top S_t^{-1}$$

and combine prediction with measurement:

$$\hat x_{t|t} = \hat x_{t|t-1} + K_t y_t$$ $$P_{t|t} = (I - K_t H) P_{t|t-1}$$

The Kalman gain is the optimal weighting between trusting the prediction (when $P_{t|t-1}$ is small relative to $R$, $K_t$ is small) and trusting the measurement (when $R$ is small relative to $P_{t|t-1}$, $K_t$ approaches $H^{-1}$).

Three properties give the filter its power. It is optimal in the minimum mean-squared error sense for linear-Gaussian systems, and the maximum-a-posteriori estimator coincides with the conditional mean. It is recursive: the new state estimate depends only on the previous estimate and the current measurement, so memory and compute per step are constant in $t$. And it is closed-form: no sampling or iteration is required.

The Kalman filter underlies the Apollo guidance computer (its first famous application), GPS receivers, inertial navigation systems, radar tracking, financial time-series filtering, and almost any sensor-fusion problem with approximately Gaussian noise. When the system is non-linear, the extended Kalman filter linearises the dynamics and observation around the current estimate, and the unscented Kalman filter propagates a small set of sigma points; for severely non-Gaussian or multimodal posteriors, the particle filter replaces the Gaussian belief with a weighted set of samples. But the linear-Gaussian Kalman filter remains the foundation: the cleanest, most efficient case of recursive Bayesian estimation, and the model against which all extensions are measured.

Video

Discussed in:

Chapter 12: Sequence Models, Robotics and Control

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).