While our state vector $\mathbf{x}$ contains all the variables we are interested in tracking (position $p$ and velocity $v$), our sensors often measure only a subset of these, or a linear combination of them.
The relationship between the state and the noisy measurement $\mathbf{z}_k$ is defined by the observation equation:
$$\mathbf{z}_k = \mathbf{H}_k \mathbf{x}_k + \mathbf{v}_k$$
Where:
The matrix $\mathbf{H}$ dictates how the elements of the internal state vector $\mathbf{x} = \begin{bmatrix} p \\ v \end{bmatrix}$ are combined to produce the measurement $\mathbf{z}$.
In other words, $\mathbf{H}$ acts as a bridge:
Example A: Measuring Only Position ($p$)
If we use a GPS sensor that measures only the position ($p$) and nothing else, our measurement vector is $\mathbf{z}_k = [p_{measured}]$.
The matrix $\mathbf{H}$ must select the position from $\mathbf{x}_k$:
$$\mathbf{z}_k = \begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} p_k \\ v_k \end{bmatrix}$$
$$\mathbf{H} = \begin{bmatrix} 1 & 0 \end{bmatrix}$$
This matrix multiplication yields: $p_{measured} = 1 \cdot p_k + 0 \cdot v_k$, meaning the measurement is simply the position.
Example B: Measuring Position and Velocity
If we use a sophisticated sensor (e.g., radar or a highly integrated IMU) that provides independent measurements of both position ($p$) and velocity ($v$), the measurement vector is $\mathbf{z}_k = \begin{bmatrix} p_{measured} \\ v_{measured} \end{bmatrix}$.
The matrix $\mathbf{H}$ must select both elements:
$$\mathbf{z}_k = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} p_k \\ v_k \end{bmatrix}$$
$$\mathbf{H} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \mathbf{I} \text{ (Identity Matrix)}$$
The vector $\mathbf{v}_k$ (measurement noise) accounts for the fact that sensors are never perfectly accurate. This noise is modeled as a zero-mean Gaussian distribution, characterized by the Measurement Noise Covariance Matrix ($\mathbf{R}$):
$$\mathbf{R} = \mathbb{E}[\mathbf{v}_k \mathbf{v}_k^T]$$
$\mathbf{R}$ is a symmetric matrix that quantifies the sensor's inherent unreliability. For a single position measurement (Example A), $\mathbf{R}$ is a $1 \times 1$ matrix:
$$\mathbf{R} = \begin{bmatrix} \sigma^2_{\text{GPS}} \end{bmatrix}$$
This $\mathbf{R}$ matrix is the formal representation of the single-measurement variance ($\sigma^2_{\text{sensor}}$) discussed in the error calculation section.
Just like $\mathbf{P}$ and $\mathbf{Q}$, the diagonal of $\mathbf{R}$ contains the variances of the sensors, and the off-diagonal elements represent the correlation between sensor noises (which is usually zero for independent sensors).
The $\mathbf{R}$ matrix is used in the filter's Correction Step.
A smaller $\mathbf{R}$ means the sensor is highly reliable, causing the filter to trust the measurement more than the prediction.
A larger $\mathbf{R}$ means the sensor is noisy, causing the filter to rely more on its internal model prediction.
The Observation Model ($\mathbf{H}$ and $\mathbf{R}$) and the Prediction Model ($\mathbf{F}$ and $\mathbf{Q}$) together form the complete framework for the linear Kalman Filter. The Kalman Filter constantly compares the size of $\mathbf{P}$ (how much we trust our prediction) with the size of $\mathbf{R}$ (how much we trust our sensor). This comparison is what calculates the Kalman Gain ($K$) (see below).
Now that we have a Predicted State ($\mathbf{x}_k^-$) and an Actual Measurement ($\mathbf{z}_k$), the filter must decide how to fuse them. This happens in three logical steps.
1. The Innovation (Measurement Residual)
First, we calculate the difference between what the sensor actually saw ($\mathbf{z}_k$) and what it expected to see based on our prediction ($\mathbf{H}\mathbf{x}_k^-$). This is called the Innovation ($\mathbf{y}_k$):
$$\mathbf{y}_k = \mathbf{z}_k - \mathbf{H} \mathbf{x}_k^-$$
If $\mathbf{y}_k$ is zero, our prediction was perfect!
2. The Kalman Gain ($\mathbf{K}_k$)
The Kalman Gain is the "Trust Factor." It is a weighting matrix that determines how much of that innovation error we should apply to our state.
$$\mathbf{K}_k = \mathbf{P}_k^- \mathbf{H}^T (\mathbf{H} \mathbf{P}_k^- \mathbf{H}^T + \mathbf{R})^{-1}$$
The Intuition of $\mathbf{K}$:
3. The State and Covariance Update
Finally, we update our "Best Guess" and our "Uncertainty":
Update State:
$$\hat{\mathbf{x}}_k^+ = \hat{\mathbf{x}}_k^- + \mathbf{K}_k \mathbf{y}_k$$ (New Guess = Old Guess + Gain $\times$ Error)
Update Uncertainty:
$$\mathbf{P}_k^+ = (\mathbf{I} - \mathbf{K}_k \mathbf{H}) \mathbf{P}_k^-$$ (New Uncertainty = Old Uncertainty - Information Gained)
| Term | Symbol | Intuition |
|---|---|---|
| Innovation | $\mathbf{y}$ | "How much did the sensor surprise us?" |
| Kalman Gain | $\mathbf{K}$ | "How much do we trust that surprise?" |
| A Posteriori State | $\hat{\mathbf{x}}^+$ | Our final, fused "truth" for this time step. |
| A Posteriori Covariance | $\mathbf{P}^+$ | Our updated (smaller) uncertainty after the measurement. |
The Correction Step is the only place in the filter where Uncertainty ($\mathbf{P}$) actually shrinks.
In our first example, the input $u_{k-1} = a_{k-1} = 0$.
With $u_{k-1} = 0$, your prediction equation simplifies to:$$\mathbf{x}_k = \mathbf{F} \mathbf{x}_{k-1} + \mathbf{G} w_{k-1}$$Expanding this into the individual components:$$\begin{bmatrix} p \\ v \end{bmatrix}_k = \begin{bmatrix} 1 & \Delta t \\ 0 & 1 \end{bmatrix} \begin{bmatrix} p \\ v \end{bmatrix}_{k-1} + \begin{bmatrix} \frac{1}{2}\Delta t^2 \\ \Delta t \end{bmatrix} w_{k-1}$$
import numpy as np
import matplotlib.pyplot as plt
# --- 1. SETUP ---
np.random.seed(420)
dt = 0.4
t = np.arange(0, 20, dt)
n_steps = len(t)
# Kalman/Physics Matrices
F = np.array([[1, dt], [0, 1]])
G = np.array([[0.5*dt**2], [dt]])
H = np.array([[1, 0]])
# Noise Parameters
process_std_true = 0.5 # Standard deviation of the random "push" (w)
process_std = .5 # process noise of Kalman filter
measurement_std = 1.2 # Standard deviation of the sensor noise (v)
Q = G @ G.T * process_std**2
R = np.array([[measurement_std**2]])
# --- 2. GENERATE CONSISTENT GROUND TRUTH ---
# We simulate the "Real World" using the exact physical model
true_x = np.array([[0.0], [2.0]]) # Initial position=0, velocity=2
true_pos, true_vel = [], []
for i in range(n_steps):
# The world moves: x_k = F*x_{k-1} + G*w
w = np.random.normal(0, process_std_true)
true_x = F @ true_x + G * w
true_pos.append(true_x[0,0])
true_vel.append(true_x[1,0])
# Measurements are taken from the consistent truth
measurements = np.array(true_pos) + np.random.normal(0, measurement_std, n_steps)
# --- 3. THE KALMAN FILTER ---
x = np.array([[0], [0]]) # Initial guess (deliberately wrong velocity)
P = np.array([[10, 0], [0, 10]])
preds_pos, ests_pos, stds_pos = [], [], []
preds_vel, ests_vel, stds_vel = [], [], []
for i in range(n_steps):
# PREDICT (A Priori)
x_p = F @ x
P_p = F @ P @ F.T + Q
preds_pos.append(x_p[0,0])
preds_vel.append(x_p[1,0])
# CORRECT (A Posteriori)
z = measurements[i]
K = P_p @ H.T @ np.linalg.inv(H @ P_p @ H.T + R)
x = x_p + K @ (z - H @ x_p)
P = (np.eye(2) - K @ H) @ P_p
ests_pos.append(x[0,0])
ests_vel.append(x[1,0])
stds_pos.append(np.sqrt(P[0,0]))
stds_vel.append(np.sqrt(P[1,1]))
# --- 4. PLOTTING ---
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 12), sharex=True)
# Position Plot
ax1.plot(t, true_pos, 'g-', label='True Position (Consistent Physics)', linewidth=2)
ax1.errorbar(t, measurements, yerr=2*measurement_std, fmt='o', color='red',
ecolor='lightcoral', elinewidth=1, capsize=2, label='Measurements (z) ± 2σ')
ax1.scatter(t, preds_pos, color='orange', marker='x', s=40, label='Predictions (x⁻)')
ax1.errorbar(t, ests_pos, yerr=2*np.array(stds_pos), fmt='o', color='blue',
ecolor='lightblue', elinewidth=2, capsize=3, label='Estimates (x⁺) ± 2σ')
ax1.set_title('Consistent Position Tracking', fontsize=14)
ax1.legend()
# Velocity Plot
ax2.plot(t, true_vel, 'g-', label='True Velocity (Consistent Physics)', linewidth=2)
ax2.scatter(t, preds_vel, color='orange', marker='x', s=40, label='Predictions (x⁻)')
ax2.errorbar(t, ests_vel, yerr=2*np.array(stds_vel), fmt='o', color='blue',
ecolor='lightblue', elinewidth=2, capsize=3, label='Estimates (x⁺) ± 2σ')
ax2.set_title('Consistent Velocity Inference', fontsize=14)
ax2.legend()
plt.tight_layout()
plt.show()