Process Noise: Modeling Model Imperfection¶

The prediction step we derived ($\mathbf{x}_k = \mathbf{F}_{k-1} \mathbf{x}_{k-1} + \mathbf{G} \mathbf{u}_{k-1}$) assumes our model is perfect: that the system is truly linear and that the acceleration $a_{k-1}$ is known and perfectly constant over the interval $\Delta t$.

In reality, neither of these is true. Process Noise accounts for unmodeled disturbances and inaccuracies.

The Full Prediction Equation¶

The general prediction equation including model imperfections is:

$$\mathbf{x}_k = \mathbf{F}_{k-1} \mathbf{x}_{k-1} + \mathbf{G}_{k-1} \mathbf{u}_{k-1} + \mathbf{L}_{k-1} \mathbf{w}_{k-1}$$

Where:

$\mathbf{x}$: $n$ State Vector.
$\mathbf{F}_{k-1}$: $n \times n$ State Transition Matrix.
$\mathbf{G}_{k-1}$: $n \times m$ Control Input Matrix.
$\mathbf{u}_{k-1}$: $m$ Control Input Vector.
$\mathbf{L}_{k-1}$: $n \times q$ Noise Gain Matrix.
$\mathbf{w}_{k-1}$: $q$ Process Noise Vector.

What Does Process Noise Represent?

The term $\mathbf{w}_{k-1}$ represents any effect that causes the true state of the system to deviate from our model's prediction, including:

Model Error: The true acceleration is never perfectly constant; it jitters, changes slightly, or is affected by unmodeled forces (like wind resistance or friction).
Input Error: Errors in measuring or calculating the input $a_{k-1}$ itself.
Discretization Error: The error introduced by assuming a constant acceleration over a finite time step $\Delta t$.

Application to the 1D Point Mass (Scalar Case)¶

For our 1D point mass, we have only one input (acceleration) and one noise source (acceleration jitter). Here, we assume the noise enters the system through the exact same physical "door" as the control command, meaning $\mathbf{L} = \mathbf{G}$.

The 1D Prediction Equation:

$$\mathbf{x}_k = \mathbf{F} \mathbf{x}_{k-1} + \mathbf{G} (u_{k-1} + w_{k-1})$$

Here, $u_{k-1} = a_{k-1}$ and $w_{k-1}$ are scalars ($1 \times 1$), while $\mathbf{G}$ is the $2 \times 1$ matrix that maps that acceleration into position and velocity.

The acceleration noise is a random variable, drawn from a Gaussian distribution with mean $0$ and variance $\sigma_a^2$. Notated as $w_{k-1} \sim \mathcal{N}(0, \sigma_a^2)$. $\sigma_a^2$ is the variance of the acceleration noise.

The Process Noise Covariance Matrix ($\mathbf{Q}$)¶

Even though $w_{k-1}$ is a scalar, its effect on the state is 2-dimensional. While $w$ is just a "jitter in acceleration," that jitter causes a "jitter in position" and a "jitter in velocity."

$\mathbf{Q}$ defines:

How much position varies ($\sigma_p^2$).
How much velocity varies ($\sigma_v^2$).
How position and velocity vary together (the covariance $\sigma_{pv}$).

$\mathbf{Q}$ is the Covariance Matrix of the noise after it has affected the state. The Kalman Filter tracks this effect using the matrix $\mathbf{Q}$.

For a scalar noise $w$ entering through matrix $\mathbf{G}$, the definition is:

$$\mathbf{Q} = \mathbb{E}[(\mathbf{G}w)(\mathbf{G}w)^T] = \mathbf{G} \mathbb{E}[w^2] \mathbf{G}^T = \mathbf{G} \sigma_a^2 \mathbf{G}^T$$

Substituting our values:

$$\mathbf{Q} = \begin{bmatrix} \frac{1}{2}(\Delta t)^2 \\ \Delta t \end{bmatrix} \begin{bmatrix} \frac{1}{2}(\Delta t)^2 & \Delta t \end{bmatrix} \sigma_a^2 = \begin{bmatrix} \frac{1}{4}(\Delta t)^4 & \frac{1}{2}(\Delta t)^3 \\ \frac{1}{2}(\Delta t)^3 & (\Delta t)^2 \end{bmatrix} \sigma_a^2$$

State Covariance Matrix ($\mathbf{P}$)¶

The vector $\mathbf{x}_k$ is our best guess of the state, but we know it isn't perfect. To handle this, we use the State Covariance Matrix ($\mathbf{P}$).

The matrix $\mathbf{P}$ represents the "error budget" of our estimate. It quantifies how much we trust our current values for position and velocity. For our 1D point mass (which has a 2D state: position and velocity), $\mathbf{P}$ is a $2 \times 2$ matrix:

$$\mathbf{P} = \begin{bmatrix} \sigma_p^2 & \sigma_{pv} \\ \sigma_{vp} & \sigma_v^2 \end{bmatrix}$$

The Diagonal ($\sigma_p^2, \sigma_v^2$): These are the variances. They tell us the "spread" of our uncertainty for each variable. A large $\sigma_p^2$ means we are very unsure about the robot's location.
The Off-Diagonal ($\sigma_{pv}$): This is the covariance. It tells us how errors in position and velocity are linked.

The Prediction Equation¶

We update our uncertainty State Covariance Matrix from the previous time step ($k-1$) to the current time ($k$) using:

$$\mathbf{P}_k^- = \mathbf{F} \mathbf{P}_{k-1}^+ \mathbf{F}^T + \mathbf{Q}$$

$\mathbf{P}_k^-$: The "A Priori" (predicted) uncertainty.
$\mathbf{Q}$: The uncertainty added by the random jitter during the time step.

Here, we assume that $\mathbf{Q}$ is a constant matrix. In more advanced systems (like those with changing time steps $\Delta t$), $\mathbf{Q}_k$ may also get a time index $k$, but the underlying "stretch and grow" logic remains identical.

The Projection ($\mathbf{F} \mathbf{P}_{k-1}^+ \mathbf{F}^T$)
This term takes our existing uncertainty and moves it forward in time using our physics model ($\mathbf{F}$).

Why the "Sandwich" Product? In 1D, if $x_{new} = f \cdot x$, then the variance scales by the square: $\sigma_{new}^2 = f^2 \sigma^2$. In matrix form, $\mathbf{F} (\cdot) \mathbf{F}^T$ is the multi-dimensional equivalent of "squaring" the transformation. How tensors transform with "Sandwich" Products will be explained in detail later in the course.
Geometric Effect: It reshapes and stretches the uncertainty. If you are uncertain about your velocity, as time passes ($\Delta t$), that uncertainty "bleeds" into your position.

The Injection ($\mathbf{Q}$)
The matrix $\mathbf{Q}$ represents the uncertainty growth caused by random noise (the "jitters" and "bumps" discussed in the Process Noise section).

The Role: Even if we knew our position perfectly at $k=0$, random disturbances during the interval $\Delta t$ mean we are less certain at $k=1$.
Geometric Effect: While the first term reshapes the bubble, $\mathbf{Q}$ increases its overall size (volume).

Summary: The Result of Prediction¶

Every time we run the prediction step:

State Prediction: Our "best guess" $\mathbf{x}_k$ moves forward. $$\mathbf{x}_k^- = \mathbf{F}\mathbf{x}_{k-1} + \mathbf{G}u_{k-1}$$
Uncertainty Prediction: Our "uncertainty bubble" $\mathbf{P}_k$ stretches (due to $\mathbf{F}$) and grows (due to $\mathbf{Q}$). $$\mathbf{P}_k^- = \mathbf{F}\mathbf{P}_{k-1}\mathbf{F}^T + \mathbf{Q}$$

Without a measurement update to "shrink" this bubble back down, the filter would eventually become so uncertain that the prediction becomes useless.