Extended Kalman Filter¶

Introduction: From Particles to Pose¶

To track a robot, we must maintain linear and angular motion simultaneously.

We define the state vector $\mathbf{x}$ as a single column containing five pieces of information:

$$\mathbf{x} = \begin{bmatrix} p_1 \\ p_2 \\ v_1 \\ v_2 \\ \theta \end{bmatrix} \begin{matrix} \text{(Position X)} \\ \text{(Position Y)} \\ \text{(Velocity X)} \\ \text{(Velocity Y)} \\ \text{(Orientation)} \end{matrix}$$

We also define an input vector $\mathbf{u}$ consisting of the IMU-measurements (Inertial Measurement Unit) that "drive" the change in our state:

linear accelerations ($a_1^B, a_2^B$)
angular velocity ($\omega$).

$$ \mathbf{u} = \begin{bmatrix} a_1^B \\ a_2^B \\ \omega \end{bmatrix} $$

Note: The linear accelarations are measured in the body frame (indicated by the index $B$). In contrast, position and velocity are tracked in the world frame (no index).

The Breakdown of the Linear Model¶

The standard Kalman Filter (KF) assumes the next state is a linear combination of the previous state, control inputs, and noise:

$$\mathbf{x}_k = \mathbf{F}_{k-1} \mathbf{x}_{k-1} + \mathbf{G}_{k-1} \mathbf{u}_{k-1} + \mathbf{L}_{k-1} \mathbf{w}_{k-1}$$ Where:

$\mathbf{F}_{k-1}$: State Transition Matrix (how position/velocity evolve).
$\mathbf{G}_{k-1}$: Control Matrix (how the IMU inputs $u$ change the state).
$\mathbf{L}_{k-1}$: Noise Gain Matrix (how the random noise $\mathbf{w}$ is distributed into the states).
$\mathbf{w}_{k-1}$ is the raw process noise from our sensors

However, this linear assumption breaks down for our robot. When we use body-frame measurements ($a_1^B, a_2^B, \omega$) and rotate them into the world frame ($W$), the equations become non-linear due to their dependence on the heading $\theta$.

Recalling the rotation into the world frame (no upper index):

$a_1 = a_1^B \cos\theta - a_2^B \sin\theta$
$a_2 = a_1^B \sin\theta + a_2^B \cos\theta$

The resulting state transition equations for our system (ignoring noise for a moment) are:

$p_{1,t} = p_{1,t-1} + v_{1,t-1}\Delta t + \frac{1}{2}\Delta t^2 (a_1^B \cos\theta - a_2^B \sin\theta)$
$p_{2,t} = p_{2,t-1} + v_{2,t-1}\Delta t + \frac{1}{2}\Delta t^2 (a_1^B \sin\theta + a_2^B \cos\theta)$
$v_{1,t} = v_{1,t-1} + \Delta t (a_1^B \cos\theta - a_2^B \sin\theta)$
$v_{2,t} = v_{2,t-1} + \Delta t (a_1^B \sin\theta + a_2^B \cos\theta)$
$\theta_{t} = \theta_{t-1} + \omega \Delta t$

Notice:

Trigonometric Coupling: The state variable $\theta$ is "trapped" inside sine and cosine functions.
Cross-Term Interaction: We see terms like $a_1^B \cos\theta$. Here, an input ($a_1^B$) is being multiplied by a function of a state ($\theta$). In a linear system, inputs and states must remain independent additive terms; once they multiply, the geometry of our uncertainty "warps."

This means the transition from $\mathbf{x}_{k-1}$ to $\mathbf{x}_k$ can no longer be expressed as a simple matrix multiplication.

Moving to the Extended Kalman Filter (EKF)¶

Since we cannot use a simple matrix multiplication to predict the state, we define a nonlinear transition function $\mathbf{f}$:

$$\mathbf{x}_t = \mathbf{f}(\mathbf{x}_{t-1}, \mathbf{u}_{t-1}, \mathbf{w}_{t-1})$$

where with the dimension of our state space $n$:

$\mathbf f$ is a $n$-vector-valued function.

To keep the "Kalman logic" alive, we must linearize this function at every time step using a first-order Taylor expansion. This is the essence of the EKF.

1. The Prediction Step (Nonlinear) We calculate the new state $\hat{\mathbf{x}}_k^-$ directly using the nonlinear trigonometric equations above. This gives us our "best guess" for the robot's new pose.

2. The Uncertainty Propagation (The Jacobian) To update the error covariance $P$ (our "certainty"), we cannot use $\mathbf{F}$ or $\mathbf{G}$ because they don't exist in constant form. Instead, we calculate the Jacobian Matrix—the matrix of partial derivatives—which represents the local linear slope of the function $f$ at the current state.

The State Transition Jacobian (Process Jacobian) $\mathbf{J}_f$¶

The State Transition Jacobian, which we will call $\mathbf{J}_f$, is a matrix of partial derivatives that represents the "linear slope" of the movement $f$ at the current state variables. For example, it tells the filter: Given our current heading and acceleration, how much will a $1^\circ$ error in $\theta$ bleed into our position and velocity estimates?

The Jacobian matrix, denoted in general as $\mathbf{J}$, is a matrix of all first-order partial derivatives of a vector function. In out EFK context, we use the symbol $\mathbf{J}_f$ for the Jacobian of the nonlinear prediction function $f(\mathbf{x}_{t-1}, \mathbf{u}_{t-1})$. This Jacobian tells us how a tiny change in any part of the state ($\mathbf{x}$) affects the next state.

Rows ($n$): Each row represents one output of your $n$-vector-valued function $\mathbf{f}$.
Columns ($n$): Each column represents the derivative with respect to one of the $n$ variables in your state vector $\mathbf{x}$.

Because you are mapping the "change in the state" to the "next state," the input and output counts are identical, making the matrix perfectly square.

The State Transition Jacobian $\mathbf{J}_f$ ($n \times n$-matrix) is defined as:

$$\mathbf{J}_f = \frac{\partial \mathbf{f}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \dots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_n}{\partial x_1} & \frac{\partial f_n}{\partial x_2} & \dots & \frac{\partial f_n}{\partial x_n} \end{bmatrix}$$

Why we need it for Uncertainty

In a Kalman Filter, we track the Covariance Matrix ($\mathbf{P}$), which represents a "cloud" of uncertainty around the robot.

If the robot moves in a straight line, the cloud just grows.
If the robot turns, the cloud warps and stretches.

Because the robot's heading $\theta$ affects $p_1$ and $p_2$, an error in your angle "leaks" into an error in your position. The Jacobian calculates exactly how much "leakage" occurs. It answers the question: "If my orientation $\theta$ is off by $0.1^\circ$, how many centimeters will my X-position be off after this step?"

Calculating the Jacobian for your 5-State System¶

Definition of the Jacobian for our 5-state system:

$$\mathbf{J}_f = \frac{\partial \mathbf{f}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial f_1}{\partial p_1} & \frac{\partial f_1}{\partial p_2} & \dots & \frac{\partial f_1}{\partial \theta} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_5}{\partial p_1} & \frac{\partial f_5}{\partial p_2} & \dots & \frac{\partial f_5}{\partial \theta} \end{bmatrix}$$

For your specific equations, let’s look at the most interesting part: the derivatives with respect to $\theta$.

Take your first equation: $f_1 = p_1 + v_1 \Delta t + \frac{1}{2}\Delta t^2 (a_1^B \cos\theta - a_2^B \sin\theta)$.

If we derive $f_1$ by $p_1$, we get 1.
If we derive $f_1$ by $v_1$, we get $\Delta t$.
If we derive $f_1$ by $\theta$, we use the rules $(\cos\theta)' = -\sin\theta$ and $(\sin\theta)' = \cos\theta$: $$\frac{\partial f_1}{\partial \theta} = \frac{1}{2}\Delta t^2 (-a_1^B \sin\theta - a_2^B \cos\theta)$$

This entry in the Jacobian tells the EKF how much the X-position ($p_1$) will change if the angle ($\theta$) is slightly different than we thought.

The Resulting Jacobian Matrix $\mathbf{J}_f$

For our system, the Jacobian looks like this:

$$\mathbf{J}_f = \begin{bmatrix} 1 & 0 & \Delta t & 0 & \frac{1}{2}\Delta t^2 (-a_1^B \sin\theta - a_2^B \cos\theta) \\ 0 & 1 & 0 & \Delta t & \frac{1}{2}\Delta t^2 (a_1^B \cos\theta - a_2^B \sin\theta) \\ 0 & 0 & 1 & 0 & \Delta t (-a_1^B \sin\theta - a_2^B \cos\theta) \\ 0 & 0 & 0 & 1 & \Delta t (a_1^B \cos\theta - a_2^B \sin\theta) \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}$$

Unlike the linear $\mathbf{F}$, which is constant, the Jacobian $\mathbf{J}_f$ depends on:

The current estimated orientation $\theta$ (part of $\mathbf x $).
The current measured accelerations $a_1^B, a_2^B$ (part of $\mathbf u $).

Summary for the Jacobian $\mathbf{J}_f$

The Jacobian $\mathbf{J}_f$ replaces the constant transition matrix $\mathbf{F}$. Unlike $\mathbf{F}$, which is the same every time, $\mathbf{J}_f$ is recalculated at every step using the current $\theta$ and the current accelerations $a^B$.

It serves as the "linear bridge" that allows the Kalman Filter to propagate uncertainty through the nonlinear turns of the robot.

The Control Jacobian $\mathbf{J}_u$ (Mapping Sensor Noise)¶

While the State Jacobian $\mathbf{J}_f$ tells us how existing uncertainty in our current state (like a fuzzy position or heading) propagates into the future, we also need to account for the new noise introduced by our sensors in every step.

Since we use the IMU measurements $\mathbf{u} = [a_1^B, a_2^B, \omega]^\top$ to "drive" the physics of our model, any noise in those measurements ($a$ and $\omega$) will immediately corrupt our state estimate. We define the Control Jacobian $\mathbf{J}_u$ to map this input noise into the state space:$$\mathbf{J}_u = \frac{\partial \mathbf{f}}{\partial \mathbf{u}}$$ For example, looking at the velocity equation $v_{1,t} = v_{1,t-1} + \Delta t (a_1^B \cos\theta - a_2^B \sin\theta)$:
The derivative with respect to $a_1^B$ is $\Delta t \cos\theta$.

This tells the filter: "If the accelerometer has a noise of $0.01 m/s^2$, how much does that specifically shake our $v_1$ estimate at this current angle?"

Dimensions and the Resulting Matrix¶

Since our state $\mathbf{x}$ has $n=5$ variables and our input $\mathbf{u}$ has $p=3$ variables, $\mathbf{J}_u$ is an $n \times p$ matrix ($5 \times 3$).

$$\mathbf{J}_u = \begin{bmatrix} \frac{1}{2}\Delta t^2 \cos\theta & -\frac{1}{2}\Delta t^2 \sin\theta & 0 \\ \frac{1}{2}\Delta t^2 \sin\theta & \frac{1}{2}\Delta t^2 \cos\theta & 0 \\ \Delta t \cos\theta & -\Delta t \sin\theta & 0 \\ \Delta t \sin\theta & \Delta t \cos\theta & 0 \\ 0 & 0 & \Delta t \end{bmatrix}$$

Defining the Input Noise Matrix ($\mathbf{Q}_u$)¶

The matrix $\mathbf{Q}_u$ represents the uncertainty of your "drivers"—the raw measurements from the IMU. We assume these sensors are uncorrelated (e.g., noise in the X-accelerometer doesn't affect the Gyroscope), which allows us to use a diagonal form:

$$\mathbf{Q}_u = \begin{bmatrix} \sigma_{a1}^2 & 0 & 0 \\ 0 & \sigma_{a2}^2 & 0 \\ 0 & 0 & \sigma_\omega^2 \end{bmatrix}$$

Finding Values in the Datasheet

Sensor manufacturers usually specify noise in one of two ways:

Standard Deviation ($\sigma$): Often listed as "Noise RMS." You must square this value to get the variance ($\sigma^2$) for the matrix.
Noise Density: Listed in units like $\text{mg}/\sqrt{\text{Hz}}$ (for accelerometers) or $^\circ/\text{s}/\sqrt{\text{Hz}}$ (for gyros). To convert this to a standard deviation, use the formula: $$\sigma = \text{Noise Density} \cdot \sqrt{\text{Sampling Bandwidth}}$$

Accounting for Vibration and "Real-World" Noise

The datasheet provides the "laboratory" noise. In a real robot, motors and surface roughness create significant mechanical vibrations.

The Multiplier Rule: It is common practice to start with the datasheet values and multiply them by a factor of 2 to 10 to account for these environmental factors.
Static Test: A professional way to determine these values is to let the robot sit perfectly still for 10 seconds, record the IMU data, and calculate the variance of the signal: variance = numpy.var(sensor_data_while_still)

The Two-Track Prediction¶

In the EKF, we perform two parallel tasks to move our robot forward in time:

The State Prediction: We use the full, nonlinear function $\mathbf{f}$ to move our "best guess" for the robot’s position and orientation.
The Uncertainty Prediction: We use the Jacobians ($\mathbf{J}_f$ and $\mathbf{J}_u$) to move and grow our "uncertainty cloud" ($\mathbf{P}$).

The EKF Covariance Prediction Formula¶

Now that we have both Jacobians, we can propagate the uncertainty (the Covariance Matrix $\mathbf{P}$) correctly. The new predicted uncertainty is the sum of "old uncertainty moved forward" and "new noise from the sensors":

$$\mathbf{P}_{t|t-1} = \mathbf{J}_f \mathbf{P}_{t-1} \mathbf{J}_f^\top + \mathbf{J}_u \mathbf{Q}_u \mathbf{J}_u^\top$$

Where:

$\mathbf{P}_{t-1}$ is your previous uncertainty.
$\mathbf{Q}_u$ is the raw noise matrix from your IMU datasheet (diagonal matrix of $\sigma^2_{a1}, \sigma^2_{a2}, \sigma^2_{\omega}$).
$\mathbf{J}_u \mathbf{Q}_u \mathbf{J}_u^\top$ is the Process Noise Matrix $\mathbf{Q}_t$ we discussed earlier—it is now non-linearly projected into the world frame!

Notational Summary

Matrix	Name	Role	Calculation
$\mathbf{J}_f$	State Jacobian	Maps how the current state ($\mathbf{x}_{k-1}$) affects the future state.	$\frac{\partial \mathbf{f}}{\partial \mathbf{x}}$
$\mathbf{J}_u$	Control Jacobian	Maps how IMU inputs ($\mathbf{u}_{k-1}$) affect the future state.	$\frac{\partial \mathbf{f}}{\partial \mathbf{u}}$
$\mathbf{Q}_u$	Input Noise	The raw variance of the IMU sensors (from datasheet).	$\text{diag}(\sigma^2_{a1}, \sigma^2_{a2}, \sigma^2_\omega)$
$\mathbf{Q}_k$	Process Noise	The projected uncertainty added during the movement.	$\mathbf{J}_u \mathbf{Q}_u \mathbf{J}_u^\top$

The Measurement Jacobian ($\mathbf{J}_h$)¶

Just as $\mathbf{J}_f$ and $\mathbf{J}_u$ handle the nonlinearity of the robot's motion, the Measurement Jacobian ($\mathbf{J}_h$) handles the relationship between your state and your sensors.

In an EKF, we define a measurement function $\mathbf{h}(\mathbf{x})$ that predicts what the sensor should see based on our current estimate. Because this function is often nonlinear (like the Pythagorean theorem for distance or trigonometry for a compass), the EKF uses a Jacobian to linearize the relationship at the robot's current location.

Mathematically, $\mathbf{J}_h$ is the matrix of partial derivatives of the measurement function with respect to each state variable:$$\mathbf{J}_h = \frac{\partial \mathbf{h}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial h_1}{\partial x_1} & \frac{\partial h_1}{\partial x_2} & \cdots & \frac{\partial h_1}{\partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial h_m}{\partial x_1} & \frac{\partial h_m}{\partial x_2} & \cdots & \frac{\partial h_m}{\partial x_n} \end{bmatrix}$$

When is a Measurement Nonlinear?

A relationship is nonlinear if the sensor does not "see" the state variables directly, but instead sees a result of a trigonometric or geometric calculation:

Range Sensors (e.g., Ultra-Wideband or Radar): The sensor measures a distance $d$. To predict this, the filter must calculate $\mathbf{h}(\mathbf{x}) = \sqrt{p_1^2 + p_2^2}$. The derivative of this square root is nonlinear.
Bearing-Only Sensors (e.g., Camera tracking a landmark): The sensor measures the angle $\alpha$ to an object. The prediction function $\mathbf{h}(\mathbf{x}) = \arctan2(p_2, p_1)$ is highly nonlinear.

In these cases, $\mathbf{J}_h$ must be recalculated at every step because the "slope" of a circle or an angle changes depending on where the robot is standing.

When to use the Linear Measurement Update

If your sensors provide direct observations of your state variables, the relationship is linear. For your 5-state system, this applies to:

GPS/MoCap: Provides $(p_1, p_2)$ directly.
Magnetometer: Provides $\theta$ directly.

In these cases, $\mathbf{J}_h$ is simply a constant "selection matrix" (a matrix of 1s and 0s) that picks the relevant variable out of the state vector. For a sensor measuring only global position, the Jacobian is constant:

$$\mathbf{J}_h = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \end{bmatrix}$$

Since the derivative of $1 \cdot p_1$ is simply $1$, the Jacobian of a linear measurement is identical to the standard linear matrix $\mathbf{H}$.

Summary

We only "extend" the measurement step (calculating a fresh Jacobian at every step) if the sensor output is a complex function of the state. If the sensor measures position, velocity, or angle directly, we stick to the much simpler linear $\mathbf{J}_h$ matrix.

The EKF Algorithm Cycle¶

The EKF follows the same "Predict-Correct" loop as the standard Kalman Filter, but uses the nonlinear function $\mathbf{f}$ for the state and Jacobians for the uncertainty.

1. Prediction (Time Update)¶

We move the robot forward in time based on the IMU inputs.

State Prediction: $$\hat{\mathbf{x}}_k^- = \mathbf{f}(\hat{\mathbf{x}}_{k-1}, \mathbf{u}_{k-1})$$ (Use the trigonometric equations to find the new $p_1, p_2, v_1, v_2, \theta$)
Covariance Prediction: $$\mathbf{P}_k^- = \mathbf{J}_f \mathbf{P}_{k-1} \mathbf{J}_f^\top + \mathbf{J}_u \mathbf{Q}_u \mathbf{J}_u^\top$$ (This warps the old uncertainty and adds the rotated IMU noise)

2. Correction (Measurement Update)¶

We "pull" the prediction toward the actual sensor reading $z_k$.

Innovation (Residual): $$\mathbf{y}_k = \mathbf{z}_k - \mathbf{h}(\hat{\mathbf{x}}_k^-)$$ (The difference between what the sensor says and what we expected to see)
Innovation Covariance:$$\mathbf{S}_k = \mathbf{J}_h \mathbf{P}_k^- \mathbf{J}_h^\top + \mathbf{R}_k$$
Kalman Gain:$$\mathbf{K}_k = \mathbf{P}_k^- \mathbf{J}_h^\top \mathbf{S}_k^{-1}$$
State Update:$$\hat{\mathbf{x}}_k = \hat{\mathbf{x}}_k^- + \mathbf{K}_k \mathbf{y}_k$$
Covariance Update:$$\mathbf{P}_k = (\mathbf{I} - \mathbf{K}_k \mathbf{J}_h) \mathbf{P}_k^-$$

EKF Matrix Summary Table¶

Matrix	Dimensions	Source	Physical Meaning
$\mathbf{\hat x}$	$5 \times 1$	State Vector	Our current "best guess" for ($p_1, p_2, v_1, v_2, \theta$).
$\mathbf{P}$	$5 \times 5$	Covariance	The uncertainty "cloud" around the state estimate.
$\mathbf{u}$	$3 \times 1$	Control Input	Raw IMU data: ($a_1^B, a_2^B, \omega$).
$\mathbf{Q}_u$	$3 \times 3$	Input Noise	Variance of the IMU sensors (from datasheet).
$\mathbf{R}$	$m \times m$	Sensor Noise	Variance of the measurement sensor (GPS, MoCap, etc.).
$\mathbf{J}_f$	$5 \times 5$	State Jacobian	How the current state affects the predicted next state.
$\mathbf{J}_u$	$5 \times 3$	Control Jacobian	How IMU inputs (and their noise) affect the state.
$\mathbf{J}_h$	$m \times 5$	Meas. Jacobian	How the state maps to the sensor's observation space.
$\mathbf{K}$	$5 \times m$	Kalman Gain	The optimal weight given to the measurement correction.