import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
from ipynb.fs.defs.plot_Kalman_Filter_1D import plot_prediction_results, plot_update_step, plot_step
The Kalman Filter (KF) is a specific, computationally efficient implementation of the general Bayes Filter.
The reason the Kalman Filter is tractable and widely used is that it imposes two powerful, simplifying constraints on the system:
If these two conditions hold, the 1D-system remains entirely defined by just two parameters at every step: the mean ($\mu$) and the variance ($\sigma^2$).
This means the entire probability distribution can be tracked simply by updating these two values. This replaces the complex integral calculations (such as those required for the convolution in the prediction step) with simple algebraic operations.
Everything you've learned so far—the Markov Assumption, the recursive Prediction/Update cycle, and the application of Bayes' Rule—forms the direct probabilistic foundation upon which the 1D-Kalman Filter is built.
In the continuous domain, the Kalman Filter (KF) is an instance of the Bayes Filter where the probability distribution is a Probability Density Function (PDF). The critical assumption that simplifies the KF is the Linear Gaussian model.
We use superscripts to indicate whether a value includes the most recent sensor measurement:
Note the hat in the notation to indicate that these are estimates.
The true state transition (the actual physics) is expressed as a constant linear model:
$$x_k = F x_{k-1} + G (u_{k-1} + w)$$
where
For our example is $F=1$, $G=1$, and $u_{k-1} = \Delta x_{k-1}$:
$$x_k = x_{k-1} + \Delta x_{k-1} + w$$
where
Note: We do not use a hat ($\hat{x}$) in the equations above because it represents the true (physical) state. Since we cannot know $x_k$ perfectly, we use the Kalman Filter to calculate $\hat{x}_k$, our best possible guess.
We represent our uncertainty about the state $x_k$ using a Gaussian distribution. We say that the possible values for the state are distributed around our best guess, $\hat{x}_k$:$$x_k \sim \mathcal{N}(\hat{x}_k, \sigma^2)$$Here, $x_k$ acts as the random variable representing the robot's position, while $\hat{x}_k$ is the mean (our current belief) and $\sigma^2$ is our uncertainty.
The probability density of being at a specific location $x$ is given by the function $p(x) = \mathcal{N}(x; \hat x_k, \sigma^2)$.
The state transition probability density function is:
$$p(x_k \mid x_{k-1}) = \mathcal{N}(x_k; x_{k-1} + \Delta x_{k-1}, \sigma^2_w)$$
This equation states that the probability of being at $x_k$, given the starting point $x_{k-1}$ and the planned movement $\Delta x_{k-1}$, is described by a Gaussian centered at the expected new position ($x_{k-1} + \Delta x_{k-1}$) , with a variance equal to the motion noise $\sigma^2_w$.
The initial estimated position (the prior) is described by the distribution $p^+(x_{k-1}) = \mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1})$. The movement, $\Delta x_{k-1}$, shifts the position and adds noise $\mathcal{N}(0, \sigma^2_w)$.
To find the new predicted distribution $p(x_k^-)$, we combine the motion model with our previous belief via convolution:
$$p(x_k^-) = \int_{-\infty}^{\infty} p(x_k \mid x_{k-1}) \cdot p^+(x_{k-1}) \, dx_{k-1}$$Substituting the Gaussian densities:$$p(x_k^-) = \int_{-\infty}^{\infty} \mathcal{N}(x_k; x_{k-1} + \Delta x_{k-1}, \sigma^2_w) \cdot \mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1}) \, dx_{k-1} $$
To see the connection to the standard convolution formula $(f * g)(t) = \int f(t - \tau) g(\tau) \, d\tau$, look at how $x_{k-1}$ is used in each part of your integral:
The Mathematical "Slide"
In a standard convolution, we look for the term $(t - \tau)$. If we look at the exponent of your first Gaussian:
$$\text{Exponent} \propto (x_k - (x_{k-1} + \Delta x_{k-1}))^2$$
Rearranging this gives:
$$( (x_k - \Delta x_{k-1}) - x_{k-1} )^2$$
This matches the $(t - \tau)$ structure perfectly. The "target" ($t$) is our net predicted movement ($x_k - \Delta x_{k-1}$), and the "source" ($\tau$) is the previous position $x_{k-1}$.
The remarkable simplification provided by the Gaussian assumption is that the convolution of two Gaussians is always a new Gaussian (see e.g. [Thr06][Bro14]) by simply summing the means and variances
Combined, the Predicted State Estimate is:
$$p(x_k^-) = \mathcal{N}(\hat{x}_{k-1}^+ + \Delta x_{k-1}, \sigma^2_{k-1} + \sigma^2_w)$$
def predict(x_hat_prev, sigma_prev, delta_x, sigma_w):
"""
Performs the prediction step of the 1D Kalman Filter.
Combines the previous posterior state with the motion model.
"""
# 1. Update Mean: New mean is the sum of previous estimate and deterministic motion
x_hat_pred = x_hat_prev + delta_x
# 2. Update Variance: New variance is the sum of previous variance and process noise
# We calculate the new standard deviation by taking the square root of the sum of variances
sigma_pred = np.sqrt(sigma_prev**2 + sigma_w**2)
return x_hat_pred, sigma_pred
# --- Simulation Parameters ---
x_space = np.arange(-10., 15., .05)
# Initial state at t-1 (Posterior from previous step)
x_hat_prev = -2.0
sigma_prev = 1.5
# Motion parameters
delta_x = 2.5 # Expected displacement (control input)
sigma_w = 1.9 # Process noise standard deviation
# --- Calculation ---
# 1. Generate PDFs for visualization
prior_pdf = scipy.stats.norm(loc=x_hat_prev, scale=sigma_prev).pdf(x_space)
motion_pdf = scipy.stats.norm(loc=delta_x, scale=sigma_w).pdf(x_space)
# 2. Run Prediction Step
x_hat_pred, sigma_pred = predict(x_hat_prev, sigma_prev, delta_x, sigma_w)
# 3. Generate Predicted PDF
pred_pdf = scipy.stats.norm(loc=x_hat_pred, scale=sigma_pred).pdf(x_space)
# --- Plotting ---
params = {
'x_hat_prev': x_hat_prev,
'sigma_prev': sigma_prev,
'x_hat_pred': x_hat_pred,
'sigma_pred': sigma_pred,
'sigma_w': sigma_w
}
plot_prediction_results(x_space, prior_pdf, pred_pdf, motion_pdf, params)
Interpretation
The second plot clearly demonstrates the two effects of the Prediction step:
In the continuous 1D Kalman Filter, the measurement update step uses Bayes' Rule to fuse our uncertain prior belief with new sensor data.
We combine:
To yield:
Mathematically, while prediction involves a convolution, the update step involves the product of two Gaussians (normalized via Bayes' Rule)
The relationship between the true state ($x_k$) and the noisy observation ($z_k$) is modeled linearly with additive Gaussian noise.
The general form of the measurement equation is:
$$z_k = H x_k + r$$
Where:
For the simple 1D case where the sensor directly measures position, the Observation constant $H$ simplifies to $1$:
$$z_k = x_k + r$$
This formulation leads to the Measurement Likelihood Function $p(z_k \mid x_k)$:
$$p(z_k \mid x_k) = \mathcal{N}(z_k; x_k, \sigma_r^2)$$This notation states that if we know the true state $x_k$, the resulting measurement $z_k$ is a random variable centered at $x_k$ with sensor noise $\sigma_r^2$.
Note for the Update Step:
In the Bayes Update, we treat $z_k$ as a fixed observation. Because the Gaussian PDF is symmetric, the likelihood of the state $x_k$ given $z_k$ is mathematically equivalent to a Gaussian centered at $z_k$:
$$\mathcal{N}(z_k; x_k, \sigma_r^2) = \mathcal{N}(x_k; z_k, \sigma_r^2)$$
Effectively, the update step "pulls" our predicted estimate toward the sensor reading $z_k$.
The update fuses our prediction with the new measurement:
$$p^+(x_k) = p(x_k \mid z_k) = \frac{p(z_k \mid x_k) \cdot p(x_k^-)}{p(z_k)} = \frac{\mathcal{N}(x_k; z_k, \sigma_r^2) \cdot \mathcal{N}(x_k; \hat{x}_k^-, \sigma_k^{2-})}{p(z_k)}$$
Conjugacy: Crucially, because both the prior and the likelihood are Gaussian, their product is proportional to a new Gaussian. This property (conjugacy) allows us to calculate the new mean and variance through simple algebra, bypassing the complex integration of the denominator $p(z_k)$.
Multiplying the prior by the likelihood results in a Posterior Gaussian with the following parameters (for the derivation, see e.g., [Bro14]):
Intuition: The Weighted Average
The update formula for the mean is essentially a weighted average of our prediction and our measurement. Each source is weighted by the certainty (the inverse of the variance) of the other source:
The Variance Shrinkage:
The posterior variance $\sigma_k^{2+}$ is always smaller than both the prior and the measurement variance. This reflects the reduction in uncertainty achieved by the fusion of information: even a noisy sensor makes our estimate more certain than the prediction alone.
Let's rewrite the formula for an intuitive form of the new mean:
$$\begin{aligned} \hat{x}_k^+ &= \frac{\sigma_r^2 \hat{x}_k^- + \sigma_k^{2-} z_k}{\sigma_k^{2-} + \sigma_r^2} \\ &= \frac{\sigma_r^2 \hat{x}_k^- + \sigma_k^{2-} \hat{x}_k^- - \sigma_k^{2-} \hat{x}_k^- + \sigma_k^{2-} z_k}{\sigma_k^{2-} + \sigma_r^2} \\ &= \frac{(\sigma_r^2 + \sigma_k^{2-}) \hat{x}_k^- + \sigma_k^{2-} (z_k - \hat{x}_k^-)}{\sigma_k^{2-} + \sigma_r^2} \\ &= \hat{x}_k^- + \underbrace{\left(\frac{\sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2}\right)}_{K_k} \underbrace{(z_k - \hat{x}_k^-)}_{y_k} \\ &= \hat{x}_k^- + K_k y_k \end{aligned}$$
Interpretation of $K_k$
The Kalman Gain is an adaptable factor that ranges between 0 and 1. It determines how much of the innovation should be used to correct the predicted state:
Final Variance Update
Using the Kalman Gain, the update for the variance also simplifies to a standard form:
$$\begin{aligned} \sigma_k^{2+} &= \frac{\sigma_k^{2-} \sigma_r^2}{\sigma_k^{2-} + \sigma_r^2} \\ &= \sigma_k^{2-} \left( \frac{\sigma_r^2}{\sigma_k^{2-} + \sigma_r^2} \right) \\ &= \sigma_k^{2-} \left( \frac{(\sigma_k^{2-} + \sigma_r^2) - \sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2} \right) \\ &= \sigma_k^{2-} \left( \frac{\sigma_k^{2-} + \sigma_r^2}{\sigma_k^{2-} + \sigma_r^2} - \frac{\sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2} \right) \\ &= \sigma_k^{2-} (1 - K_k) \end{aligned}$$
This formula beautifully illustrates that the new uncertainty ($\sigma_k^{2+}$) is always a fraction of the old uncertainty ($\sigma_k^{2-}$), reflecting our increased confidence after seeing the measurement (data).
Measurement Update Function
The measurement function implements the core algebraic rules derived from the Product of Two Gaussians (Bayes' Rule) to fuse the predicted state (prior) with the observation (likelihood).
def update(x_hat_pred, sigma_pred, z_k, sigma_r):
"""
Performs the measurement update (correction) step of the 1D Kalman Filter.
Fuses the predicted state (Prior) with the sensor measurement (Likelihood).
"""
# 1. Calculate the Kalman Gain (K_k)
# This is the 'weight' that balances our trust between prediction and measurement.
k_k = sigma_pred**2 / (sigma_pred**2 + sigma_r**2)
# 2. Update Mean (Posterior Mean: \hat{x}_k^+)
# This is algebraically equivalent to your variance-weighted average:
# x_hat_plus = (sigma_r**2 * x_hat_pred + sigma_pred**2 * z_k) / (sigma_pred**2 + sigma_r**2)
x_hat_plus = x_hat_pred + k_k * (z_k - x_hat_pred)
# 3. Update Standard Deviation (Posterior Variance: \sigma_k^{2+})
# This is algebraically equivalent to the reciprocal sum of inverse variances:
# sigma_plus = np.sqrt( 1./( 1./sigma_pred**2 + 1./sigma_r**2) )
sigma_plus = np.sqrt((1 - k_k) * sigma_pred**2)
return x_hat_plus, sigma_plus
Demonstration of the Update Step
The following code segment visually demonstrates the fusion process, showing how the Prior and the Likelihood combine to create a sharper, more certain Posterior.
Parameters for the Simulation:
# --- Simulation Parameters & Setup ---
# (Using results from your previous prediction step)
x_hat_pred = 0.5 # Corresponds to mu1
sigma_pred = 2.42 # Corresponds to sigma1
# Measurement (Sensor Data)
z_k = -1.0 # The reading (mu_o)
sigma_r = 3.0 # Sensor noise (sigma_z)
# --- Calculation ---
# 1. Run the update function
x_hat_plus, sigma_plus = update(x_hat_pred, sigma_pred, z_k, sigma_r)
# 2. Generate PDFs for plotting
pred_pdf = scipy.stats.norm(loc=x_hat_pred, scale=sigma_pred).pdf(x_space)
likelihood_pdf = scipy.stats.norm(loc=z_k, scale=sigma_r).pdf(x_space)
post_pdf = scipy.stats.norm(loc=x_hat_plus, scale=sigma_plus).pdf(x_space)
# --- Execution ---
params = {
'x_hat_pred': x_hat_pred, 'sigma_pred': sigma_pred,
'z_k': z_k, 'sigma_r': sigma_r,
'x_hat_plus': x_hat_plus, 'sigma_plus': sigma_plus
}
plot_update_step(x_space, pred_pdf, likelihood_pdf, post_pdf, params)
Interpretation:
The resulting Posterior (blue line) is narrower than both the Prior and the Likelihood, confirming that the filter has reduced the overall uncertainty. Its mean is optimally located between the Prior and the Likelihood, weighted by the respective variances (Kalman Gain).
This final code block demonstrates the full, recursive operation of the 1D Kalman Filter over multiple time steps ($t=1$ to $t=5$).
Initial Setup and Cycle
def run_kalman_simulation(x_hat_start, sigma_start, z_obs, delta_x_vel, s_r, s_w):
"""
Runs and visualizes a 1D Kalman Filter simulation.
"""
# 1. Setup Space and Plotting
x_space = np.arange(-6., 10., 0.05)
fig, axes = plt.subplots(len(z_obs) + 1, 1, figsize=(10, 14), sharex=True)
x_hat = x_hat_start
sigma = sigma_start
# 2. Plot Initial State (t=0)
x_pdf_initial = scipy.stats.norm(loc=x_hat, scale=sigma).pdf(x_space)
axes[0].plot(x_space, x_pdf_initial, 'b-', label=r"Initial Position ($\hat{x}_0^+$)")
axes[0].set_ylim(0., 0.6)
axes[0].legend(loc='upper right')
axes[0].set_title("Kalman Filter Recursive Estimation")
# 3. Recursive Loop
for i, (z_k, delta_x) in enumerate(zip(z_obs, delta_x_vel)):
t = i + 1
# --- PREDICT STEP ---
x_hat_pred = x_hat + delta_x
sigma_pred = np.sqrt(sigma**2 + s_w**2)
# Calculate PDF for visualization
x_pdf_pred = scipy.stats.norm(loc=x_hat_pred, scale=sigma_pred).pdf(x_space)
# --- MEASUREMENT LIKELIHOOD ---
likelihood_pdf = scipy.stats.norm(loc=z_k, scale=s_r).pdf(x_space)
# --- UPDATE STEP ---
k_gain = sigma_pred**2 / (sigma_pred**2 + s_r**2)
x_hat = x_hat_pred + k_gain * (z_k - x_hat_pred)
sigma = np.sqrt((1 - k_gain) * sigma_pred**2)
# Calculate PDF for visualization
x_pdf_post = scipy.stats.norm(loc=x_hat, scale=sigma).pdf(x_space)
# --- VISUALIZATION ---
axes[t].plot(x_space, x_pdf_pred, 'r--', alpha=0.6, label=f"Prediction $p(x_{t}^-)$")
axes[t].plot(x_space, likelihood_pdf, 'g:', alpha=0.6, label=f"Likelihood $p(z_{t}|x_{t})$")
axes[t].plot(x_space, x_pdf_post, 'b-', linewidth=2, label=f"Update $p^+(x_{t})$")
axes[t].set_ylim(0., 0.6)
axes[t].set_ylabel(f"Step {t}")
axes[t].legend(loc='upper right', fontsize='small')
plt.xlabel("Position (x)")
plt.tight_layout()
plt.show()
run_kalman_simulation(
x_hat_start=0.0,
sigma_start=6.0,
z_obs=np.array([-2, -1.5, -0.4, 1.2, 2.1]),
delta_x_vel=np.array([1, 1.1, 1.2, 1.2, 1.2]),
s_r=1.6,
s_w=0.9
)
Final Interpretation of the Results
The sequence of plots shows:
This recursive cycle—Prediction followed by Measurement Update—is the complete implementation of the Bayes Filter and the definition of the Kalman Filter in its simplest 1D form.
This chapter introduced the Kalman Filter (KF) as an efficient, specialized form of the Bayes Filter for continuous state estimation. The KF provides an mathematically optimal, recursive solution when two conditions hold:
| Component | Symbol | Density/Parameter | Description |
|---|---|---|---|
| Initial State | $\hat{x}_{k-1}^+$ | $\mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1})$ | The final estimate from the previous time step (Posterior). |
| Control Input | $\Delta x_{k-1}$ | $u_{k-1}$ | The intended displacement (deterministic motion). |
| Process Noise | $w$ | $\mathcal{N}(w; 0, \sigma^2_w)$ | Uncertainty in movement (e.g., wheel slip). |
| Prediction (Prior) | $p(x_k^-)$ | $\mathcal{N}(x_k; \hat{x}_k^-, \sigma_k^{2-})$ | The "Dead Reckoning" estimate before sensor data. |
| Measurement | $z_k$ | — | The actual sensor reading received at time $k$. |
| Measurement Noise | $r$ | $\mathcal{N}(r; 0, \sigma^2_r)$ | The inherent uncertainty of the sensor. |
| Likelihood | $p(z_k \mid x_k)$ | $\mathcal{N}(z_k; x_k, \sigma^2_r)$ | Probability of reading $z_k$ given the true state $x_k$, but seen as function of $x_k$. |
| Kalman Gain | $K_k$ | Range $(0, 1)$ | The weighting factor assigned to the "surprise" (innovation). |
| Update (Posterior) | $p^+(x_k)= p(x_k \mid z_k)$ | $\mathcal{N}(x_k; \hat{x}_k^+, \sigma_k^2)$ | The optimal fusion of prediction and measurement. |
The KF operates in a continuous, two-step cycle, tracking the entire belief using only the mean ($\hat{x}$) and variance ($\sigma^2$).
| Step | Purpose | Algebraic Rule | Effect on Uncertainty ($\sigma^2$) |
|---|---|---|---|
| 1. Prediction (Motion Update) | Updates state based on motion laws. | $\hat{x}_k^- = \hat{x}_{k-1}^+ + \Delta x_{k-1}$ $\sigma_k^{2-} = \sigma_{k-1}^{2+} + \sigma_w^2$ |
Increases ($\sigma^2_w$ is added via convolution). |
| 2. Update (Sensor Update) | Refines prediction using measurement $z_k$. | $\hat{x}_k^+ = \hat{x}_k^- + K_k (z_k - \hat{x}_k^-)$ $\sigma_k^{2+} = (1 - K_k)\sigma_k^{2-}$ |
Decreases (Information is fused via multiplication). |
The Role of the Kalman Gain ($K_k$): $K_k = \frac{\sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2}$
The Kalman Gain is the "volume knob" that optimally balances the trust between the uncertain prediction and the uncertain measurement. The resulting Posterior state ($\hat{x}_k^+, \sigma_k^{2+}$) then becomes the starting point (the Prior) for the next cycle.