In [1]:

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
from ipynb.fs.defs.plot_Kalman_Filter_1D import plot_prediction_results, plot_update_step, plot_step

Principle of the Kalman Filter (1D-case)¶

The Kalman Filter (KF) is a specific, computationally efficient implementation of the general Bayes Filter.

The reason the Kalman Filter is tractable and widely used is that it imposes two powerful, simplifying constraints on the system:

Linear Dynamics: The motion (transition) and measurement models must be linear functions.
Gaussian Distributions: All uncertainties—the prior belief, the motion noise, and the measurement noise—must be modeled using Gaussian (Normal) distributions.

If these two conditions hold, the 1D-system remains entirely defined by just two parameters at every step: the mean ($\mu$) and the variance ($\sigma^2$).

This means the entire probability distribution can be tracked simply by updating these two values. This replaces the complex integral calculations (such as those required for the convolution in the prediction step) with simple algebraic operations.

Everything you've learned so far—the Markov Assumption, the recursive Prediction/Update cycle, and the application of Bayes' Rule—forms the direct probabilistic foundation upon which the 1D-Kalman Filter is built.

1D Continuous State Space¶

In the continuous domain, the Kalman Filter (KF) is an instance of the Bayes Filter where the probability distribution is a Probability Density Function (PDF). The critical assumption that simplifies the KF is the Linear Gaussian model.

Note on Notation¶

We use superscripts to indicate whether a value includes the most recent sensor measurement:

The Prediction ($\hat{x}_k^-$): This is the Prior. The minus sign ($^-$) signifies the state estimate after the motion model has been applied, but before we have incorporated the sensor data at time $k$. We are projecting the state forward.
The Update ($\hat{x}_k^+$): This is the Posterior. The plus sign ($^+$) signifies the refined estimate after we have integrated the new measurement.

Note the hat in the notation to indicate that these are estimates.

Motion Model and Linear Gaussian Assumption¶

The true state transition (the actual physics) is expressed as a constant linear model:

$$x_k = F x_{k-1} + G (u_{k-1} + w)$$

where

$F$ and $G$ are constants (in the multidimensional case matrices)
$x_k$ is the true state at time $k \in \mathbb N^+$
$x_{k-1}$ is the true state at time $k-1$
$u_{k-1}$ is the control signal at time $k-1$

For our example is $F=1$, $G=1$, and $u_{k-1} = \Delta x_{k-1}$:

$$x_k = x_{k-1} + \Delta x_{k-1} + w$$

where

$\Delta x_{k-1}$ is the control input (expected traveling distance in time $\Delta t$ one time step), representing the deterministic part of the motion.
- Note: $\Delta x_{k-1} = v_{k-1} \Delta t$
$w$ is the process noise, representing the uncertainty in motion, with variance $\sigma^2_w$
- $w \sim \mathcal{N}(0, \sigma^2_w)$.

Note: We do not use a hat ($\hat{x}$) in the equations above because it represents the true (physical) state. Since we cannot know $x_k$ perfectly, we use the Kalman Filter to calculate $\hat{x}_k$, our best possible guess.

We represent our uncertainty about the state $x_k$ using a Gaussian distribution. We say that the possible values for the state are distributed around our best guess, $\hat{x}_k$:$$x_k \sim \mathcal{N}(\hat{x}_k, \sigma^2)$$Here, $x_k$ acts as the random variable representing the robot's position, while $\hat{x}_k$ is the mean (our current belief) and $\sigma^2$ is our uncertainty.

The probability density of being at a specific location $x$ is given by the function $p(x) = \mathcal{N}(x; \hat x_k, \sigma^2)$.

The state transition probability density function is:

$$p(x_k \mid x_{k-1}) = \mathcal{N}(x_k; x_{k-1} + \Delta x_{k-1}, \sigma^2_w)$$

This equation states that the probability of being at $x_k$, given the starting point $x_{k-1}$ and the planned movement $\Delta x_{k-1}$, is described by a Gaussian centered at the expected new position ($x_{k-1} + \Delta x_{k-1}$) , with a variance equal to the motion noise $\sigma^2_w$.

The Prediction (Convolution of Gaussians)¶

The initial estimated position (the prior) is described by the distribution $p^+(x_{k-1}) = \mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1})$. The movement, $\Delta x_{k-1}$, shifts the position and adds noise $\mathcal{N}(0, \sigma^2_w)$.

To find the new predicted distribution $p(x_k^-)$, we combine the motion model with our previous belief via convolution:

$$p(x_k^-) = \int_{-\infty}^{\infty} p(x_k \mid x_{k-1}) \cdot p^+(x_{k-1}) \, dx_{k-1}$$Substituting the Gaussian densities:$$p(x_k^-) = \int_{-\infty}^{\infty} \mathcal{N}(x_k; x_{k-1} + \Delta x_{k-1}, \sigma^2_w) \cdot \mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1}) \, dx_{k-1} $$

Why this creates a Convolution?¶

To see the connection to the standard convolution formula $(f * g)(t) = \int f(t - \tau) g(\tau) \, d\tau$, look at how $x_{k-1}$ is used in each part of your integral:

The "Signal" ($g$): $\mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1})$
Here, $x_{k-1}$ is the variable. This represents our static belief of where the robot was.
The "Kernel" ($f$): $\mathcal{N}(x_k; x_{k-1} + \Delta x_{k-1}, \sigma^2_w)$
Here, $x_{k-1}$ is part of the mean. As we integrate over all possible values of $x_{k-1}$, we are effectively "shifting" the motion noise kernel by every possible starting position.

The Mathematical "Slide"

In a standard convolution, we look for the term $(t - \tau)$. If we look at the exponent of your first Gaussian:

$$\text{Exponent} \propto (x_k - (x_{k-1} + \Delta x_{k-1}))^2$$

Rearranging this gives:

$$( (x_k - \Delta x_{k-1}) - x_{k-1} )^2$$

This matches the $(t - \tau)$ structure perfectly. The "target" ($t$) is our net predicted movement ($x_k - \Delta x_{k-1}$), and the "source" ($\tau$) is the previous position $x_{k-1}$.

Convolution of two Gaussians¶

The remarkable simplification provided by the Gaussian assumption is that the convolution of two Gaussians is always a new Gaussian (see e.g. [Thr06][Bro14]) by simply summing the means and variances

Mean: $\hat{x}_{k}^- = \hat{x}_{k-1}^+ + \Delta x_{k-1}$ (Previous guess + deterministic movement)
Variance: $\sigma_k^{2-} = \sigma_{k-1}^{2+} + \sigma_w^2$ (Accumulated uncertainty)

Combined, the Predicted State Estimate is:

$$p(x_k^-) = \mathcal{N}(\hat{x}_{k-1}^+ + \Delta x_{k-1}, \sigma^2_{k-1} + \sigma^2_w)$$

In [2]:

def predict(x_hat_prev, sigma_prev, delta_x, sigma_w):
    """
    Performs the prediction step of the 1D Kalman Filter.
    Combines the previous posterior state with the motion model.
    """
    # 1. Update Mean: New mean is the sum of previous estimate and deterministic motion
    x_hat_pred = x_hat_prev + delta_x
    
    # 2. Update Variance: New variance is the sum of previous variance and process noise
    # We calculate the new standard deviation by taking the square root of the sum of variances
    sigma_pred = np.sqrt(sigma_prev**2 + sigma_w**2)
    
    return x_hat_pred, sigma_pred

In [3]:

# --- Simulation Parameters ---
x_space = np.arange(-10., 15., .05)

# Initial state at t-1 (Posterior from previous step)
x_hat_prev = -2.0  
sigma_prev = 1.5   

# Motion parameters
delta_x = 2.5      # Expected displacement (control input)
sigma_w = 1.9      # Process noise standard deviation

# --- Calculation ---

# 1. Generate PDFs for visualization
prior_pdf = scipy.stats.norm(loc=x_hat_prev, scale=sigma_prev).pdf(x_space)
motion_pdf = scipy.stats.norm(loc=delta_x, scale=sigma_w).pdf(x_space)

# 2. Run Prediction Step
x_hat_pred, sigma_pred = predict(x_hat_prev, sigma_prev, delta_x, sigma_w)

# 3. Generate Predicted PDF
pred_pdf = scipy.stats.norm(loc=x_hat_pred, scale=sigma_pred).pdf(x_space)

# --- Plotting ---
params = {
    'x_hat_prev': x_hat_prev,
    'sigma_prev': sigma_prev,
    'x_hat_pred': x_hat_pred,
    'sigma_pred': sigma_pred,
    'sigma_w': sigma_w
}

In [4]:

plot_prediction_results(x_space, prior_pdf, pred_pdf, motion_pdf, params)

No description has been provided for this image

Interpretation

The second plot clearly demonstrates the two effects of the Prediction step:

Shift in Mean: The distribution shifts to the right (from $\hat x_{0}=-2.0$ to $\hat x_{1}=0.5$) due to the expected motion $\Delta x_{k-1}=2.5$.
Increase in Uncertainty: The red predicted distribution is wider and flatter than the blue prior distribution ($\sigma_1=2.42$ vs $\sigma_0=1.5$). This reflects the accumulation of the motion noise ($\sigma_v=1.9$) onto the existing position uncertainty.

Measurement Update (The Correction Step)¶

In the continuous 1D Kalman Filter, the measurement update step uses Bayes' Rule to fuse our uncertain prior belief with new sensor data.

We combine:

The Prediction (Prior) $p(x_k^-)$: Our belief based on movement, $ \mathcal{N}(x_k; \hat{x}_k^-, \sigma_k^{2-})$.
The Likelihood $p(z_k \mid x_k)$: The probability of seeing measurement $z_k$ if the robot were actually at $x_k$. The information from the sensor, representing the probability of seeing measurement $z_k$ given the robot's actual position $x_k$. We assume the sensor has Gaussian noise $v \sim \mathcal{N}(0, \sigma_r^2)$, making the likelihood a Gaussian centered at $z_k$.

To yield:

The Update (Posterior) $p^+(x_k)$: Our refined belief, $\mathcal{N}(x_k; \hat{x}_k^+, \sigma_k^{2+})$.

Mathematically, while prediction involves a convolution, the update step involves the product of two Gaussians (normalized via Bayes' Rule)

1. The Measurement Model (Likelihood)¶

The relationship between the true state ($x_k$) and the noisy observation ($z_k$) is modeled linearly with additive Gaussian noise.

The general form of the measurement equation is:

$$z_k = H x_k + r$$

Where:

$z_k$ is the actual sensor reading received at time $k$.
$H$ is the Observation constant (for the multidimensional case a matrix), relating the state to the expected measurement.
$x_k$ is the true state (note the absence of a hat, as this is the physical reality, not an estimate).
$r$ is the measurement noise: $r \sim \mathcal{N}(0, \sigma_r^2)$.

For the simple 1D case where the sensor directly measures position, the Observation constant $H$ simplifies to $1$:

$$z_k = x_k + r$$

This formulation leads to the Measurement Likelihood Function $p(z_k \mid x_k)$:

$$p(z_k \mid x_k) = \mathcal{N}(z_k; x_k, \sigma_r^2)$$This notation states that if we know the true state $x_k$, the resulting measurement $z_k$ is a random variable centered at $x_k$ with sensor noise $\sigma_r^2$.

Note for the Update Step:

In the Bayes Update, we treat $z_k$ as a fixed observation. Because the Gaussian PDF is symmetric, the likelihood of the state $x_k$ given $z_k$ is mathematically equivalent to a Gaussian centered at $z_k$:

$$\mathcal{N}(z_k; x_k, \sigma_r^2) = \mathcal{N}(x_k; z_k, \sigma_r^2)$$

Effectively, the update step "pulls" our predicted estimate toward the sensor reading $z_k$.

2. Bayes' Rule and Conjugate Priors¶

The update fuses our prediction with the new measurement:

$$p^+(x_k) = p(x_k \mid z_k) = \frac{p(z_k \mid x_k) \cdot p(x_k^-)}{p(z_k)} = \frac{\mathcal{N}(x_k; z_k, \sigma_r^2) \cdot \mathcal{N}(x_k; \hat{x}_k^-, \sigma_k^{2-})}{p(z_k)}$$

Likelihood $p(z_k \mid x_k)$: The sensor model. For the update, we use the symmetric form $\mathcal{N}(x_k; z_k, \sigma_r^2)$, which treats the observed measurement $z_k$ as the mean.
Prior $p(x_k^-)$: Our prediction, $\mathcal{N}(x_k; \hat{x}_k^-, \sigma_k^{2-})$.
Posterior $p^+(x_k) = p(x_k \mid z_k)$: The final refined belief.

Conjugacy: Crucially, because both the prior and the likelihood are Gaussian, their product is proportional to a new Gaussian. This property (conjugacy) allows us to calculate the new mean and variance through simple algebra, bypassing the complex integration of the denominator $p(z_k)$.

3. Product of Two Gaussians (The Algebraic Update)¶

Multiplying the prior by the likelihood results in a Posterior Gaussian with the following parameters (for the derivation, see e.g., [Bro14]):

Posterior Mean ($\hat{x}_k^+$): $$\hat{x}_k^+ = \frac{\sigma_r^2 \hat{x}_k^- + \sigma_k^{2-} z_k}{\sigma_k^{2-} + \sigma_r^2}$$
Posterior Variance ($\sigma_k^{2+}$): $$\sigma_k^{2+} = \frac{1}{\frac{1}{\sigma_k^{2-}} + \frac{1}{\sigma_r^2}} = \frac{\sigma_k^{2-} \sigma_r^2}{\sigma_k^{2-} + \sigma_r^2}$$

Intuition: The Weighted Average
The update formula for the mean is essentially a weighted average of our prediction and our measurement. Each source is weighted by the certainty (the inverse of the variance) of the other source:

If the sensor is very noisy ($\sigma_r^2$ is huge), the posterior mean stays close to our prediction $\hat{x}_k^-$.
If our prediction is very uncertain ($\sigma_k^{2-}$ is huge), the posterior mean trusts the sensor reading $z_k$ more.

The Variance Shrinkage:

The posterior variance $\sigma_k^{2+}$ is always smaller than both the prior and the measurement variance. This reflects the reduction in uncertainty achieved by the fusion of information: even a noisy sensor makes our estimate more certain than the prediction alone.

The Kalman Gain ($K_k$)¶

Let's rewrite the formula for an intuitive form of the new mean:

$$\begin{aligned} \hat{x}_k^+ &= \frac{\sigma_r^2 \hat{x}_k^- + \sigma_k^{2-} z_k}{\sigma_k^{2-} + \sigma_r^2} \\ &= \frac{\sigma_r^2 \hat{x}_k^- + \sigma_k^{2-} \hat{x}_k^- - \sigma_k^{2-} \hat{x}_k^- + \sigma_k^{2-} z_k}{\sigma_k^{2-} + \sigma_r^2} \\ &= \frac{(\sigma_r^2 + \sigma_k^{2-}) \hat{x}_k^- + \sigma_k^{2-} (z_k - \hat{x}_k^-)}{\sigma_k^{2-} + \sigma_r^2} \\ &= \hat{x}_k^- + \underbrace{\left(\frac{\sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2}\right)}_{K_k} \underbrace{(z_k - \hat{x}_k^-)}_{y_k} \\ &= \hat{x}_k^- + K_k y_k \end{aligned}$$

The Kalman Gain $K_k \in \left] 0, 1 \right[$ is the "weighting factor" that determines the optimal balance between trusting your prediction versus trusting the measurement $K_k $
The difference $y_k =z_k - \hat{x}_k^-$ is known as the Innovation or Measurement Residual. It represents the "surprise"—how much the new sensor data differs from what we expected.

Interpretation of $K_k$

The Kalman Gain is an adaptable factor that ranges between 0 and 1. It determines how much of the innovation should be used to correct the predicted state:

High Confidence in Measurement ($\sigma_r^2$ is small): If the sensor is very precise, then $K_k \to 1$. The posterior mean $\hat{x}_k^+$ will be adjusted almost entirely toward the measurement $z_k$.
High Confidence in Prediction ($\sigma_k^{2-}$ is small): If our motion model is very reliable compared to a noisy sensor, then $K_k \to 0$. The posterior mean will stay close to the prediction $\hat{x}_k^-$, effectively "filtering out" the noisy measurement.

Final Variance Update

Using the Kalman Gain, the update for the variance also simplifies to a standard form:

$$\begin{aligned} \sigma_k^{2+} &= \frac{\sigma_k^{2-} \sigma_r^2}{\sigma_k^{2-} + \sigma_r^2} \\ &= \sigma_k^{2-} \left( \frac{\sigma_r^2}{\sigma_k^{2-} + \sigma_r^2} \right) \\ &= \sigma_k^{2-} \left( \frac{(\sigma_k^{2-} + \sigma_r^2) - \sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2} \right) \\ &= \sigma_k^{2-} \left( \frac{\sigma_k^{2-} + \sigma_r^2}{\sigma_k^{2-} + \sigma_r^2} - \frac{\sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2} \right) \\ &= \sigma_k^{2-} (1 - K_k) \end{aligned}$$

This formula beautifully illustrates that the new uncertainty ($\sigma_k^{2+}$) is always a fraction of the old uncertainty ($\sigma_k^{2-}$), reflecting our increased confidence after seeing the measurement (data).

Demonstration with Code¶

Measurement Update Function

The measurement function implements the core algebraic rules derived from the Product of Two Gaussians (Bayes' Rule) to fuse the predicted state (prior) with the observation (likelihood).

In [5]:

def update(x_hat_pred, sigma_pred, z_k, sigma_r):
    """
    Performs the measurement update (correction) step of the 1D Kalman Filter.
    Fuses the predicted state (Prior) with the sensor measurement (Likelihood).
    """
    
    # 1. Calculate the Kalman Gain (K_k)
    # This is the 'weight' that balances our trust between prediction and measurement.
    k_k = sigma_pred**2 / (sigma_pred**2 + sigma_r**2)

    # 2. Update Mean (Posterior Mean: \hat{x}_k^+)
    # This is algebraically equivalent to your variance-weighted average:
    # x_hat_plus = (sigma_r**2 * x_hat_pred + sigma_pred**2 * z_k) / (sigma_pred**2 + sigma_r**2)
    x_hat_plus = x_hat_pred + k_k * (z_k - x_hat_pred)
    
    # 3. Update Standard Deviation (Posterior Variance: \sigma_k^{2+})
    # This is algebraically equivalent to the reciprocal sum of inverse variances:
    # sigma_plus = np.sqrt( 1./( 1./sigma_pred**2 + 1./sigma_r**2) )
    sigma_plus = np.sqrt((1 - k_k) * sigma_pred**2)
    
    return x_hat_plus, sigma_plus

Demonstration of the Update Step

The following code segment visually demonstrates the fusion process, showing how the Prior and the Likelihood combine to create a sharper, more certain Posterior.

Prior ($p(x_k^-)$): Our prediction from the motion model, defined by mean $\hat{x}_k^-$ and variance $\sigma_k^{2-}$.
Likelihood ($p(z_k \mid x_k)$): The sensor model, defined by the measurement $z_k$ and sensor noise $\sigma_r^2$.
Posterior ($p^+(x_k)$): The updated belief, defined by mean $\hat{x}_k^+$ and variance $\sigma_k^{2+}$.

Parameters for the Simulation:

Sensor Noise ($\sigma_r^2$): $9.0$ (Measurement uncertainty is quite high, where $\sigma_r = 3.0$).
Measurement ($z_k$): $-1.0$ (The sensor reports the position is at $-1.0$).

In [6]:

# --- Simulation Parameters & Setup ---
# (Using results from your previous prediction step)
x_hat_pred = 0.5   # Corresponds to mu1
sigma_pred = 2.42  # Corresponds to sigma1

# Measurement (Sensor Data)
z_k = -1.0         # The reading (mu_o)
sigma_r = 3.0      # Sensor noise (sigma_z)

# --- Calculation ---
# 1. Run the update function
x_hat_plus, sigma_plus = update(x_hat_pred, sigma_pred, z_k, sigma_r)

# 2. Generate PDFs for plotting
pred_pdf = scipy.stats.norm(loc=x_hat_pred, scale=sigma_pred).pdf(x_space)
likelihood_pdf = scipy.stats.norm(loc=z_k, scale=sigma_r).pdf(x_space)
post_pdf = scipy.stats.norm(loc=x_hat_plus, scale=sigma_plus).pdf(x_space)

# --- Execution ---
params = {
    'x_hat_pred': x_hat_pred, 'sigma_pred': sigma_pred,
    'z_k': z_k, 'sigma_r': sigma_r,
    'x_hat_plus': x_hat_plus, 'sigma_plus': sigma_plus
}

In [7]:

plot_update_step(x_space, pred_pdf, likelihood_pdf, post_pdf, params)

Interpretation:

The resulting Posterior (blue line) is narrower than both the Prior and the Likelihood, confirming that the filter has reduced the overall uncertainty. Its mean is optimally located between the Prior and the Likelihood, weighted by the respective variances (Kalman Gain).

Full Recursive Kalman Filter Example¶

This final code block demonstrates the full, recursive operation of the 1D Kalman Filter over multiple time steps ($t=1$ to $t=5$).

Initial Setup and Cycle

Initial State ($t=0$): The robot's position is highly uncertain. The initial belief is $\hat x_0=0$ and $\sigma=6$.
Loop: For each time step, the filter performs the two-step cycle:
- Prediction: The predict function updates the mean based on velocity $v$ and increases the variance by adding the motion noise $\sigma_w^2$. (Red curve in plots)
- Measurement Update: The measurement function fuses the new prediction (Prior) with the sensor reading $o$ (Likelihood) to generate the new, refined belief (Posterior). (Blue curve in plots)
Recursion: The Posterior from the Measurement Update becomes the new Prior for the next time step's Prediction.

In [8]:

def run_kalman_simulation(x_hat_start, sigma_start, z_obs, delta_x_vel, s_r, s_w):
    """
    Runs and visualizes a 1D Kalman Filter simulation.
    """
    # 1. Setup Space and Plotting
    x_space = np.arange(-6., 10., 0.05)
    fig, axes = plt.subplots(len(z_obs) + 1, 1, figsize=(10, 14), sharex=True)
    
    x_hat = x_hat_start
    sigma = sigma_start

    # 2. Plot Initial State (t=0)
    x_pdf_initial = scipy.stats.norm(loc=x_hat, scale=sigma).pdf(x_space)
    axes[0].plot(x_space, x_pdf_initial, 'b-', label=r"Initial Position ($\hat{x}_0^+$)")
    axes[0].set_ylim(0., 0.6)
    axes[0].legend(loc='upper right')
    axes[0].set_title("Kalman Filter Recursive Estimation")

    # 3. Recursive Loop
    for i, (z_k, delta_x) in enumerate(zip(z_obs, delta_x_vel)):
        t = i + 1
        
        # --- PREDICT STEP ---
        x_hat_pred = x_hat + delta_x
        sigma_pred = np.sqrt(sigma**2 + s_w**2)
        
        # Calculate PDF for visualization
        x_pdf_pred = scipy.stats.norm(loc=x_hat_pred, scale=sigma_pred).pdf(x_space)
        
        # --- MEASUREMENT LIKELIHOOD ---
        likelihood_pdf = scipy.stats.norm(loc=z_k, scale=s_r).pdf(x_space)
        
        # --- UPDATE STEP ---
        k_gain = sigma_pred**2 / (sigma_pred**2 + s_r**2)
        
        x_hat = x_hat_pred + k_gain * (z_k - x_hat_pred)
        sigma = np.sqrt((1 - k_gain) * sigma_pred**2)
        
        # Calculate PDF for visualization
        x_pdf_post = scipy.stats.norm(loc=x_hat, scale=sigma).pdf(x_space)
        
        # --- VISUALIZATION ---
        axes[t].plot(x_space, x_pdf_pred, 'r--', alpha=0.6, label=f"Prediction $p(x_{t}^-)$")
        axes[t].plot(x_space, likelihood_pdf, 'g:', alpha=0.6, label=f"Likelihood $p(z_{t}|x_{t})$")
        axes[t].plot(x_space, x_pdf_post, 'b-', linewidth=2, label=f"Update $p^+(x_{t})$")
        
        axes[t].set_ylim(0., 0.6)
        axes[t].set_ylabel(f"Step {t}")
        axes[t].legend(loc='upper right', fontsize='small')

    plt.xlabel("Position (x)")
    plt.tight_layout()
    plt.show()

In [9]:

run_kalman_simulation(
     x_hat_start=0.0, 
     sigma_start=6.0, 
     z_obs=np.array([-2, -1.5, -0.4, 1.2, 2.1]), 
     delta_x_vel=np.array([1, 1.1, 1.2, 1.2, 1.2]),
     s_r=1.6, 
     s_w=0.9
)

Final Interpretation of the Results

The sequence of plots shows:

Shrinking Uncertainty (Blue Curve): The initial blue curve at $t=0$ is very wide ($\sigma=6$). After the first few cycles, the subsequent blue curves (Posteriors) are much narrower, demonstrating the filter's ability to reduce initial uncertainty by incorporating measurements.
Prediction (Red) vs. Observation (Green): The red distribution (Prediction) and the green distribution (Observation) often show a discrepancy, but the Posterior (Blue) always finds the optimal position between them, weighted by their respective certainties ($\sigma_v$ and $\sigma_z$).
Stable Estimate: The blue curve successfully tracks the robot's motion (which is generally positive due to the positive velocity array), providing a much more stable and confident estimate of the position than either the noisy observations alone or the uncertain predictions alone.

This recursive cycle—Prediction followed by Measurement Update—is the complete implementation of the Bayes Filter and the definition of the Kalman Filter in its simplest 1D form.

Summary: The 1D Kalman Filter¶

This chapter introduced the Kalman Filter (KF) as an efficient, specialized form of the Bayes Filter for continuous state estimation. The KF provides an mathematically optimal, recursive solution when two conditions hold:

Linear Dynamics: Motion and measurement models are linear functions.
Gaussian Distributions: All uncertainties (prior, motion noise, measurement noise) are modeled as Gaussians $\mathcal{N}(x; \mu, \sigma^2)$.

Key Components and Notation¶

Component	Symbol	Density/Parameter	Description
Initial State	$\hat{x}_{k-1}^+$	$\mathcal{N}(x_{k-1}; \hat{x}_{k-1}^+, \sigma^2_{k-1})$	The final estimate from the previous time step (Posterior).
Control Input	$\Delta x_{k-1}$	$u_{k-1}$	The intended displacement (deterministic motion).
Process Noise	$w$	$\mathcal{N}(w; 0, \sigma^2_w)$	Uncertainty in movement (e.g., wheel slip).
Prediction (Prior)	$p(x_k^-)$	$\mathcal{N}(x_k; \hat{x}_k^-, \sigma_k^{2-})$	The "Dead Reckoning" estimate before sensor data.
Measurement	$z_k$	—	The actual sensor reading received at time $k$.
Measurement Noise	$r$	$\mathcal{N}(r; 0, \sigma^2_r)$	The inherent uncertainty of the sensor.
Likelihood	$p(z_k \mid x_k)$	$\mathcal{N}(z_k; x_k, \sigma^2_r)$	Probability of reading $z_k$ given the true state $x_k$, but seen as function of $x_k$.
Kalman Gain	$K_k$	Range $(0, 1)$	The weighting factor assigned to the "surprise" (innovation).
Update (Posterior)	$p^+(x_k)= p(x_k \mid z_k)$	$\mathcal{N}(x_k; \hat{x}_k^+, \sigma_k^2)$	The optimal fusion of prediction and measurement.

The Recursive Cycle¶

The KF operates in a continuous, two-step cycle, tracking the entire belief using only the mean ($\hat{x}$) and variance ($\sigma^2$).

Step	Purpose	Algebraic Rule	Effect on Uncertainty ($\sigma^2$)
1. Prediction (Motion Update)	Updates state based on motion laws.	$\hat{x}_k^- = \hat{x}_{k-1}^+ + \Delta x_{k-1}$ $\sigma_k^{2-} = \sigma_{k-1}^{2+} + \sigma_w^2$	Increases ($\sigma^2_w$ is added via convolution).
2. Update (Sensor Update)	Refines prediction using measurement $z_k$.	$\hat{x}_k^+ = \hat{x}_k^- + K_k (z_k - \hat{x}_k^-)$ $\sigma_k^{2+} = (1 - K_k)\sigma_k^{2-}$	Decreases (Information is fused via multiplication).

The Role of the Kalman Gain ($K_k$): $K_k = \frac{\sigma_k^{2-}}{\sigma_k^{2-} + \sigma_r^2}$

The Kalman Gain is the "volume knob" that optimally balances the trust between the uncertain prediction and the uncertain measurement. The resulting Posterior state ($\hat{x}_k^+, \sigma_k^{2+}$) then becomes the starting point (the Prior) for the next cycle.

Literature and Links:¶

Fred Hamprecht: Lecture in Course Pattern Recognition (Video), Directed Graphical Models; State Space Models, Uni Heidelberg
C. Bishop: "Graphical Models" Talk at MLSS 2013 (1:38-)
[Bro14] P.A. Bromiley: Products and Convolutions of Gaussian Probability Density Functions
[Thu06] Thrun, Sebastian, Wolfram Burgard, and Dieter Fox: Probabilistic robotics. MIT press, 2006.
Roger Labbe: Kalman and Bayesian Filters in Python, Online Book as jupyter notebook files