You are on page 1of 13

Discrete Kalman Filter Tutorial

Gabriel A. Terejanu
Department of Computer Science and Engineering
University at Bualo, Bualo, NY 14260
terejanu@bualo.edu
1 Introduction
Consider the following stochastic dynamic model and the sequence of noisy observations z
k
:
x
k
= f(x
k1
, u
k1
, w
k1
, k) (1)
z
k
= h(x
k
, u
k
, v
k
, k) (2)
Also let x
0
be the random initial condition of the system and
Z
k
= {z
i
|1 i k} (3)
be the set of k observations. Finding x
a
k
, the estimate or analysis of the state space x
k
, given Z
k
and the initial conditions is called the ltering problem. When the dynamic model for the process,
f(), and for the measurements, h(), are linear, and the random components x
0
, w
k
, v
k
are uncorre-
lated Gaussian random vectors, then the solution is given by the classical Kalman lter equations [7].
The Kalman lter is named after Rudolph E.Kalman, who in 1960 published his famous paper de-
scribing a recursive solution to the discrete-data linear ltering problem (Kalman 1960) [11]. It is
the optimal estimator for a large class of problems, nding the most probable state as an unbiased
linear minimum variance estimate of a system based on discrete observations of the system and a
model which describes the evolution of the system [5].
2 Dynamic process
Astochastic time-variant linear systemis described by the dierence equation and the observation
model:
x
k
= A
k1
x
k1
+B
k1
u
k1
+w
k1
(4)
z
k
= H
k
x
k
+v
k
(5)
Where the control input u
k
is a known nonrandom vector. The initial state x
0
is a random vector
with known mean
0
= E[x
0
] and covariance P
0
= E[(x
0

0
)(x
0

0
)
T
].
In the following we assume that the random vector w
k
captures uncertainties in the model and v
k
denotes the measurement noise. Both are temporally uncorrelated (white noise), zero-mean random
sequences with known covariances and both of them are uncorrelated with the initial state x
0
.
E[w
k
] = 0 E[w
k
w
T
k
] = Q
k
E[w
k
w
T
j
] = 0 for k = j E[w
k
x
T
0
] = 0 for all k (6)
E[v
k
] = 0 E[v
k
v
T
k
] = R
k
E[v
k
v
T
j
] = 0 for k = j E[v
k
x
T
0
] = 0 for all k (7)
Also the two random vectors w
k
and v
k
are uncorrelated:
E[w
k
v
T
j
] = 0 for all k and j (8)
The assumptions about the unbiasedness and uncorrelation are not that critical, extensions of Kalman
Filter can be derived in these situations.
Dimension and description of variables:
x
k
n 1 State vector
u
k
l 1 Input/control vector
w
k
n 1 Process noise vector
z
k
m1 Observation vector
v
k
m1 Measurement noise vector
A
k
n n State transition matrix
B
k
n l Input/control matrix
H
k
mn Observation matrix
Q
k
n n Process noise covariance matrix
R
k
mm Measurement noise covariance matrix
3 KF derivation
The optimal (minimum variance unbiased) estimate is the conditional mean and is computed in two
steps: the forecast step using the model dierence equations and the data assimilation step. Hence
the Kalman Filter has a predictor-corrector structure.
Model Forecast Step
Initially, since the only available information is the mean,
0
, and the covariance, P
0
, of the initial
state then the initial optimal estimate x
a
0
and error covariance is:
x
a
0
=
0
= E[x
0
] (9)
P
0
= E[(x
0
x
a
0
)(x
0
x
a
0
)
T
] (10)
Assume now that we have an optimal estimate x
a
k1
E[x
k1
|Z
k1
] with P
k1
E[(x
k1

x
a
k1
)(x
k1
x
a
k1
)
T
] covariance at time k 1. The predictable part of x
k
is given by:
x
f
k
E[x
k
|Z
k1
] (11)
= E[A
k1
x
k1
+B
k1
u
k1
+w
k1
]
= A
k1
x
a
k1
+B
k1
u
k1
2
The forecast error is:
e
f
k
x
k
x
f
k
(12)
= A
k1
(x
k1
x
a
k1
) +w
k1
= A
k1
e
k1
+w
k1
The forecast error covariance is given by:
P
f
k
E[e
f
k
(e
f
k
)
T
] (13)
= E[(A
k1
e
k1
+w
k1
)(A
k1
e
k1
+w
k1
)
T
]
= A
k1
E[e
k1
(e
k1
)
T
]A
T
k1
+Q
k1
= A
k1
P
k1
A
T
k1
+Q
k1
(14)
Data Assimilation Step
At time k we have two pieces of information: the forecast value x
f
k
with the covariance P
f
k
and the
measurement z
k
with the covariance R
k
. We know that:
x
a
k
E[x
k
|Z
k
] (15)
= E[x
k
|Z
k1
] + E[x
k
|z
k
]
Assume that the last term is a linear operation on the innovation z
k
H
k
x
f
k
[10](see also [9]: Projec-
tion theorem p.408 and Kalman innovations p.443). The innovation represents the new information
contained in the observation z
k
.
E[x
k
|z
k
] = K
k
(z
k
H
k
x
f
k
) (16)
Therefore:
x
a
k
= x
f
k
+K
k
(z
k
H
k
x
f
k
)
= (I K
k
H
k
)x
f
k
+K
k
z
k
So, the easiest way to combine the two bits of information is to assume that the unbiased estimate x
a
k
is a linear combination of both the forecast and the measurement.
x
a
k
= L
k
x
f
k
+K
k
z
k
where L
k
= I K
k
H
k
(17)
Or, the optimal estimate at time k is equal to the best prediction plus a correction term of an
optimal weighting value, K
k
, times the innovation as in (17) [8].
Substitute (4), (5) and (11) into (16):
x
a
k
= A
k1
x
a
k1
+B
k1
u
k1
+K
k
(H
k
x
k
+v
k
H
k
(A
k1
x
a
k1
+B
k1
u
k1
)) (18)
= A
k1
x
a
k1
+B
k1
u
k1
+K
k
(H
k
A
k1
(x
k1
x
a
k1
) +H
k
w
k1
+v
k
)
3
Figure 1: Sequential assimilation
The error in the estimate x
a
k
is:
e
k
x
k
x
a
k
(19)
= A
k1
e
k1
K
k
H
k
A
k1
e
k1
+ (I K
k
H
k
)w
k1
K
k
v
k
= (I K
k
H
k
)(A
k1
e
k1
+w
k1
) K
k
v
k
Then, the posterior covariance of the new estimate is:
P
k
E[e
k
e
T
k
] (20)
= E[(L
k
(A
k1
e
k1
+w
k1
) K
k
v
k
)(L
k
(A
k1
e
k1
+w
k1
) K
k
v
k
)
T
]
= L
k
E[(A
k1
e
k1
+w
k1
)(A
k1
e
k1
+w
k1
)
T
]L
T
k
+K
k
E[v
k
v
t
k
]K
T
k
= L
k
(A
k1
P
k1
A
T
k1
+Q
k1
)L
T
k
+K
k
R
k
K
T
k
= (I K
k
H
k
)P
f
k
(I K
k
H
k
)
T
+K
k
R
k
K
T
k
= P
f
k
K
k
H
k
P
f
k
P
f
k
H
T
k
K
T
k
+K
k
D
k
K
T
k
(21)
where
D
k
= H
k
P
f
k
H
T
k
+R
k
(22)
The posterior covariance formula holds for any K
k
. The cross terms canceled because x
k1
, w
k1
and
v
k
are uncorrelated and e
k1
is a function of x
k1
.
Our goal is to minimize the error in the estimate e
ki
for any state i = 1, n. The problem is con-
structed as a mean squared error minimiser. The cost functional to be minimized is given by [1]:
J = E
_
n

i=1
e
2
ki
_
(23)
4
This is the sum of error variances for each state variable. Therefore the cost functional can be expressed
as the trace of the error covariance:
J = tr(P
k
) (24)
Since tr(P
k
) is a function of K
k
and K
k
is the only unknown, we request to minimize the tr(P
k
) w.r.t.
K
k
.
tr(P
k
)
K
k
= 0 (25)
The partial derivative of the trace is easily given using matrix calculus rules [4].
tr(P
f
k
K
k
H
k
P
f
k
P
f
k
H
T
k
K
T
k
+K
k
D
k
K
T
k
)
K
k
= 0 (26)
(H
k
P
f
k
)
T
P
f
k
H
T
k
+ 2K
k
D
k
= 0
Thus, the Kalman gain is given by:
K
k
= P
f
k
H
T
k
D
1
k
(27)
= P
f
k
H
T
k
(H
k
P
f
k
H
T
k
+R
k
)
1
Substituting this back into (20):
P
k
= P
f
k
K
k
H
k
P
f
k
P
f
k
H
T
k
(D
1
k
)
T
H
k
(P
f
k
)
T
+P
f
k
H
T
k
D
1
k
D
k
(D
1
k
)
T
H
k
(P
f
k
)
T
(28)
= (I K
k
H
k
)P
f
k
Note that the covariance P
k
does not directly depend on observations, z
k
, or the input vector. This
property makes it possible to compute and to analyse the covariance matrix in absence of any obser-
vation.
4 Summary of Kalman Filter
Model and Observation:
x
k
= A
k1
x
k1
+B
k1
u
k1
+w
k1
z
k
= H
k
x
k
+v
k
Initialization:
x
a
0
=
0
with error covariance P
0
Model Forecast Step/Predictor:
x
f
k
= A
k1
x
a
k1
+B
k1
u
k1
P
f
k
= A
k1
P
k1
A
T
k1
+Q
k1
Data Assimilation Step/Corrector:
x
a
k
= x
f
k
+K
k
(z
k
H
k
x
f
k
)
K
k
= P
f
k
H
T
k
(H
k
P
f
k
H
T
k
+R
k
)
1
P
k
= (I K
k
H
k
)P
f
k
5
Dynamics and Observation Model
Kalman Filter
Innovation

Figure 2: The block diagram for Kalman Filter


5 KF original derivation
The following derivation respects Kalman original concept of derivation [10]. The notation that has
been changed for the consistency of the tutorial. The optimal estimate for the system (4)-(5) is derived
using orthogonal projections on the vector space of random variables.
Orthogonal Projection
Let the vector space Z
k
be the set of all linear combinations of the random variables (observations)
z
k
. Z
k
is a nite-dimensional subspace of the space of all possible observations.
Z
k

_
z
k

z
k
=
k

i=1

i
z
i
_
(29)
Two vectors u, v Z
k
are orthogonal if their correlation is zero. Any vector x can be uniquely
decomposed in two parts: x Z
k
and xZ
k
.
x = x + x (30)
Theorem [10]: Let {x
k
}, {z
k
} be random processes with zero mean. If either (1) the random processes
are Gaussian or (2) the optimal estimate is restricted to be a linear function of the observed random
variables, and the loss function L(e
k
) is quadratic in e
k
= x
k
x
k
, where L() is a positive non-
decreasing function of error
x
k
= optimal estimate of x
k
given {z
k
} (31)
= orthogonal projection x
k
of x
k
on Z
k
Derivation
Assume Z
k1
is known and z
k
is measured. Let z
k
be the component of z
k
orthogonal on Z
k1
. The
component z
k
generates a linear manifold

Z
k
.
Z
k
= Z
k1


Z
k
(32)
6
Every vector in

Z
k
is orthogonal to every vector in Z(k 1).
Assume that x
k1
is know, then:
x
a
k
E[x
k
|Z
k
] (33)
= E[x
k
|Z
k1
] + E[x
k
|

Z
k
]
= x
f
k
+ E[x
k
|

Z
k
]
Where the forecast value x
f
k
of x
k
can be obtained as in (11) and the forecast covariance matrix is
given by (13):
x
f
k
= A
k1
x
a
k1
+B
k1
u
k1
(34)
P
f
k
= A
k1
P
k1
A
T
k1
+Q
k1
(35)
Assume that the last term in (33) is a linear operation on the random variable z
k
(called innovation):
E[x
k
|

Z
k
] = K
k
z
k
(36)
where
z
k
= z
k
z
k
(37)
but z
k
is the orthogonal projection of z
k
on Z
k1
. So:
z
k
= E[z
k
|Z
k1
] (38)
= E[H
k
x
k
+v
k
|Z
k1
]
= H
k
x
f
k
Hence, (33) becomes:
x
a
k
= x
f
k
+K
k
(z
k
H
k
x
f
k
) (39)
= (I K
k
H
k
)x
f
k
+K
k
z
k
The estimate error:
e
k
x
k
x
a
k
(40)
= A
k1
e
k1
K
k
H
k
A
k1
e
k1
+ (I K
k
H
k
)w
k1
K
k
v
k
= (I K
k
H
k
)(A
k1
e
k1
+w
k1
) K
k
v
k
Therefore the error covariance matrix, derived as in (20), is:
P
k
E[e
k
e
T
k
] (41)
= P
f
k
K
k
H
k
P
f
k
P
f
k
H
T
k
K
T
k
+K
k
D
k
K
T
k
where
D
k
= H
k
P
f
k
H
T
k
+R
k
(42)
7
We have to nd an explicit formula for K
k
by noting that the residual x
k
E[x
k
|

Z
k
] is orthogonal to

Z
k
, therefore it is orthogonal to z
k
. Results:
0 = E[(x
k
E[x
k
|

Z
k
])z
T
k
] (43)
= E[(x
k
K
k
z
k
)z
T
k
]
= E[x
k
z
T
k
] K
k
E[z
k
z
T
k
]
We know that x
k
= x
k
+ x
k
, where x
k
Z
k1
, therefore x
k


Z
k
so x
k
z
k
.
E[x
k
z
T
k
] = E[(x
k
+ x
k
)z
T
k
] (44)
= E[ x
k
z
T
k
]
= E[(x
k
x
k
)z
T
k
]
= E[(A
k1
x
k1
+B
k1
u
k1
+w
k1
E[x
k
|Z
k1
])z
T
k
]
= E[(A
k1
x
k1
+B
k1
u
k1
+w
k1
A
k1
x
a
k1
B
k1
u
k1
)z
T
k
]
= E[(A
k1
e
k1
+w
k1
)z
T
k
]
= A
k1
E[e
k1
(z
k
H
k
x
f
k
)
T
] + E[w
k1
(z
k
H
k
x
f
k
)
T
]
We can obtain an expression of the innovation as function of the estimate error:
z
k
H
k
x
f
k
= H
k
x
k
+v
k
H
k
x
f
k
(45)
= H
k
(x
k
x
f
k
) +v
k
= H
k
(A
k1
x
k1
+B
k1
u
k1
+w
k1
A
k1
x
a
k1
B
k1
u
k1
) +v
k
= H
k
A
k1
e
k1
+H
k
w
k1
+v
k
Substituting this into (44):
E[x
k
z
T
k
] = A
k1
E[e
k1
(H
k
A
k1
e
k1
+H
k
w
k1
+v
k
)
T
] (46)
+E[w
k1
(H
k
A
k1
e
k1
+H
k
w
k1
+v
k
)
T
]
= A
k1
P
k1
A
T
k1
H
T
k
+Q
k1
H
T
k
= P
f
k
H
T
k
The last term from (43) is:
E[z
k
z
T
k
] = E[(H
k
A
k1
e
k1
+H
k
w
k1
+v
k
)(H
k
A
k1
e
k1
+H
k
w
k1
+v
k
)
T
] (47)
= H
k
A
k1
P
k1
A
T
k1
H
T
k
+H
k
Q
k1
H
T
k
+R
k
= H
k
P
f
k
H
T
k
+R
k
With (46) and (47), (43) becomes:
0 = E[x
k
z
T
k
] K
k
E[z
k
z
T
k
] (48)
= P
f
k
H
T
k
K
k
(H
k
P
f
k
H
T
k
+R
k
)
Results that the gain matrix K
k
is:
K
k
= P
f
k
H
T
k
(H
k
P
f
k
H
T
k
+R
k
)
1
(49)
The Kalman Filter equations are given by (34), (35), (33), (49) and (41). Note that in the original
Kalman paper the gain derived there,
k
, is given by
k
= A
k1
K
k
.
8
6 Information form
In the information lter (inverse-covariance lter) the estimated vector and the covariance matrix has
been replaced by the information state y
k
and the information matrix Y
k
respectively.
y
a
k
Y
k
x
a
k
(50)
Y
k
P
1
k
(51)
The same information form has the forecast estimate and covariance matrix.
y
f
k
Y
f
k
x
f
k
(52)
Y
f
k
(P
f
k
)
1
(53)
With this changes we try to write Kalman lter equations in the information form [6]. So the data
assimilation equations become:
x
a
k
= x
f
k
+K
k
(z
k
H
k
x
f
k
) (54)
P
k
y
a
k
= P
f
k
y
f
k
+K
k
(z
k
H
k
P
f
k
x
f
k
)
= (I K
k
H
k
)P
f
k
y
f
k
+K
k
z
k
= P
k
y
f
k
+K
k
z
k
y
a
k
= y
f
k
+P
1
k
K
k
z
k
= y
f
k
+ (P
f
k
)
1
(I K
k
H
k
)
1
K
k
z
k
= y
f
k
+ [K
1
k
(I K
k
H
k
)P
f
k
]
1
z
k
= y
f
k
+ (K
1
k
P
f
k
H
k
P
f
k
)
1
z
k
= y
f
k
+ [(H
k
P
f
k
H
T
k
+R
k
)(H
T
k
)
1
(P
f
k
)
1
P
f
k
H
k
P
f
k
]
1
z
k
y
a
k
= y
f
k
+H
T
k
R
1
k
z
k
(55)
Derivation of the information matrix follows immediately from the posterior covariance matrix formula.
P
1
k
= (P
f
k
)
1
(I K
k
H
k
)
1
(56)
= (P
f
k
)
1
[K
k
(K
1
k
H
k
)]
1
= (P
f
k
)
1
_
K
k
[(H
k
P
f
k
H
T
k
+R
k
)(H
T
k
)
1
(P
f
k
)
1
H
k
]
_
1
= (P
f
k
)
1
_
K
k
R
k
(H
T
k
)
1
(P
f
k
)
1
_
1
= H
T
k
R
1
k
K
1
k
= H
T
k
R
1
k
(H
k
P
f
k
H
T
k
+R
k
)(H
T
k
)
1
(P
f
k
)
1
= H
T
k
R
1
k
H
k
+H
T
k
R
1
k
R
k
(H
T
k
)
1
(P
f
k
)
1
Y
k
= Y
f
k
+H
T
k
R
1
k
H
k
(57)
9
Provided that A
k1
is nonsingular, equations of the model forecast become:
x
f
k
= A
k1
x
a
k1
+B
k1
u
k1
(58)
y
f
k
= (P
f
k
)
1
A
k1
P
k1
y
k1
+Y
f
k
B
k1
u
k1
= [A
k1
P
k1
A
T
k1
+Q
k1
]
1
A
k1
P
k1
y
k1
+Y
f
k
B
k1
u
k1
=
_
P
1
k1
A
1
k1
[A
k1
P
k1
A
T
k1
+Q
k1
]
_
1
y
k1
+Y
f
k
B
k1
u
k1
= (A
T
k1
+Y
k1
A
1
k1
Q
k1
)
1
+Y
f
k
B
k1
u
k1
= (I + (A
1
k1
)
T
Y
k1
A
1
k1
Q
k1
)
1
(A
T
k1
)
1
+Y
f
k
B
k1
u
k1
y
f
k
= (I +M
k1
Q
k1
)
1
(A
T
k1
)
1
+Y
f
k
B
k1
u
k1
(59)
where M
k
= (A
1
k
)
T
Y
k
A
1
k
. Using the forecast covariance matrix recurrence formula we can derive
its counterpart information matrix.
Y
f
k
= (A
k1
P
k1
A
T
k1
+Q
k1
)
1
(60)
= (I +M
k1
Q
k1
)
1
M
k1
The summary of the information form of the Kalman lter:
Initialization (given
0
and P
0
):
Y
0
= P
1
0
y
a
0
= Y
0

0
Model Forecast Step/Predictor:
M
k
= (A
1
k
)
T
Y
k
A
1
k
Y
f
k
= (I +M
k1
Q
k1
)
1
M
k1
y
f
k
= (I +M
k1
Q
k1
)
1
(A
T
k1
)
1
+Y
f
k
B
k1
u
k1
Data Assimilation Step/Corrector:
y
a
k
= y
f
k
+H
T
k
R
1
k
z
k
Y
k
= Y
f
k
+H
T
k
R
1
k
H
k
7 Innovation approach
The method solves the estimation problem much easier using the innovation process, which is the
observed process converted into a white-noise process. The innovation represents the new information
measure in the observation variable z
k
, being given all the past observations and the old information
deduced therefrom. It is dened as:
z
k
= z
k
E[z
k
|Z
k1
] (61)
Several properties of the innovation [3]:
10
1. The innovation z
k
, associated with the current observation, is uncorrelated with the past obser-
vations: E[z
k
z
T
j
] = 0 where j = 1, 2..., k 1.
2. The innovations are orthogonal to each other: E[z
i
z
T
j
] = 0 where i = j.
3. There is a one-to-one correspondence between the innovation and the associated observation.
4. It has zero mean.
We know that x
a
k
E[x
k
|Z
k
]. Since the innovation sequence {z
k
} contains all the information in
the observation sequence {z
k
} [2], the estimate can be assumed to be a linear combination of all the
innovations up to k [9]:
x
a
k
=
k

i=1
I
i
z
i
(62)
where I
i
is an nm matrix to be determined. We know by the Projection Theorem that the forecast
error e
k
is uncorrelated with the innovation sequence. Therefore, for all i up to k:
0 = E[e
k
z
T
i
] (63)
= E[(x
k
x
a
k
)z
T
i
]
E[x
k
z
T
i
] = E[x
a
k
z
T
i
]
E[x
k
z
T
i
] =
k

l=1
I
l
E[z
l
z
T
i
]
= I
i
E[z
i
z
T
i
]
All the terms up to k 1 go to zero since the innovations are temporally uncorrelated.
Where E[z
i
z
T
i
] is the innovation covariance Cov(z
i
). Results:
I
i
= E[x
k
z
T
i
]Cov
1
(z
i
) (64)
Substitute this into (62):
x
a
k
=
k

i=1
E[x
k
z
T
i
]Cov
1
(z
i
)z
i
(65)
=
k1

i=1
E[x
k
z
T
i
]Cov
1
(z
i
)z
i
+ E[x
k
z
T
k
]Cov
1
(z
k
)z
k
=
k1

i=1
E[(A
k1
x
k1
+B
k1
u
k1
+w
k1
)z
T
i
]Cov
1
(z
i
)z
i
+ E[x
k
z
T
k
]Cov
1
(z
k
)z
k
= A
k1
k1

i=1
E[x
k1
z
T
i
]Cov
1
(z
i
)z
i
+ E[x
k
z
T
k
]Cov
1
(z
k
)z
k
= A
k1
x
a
k1
+ E[x
k
z
T
k
]Cov
1
(z
k
)z
k
= A
k1
x
a
k1
+K
k
z
k
11
where K
k
= E[x
k
z
T
k
]Cov
1
(z
k
).
The error in the estimate is:
e
k
x
k
x
a
k
(66)
= A
k1
e
k1
+w
k1
K
k
z
k
Then, the posterior covariance of the new estimate is:
P
k
E[e
k
e
T
k
] (67)
= A
k1
E[e
k1
e
T
k1
]A
T
k1
+A
k1
E[e
k1
z
T
k
]K
T
k
+ E[w
k1
w
T
k1
] E[w
k1
z
T
k
]K
T
k
K
k
E[z
k
e
T
k1
]A
T
k1
K
k
E[z
k
w
T
k1
] +K
k
E[z
k
z
T
k
]K
T
k
= A
k1
P
k1
A
T
k1
+A
k1
P
k1
A
T
k1
H
k
K
T
k
+Q
k1
Q
k1
H
T
k
K
T
k
K
k
H
k
A
k1
A
k1
P
T
k1
K
k
H
k
Q
T
k1
+K
k
Cov(z
k
)K
T
k
Denote P
f
k
= A
k1
P
k1
A
T
k1
+Q
k1
and substitute this back into (67) yields:
P
k
= P
f
k
K
k
H
k
P
f
k
P
f
k
H
T
k
K
T
k
+K
k
Cov(z
k
)K
T
k
(68)
= P
f
k
K
k
H
k
P
f
k
P
f
k
H
T
k
K
T
k
+ E[x
k
z
T
k
]K
T
k
But E[x
k
z
T
k
] = P
f
k
H
T
k
(see (46)) and Cov(z
k
) = H
k
P
f
k
H
T
k
+R
k
(see (47)). Then:
P
k
= (I K
k
H
k
)P
f
k
(69)
K
k
= P
f
k
H
T
k
(H
k
P
f
k
H
T
k
+R
k
)
1
(70)
The equations (61), (62), (69) and (70) dene the Kalman Filter algorithm.
8 Properties of Kalman Filter
Stability - Theorem Jaswinski
Asymptotic stability of the KF means that its solution will gradually become insensitive to its initial
conditions, provided that the norms of the noise covariance matrices, Q
k
, R
k
are bounded.
If the system (4) and (5) with x
0
, w
k
, v
k
, independent, is uniformly completely observable and uni-
formly completely controllable and if P
0
0 then the discrete time KF is uniformly asymptotically
stable.
Filter Divergence
This phenomenon occurs when the lter seems to behave well, having low error variance, but the
estimate is far away from the truth. This is due to errors in the system modeling: the model error
is higher than expected, the system model has the wrong form, the system is unstable or has bias
errors.
9 Remarks
1. The lter produce the error covariance matrix P
k
which is an important estimate for the accuracy
of the estimate.
12
2. The lter is optimal for Gaussian sequences only.
3. While the measurement noise covariance R
k
is possible to be determined, the process noise
covariance matrix Q
k
has to be computed to adjust to dierent dynamics. We are not able to
directly observe the process we are estimating. Therefore a tuning on Q
k
has to be performed
for superior lter performances.
10 Conclusion
While most of the lters are formulated in the frequency domain, the Kalman Filter is a purely time-
domain lter.
The main issue remains how the uncertainties are represented.
References
[1] Michael Athans. The Control Handbook, chapter Kalman Filtering, pages 589594. CRC Press,
1996.
[2] A.V.Balakrishnan. Kalman Filtering Theory. Optimization Software, Inc., 1984.
[3] Mourad Barkat. Signal Detection and Estimation. Artech House Inc, 2005.
[4] Jon Dattorro. Convex Optimization & Euclidean Distance Geometry, chapter Matrix Calculus.
Meboo Publishing USA, 2006.
[5] Henk Eskes. Data Assimilation: The Kalman Filter.
[6] Mohinder S. Grewal and Angus P. Andrews. Kalman Filtering. Theory and Practice using Matlab
2nd. John Wiley & Sons, 2001.
[7] John M. Lewis and S.Lakshmivarahan. Dynamic Data Assimilation, A Least Squares Approach.
2006.
[8] Peter S. Mayback. Introduction to Random Signals and Applied Kalman Filtering. Academic
Press, 1979.
[9] Athanasios Papoulis. Probability, Random Variables, and Stochastic Process. McGraw-Hill, Inc.,
2nd edition, 1965.
[10] R.E.Kalman. A New Approach to Linear Filtering and Prediction Problems. Trans.ASME, 1960.
[11] Greg Welch and Gary Bishop. An Introduction to the Kalman Filter. SIGGRAPH, ACM, 2001.
13

You might also like