Professional Documents
Culture Documents
com/
Robotics Research
The International Journal of
http://ijr.sagepub.com/content/33/1/182
The online version of this article can be found at:
DOI: 10.1177/0278364913509675
2014 33: 182 originally published online 13 November 2013 The International Journal of Robotics Research
Joel A Hesch, Dimitrios G Kottas, Sean L Bowman and Stergios I Roumeliotis
Camera-IMU-based localization: Observability analysis and consistency improvement
Published by:
http://www.sagepublications.com
On behalf of:
Multimedia Archives
can be found at: The International Journal of Robotics Research Additional services and information for
http://ijr.sagepub.com/subscriptions Subscriptions:
http://www.sagepub.com/journalsReprints.nav Reprints:
http://www.sagepub.com/journalsPermissions.nav Permissions:
http://ijr.sagepub.com/content/33/1/182.refs.html Citations:
What is This?
q
G
(t) =
1
2
( (t))
I
q
G
(t) (2)
G
p
I
(t) =
G
v
I
(t) (3)
G
v
I
(t) =
G
a
I
(t) (4)
b
g
(t) = n
wg
(t) (5)
b
a
(t) = n
wa
(t) (6)
G
p
f
i
(t) = 0
31
, i = 1, . . . , N (7)
In these expressions, (t) = [
1
(t)
2
(t)
3
(t) ]
T
is the
rotational velocity of the IMU, expressed in {I},
G
a
I
(t) is
the body acceleration expressed in {G}, and
( ) =
_
T
0
_
,
_
_
0
3
2
3
0
1
2
1
0
_
_
The gyroscope and accelerometer measurements,
m
and
a
m
, are modeled as
m
(t) = (t) +b
g
(t) +n
g
(t)
a
m
(t) = C(
I
q
G
(t)) (
G
a
I
(t)
G
g) +b
a
(t) +n
a
(t)
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 185
where n
g
and n
a
are zero-mean, white Gaussian noise
processes, and
G
g is the gravitational acceleration. The
matrix C( q) is the rotation matrix corresponding to q. The
observed features belong to a static scene; hence, their time
derivatives are zero (see (7)). Linearizing at the current esti-
mates and applying the expectation operator on both sides
of (2)(7), we obtain the state estimate propagation model
I
q
G
(t) =
1
2
( (t))
I
q
G
(t) (8)
G
p
I
(t) =
G
v
I
(t) (9)
G
v
I
(t) = C
T
(
I
q
G
(t)) a
I
(t) +
G
g (10)
b
g
(t) = 0
31
(11)
b
a
(t) = 0
31
(12)
G
p
f
i
(t) = 0
31
, i = 1, . . . , N (13)
where a
I
(t) = a
m
(t)
b
a
(t), and (t) =
m
(t)
b
g
(t). The
(15 +3N) 1 error-state vector is dened as
x =
_
I
T
G
b
T
g
G
v
T
I
b
T
a
G
p
T
I
|
G
p
T
f
1
G
p
T
f
N
_
T
=
_
x
T
s
| x
T
m
_
T
where x
s
(t) is the 15 1 error state corresponding to the
sensing platform, andx
m
(t) is the 3N 1 error state of the
map. For the IMU position, velocity, biases, and the map,
an additive error model is employed (i.e. y = y y is the
error in the estimate y of a quantity y). However, for the
quaternion we employ a multiplicative error model (Trawny
and Roumeliotis 2005). Specically, the error between the
quaternion q and its estimate
q is the 3 1 angle-error
vector, , implicitly dened by the error quaternion
q = q
q
1
_
1
2
T
1
_
T
where q describes the small rotation that causes the true
and estimated attitude to coincide. This allows us to repre-
sent the attitude uncertainty by the 3 3 covariance matrix
E[
T
], which is a minimal representation.
The linearized continuous-time error-state equation is
x =
_
F
s
0
153N
0
3N15
0
3N
_
x +
_
G
s
0
3N12
_
n
= F
c
x +G
c
n
where 0
3N
denotes the 3N 3N matrix of zeros,
n =
_
n
T
g
n
T
wg
n
T
a
n
T
wa
_
T
is the system noise, F
s
is the
continuous-time error-state transition matrix corresponding
to the sensor platform state, and G
s
is the continuous-time
input noise matrix, i.e.
F
s
=
_
_
_
_
_
_
I
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
C
T
(
I
q
G
) a
I
0
3
0
3
C
T
(
I
q
G
) 0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
_
_
G
s
=
_
_
_
_
_
_
I
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
C
T
(
I
q
G
) 0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
_
_
where 0
3
is the 3 3 matrix of zeros. The system noise
is modelled as a zero-mean white Gaussian process with
autocorrelation E[n(t) n
T
( ) ] = Q
c
( t ) which depends
on the IMU noise characteristics and is computed off-
line (Trawny and Roumeliotis 2005).
3.1.2. Discrete-time implementation The IMU signals
m
and a
m
are sampled at a constant rate 1/t, where
t t
k+1
t
k
. Every time a new IMU measurement is
received, the state estimate is propagated using integra-
tion of (8)(13). In order to derive the covariance propaga-
tion equation, we compute the discrete-time state transition
matrix,
k+1,k
, from time-step t
k
to t
k+1
, as the solution to
the following matrix differential equation:
k+1,k
= F
c
k+1,k
(14)
initial condition
k,k
= I
15+3N
which can be calculated analytically as we show in Hesch
et al. (2012a) or numerically. We also compute the discrete-
time system noise covariance matrix, Q
d,k
,
Q
d,k
=
_
t
k+1
t
k
( t
k+1
, ) G
c
Q
c
G
T
c
T
( t
k+1
, ) d.
The propagated covariance is then computed as
P
k+1|k
=
k+1,k
P
k|k
T
k+1,k
+Q
d,k
. (15)
3.2. Measurement update model
As the camera-IMU platform moves, the camera observes
visual features which are tracked over multiple image
frames. These measurements are exploited to estimate the
motion of the sensing platform and (optionally) the map of
the environment.
To simplify the discussion, we consider the observation
of a single point p
f
i
. The camera measures z
i
, which is the
perspective projection of the 3D point
I
p
f
i
expressed in the
current IMU frame {I}, onto the image plane, i.e.
z
i
=
1
p
z
_
p
x
p
y
_
+
i
(16)
where
_
_
p
x
p
y
p
z
_
_
=
I
p
f
i
= C
_
I
q
G
_ _
G
p
f
i
G
p
I
_
(17)
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
186 The International Journal of Robotics Research 33(1)
where the measurement noise,
i
, is modeled as zero mean,
white Gaussian with covariance R
i
. We note that, with-
out loss of generality, we consider the image measurement
in normalized pixel coordinates, and dene the camera
frame to be coincident with the IMU. In practice, we per-
form both intrinsic and extrinsic camera-IMU calibration
off-line (Bouguet 2006; Mirzaei and Roumeliotis 2008).
The linearized error model is
z
i
= z
i
z
i
H
i
x +
i
, (18)
where z = h
_
x
_
is the expected measurement computed by
evaluating (16)(17) at the current state estimate, and the
measurement Jacobian, H
i
, is
H
i
= H
c
_
H
0
39
H
p
| 0
3
H
f
i
0
3
_
(19)
where the partial derivatives are
H
c
=
h
I
p
f
i
=
1
p
2
z
_
p
z
0 p
x
0 p
z
p
y
_
H
=
I
p
f
i
= C
_
I
q
G
_ _
G
p
f
i
G
p
I
_
H
p
=
I
p
f
i
G
p
I
= C
_
I
q
G
_
H
f
i
=
I
p
f
i
G
p
f
i
= C
_
I
q
G
_
i.e. H
c
, is the Jacobian of the perspective projection with
respect to
I
p
f
i
, while H
, H
p
, and H
f
i
, are the Jacobians of
I
p
f
i
with respect to
I
q
G
,
G
p
I
, and
G
p
f
i
, respectively.
This measurement model is used, independently of
whether the map of the environment x
m
is part of the state
vector (V-SLAM) or not (VIO). Specically, for the case
of V-SLAM, when features that are already mapped are
observed, the measurement model (16)(19) can be directly
applied to update the lter. In particular, we compute the
measurement residual,
r
i
= z
i
z
i
the covariance of the residual,
S
i
= H
i
P
k+1|k
H
i
T
+R
i
and the Kalman gain,
K
i
= P
k+1|k
H
i
T
S
1
i
Employing these quantities, we compute the EKF state and
covariance update as
x
k+1|k+1
= x
k+1|k
+K
i
r
i
P
k+1|k+1
= P
k+1|k
K
i
S
i
K
T
i
.
When features are rst observed in V-SLAM, we initialize
them into the feature map. To accomplish this, we com-
pute an initial estimate, along with covariance and cross-
correlations, by solving a bundle-adjustment over a short
time window (Hesch et al. 2012a). Finally, for the case
of VIO, the map is not estimated explicitly; instead we
use the Multi-State Constraint Kalman Filter (MSC-KF)
approach (Mourikis and Roumeliotis 2007) to impose a
lter update constraining all the views from which a fea-
ture was seen. To accomplish this, we employ stochastic
cloning (Roumeliotis and Burdick 2002) over a window of
M camera poses.
4. Nonlinear system observability analysis
In this section, we provide a brief overview of the method in
Hermann and Krener (1977) for studying the observability
of nonlinear systems and then introduce a newmethodology
for determining its unobservable directions.
4.1. Observability analysis with Lie derivatives
Consider a nonlinear, continuous-time system:
_
x = f
0
( x) +
i=1
f
i
( x) u
i
z = h( x)
(20)
where u =
_
u
1
. . . u
_
T
is the control input, x =
_
x
1
. . . x
m
_
T
is the state vector, z is the output, and
the vector functions f
i
, i = 0, . . . , , comprise the process
model.
Our objective is to study the observability properties of
the system and to determine the directions in state-space
that the measurements provide information. To this end, we
compute the Lie derivatives of the system. The zeroth-order
Lie derivative of the measurement function h is dened as
the function itself (Hermann and Krener 1977):
L
0
h = h( x)
Each subsequent Lie derivative is formed recursively from
the denition of L
0
h. Specically, for any ith-order Lie
derivative, L
i
h, the ( i+1)th-order Lie derivative L
i+1
f
j
h with
respect to a process function f
j
is computed as:
L
i+1
f
j
h = L
i
h f
j
where L
i
h denotes the span of the ith-order Lie derivative,
i.e.
L
i
h =
_
L
i
h
x
1
L
i
h
x
2
. . .
L
i
h
x
m
_
In order to determine the directions along which infor-
mation can be acquired, we examine the span of the Lie
derivatives. We do this by forming the observability matrix,
O, whose block-rows comprise of the spans of the Lie
derivatives of the system, i.e.
O =
_
_
_
_
_
_
_
L
0
h
L
1
f
i
h
L
2
f
i
f
j
h
L
3
f
i
f
j
f
k
h
.
.
.
_
_
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 187
where i, j, k = 1, . . . , . Based on Hermann and Krener
(1977), to prove that a system is observable, it sufces to
show that a submatrix of O comprising a subset of its rows
is of full column rank. In contrast, to prove that a system is
unobservable and nd its unobservable directions, we need
to: (a) Show that the innitely many block rows of O can
be written as a linear combination of a subset of its block
rows, which form a submatrix O
1
( x)
T
. . .
t
( x)
T
_
T
. These bases are
functions of the variable x in (20), and the number of basis
elements, t, is dened so as to fulll:
(C1)
1
( x) = h( x).
(C2)
x
f
i
, i = 0, . . . , is a function of .
(C3) The system:
_
= g
0
( ) +
i=1
g
i
( ) u
i
z = h =
1
(21)
where g
i
( ) =
x
f
i
( x) , i = 0, . . . , , is observable.
Then:
(i) The observability matrix of (20) can be factorized as:
O = B
where is the observability matrix of system (21) and
B
x
.
(ii) null( O) = null( B).
Proof:
(i) Based on the chain rule, the span of any Lie derivative
L
i
h can be written as:
L
i
h =
L
i
h
x
=
L
i
h
x
Thus, the observability matrix O of (20) can be factorized
as:
O =
_
_
_
_
_
_
_
L
0
h
L
1
f
i
h
L
2
f
i
f
j
h
L
3
f
i
f
j
f
k
h
.
.
.
_
_
=
_
_
_
_
_
_
_
_
_
_
_
L
0
h
L
1
f
i
h
L
2
f
i
f
j
h
L
3
f
i
f
j
f
k
h
.
.
.
_
x
= B (22)
Next we prove that is the observability matrix of the
system (21) by induction.
To distinguish the Lie derivatives of system (20), let I
denote the Lie derivatives of system (21). Then, the span of
its zeroth-order Lie derivative is:
I
0
h =
h
=
L
0
h
=
( I
i
h g
j
)
=
(
L
i
h
x
f
j
( x))
=
(
L
i
h
x
f
j
( x))
=
L
i+1
f
j
h
=
1
2
_
I
3
+s +ss
T
_
(24)
5.2. Determining the systems basis functions
In this section, we dene the basis functions for the VINS
model that satisfy conditions C1 and C2 of Theorem 4.1.
We achieve this by applying C1 to obtain
1
and recur-
sively employing C2 to dene the additional elements
j
,
j = 2, . . . , 6. We note that at each step of this process
there may be multiple options for selecting
j
, and we mit-
igate this by favoring bases that have a meaningful physical
interpretation. After determining the bases, we present the
model of the corresponding system (43), and show that it is
observable in the next section.
To preserve the clarity of presentation, we retain only a
few of the subscripts and superscripts in the state elements
and write the system state vector as:
x =
_
s
T
b
T
g
v
T
b
T
a
p
T
p
T
f
_
T
The VINS model (see (2)(7), (16)(17), and (23)) is
expressed in input-afne form as:
_
_
_
_
_
_
_
_
s
b
g
v
b
a
p
p
f
_
_
=
_
_
_
_
_
_
_
_
Db
g
0
31
g C
T
b
a
0
31
v
0
31
_
_
. ,, .
f
0
+
_
_
_
_
_
_
_
_
D
0
3
0
3
0
3
0
3
0
3
_
_
. ,, .
f
1
+
_
_
_
_
_
_
_
_
0
3
0
3
C
T
0
3
0
3
0
3
_
_
. ,, .
f
2
a (25)
z =
1
p
z
_
p
x
p
y
_
, where
_
_
p
x
p
y
p
z
_
_
=
I
p
f
= C
_
p
f
p
_
(26)
and C C( s). Note that f
0
is an 18 1 vector, while f
1
and f
2
are both 18 3 matrices which is a compact way of
representing three process functions:
f
1
= f
11
1
+f
12
2
+f
13
3
f
2
a = f
21
a
1
+f
22
a
2
+f
23
a
3
Using this model, we dene the bases for this system by
applying the conditions of Theorem 4.1. Specically, we
(a) select
1
as the measurement function z, and (b) recur-
sively determine the remaining bases so that
j
x
f
i
can be
expressed in terms of for all the process functions. Note
also that the denition of the bases is not unique, any basis
functions that satisfy the conditions of Theorem 4.1 span
the same space.
The rst basis is dened as the measurement function:
1
h(x) =
1
p
z
_
p
x
p
y
_
In order to compute the remaining basis elements, we must
ensure that the properties of Theorem 4.1 are satised. We
do so by applying C2 to
1
.
5.2.1. Satisfying condition C2 of Theorem 4.1 for
1
We
start by computing the span of
1
with respect to x, i.e.
1
x
=
_
1
b
g
1
v
1
b
a
1
p
1
p
f
_
=
_
1
p
z
0
p
x
p
2
z
0
1
p
z
p
y
p
2
z
_
. ,, .
h
I
p
f
_
I
p
f
s
0
3
0
3
0
3
C C
_
. ,, .
I
p
f
x
(27)
where
s
= D
1
(see (24)). Once the span of the rst basis
function
1
is obtained, we project it onto all the process
functions, f
0
, f
1
, and f
2
(see (25)), in order to determine
the other basis functions that satisfy condition C2 of The-
orem 4.1. During this procedure, our aim is to ensure that
every term in the resulting product is a function of the exist-
ing basis elements. Whenever a term cannot be expressed
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 189
by the previously dened basis functions, we incorporate it
as a new basis function.
Specically, beginning with the projection of
1
x
along
f
0
we obtain
1
x
f
0
=
_
1
p
z
0
p
x
p
2
z
0
1
p
z
p
y
p
2
z
_
_
I
p
f
b
g
Cv
_
=
_
I
2
1
_
_
1
1
_
b
g
1
p
z
Cv
_
(28)
This is a function of
1
and of other elements of the state x,
namely b
g
and v, as well as of functions of x, which are 1/pz
and C. Hence, in order to satisfy C2, we must dene new
basis elements, which we select as physically interpretable
quantities:
2
1
p
z
(29)
3
Cv
4
b
g
where
2
is the inverse depth to the point,
3
is the velocity
expressed in the local frame, and
4
is the gyroscope bias.
Rewriting (28) using these denitions we have:
1
x
f
0
_
I
2
1
_
_
1
1
_
4
2
3
_
Note that later on we will need to ensure that the properties
of Theorem 4.1 are also satised for these new elements,
2
,
3
, and
4
, but rst we examine the projections of the
span of
1
along f
1
and f
2
.
The projections of
1
x
along the three directions of f
1
(i.e. f
1
e
i
, i = 1, 2, 3, where
_
e
1
e
2
e
3
_
= I
3
) are
1
x
f
1
e
i
=
_
1
p
z
0
p
x
p
2
z
0
1
p
z
p
y
p
2
z
_
I
p
f
e
i
=
_
I
2
1
_
1
1
_
e
i
, i = 1, 2, 3 (30)
Note that in this case no new basis functions need to be
dened since (30) already satises condition C2 of Theo-
rem4.1. Lastly, the projections of
1
x
along the f
2
directions
are
1
x
f
2
e
i
= 0
21
, i = 1, 2, 3
Hence, by adding the new basis elements
2
,
3
, and
4
,
we ensure that the properties of Theorem 4.1 are fullled
for
1
. To make the newly dened basis functions,
2
,
3
,
and
4
, satisfy condition C2, we proceed by projecting their
spans on the process functions.
5.2.2. Satisfying condition C2 of Theorem 4.1 for
2
The
derivative of
2
(see (29)) with respect to the state is:
2
x
=
1
p
2
z
e
T
3
_
I
p
f
s
0
3
0
3
0
3
C C
_
(31)
Projecting (31) along f
0
we obtain
2
x
f
0
=
1
p
2
z
e
T
3
_
G
p
f
b
g
Cv
_
=
2
e
T
3
_
1
1
_
4
2
3
_
which is a function of only the currently enumerated basis
elements.
We also project
2
x
along the remaining input directions,
i.e. f
j
e
i
, j = 1, 2, i = 1, 2, 3.
2
x
f
1
e
i
=
1
p
2
z
e
T
3
G
p
f
e
i
=
2
e
T
3
1
1
_
e
i
, i = 1, 2, 3 (32)
2
x
f
2
e
i
= 0, i = 1, 2, 3
which does not admit any new basis elements. Thus, we
see that
2
fullls the properties of Theorem 4.1 without
requiring us to dene any new basis elements.
5.2.3. Satisfying condition C2 of Theorem 4.1 for
3
Fol-
lowing the same procedure again, we compute the span of
3
with respect to x:
3
x
=
_
Cv
s
0
3
C 0
3
0
3
0
3
_
(33)
and then the projection of (33) along the input direction f
0
3
x
f
0
= Cv b
g
+Cg b
a
3
4
+
5
6
where we assign two new basis elements, i.e.
5
Cg
6
b
a
Note again that we selected physically interpretable func-
tions: (a)
5
is the gravity vector expressed in the local
frame, and (b)
6
is the accelerometer bias. The projections
of (33) along f
j
e
i
, j = 1, 2, i = 1, 2, 3, are
3
x
f
1
e
i
= Cv e
i
=
3
e
i
, i = 1, 2, 3
3
x
f
2
e
i
= I
3
e
i
= e
i
, i = 1, 2, 3
which do not produce additional bases.
5.2.4. Satisfying condition C2 of Theorem 4.1 for
4
We
proceed by examining the span of
4
with respect to x, i.e.
4
x
=
_
0
3
I
3
0
3
0
3
0
3
0
3
_
(34)
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
190 The International Journal of Robotics Research 33(1)
with corresponding projections
4
x
f
0
= 0
31
4
x
f
j
e
i
= 0
31
, j = 1, 2, i = 1, 2, 3
We note here that no additional basis elements are pro-
duced.
5.2.5. Satisfying condition C2 of Theorem 4.1 for
5
The
derivative of
5
with respect to x is:
5
x
=
_
Cg
s
0
3
0
3
0
3
0
3
0
3
_
(35)
Projecting (35) along the input directions, we obtain
5
x
f
0
= Cg b
g
=
5
5
x
f
1
e
i
= Cg e
i
=
5
e
i
, i = 1, 2, 3
5
x
f
2
e
i
= 0
31
, i = 1, 2, 3
All of these are either a function of the existing basis ele-
ments, or are equal to zero, and thus we do not need to
dene any additional bases.
5.2.6. Satisfying condition C2 of Theorem 4.1 for
6
Lastly, we examine the span of the remaining basis element
6
, i.e.
6
x
=
_
0
3
0
3
0
3
I
3
0
3
0
3
_
(36)
The projections of (36) along the input directions are
6
x
f
0
= 0
31
6
x
f
j
e
i
= 0
31
, j = 1, 2, i = 1, 2, 3
which do not produce any additional basis elements.
At this point, we have proved that the conditions C1 and
C2 of Theorem 4.1 are satised for all of the basis elements;
hence, we have dened a complete basis set for the VINS
model:
1
= h(x) (37)
2
=
1
p
z
(38)
3
= Cv (39)
4
= b
g
(40)
5
= Cg (41)
6
= b
a
(42)
These correspond to the landmark projection on the image
plane (37), the inverse depth to the landmark (38), the
velocity expressed in the local frame (39), the gyro
bias (40), the gravity vector expressed in the local
frame (41), and the accelerometer bias (42). Based on
Theorem 4.1, the resulting system in the basis functions
(see (21)) is:
_
_
_
_
_
_
_
_
6
_
_
=
_
_
_
_
_
_
_
_
1
_
4
2
3
_
2
e
T
3
_
4
+
2
3
_
4
+
5
6
0
31
4
0
31
_
_
. ,, .
g
0
+
_
_
_
_
_
_
_
_
2
e
T
3
0
3
0
3
_
_
. ,, .
g
1
+
_
_
_
_
_
_
_
0
23
0
13
I
3
0
3
0
3
0
3
_
_
. ,, .
g
2
a
y =
1
, (43)
where
1
=
_
T
1
1
_
T
denotes
1
expressed as a 3 1
homogeneous vector, and
1
=
_
I
2
1
_
. In the next sec-
tion, we will show that system (43) is observable by proving
its observability matrix is of full column rank. Therefore,
the basis functions
1
to
6
correspond to the observable
modes of system (25)(26), and the system model (43)
governs the time evolution of the observable state.
5.3. Determining the systems observability
matrix and its unobservable directions
Based on Theorem 4.1, the observability matrix O of the
VINS model (see (25)) is the product of the observabil-
ity matrix of system (43) with the matrix B comprising
the derivatives of the basis functions. In what follows, we
rst prove that matrix is of full column rank. Then, we
nd the nullspace of matrix B, which according to Theo-
rem 4.1 corresponds to the unobservable directions of the
VINS model.
Lemma 5.1: System (43) is observable.
Proof: See Appendix B.
Since system (43) is observable, based on Theorem 4.1,
we can nd the unobservable directions of system (25) from
the nullspace of matrix B.
Theorem 5.2: The VINS model (25) is unobservable, and
its unobservable sub-space is spanned by four directions
(see (45)) corresponding to the IMU-camera global posi-
tion and its rotation around the gravity vector in the global
frame.
Proof: System (43) satises the conditions of Theorem 4.1.
Therefore, null( O) = null( B), which spans the unobserv-
able subspace of the original system (25). Stacking the
derivatives of the basis functions with respect to the vari-
able x, the matrix B can be written as (see (27), (31), (33),
(34), (35), and (36)):
B=
_
_
_
_
_
0
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
0
3
I
3
_
_
. ,, .
B
1
_
_
_
_
_
_
I
p
f
s
0
3
0
3
0
3
C C
Cv
s
0
3
C 0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
Cg
s
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
_
_
. ,, .
B
2
(44)
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 191
where we have factorized B = B
1
B
2
to further simplify
the proof, and, for conciseness, we have denoted the rst
subblock of B
1
as
=
_
_
_
1
p
z
0
p
x
p
2
z
0
1
p
z
p
y
p
2
z
0 0
1
p
2
z
_
_
It is easy to verify that B
1
is full rank, since it is comprised
of block-diagonal identity matrices, as well as the 3 3
upper-triangular matrix , which is itself full rank (since
1
p
z
= 0). Hence, we can study the unobservable modes of
VINS by examining the right nullspace of B
2
.
The 15 18 matrix B
2
is rank decient by exactly four,
and these four unobservable modes are spanned by the
columns of the following matrix
N =
_
_
_
_
_
_
_
_
0
3
s
Cg
0
3
0
31
0
3
v g
0
3
0
31
I
3
pg
I
3
p
f
g
_
_
(45)
By multiplying B
2
from the right with N, it is straightfor-
ward to verify that N is indeed the right nullspace of B
2
(see (44) and (45)). We note that the rst three columns
of N correspond to globally translating the feature and the
IMU-camera sensor pair together, while the fourth column
corresponds to global rotations about the gravity vector.
We further prove that there are no additional right
nullspace directions by showing that the 15 18 matrix B
2
has rank 14 (note that if B
2
had ve or more right nullspace
directions, then it would be of rank 13 or less). To do so, we
examine the left nullspace of B
2
. Specically, we postulate
that B
2
has a left nullspace comprising the block elements
M
1
, . . . , M
5
, i.e.
0 = [M
1
M
2
M
3
M
4
M
5
]
_
_
_
_
_
_
_
I
p
f
s
0
3
0
3
0
3
C C
Cv
s
0
3
C 0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
0
3
Cg
s
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
I
3
0
3
0
3
_
_
Based on the relationships involving the second and fourth
block columns of B
2
, we see that M
3
I
3
= 0 and M
5
I
3
= 0,
which can only hold if both M
3
and M
5
are zero. From the
third and sixth columns of B
2
we see that M
2
C = 0 and
M
1
C = 0, which again can only hold if M
1
and M
2
are
zero, since the rotation matrix C is full rank. Thus far, the
only potentially nonzero element in the left nullspace of B
2
is M
4
. By writing the relationship involving the rst block
column of B
2
, we obtain
M
4
Cg
s
= 0
The matrix
s
is full rank, hence, the only nonzero M
4
which can satisfy this relationship is
M
4
= (Cg)
T
Therefore, we conclude that B
2
has a one dimensional left
nullspace, i.e.
M =
_
0
13
0
13
0
13
(Cg)
T
0
13
_
(46)
Since B
2
is a matrix of dimensions 1518 with exactly one
left null vector (see (46)), it is of rank 14. Applying this fact
to determine the dimension of the right nullspace, we see
that the right nullspace comprises 18 14 = 4 directions,
which are spanned by N [see (45)].
6. Observability-constrained VINS
Whereas in the previous section we concerned ourselves
with the properties of the underlying nonlinear system,
here we analyze the observability properties of the lin-
earized system employed for estimation purposes. When
using a linearized estimator, such as the EKF, errors in lin-
earization while evaluating the system and measurement
Jacobians change the directions in which information is
acquired by the estimator.
2
We postulate that if this infor-
mation mistakingly lies along unobservable directions, it
can lead to larger errors, erroneously smaller uncertain-
ties, and inconsistency. We rst analyze this issue, and
subsequently, present an Observability-Constrained VINS
(OC-VINS) that explicitly adheres to the observability
properties of VINS.
The observability matrix (Maybeck 1979) is dened as a
function of the linearized measurement model, H, and the
discrete-time state transition matrix, , which are in turn
functions of the linearization point, x, i.e.
M(x) =
_
_
_
_
_
H
1
H
2
2,1
.
.
.
H
k
k,1
_
_
(47)
where
k,1
k1,k2
2,1
is the state transition matrix
from time step 1 to k. In Hesch et al. (2012a), we show that
k+1,k
(see (14)) has the following block structure:
k+1,k
=
_
R
k+1,k
0
153N
0
3N15
I
3N
_
(48)
with
R
k+1,k
=
_
_
_
_
_
_
11
12
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
31
32
I
3
34
0
3
0
3
0
3
0
3
I
3
0
3
51
52
tI
3
54
I
3
_
_
To simplify the discussion, we consider a single landmark
in the state vector (i.e. N = 1), and write the rst block row
of M(x) as (see (19))
H
1
= H
c
1
C
_
I
1
q
G
_
G
p
f
G
p
I
1
C
_
I
1
q
G
_
T
0
3
0
3
0
3
I
3
I
3
_
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
192 The International Journal of Robotics Research 33(1)
where
I
1
q
G
, denotes the rotation of {G} with respect to frame
{I} at time step 1, and for the purposes of the observability
analysis, all the quantities appearing in the previous expres-
sion are the true ones. As shown in Hesch et al. (2012a), the
kth block row, for k > 1, is of the form:
H
k
k,1
= H
c
k
C
_
I
q
G
k
_ _
k
D
k
I
3
t
k1
E
k
I
3
I
3
_
(49)
where
k
=
G
p
f
G
p
I
1
G
v
I
1
t
k1
+
1
2
G
gt
2
k1
C
_
I
1
q
G
_
T
t
k1
=( k 1) t
We note that in (49) D
k
and E
k
are both time-varying
matrices, which do not affect the observability properties.
It is straightforward to verify that the right nullspace of
M(x) spans four directions, i.e.
M(x) N
1
= 0 (50)
N
1
=
_
_
_
_
_
_
_
_
0
3
C
_
I
1
q
G
_
G
g
0
3
0
31
0
3
G
v
I
1
G
g
0
3
0
31
I
3
G
p
I
1
G
g
I
3
G
p
f
G
g
_
_
=
_
N
t
1
| N
r
1
_
(51)
where N
t
1
corresponds to global translations and N
r
1
corre-
sponds to global rotations about the gravity vector, which
are the same as those of the nonlinear system.
3
Ideally, any estimator we employ should correspond to a
system with an unobservable subspace that matches these
directions, both in number and structure. However, when
linearizing about the estimated state x, M
_
x
_
gains rank
due to errors in the state estimates across time (Hesch et al.
2012a). Hence, the linearized system used by the EKF has
different observability properties than the nonlinear system
it approximates, which leads to the acquisition of nonexis-
tent information about global rotations around the gravity
vector (yaw). To address this problem and ensure that (51)
is satised for every block row of M when the state esti-
mates are used for computing H
, and
,1
, = 1, . . . , k,
we must ensure that H
,1
N
1
= 0, = 1, . . . , k (see (47)
and (50)).
One way to enforce this is by requiring that at each time
step,
+1,
and H
(52)
H
= 0, = 1, . . . , k (53)
where N
following the
process described in the next section.
6.1. OC-VINS: Algorithm description
Hereafter, we present our OC-VINS algorithm which
enforces the observability constraints dictated by the VINS
system structure. Rather than changing the linearization
points explicitly (e.g. as in Huang et al. (2008)), we main-
tain the nullspace, N
k
, at each time step, and use it to
enforce the unobservable directions. We refer to the rst set
of block rows of N
k
as the nullspace corresponding to the
robot state, which we term N
R
k
, whereas the last block row
of N
k
is the nullspace corresponding to the feature state, i.e.
N
f
k
. Specically, the 15 4 nullspace sub-block, N
R
k
, cor-
responding to the robot state is analytically dened as (see
(51) and Hesch et al. (2012a)):
N
R
1
=
_
_
_
_
_
_
_
0
3
C
_
I
q
G,1|1
_
G
g
0
3
0
31
0
3
G
v
I,1|1
G
g
0
3
0
31
I
3
G
p
I,1|1
G
g
_
_
N
R
k
=
_
_
_
_
_
_
_
0
3
C
_
I
q
G,k|k1
_
G
g
0
3
0
31
0
3
G
v
I,k|k1
G
g
0
3
0
31
I
3
G
p
I,k|k1
G
g
_
_
=
_
N
R
t,k
| N
R
r,k
_
(54)
The 3 4 nullspace sub-block, N
f
k
, corresponding to the
feature state, is a function of the feature estimate at time t
_
N
R
k+1
N
f
k+1
_
=
_
R
k+1,k
0
153
0
315
I
3
_ _
N
R
k
N
f
k
_
which, after multiplying out, provides two relationships that
should be satised:
N
R
k+1
=
R
k+1,k
N
R
k
(56)
N
f
k+1
= N
f
k
(57)
From the denition of N
f
k
(see (55)), it is clear that (57)
holds automatically, and does not require any modication
of
k+1,k
. However, (56) will in general not hold, and hence
it requires changing
R
k+1,k
such that N
R
k+1
=
R
k+1,k
N
R
k
.
In order to determine which elements of
R
k+1,k
should
be modied to satisfy (56), we further analyze the struc-
ture of this constraint. To do so, we partition N
R
k
into two
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 193
components: (a) the rst three columns corresponding to
the unobservable translation, N
R
t,k
, and (b) the fourth col-
umn corresponding to the unobservable rotation about the
gravity vector, N
R
r,k
(see (51)). We rewrite (56) based on this
partitioning to obtain:
N
R
k+1
=
R
k+1,k
N
R
k
_
N
R
t,k+1
N
R
r,k+1
_
=
R
k+1,k
_
N
R
t,k
N
R
r,k
_
which is equivalent to satisfying the following two relation-
ships simultaneously, i.e.
N
R
t,k+1
=
R
k+1,k
N
R
t,k
(58)
N
R
r,k+1
=
R
k+1,k
N
R
r,k
(59)
Treating these in order, we see that (58) is automatically
satised, since every block row results in 0
3
= 0
3
or I
3
=
I
3
, i.e.
N
R
t,k+1
=
R
k+1,k
N
R
t,k
_
_
_
_
_
_
0
3
0
3
0
3
0
3
I
3
_
_
=
_
_
_
_
_
_
11
12
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
31
32
I
3
34
0
3
0
3
0
3
0
3
I
3
0
3
51
52
tI
3
54
I
3
_
_
_
_
_
_
_
_
0
3
0
3
0
3
0
3
I
3
_
_
We proceed by expanding the second relationship element-
wise (see (59)) and we obtain
N
R
r,k+1
=
R
k+1,k
N
R
r,k
_
_
_
_
_
_
_
C
_
I
q
G,k+1|k
_
G
g
0
31
G
v
I,k+1|k
G
g
0
31
G
p
I,k+1|k
G
g
_
_
=
_
_
_
_
_
11
12
0
3
0
3
0
3
0
3
I
3
0
3
0
3
0
3
31
32
I
3
34
0
3
0
3
0
3
0
3
I
3
0
3
51
52
tI
3
54
I
3
_
_
_
_
_
_
_
_
_
C
_
I
q
G,k|k1
_
G
g
0
31
G
v
I,k|k1
G
g
0
31
G
p
I,k|k1
G
g
_
_
From the rst block row we have that
C
_
I
q
G,k+1|k
_
G
g =
11
C
_
I
q
G,k|k1
_
G
g
11
= C
_
I,k+1|k
q
I,k|k1
_
(60)
The requirements for the third and fth block rows are:
31
C
_
I
q
G,k|k1
_
G
g =
G
v
I,k|k1
G
g
G
v
I,k+1|k
G
g
(61)
51
C
_
I
q
G,k|k1
_
G
g = t
G
v
I,k|k1
G
g +
G
p
I,k|k1
G
g
G
p
I,k+1|k
G
g (62)
both of which are of the formAu = w, where u and w com-
prise nullspace elements that are xed (see (54)), and we
seek to nd a perturbed A
, for A =
31
and A =
51
,
that fullls the constraint. To compute the minimum pertur-
bation, A
||A
A||
2
F
, s.t. A
u = w (63)
where || ||
F
denotes the Frobenius matrix norm. After
employing the method of Lagrange multipliers, and solv-
ing the corresponding KKT optimality conditions (Boyd
and Vandenberghe 2004), the optimal A
= A( Au w) ( u
T
u)
1
u
T
(64)
In summary, satisfying (52) only requires modify-
ing three block elements of
k
during each propaga-
tion step. Specically, we compute the modied
11
from (60), and
31
and
51
from (63)(64) and con-
struct the observability-constrained discrete-time state tran-
sition matrix. We then proceed with covariance propagation
(see (15)).
6.1.2. Modication of the measurement Jacobian H Dur-
ing each update step, we seek to satisfy (53), i.e. H
k
N
k
= 0.
Based on (19), (54), and (55) we can write this relationship
per feature as
H
c
_
H
0
39
H
p
| H
f
_
_
_
_
_
_
_
_
_
_
0
3
C
_
I
q
G,k|k1
_
G
g
0
3
0
31
0
3
G
v
I,k|k1
G
g
0
3
0
31
I
3
G
p
I,k|k1
G
g
I
3
G
p
f
|
G
g
_
_
= 0
(65)
The rst block column of (65) requires that H
f
= H
p
.
Hence, we rewrite the second block column of (65) as
H
c
_
H
H
p
_
_
C
_
I
q
G,k|k1
_
G
g
_
G
p
f
|
G
p
I,k|k1
_
G
g
_
= 0
This is a constraint of the form Au = 0, where u is a
xed quantity determined by elements in the nullspace, and
A comprises elements of the measurement Jacobian H
k
.
We compute the optimal A
, we
recover the Jacobian as
H
c
H
= A
1:2,1:3
(66)
H
c
H
p
= A
1:2,4:6
(67)
H
c
H
f
= A
1:2,4:6
(68)
where the subscripts (i:j, m:n) denote the matrix sub-block
spanning rows i to j, and columns m to n. After computing
the modied measurement Jacobian, we proceed with the
lter update as described in Section 3.2.
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
194 The International Journal of Robotics Research 33(1)
6.2. Application to the MSC-KF
The MSC-KF (Mourikis and Roumeliotis 2007) is a VINS
that performs tightly coupled visual-inertial odometry over
a sliding windowof M poses, while maintaining linear com-
plexity in the number of observed features. The key advan-
tage of the MSC-KF is that it exploits all the constraints for
each feature observed by the camera over M poses, with-
out requiring to build a map or estimate the features as part
of the state vector. We hereafter describe how to apply our
OC-VINS methodology to the MSC-KF.
Each time the camera records an image, the MSC-
KF creates a stochastic clone (Roumeliotis and Burdick
2002) of the sensor pose. This enables the MSC-KF to
use delayed image measurements; in particular, it allows
all of the observations of a given feature p
f
i
to be pro-
cessed during a single update step (when the rst pose
that observed the feature is about to be marginalized).
Every time the current pose is cloned, we also clone the
corresponding nullspace elements to obtain an augmented
nullspace, i.e.
N
aug
k
=
_
N
k
N
k,clone
_
where N
k,clone
=
_
0
3
C
_
I
q
G,k|k1
_
G
g
I
3
G
p
I,k|k1
G
g
_
During propagation, the current state estimate evolves
forward in time by integrating (8)(13), while the cur-
rent clone poses are static. Moreover, we employ (60) and
solve in closed form the optimization problem (63) for
the constraints (61)(62), using (64), so as to compute
the observability-constrained discrete-time state transition
matrix
k+1,k
, and propagate the covariance as
P
aug
k+1|k
=
aug
k+1,k
P
aug
k|k
augT
k+1,k
+
_
Q
k
0
156M
0
6M15
0
6M
_
aug
k+1,k
=
_
k+1,k
0
156M
0
6M15
I
6M
_
where P
aug
i|j
denotes the covariance of the augmented state
corresponding to M cloned poses, along with the current
state.
During the MSC-KF update step, we process all measure-
ments of the features observed by the Mth clone (i.e. the one
about to be marginalized fromthe sliding windowof poses).
We use (66)(68) to compute the observability-constrained
measurement Jacobian,
H
k
, for each measurement and
stack all observations of the ith feature across M time steps
into a large measurement vector
_
_
_
z
k
.
.
.
z
kM
_
_=
_
_
_
H
k
.
.
.
H
kM
_
_
_
x
aug
p
f
_
+
_
_
_
k
.
.
.
kM
_
_=H
x
x
aug
+H
f
p
f
+
(69)
Fig. 2. (a) Camera-IMU trajectory and 3D features. (b) Errors
(lines with markers) and 3 bounds (lines without markers) for
the rotation about the gravity vector, for the three lters, plotted
for a single run. Note that the errors and 3 bounds corresponding
to the ideal VINS and the proposed OC-VINS are almost identical
which makes it difcult to distinguish the corresponding lines in
the gure.
where H
x
and H
f
are the Jacobians corresponding to the
augmented state vector x
aug
, and to the feature, respectively.
To avoid including p
f
into the state, we marginalize it by
projecting (69) onto the left nullspace of H
f
, which we term
W. This yields
W
T
z = W
T
H
x
x
aug
+W
T
z
= H
x
x
aug
+
,
which we employ to update the state estimate and covari-
ance using the standard EKF update equations (Mourikis
and Roumeliotis 2007).
7. Simulations
We conducted Monte-Carlo simulations to evaluate the
impact of the proposed OC-VINS method on estimator
consistency. We compared its performance to the standard
VINS (Std-VINS), as well as the ideal VINS that linearizes
about the true state. Since the ideal VINS has access to the
true state, it is not realizable in practice, but we included
it here as a baseline comparison. Specically, we com-
puted the Root Mean Squared Error (RMSE) and Normal-
ized Estimation Error Squared (NEES) over 100 trials in
which the camera-IMU platform traversed a circular trajec-
tory of radius 5 m at an average velocity of 60 cm/s. The
camera had 45
q
j
(using the gyroscope measurements), we compute
the essential matrix, which now has only two DOF, from
only two feature correspondences, i.e.
i
T
E
j
= 0, where
i
is the unit-norm vector of a feature at time step i, and
E
i
j
C(
i
q
j
), where
i
j
is the two DOF direction of
motion between camera poses i and j. This approach is more
robust than the ve-point algorithm (Nistr 2003) because
it provides 2 solutions for the essential matrix rather than
up to 10, and as it requires only 2 data points, it reaches a
consensus with fewer hypotheses when used in a RANSAC
framework.
At every time step, the robot poses corresponding to the
last M images are kept in the state vector, as described in
Roumeliotis and Burdick (2002). Before marginalizing a
pose, all the features that rst appeared at the oldest aug-
mented robot pose, are processed following the MSC-KF
approach, as discussed in Section 3.2.
8.2. Experimental evaluation
Experiments were performed with a PointGrey Chameleon
4
camera and a Navchip IMU
5
which are rigidly attached on
a light-weight sensing platform (see Figure 4). For both
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
196 The International Journal of Robotics Research 33(1)
Fig. 5. Experiment 1: (a) The estimated uncertainty in yaw computed by the Std-VINS and OC-VINS methods. (b) An xy view of the
trajectory which covered 550 m over three oors. (c) A 3D side view of the trajectory.
Fig. 4. The hardware setup comprises a miniature monochrome
Point Grey Chameleon camera recording images at 7.5 Hz, and a
rigidly attached InterSense NavChip IMU operating at 100 Hz. A
coin (US dime, radius 1.8 cm) is included as a size reference.
experiments, IMU signals were sampled at a frequency of
100 Hz while camera images were acquired at 7.5 Hz. Fea-
tures were tracked using a window of M = 20 images. In
the rst experiment, the platform traveled a total distance of
550 m over three oors of Walter Library at the University
of Minnesota, traversing regions with a variety of lighting
conditions, containing areas that were both rich and poor in
distinctive features, and passing through both crowded and
empty scenes. A video demonstrating the robustness of the
proposed algorithm, can be found in Extension 1, accom-
panying the present paper. At the end of the trajectory, the
sensing platform was placed back in its original congura-
tion, so as to provide a quantitative characterization of the
achieved accuracy.
For the Std-VINS the nal position error was 5.33 m,
while the OC-VINS achieved a nal error of 4.60 m, cor-
responding to 0.97% and 0.83% of the total distance trav-
elled, respectively (see Figure 5). In addition, the estimated
covariances from the Std-VINS are smaller than those from
the OC-VINS (see Figure 6). Furthermore, uncertainty esti-
mates from the Std-VINS decreased in directions that are
unobservable (i.e. rotations about the gravity vector); this
violates the observability properties of the system and
demonstrates that spurious information is injected into the
lter.
Figure 5(a) highlights the difference in estimated yaw
uncertainty between the OC-VINS and the Std-VINS. In
contrast to the OC-VINS, the Std-VINS covariance rapidly
decreases, violating the observability properties of the sys-
tem. Similarly, large differences can be seen in the covari-
ance estimates for the x- and y-position estimates (see Fig-
ure 6(a) and (b)). The Std-VINS estimates a much smaller
uncertainty than the OC-VINS, supporting the claim that
Std-VINS tends to be inconsistent.
The second experiment was conducted on the fth oor
of Keller Hall at the University of Minnesota, for which
the oor plans are available, over a trajectory of 144 m.
The IMU-camera sensor platform was initially aligned with
the walls, so that the comparison of the estimated tra-
jectory with the oor plans can provide strong qualita-
tive evidence of the lters improvement in accuracy and
consistency. As evident from Figures 7 and 8, the Std-
VINS erroneously injects information along the global yaw
direction, which results in an infeasible trajectory when
overlaid on the buildings blueprint (i.e. the path passes
through the walls). Both Std-VINS and OC-VINS achieved
nal accuracy smaller than 0.5% of the total distance trav-
elled. However, as evident from the overlay of the esti-
mated trajectories on the oor plan, the violation of the
correct observability properties from the Std-VINS, leads
to inconsistent yaw estimates which cause signicant posi-
tion errors, especially at the parts of the trajectory that are
farthest from the starting point (see Figures 7(a) and (b)).
9. Conclusion and future work
In this work, we studied the observability properties of
VINS, and leveraged our key results for mitigating lter
inconsistency and improving the performance of linearized
estimators. To do so, we rst introduced a new method for
determining the unobservable modes of a nonlinear sys-
tem. In particular, and in contrast to previous approaches
that require considering the innitely many block rows of
the observability matrix, we employ a set of auxiliary vari-
ables (basis functions) to achieve a factorization of the
observability matrix, and hence a decomposition of the
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 197
Fig. 6. Experiment 1: The estimated uncertainties for the Std-VINS and OC-VINS estimators computed for the x-, y-, and z-axes of
position. We note that because Std-VINS is overcondent in its yaw estimate, it also becomes overcondent along the x- and y-axes
since the horizontal positioning uncertainty is coupled with the global yaw uncertainty.
Fig. 7. Experiment 2: (a) The estimated uncertainty in yaw computed by the Std-VINS and OC-VINS methods. (b) Overhead x-y view
of the trajectory, projected on the buildings oor plans. The Std-VINS violation of the correct observability properties results in an
infeasible trajectory (i.e. passing through walls), while the Std-VINS remains consistent with the ground truth oor drawings.
Fig. 8. Experiment 2: The estimated uncertainties for the Std-VINS and OC-VINS estimators computed for the x-, y-, and z-axes of
position.
original system into observable and unobservable modes.
Using this approach, the observability matrix of the result-
ing (reduced) unobservable system has a bounded number
of rows which greatly simplies the process for computing
all its unobservable modes, and thus those of the original
system.
Next, we applied this method to the VINS state
model and derived the analytical form of the unobservable
directions of the nonlinear system. Furthermore, we showed
that these coincide with the unobservable directions of the
linearized system, when linearization is performed around
the true state. In practice, however, when the system is
linearized about the state estimates, the observability prop-
erties are violated, allowing spurious information to enter
into the estimator and leading to inconsistency. To address
this issue, we employed our analysis for improving the
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
198 The International Journal of Robotics Research 33(1)
consistency and accuracy of VINS. In particular, we explic-
itly enforced the correct observability properties (in terms
of number and structure of the unobservable directions)
by performing simple modications of the system and
measurement Jacobians. Finally, we presented both simu-
lation and experimental results that validate the superior
performance and improved consistency of the proposed
observability-constrained estimator.
In our future work, we are interested in analyzing addi-
tional sources of estimator inconsistency in VINS such as
the existence of multiple local minima.
Funding
This work was supported by the University of Minnesota (DTC)
and AFOSR (grant number FA9550-10-1-0567).
Notes
1. As dened in Bar-Shalom et al. (2001), a state estimator is
consistent if the estimation errors are zero-mean and have
covariance equal to the one calculated by the lter.
2. The same phenomenon occurs in other estimation frameworks
such as batch least-squares, unless the entire cost function
(including both IMU and vision constraints) is relinearized at
every iteration.
3. Note that the unobservable directions for the linearized system
when the Jacobians are evaluated at the true states (see (51))
are the same as those of the underlying nonlinear system
(see (45)). The term
s
L
0
h
L
3
g
0
g
13
g
21
h
L
1
g
0
h
L
3
g
0
g
13
g
13
h
L
3
g
0
g
0
g
21
h
L
2
g
0
g
0
h
L
3
g
0
g
0
g
13
h
L
3
g
0
g
0
g
0
h
(70)
which, after expanding all of the spans of the Lie derivatives
in (70), has the following structure:
I
33
0
36
0
36
0
13
0
16
0
16
X
63
66
0
66
Y
63
Z
66
66
=
_
_
I
33
0
36
0
36
X
63
66
0
66
Y
63
Z
66
66
_
_
Hence, we can prove that system (43) is observable, by
showing that the matrix
66
and
66
are full-rank matrices. Specically,
=
_
_
_
_
_
_
_
_
_
21
0
11
21
11
12
2
11
+1
12
0
21
12
21
2
12
1
11
12
11
21
0
11
21
4
11
12
2
2
12
2
2
11
12
0
21
12
21
2
2
12
2
2
11
4
11
12
11
0 0 2
2
21
2
12
21
4
11
21
0
0 0 0 0 2
12
21
2
21
_
_
where
ij
denotes the jth component of basis element
i
.
Examining the determinant of , we see that
det () = 4
5
21
_
2
11
+
2
12
1
_ _
2
2
11
+2
2
12
+1
_
= 4
1
p
5
z
_
p
2
x
p
2
z
+
p
2
y
p
2
z
1
__
2
p
2
x
p
2
z
+2
p
2
y
p
2
z
+1
_
(71)
where for the purpose of analyzing the determinant, we
have substituted the basis element denitions (see (37)
and (38)). First, we note that since the observed point cannot
be coincident with the camera center (due to the physical
size of the lens and optics), p
z
= 0. Moreover, since we
only process features whose positions can be triangulated
from multiple views (i.e. features that are not at innite dis-
tance from the camera)
1
p
z
= 0. Second, we note that all
quantities in the last term are nonnegative, hence,
_
2
p
2
x
p
2
z
+2
p
2
y
p
2
z
+1
_
1
This means that is only rank decient when the relation-
ship
_
p
2
x
p
2
z
+
p
2
y
p
2
z
1
_
= 0
holds. This equation is satised when the observed point
lies on a circle with radius 1 on the normalized image plane
(i.e. at focal length 1 from the optical center). The corre-
sponding bearing angle to a point on this circle is 45
. This
corresponds to a zero-probability event, since the control
inputs of the system take arbitrary values across time. Thus,
we conclude that is generically full rank.
We now turn our attention to the 6 6 submatrix :
=
_
_
_
_
_
_
_
_
21
0
11
21
21
0
11
21
0
21
12
21
0
21
12
21
0
21
12
21
0 0
12
21
21
0
11
21
0 0
11
21
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
_
_
where
i,j
denotes the element in the ith row and jth column
of the matrix , with
5,1
= 2
21
_
11
42
12
41
+
21
33
_
21
_
2
11
42
12
41
+
21
33
_
2
11
21
42
5,2
= 2
21
43
+
21
_
43
+
11
41
_
+2
11
21
41
5,3
= 2
21
_
42
21
31
12
43
+
11
_
11
42
12
41
+
21
33
__
2
21
42
21
2
_
31
11
33
_
12
21
_
43
+
11
41
_
+2
11
21
_
11
42
12
41
+
21
33
_
+
11
21
_
2
11
42
12
41
+
21
33
_
5,4
= 2
21
_
11
42
12
41
+
21
33
_
+
21
_
2
11
42
12
41
+
21
33
_
+
11
21
42
5,5
=
21
43
21
_
43
+
11
41
_
11
21
41
5,6
=
21
2
_
31
11
33
_
+
21
42
2
21
_
42
21
31
12
43
+
11
_
11
42
12
41
+
21
33
__
+
12
21
_
43
+
11
41
_
2
11
21
_
11
42
12
41
+
21
33
_
11
21
_
2
11
42
12
41
+
21
33
_
6,1
= 2
21
43
21
_
43
+
12
42
_
2
12
21
42
6,2
= 2
12
21
41
21
_
11
42
2
12
41
+
21
33
_
2
21
_
11
42
12
41
+
21
33
_
6,3
= 2
21
41
21
2
_
32
12
33
_
2
21
_
41
+
21
32
11
43
12
_
11
42
12
41
+
21
33
__
+
11
21
_
43
+
12
42
_
+2
12
21
_
11
42
12
41
+
21
33
_
+
12
21
_
11
42
2
12
41
+
21
33
_
6,4
=
21
43
+
21
_
43
+
12
42
_
+
12
21
42
6,5
= 2
21
_
11
42
12
41
+
21
33
_
+
21
_
11
42
2
12
41
+
21
33
_
12
21
41
6,6
=
21
2
_
32
12
33
_
21
41
+2
21
_
41
+
21
32
11
43
12
_
11
42
12
41
+
21
33
__
11
21
_
43
+
12
42
_
2
12
21
_
11
42
12
41
+
21
33
_
12
21
_
11
42
2
12
41
+
21
33
_
Again, by examining the matrix determinant, we can show
that is generically full rank. Specically,
det () = 3
7
21
_
11
33
41
32
42
31
41
+
12
33
42
_
= 3
7
21
_
11
33
31
12
33
32
_
_
41
42
_
(72)
We hereafter employ the denitions of the basis elements
(see (37)(40)) in order to analyze det (). As before, the
rst term
21
=
1
p
z
is strictly positive and nite. For the
remaining two terms, it sufces to show that they and their
product are generically non-zero.
Starting from the last term, we note that this is zero only
when b
g
=
4
= 0
31
. However, this corresponds to a
at Istanbul Teknik Universitesi on January 15, 2014 ijr.sagepub.com Downloaded from
Hesch et al. 201
different system whose system equations would need to be
modied to reect that its gyro is bias free.
The second term is a function of the feature observation,
1
= h, and the velocity expressed in the local frame,
3
=
Cv, which can be written in a matrix vector form as
_
11
33
31
12
33
32
_
T
= A
3
where A =
_
I
2
1
_
. Since, generically,
3
= 0
31
(the
camera is moving), and A is full column rank, their product
cannot be zero. Thus, it sufces to examine the case for
which
_
11
33
31
12
33
32
_
_
41
42
_
= 0
41
42
_
A
3
= 0
A
3
=
_
42
41
_
, R
This condition, for particular values of
41
and
42
(con-
stant), and for time-varying values of
1
and hence A,
restricts
3
= Cv to always reside in a manifold. This
condition, however, cannot hold given that arbitrary con-
trol inputs (linear acceleration and rotational velocity) are
applied to the system.
We have shown that the diagonal elements of
, i.e.
and are both full rank (see (71) and (72)). We can now
apply block-Gaussian elimination in order to show that
66
to obtain the following matrix whose columns span
the same space:
=
_
_
I
33
0
36
0
36
0
63
66
0
66
0
63
0
66
66
_
_
Since the block-diagonal elements of