Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
I NTRODUCTION
tively, high-dimensional features can be projected to a lowdimensional space from which a classifier can be constructed.
The compressive sensing (CS) theory [17], [18] shows
that if the dimension of the feature space is sufficiently
high, these features can be projected to a randomly chosen low-dimensional space which contains enough information to reconstruct the original high-dimensional features.
The dimensionality reduction method via random projection (RP) [19], [20] is data-independent, non-adaptive and
information-preserving. In this paper, we propose an effective
and efficient tracking algorithm with an appearance model
based on features extracted in the compressed domain [1].
The main components of the proposed compressive tracking
algorithm are shown by Figure 1. We use a very sparse
measurement matrix that asymptotically satisfies the restricted
isometry property (RIP) in compressive sensing theory [17],
thereby facilitating efficient projection from the image feature space to a low-dimensional compressed subspace. For
tracking, the positive and negative samples are projected
(i.e., compressed) with the same sparse measurement matrix
and discriminated by a simple naive Bayes classifier learned
online. The proposed compressive tracking algorithm runs at
real-time and performs favorably against state-of-the-art trackers on challenging sequences in terms of efficiency, accuracy
and robustness.
The rest of this paper is organized as follows. We first
review the most relevant work on online object tracking
in Section 2. The preliminaries of compressive sensing and
random projection are introduced in Section 3. The proposed
algorithm is detailed in Section 4, and the experimental results
are presented in Section 5 with comparisons to state-of-the-art
methods on challenging sequences. We conclude with remarks
on our future work in Section 6.
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"
Samples
"
"
#
Frame(t)
Sparse
measurement
Multiscale
matrix
image features
Multiscale
filter bank
"
"
"
Compressed
vectors
Classifer
Sparse
measurement
matrix
Multiscale
filter bank
Frame(t+1)
Multiscale
image features
Compressed
vectors
Classifier
R ELATED
WORK
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
P RELIMINARIES
(1)
or
+1
0
rij = 3
with probability 16
with probability 23
with probability 16 .
(5)
3.2
A typical measurement matrix satisfying the restricted isometry property is the random Gaussian matrix R Rnm where
rij N (0, 1) (i.e., zero mean and unit variance), as used in
recent work [11], [37], [38]. However, as the matrix is dense,
the memory and computational loads are very expensive when
m is large. In this paper, we adopt a very sparse random
measurement matrix with entries defined as
1
1 with probability 2
0 with probability 1 1
(7)
rij =
1
1 with probability 2
.
Achlioptas [19] proves that this type of matrix with = 1
or 3 satisfies the Johnson-Lindenstrauss lemma (i.e., (4) and
(5)). This matrix is easy to compute which requires only a
uniform random generator. More importantly, when = 3,
it is sparse where two thirds of the computation can be
avoided. In addition, Li et al. [39] show that for = o(m)
(x Rm ), the random projections are almost as accurate as
the conventional random projections where rij N (0, 1).
Therefore, the random matrix (7) with = o(m) asymptotically satisfies the JL lemma. In this work, we set =
o(m) = m/(a log10 (m)) = m/(10a) m/(6a) with a
fixed constant a because the dimensionality m is typically
in the order of 106 to 1010 . For each row of R, only about
1
1
c = ( 2
+ 2
) m = a log10 (m) 10a nonzero entries
need to be computed. We observe that good results can be
obtained by fixing a = 0.4 in our experiments. Therefore, the
computational complexity is only o(cn) (n = 100 in this work)
which is very low. Furthermore, only the nonzero entries of R
need to be stored which makes the memory requirement also
very light.
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Original image
F1,1
F1,2
" F
"
w,h
x1
x = x#2
xwh
P ROPOSED A LGORITHM
Image representation
(8)
0, otherwise
wh
where w and h are the width and height of a rectangle filter,
respectively.
Then, we represent each filtered image as a column vector in
Rwh and concatenate these vectors as a very high-dimensional
multiscale image feature vector x = (x1 , ..., xm )> Rm
where m = (wh)2 . The dimensionality m is typically in the
order of 106 to 1010 . We adopt a sparse random matrix R in (7)
to project x onto a vector v Rn in a low-dimensional space.
The random matrix R needs to be computed only once offline and remains fixed throughout the tracking process. For the
vi = rij x j
v
R nm
x
Fig. 3: Graphical representation of compressing a high-dimensional
vector x to a low-dimensional vector v. In the matrix R, dark, gray
and white rectangles represent negative, positive, and zero entries,
respectively. The blue arrows illustrate that one of nonzero entries of
one row of R sensing an element in x is equivalent to a rectangle
filter convolving the intensity at a fixed position of an input image.
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
wj
sh
sy
sw j
hj
sh j sw
0.2
0.1
0.15
0.08
0.06
0.06
0.1
0.04
0.04
0.05
0.02
0.5
1.5
2
5
x 10
0
0
0.12
0.08
0
0
0.02
10
12
0
0
x 10
10
15
4
x 10
Fig. 5: Probability distributions of three different features in a lowdimensional space. The red stair represents the histogram of positive
samples while the blue one represents the histogram of negative
samples. The red and blue lines denote the corresponding estimated
distributions by the proposed incremental update method.
xj (sy)
i1
1i + (1 )1
q
(i1 )2 + (1 )( 1 )2 + (1 )(1i 1 )2 ,
(11)
4.4
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Discussion
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
0.12
0.18
0.2
0.16
0.18
0.1
0.16
0.14
0.08
0.14
0.12
0.12
0.1
0.06
0.1
0.08
0.08
0.04
0.06
0.06
0.04
0.04
0.02
0.02
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
2
5
x 10
0
0.2
0.02
0.4
0.6
0.8
1.2
1.4
1.6
1.8
2
5
x 10
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
2
5
x 10
Fig. 7: Illustration of the proposed algorithm in dealing with ambiguity in detection. Top row: three positive samples. The sample in
red rectangle is the most correct positive sample while other two
in yellow rectangles are less correct positive samples. Bottom row:
the probability distributions for a feature extracted from positive and
negative samples. The green markers denote the feature extracted
from the most correct positive sample while the yellow markers
denote the feature extracted from the two less correct positive
samples. The red and blue stairs as well as lines denote the estimated
distributions of positive and negative samples as shown in Figure 5.
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
than p = d
failure happens.
Qn
i=1
pi (y = 0) = 1% 0.5100 0 when a
E XPERIMENTS
5.1
Experimental setup
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
TABLE 2: Success rate (SR)(%). Bold fonts indicate the best performance while the italic fonts indicate the second best ones. The total
number of evaluated frames is 8, 762.
SFCT
FCT
CT
CS
Frag
OAB
SemiB
IVT
MIL
VTD
L1T
TLD
DF
MTT
Struck
CST
SCM
ASLA
Animal
Biker
Bolt
Cliff bar
Chasing
Coupon book
David indoor
Dark car
Football
Goat
Occluded face
Panda
Pedestrian
Skating
Shaking 1
Shaking 2
Sylvester
Tiger 1
Tiger 2
Twinnings
99
85
99
95
88
99
99
75
77
75
98
91
82
96
72
97
83
49
61
72
92
35
94
99
79
98
98
36
76
77
99
84
83
97
84
88
77
52
72
98
96
84
90
89
47
97
94
53
74
26
99
90
13
83
80
55
69
50
48
70
4
5
5
24
67
17
8
6
35
26
39
1
1
7
9
12
57
62
11
41
3
3
41
24
21
26
8
0
26
14
54
9
0
11
25
34
34
19
12
73
17
66
0
66
71
98
32
14
31
46
49
83
1
68
39
74
65
24
36
99
51
39
18
24
62
23
46
19
17
43
41
71
3
39
30
46
66
29
16
23
4
10
17
47
91
98
98
54
64
37
96
11
0
8
1
39
45
8
19
49
83
1
92
71
65
98
71
48
77
27
97
80
1
21
83
41
77
34
44
83
96
15
0
53
70
17
98
25
83
39
79
7
3
96
93
80
33
78
13
75
6
3
2
24
72
16
83
46
35
24
96
63
4
65
3
36
40
18
11
82
37
2
0
63
76
31
98
67
59
48
87
34
0
37
15
56
89
40
24
91
6
6
1
26
70
34
51
78
56
44
78
13
7
19
84
95
32
36
65
82
87
9
1
55
96
39
41
59
67
39
88
11
4
10
2
93
67
25
34
77
96
9
8
44
85
98
33
18
62
59
97
43
1
84
48
53
80
87
62
95
100
9
92
96
96
81
66
48
69
89
99
15
1
9
36
43
83
42
37
86
98
5
1
41
64
32
30
47
17
57
76
29
5
76
54
86
76
31
2
89
96
10
1
40
63
71
34
57
7
37
93
71
1
61
98
82
82
14
24
63
Average SR
86
82
73
29
33
56
43
49
68
51
47
57
50
50
64
66
51
58
Sequence
TABLE 3: Center location error (CLE)(in pixels) and average frame per second (FPS). Bold fonts indicate the best performance while the
italic fonts indicate the second best ones. The total number of evaluated frames is 8, 762.
SFCT
FCT
CT
CS
Frag
OAB
SemiB
IVT
MIL
VTD
L1T
TLD
DF
MTT
Struck
CST
SCM
ASLA
Animal
Biker
Bolt
Cliff bar
Chasing
Coupon book
David indoor
Dark car
Football
Goat
Occluded face
Panda
Pedestrian
Skating
Shaking 1
Shaking 2
Sylvester
Tiger 1
Tiger 2
Twinnings
13
6
8
7
9
5
8
7
8
20
11
6
7
16
13
14
9
13
16
11
15
12
10
6
10
4
11
9
13
18
12
6
6
14
10
15
9
23
10
10
16
6
9
8
12
7
14
10
14
103
16
10
70
21
14
46
14
25
17
15
271
176
152
69
9
175
72
89
43
137
29
157
78
207
119
255
84
48
84
44
100
107
44
34
56
62
73
116
144
140
57
56
160
176
55
119
47
39
36
15
62
10
227
33
9
9
57
11
37
71
36
8
91
74
22
18
12
42
22
7
26
14
102
56
44
74
37
11
58
77
39
9
86
76
134
124
14
38
30
70
207
111
60
37
5
4
6
8
10
94
14
58
84
144
122
109
138
45
44
23
32
44
8
14
13
6
19
9
13
109
17
7
71
136
12
58
9
27
18
11
16
86
146
31
23
74
6
20
6
92
36
61
74
9
6
41
66
8
47
19
122
89
261
40
9
75
17
8
39
88
17
9
76
87
72
113
50
37
48
11
125
166
286
70
47
81
8
13
15
103
24
16
211
204
232
144
8
24
40
8
252
76
277
52
31
23
56
6
33
86
22
64
90
174
10
7
56
30
13
12
17
68
293
25
4
72
125
7
9
99
19
47
76
78
115
16
18
61
24
12
19
95
148
46
5
6
64
9
26
22
15
11
72
15
24
48
10
8
11
9
15
53
12
7
4
21
18
8
17
9
13
46
104
10
21
84
8
25
22
11
16
227
200
99
61
73
150
45
200
75
24
156
210
42
47
18
10
146
230
9
13
109
210
49
47
23
57
8
207
94
20
9
93
72
10
27
9
49
36
29
Average CLE
Average FPS
9
135
10
149
18
80
96
40
60
6
31
22
48
11
56
33
22
38
42
6
45
0.5
65
28
48
13
57
5
24
20
23
362
87
1
45
7
Sequence
5.2
Evaluation criteria
5.3
5.3.1
Tracking results
Pose and illumination change
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
#50
#150
#500
#600
#76
#165
#50
#60
10
#200
#300
#460
#1050
#1300
#280
#383
#160
#304
OAB
SemiB
MIL
TLD
CST
ST
VTD
DF
IVT
L1T
MTT
CS
ASLA
SCM
CT
FCT
SFCT
Fig. 8: Screenshots of some sample tracking results when there are pose variations and severe illumination changes.
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
#200
#500
#100
#250
#25
#125
#30
#50
#10
#26
11
#600
#710
#800
#600
#780
#200
#280
#120
#140
#80
#200
OAB
SemiB
MIL
TLD
CST
ST
VTD
DF
IVT
L1T
MTT
CS
ASLA
SCM
CT
FCT
SFCT
Fig. 9: Screenshots of some sample tracking results when there are severe occlusion and pose variations.
Except the FCT and SFCT algorithms, all the other methods
lose track of the target in numerous frames.
The proposed FCT and SFCT algorithms handle occlusion
and pose variation well as the adopted scale invariant appearance model is discriminatively learned from target and
background with data-independent measurement, thereby alleviating the influence of background pixels (See also Figure
9(c)). Furthermore, the FCT and SFCT algorithms perform
well on objects with non-rigid shape deformation and camera
view change in the Panda, Bolt and Goat sequences (Figure
9(b), (c), and (d)) as the adopted appearance model is based on
spatially local scale invariant features which are less sensitive
to non-rigid shape deformation.
5.3.3
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
#350
#380
#50
#120
#5
#33
#50
#166
12
#400
#430
#530
#270
#287
#70
#100
#290
#364
OAB
SemiB
MIL
TLD
CST
ST
VTD
DF
IVT
L1T
MTT
CS
ASLA
SCM
CT
FCT
SFCT
FCT
SFCT
Fig. 10: Screenshots of some sample tracking results when there are rotation and abrupt motion.
#83
#150
#10
#60
#200
#250
#320
#200
#280
OAB
SemiB
MIL
TLD
CST
ST
VTD
DF
IVT
L1T
MTT
CS
ASLA
SCM
CT
Fig. 11: Screenshots of some sample tracking results with background clutter.
5.3.4
Background clutter
The object in the Cliff bar sequence changes in scale, orientation and the surrounding background has similar texture
(Figure 11(a)). As the Frag, IVT, VTD, L1T, DF, CS, MTT
and ASLA methods use generative appearance models that do
not exploit the background information, it is difficult to keep
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
13
[11] H. Li, C. Shen, and Q. Shi, Real-time visual tracking using compressive
sensing, in Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, pp. 13051312, 2011. 1, 2, 3, 6, 8
[12] X. Mei and H. Ling, Robust visual tracking and vehicle classification
via sparse representation, IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 33, no. 11, pp. 22592272, 2011. 1, 2, 6, 8
[13] S. Hare, A. Saffari, and P. Torr, Struck: Structured output tracking
with kernels, in Proceedings of the IEEE International Conference on
Computer Vision, pp. 263270, 2011. 1, 2, 3, 8
[14] J. Kwon and K. M. Lee, Tracking by sampling trackers, in Proceedings
of the IEEE International Conference on Computer Vision, pp. 1195
1202, 2011. 1
[15] J. Henriques, R. Caseiro, P. Martins, and J. Batista, Exploiting the
circulant structure of tracking-by-detection with kernels, in Proceedings
of European Conference on Computer Vision, pp. 702715, 2012. 1, 2,
3, 8
[16] J. Ho, K. Lee, M. Yang, and D. Kriegman, Visual tracking using learned
linear subspaces, in Proceedings of IEEE Conference on Computer
Vision and Pattern Recognition, vol. 1, pp. I782, 2004. 1, 2
[17] E. Candes and T. Tao, Decoding by linear programming, IEEE
Transactions on Information Theory, vol. 51, no. 12, pp. 42034215,
2005. 1, 3, 7
[18] E. Candes and T. Tao, Near-optimal signal recovery from random
projections: Universal encoding strategies?, IEEE Transactions on
Information Theory, vol. 52, no. 12, pp. 54065425, 2006. 1, 3
[19] D. Achlioptas, Database-friendly random projections: Johnsonlindenstrauss with binary coins, Journal of Computer and System
Sciences, vol. 66, no. 4, pp. 671687, 2003. 1, 3
[20] E. Bingham and H. Mannila, Random projection in dimensionality
reduction: applications to image and text data, in International Conference on Knowledge Discovery and Data Mining, pp. 245250, 2001. 1,
7
[21] A. Yilmaz, O. Javed, and M. Shah, Object tracking: A survey, ACM
Computing Surveys, vol. 38, no. 4, 2006. 2
[22] K. Cannons, A review of visual tracking, Tech. Rep. CSE-2008-07,
York University, 2008. 2
[23] M.-H. Yang and J. Ho, Toward robust online visual tracking, in
Distributed Video Sensor Networks (B. Bhanu, C. Ravishankar, A. RoyChowdhury, H. Aghajan, and D. Terzopoulos, eds.), pp. 119136,
Springer, 2011. 2
[24] S. Salti, A. Cavallaro, and L. Di Stefano, Adaptive appearance modeling
for video tracking: Survey and evaluation, IEEE Transactions on Image
Processing, vol. 21, no. 10, pp. 43344348, 2012. 2
[25] A. Adam, E. Rivlin, and I. Shimshoni, Robust fragments-based tracking
using the integral histogram, in Proceedings of IEEE Conference on
Computer Vision and Pattern Recognition, pp. 798805, 2006. 2, 8
[26] T. Zhang, B. Ghanem, S. Liu, and N. Ahuja, Robust visual tracking
via multi-task sparse learning, in Proceedings of IEEE Conference on
Computer Vision and Pattern Recognition, pp. 20422049, 2012. 2, 8
[27] C. Bao, Y. Wu, H. Ling, and H. Ji, Real time robust l1 tracker
using accelerated proximal gradient approach, in Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1830
1837, 2012. 2, 6
[28] S. Oron, A. Bar-Hillel, D. Levi, and S. Avidan, Locally orderless
tracking, in Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, pp. 19401947, 2012. 2
[29] L. Sevilla-Lara and E. Learned-Miller, Distribution fields for tracking,
in Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, pp. 19101917, 2012. 2, 8
[30] S. Wang, H. Lu, F. Yang, and M. Yang, Superpixel tracking, in
Proceedings of the IEEE International Conference on Computer Vision,
pp. 13231330, 2011. 2, 3
[31] D. Comaniciu, V. Ramesh, and P. Meer, Kernel-based object tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 25, no. 5, pp. 564577, 2003. 2
[32] C. Shen, J. Kim, and H. Wang, Generalized kernel-based visual tracking, IEEE Transactions on Circuits and Systems for Video Technology,
vol. 20, no. 1, pp. 119130, 2010. 2
[33] Z. Kalal, J. Matas, and K. Mikolajczyk, Pn learning: Bootstrapping
binary classifiers by structural constraints, in Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition, pp. 4956,
2010. 2, 8
[34] D. Donoho, Compressed sensing, IEEE Transactions on Information
Theory, vol. 52, no. 4, pp. 12891306, 2006. 3
[35] R. Baraniuk, Compressive sensing, IEEE Signal Processing Magazine,
vol. 24, no. 4, pp. 118121, 2007. 3
C ONCLUDING R EMARKS
In this paper, we propose a simple yet robust tracking algorithm with an appearance model based on non-adaptive
random projections that preserve the structure of original
image space. A very sparse measurement matrix is adopted
to efficiently compress features from the foreground targets
and background ones. The tracking task is formulated as
a binary classification problem with online update in the
compressed domain. Numerous experiments with state-ofthe-art algorithms on challenging sequences demonstrate that
the proposed algorithm performs well in terms of accuracy,
robustness, and speed.
Our future work will focus on applications of the developed
algorithm for object detection and recognition under heavy
occlusion. In addition, we will explore efficient detection
modules for persistent tracking (where objects disappear and
reappear after a long period of time).
ACKNOWLEDGEMENT
Authors would like to thank the editor and reviewers valuable
comments. Kaihua Zhang is partially supported by NSFC
(61272223) and NSF of Jiangsu Province (BK2012045).
R EFERENCES
[1]
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TPAMI.2014.2315808, IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
14
0162-8828 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.