You are on page 1of 12

A Scale-Space Approach to Nonlocal Optical

Flow Calculations
Luis Alvarez1 , Joachim Weickert2, and Javier Sanchez1
Departamento de Informatica y Sistemas,
Universidad de Las Palmas, Campus de Ta ra,
SP-35017 Las Palmas, Spain.
E-Mail: flalvarez,jsanchezg@dis.ulpgc.es
WWW: http://serdis.dis.ulpgc.es/~lalvarez
1

Computer Vision, Graphics and Pattern Recognition Group,


Department of Mathematics and Computer Science,
University of Mannheim, D-68131 Mannheim, Germany.
E-Mail: Joachim.Weickert@ti.uni-mannheim.de
WWW: http://www.ti.uni-mannheim.de/~bmg/weickert/

Abstract. We interpret a classical method for optical ow computation


due to Nagel and Enkelmann (1986) as a tensor-driven anisotropic di usion approach in digital image analysis. We introduce an improvement
into the model formulation, and we establish well-posedness results for
the resulting system of parabolic partial di erential equations. Since our
method avoids linearizations in the optical ow constraint, it can recover
displacement elds which are far beyond the typical one-pixel limits that
are characteristic for di erential methods for optical ow recovery. A
robust numerical scheme is presented in detail. We avoid convergence to
irrelevant local minima by embedding our method into a linear scalespace framework and using a focusing strategy from coarse to ne scales.
We illustrate the high accuracy of the proposed method by means of a
synthetic and a real-world image sequence.

1 Introduction
Optical ow computation consists of nding the apparent motion of objects in
a sequence of images. It is a key problem in arti cial vision and much research
has been devoted to this eld; for a survey see e.g. [20].
In the present paper we shall consider two images I1 (x; y) and I2 (x; y) (de ned
on R 2 to simplify the discussion) which represent two consecutive views in a
sequence of images. Under the assumption that corresponding pixels have equal
grey values, the determination of the optical ow from I1 to I2 comes down to
nding a function h(x; y) = (u(x; y); v(x; y)) such that
I1 (x; y )

= I2 (x + u(x; y); y + v(x; y)),

8(x; y) 2 R 2 :

(1)

To compute h(x; y) the preceding equality is usually linearized yielding the


so-called "optical ow constraint"
I1 (x) ? I2 (x)  hrI2 (x); h(x)i

8x 2 R 2 ;

where x := (x; y). The linearized optical ow constraint assumes that the object
displacements h(x) are small. In other cases, this linearization is no longer valid.
Frequently, instead of equation (1), researchers use the alternative equality
I1 (x ? u(x; y ); y ? v (x; y ))

8(x; y) 2 R 2 :

= I2 (x; y),

(2)

In this case the displacement h(x; y) is centered in the image I2 (x; y):
The determination of optical ow is a classic ill-posed problem in computer
vision [6], and it requires to be supplemented with additional regularizing assumptions. The regularization by Horn and Schunck [13] assumes that the optical ow
eld is smooth. Much research has been done to modify the Horn and Schunck
approach in order to permit discontinuous ow elds; see [7, 8, 18, 21{24, 28] and
the references therein.
An important improvement in this direction has been achieved by Nagel and
Enkelmann [21] in 1986. They consider the following minimization problem:

ENE (h) =

R2

(I1 (x ? u(x; y); y ? v(x; y)) ? I2 (x; y))2 dx

+C

(rh)T D (rI1 ) (rh)

trace
2

(3)

dx

where C is a positive constant and D (rI1 ) is a regularized projection matrix in


the direction perpendicular of rI1 :

80 1 0 1T
9
>
>
@I
@I
<
=
1
@y
@y
2
@
A
@
A
D (rI1 ) =
+

Id
:
>
jrI1 j2 + 22 >
: ? @I@x ? @I@x
;
1

In this formulation, Id denotes the identity matrix. The advantage of this method
is that it inhibits blurring of the ow across boundaries of I1 where jrI1 j >> .
This model, however, uses an optical ow constraint which is centered in I2 ,
while the projection matrix D in the smoothness term depends on I1 . This
inconsistency may lead to erroneous results for large displacement elds. In order
to avoid this problem, we consider a modi ed energy functional where both the
optical ow constraint and the smoothness constraint are related to I1 :
E (h) =

R2

+C

(I1 (x; y) ? I2 (x + u(x; y); y + v(x; y)))2 dx

trace
2

(rh)T D (rI1 ) (rh)

dx:

(4)

The associated Euler-Lagrange equations are given by the PDE system


C div (D (rI1 )
C div (D (rI1 )

2 (x + h(x)) = 0;
ru) + (I1 (x) ? I2 (x + h(x))) @I
@x
2 (x + h(x)) = 0:
rv) + (I1 (x) ? I2 (x + h(x))) @I
@y

(5)
(6)

In this paper, we are interested in solutions of the equations (5)-(6) in the case
of large displacement elds. Therefore, we do not introduce any linearization in
the above system. We obtain the solutions by calculating the asymptotic state
(t ! 1) of the parabolic system
@u
@t
@v
@t

2 (x + h(x));
= C div (D (rI1 ) ru) + (I1 (x) ? I2 (x + h(x))) @I
@x
2 (x + h(x)):
= C div (D (rI1 ) rv) + (I1 (x) ? I2 (x + h(x))) @I
@y

(7)
(8)

Interestingly, this coupled system of di usion{reaction equations reveals a


di usion tensor which resembles the one used for edge-enhancing anisotropic
di usion ltering. Indeed, D(rI1 ) has the eigenvectors v1 := rI1 and v2 :=
rI1? . The corresponding eigenvalues are given by
1 (jrI1 j) =

2

jrI1 j2 + 22 ;
jrI1 j2 + 2 :
2 (jrI1 j) =
jrI1 j2 + 22

(9)
(10)

We observe, that 1 + 2 = 1 holds independently of rI1 . In the interior


of objects we have jrI1 j ! 0, and therefore 1 ! 1=2 and 2 ! 1=2. At
ideal edges where jrI1 j ! 1, we obtain 1 ! 0 and 2 ! 1. Thus, we
have isotropic behaviour within regions, and at image boundaries the process
smoothes anisotropically along the edge. This behaviour is very similar to edgeenhancing anisotropic di usion ltering [26], and it is also close in spirit to the
modi ed mean-curvature motion considered in [3]. In this sense, one may regard
the Nagel{Enkelmann method as an early predecessor of modern PDE techniques
for image restoration. For a detailed treatment of anisotropic di usion ltering
we refer to [27], and an axiomatic classi cation of mean-curvature motion and
related morphological PDEs for image analysis is presented in [2].
In general, we cannot expect the uniqueness of solutions of the elliptic system
(5)-(6), and the asymptotic state of the above parabolic system depends on the
initial data for the ow u and v. In order to encourage convergence to the
physically correct solution in case of large displacement ow, we will design a
linear scale-space focusing procedure for the optical ow constraint. One may
expect that this leads to a signi cant improvement, since methods which consider
the optical ow constraint under Gaussian convolution have demonstrated their
advantages already without nonlinear smoothness terms; see e.g. Florack et al.

[11]. Using a scale-space approach enables us also to perform a ner and more
reliable scale focusing as it would be the case for related pyramid [4] or multigrid
approaches [9].
The paper is organized as follows: In Section 2 we show the existence and
uniqueness of solutions of the nonlinear parabolic system (7)-(8), under suitable
conditions on the data. In Section 3 we apply a linear scale-space focusing which
enables us to achieve convergence to realistic solutions for large displacement
vectors. Section 4 describes a numerical discretization of the parabolic system
(7)-(8) based on an explicit nite di erence scheme. In Section 5 we present
experimental results on a synthetic and a real-world image sequence. Finally, in
Section 6 we conclude with a summary.

2 Existence and Uniqueness of the Parabolic System


We will study the solutions of the parabolic system of nonlinear partial di erential equations (7)-(8). In [1], the authors develop a theoretical framework to
study the existence and uniqueness of solutions of a similar parabolic system,
but with a di erent regularization term. The main techniques used in [1] can be
applied in order to obtain the existence and uniqueness of the solutions of the
system (7)-(8).
First, we notice that the di usion tensor D(rI1 ) is uniformly bounded. Moreover, if we assume that I1 2 C 1 (R 2 ), then the di usion tensor is a continuous
function and, on the other hand, the eigenvalues of the di usion tensor are
uniformly bounded from below by a positive constant. Then, by classical results,
we obtain that the homogeneous problem
@u
@t
@v
@t

= C div (D(rI1 ) ru)


= C div (D(rI1 ) rv)

with C > 0 has a unique solution for any initial datum h0 = (u0 ; v0 ) 2 L2 (R 2 ) 
L2 (R 2 ). Using the semigroup standard notation we denote by S (t)h0 the solution
of the above homogeneous system for the initial data h0 : Let us also denote by
F : L2 (R 2 )  L2 (R 2 ) ! L2 (R 2 )  L2 (R 2 ) the function
F (h) = (I1 ? I2 (Id + h)) rI2 (Id + h):

Then the following result holds.


Theorem21. Let I22 2 C 2(R 2 ) and I1 2 C 1(R 2 ). Then, for all initial ows
h0 2 L2 (R )  L2 (R ), there exists a unique generalized solution
h(:; t) 2 C

?[0; 1); L2(R 2)  L2(R 2)

of the nonlinear PDE system (7){(8) .

Proof. We will only sketch the proof. For more details we refer the reader to
[1] where all mathematical details are presented. First, we note that, under the
conditions I2 2 C 2 (R 2 ) and I1 2 C 1 (R 2 ), the function F (:) is Lipschitz. On
the other hand, using standard mathematical techniques we can show that the
solutions of (7)-(8) can be expressed as a xed point of the function
(h)(t) = S (t)h0 +

Zt
0

S (t ? s)F (h(s))ds:

Therefore, the solutions of (7)-(8) satisfy


(h)(t)

= h(t)

and nally, since F (:) is Lipschitz, we can show that  is a contraction, and by
the Banach xed point theorem there exists a unique solution of the problem.

3 A Linear Scale-Space Approach to Recover Large


Displacements
In general, the Euler-Lagrange equations (5)-(6) will have multiple solutions. As
a consequence, the asymptotic state of the parabolic system (7)-(8), which we use
for approximating the optical ow, will depend on the initial data. Typically, the
convergence is the better, the closer the initial data is to the asymptotic state.
When we expect small displacements in the scene, the natural choice is to take
u  v  0 as initialization of the ow. For large displacement elds, however,
this will not work, and we need better initial data. To this end, we embed our
method into a linear scale-space framework [14, 29]. Considering the problem at
a coarse scale avoids that the algorithm gets trapped in physically irrelevant
local minima. The coarse-scale solution serves then as initial data for solving
the problem at a ner scale. Scale focusing has a long tradition in linear scalespace theory (see e.g. Bergholm [5] for an early approach. Detailed descriptions
of linear scale-space theory can be found in [10, 12, 15, 16, 19, 25].
We proceed as follows. First, we introduce a linear scale factor in the parabolic
PDE system in order to end up with
@u
@t

= C div (D(rI1 ) ru ) +


+

@v
@t

G

 I1 (x) ? G  I2 (x + h (x))

= C div (D(rI1 ) rv ) +


+

G

 I1 (x) ? G  I2 (x + h (x))

 @ (G  I2 )

(x + h (x));

(11)

 @ (G  I2 )

(x + h (x));

(12)

@x

@y

where G  I represents the convolution of I with a Gaussian of standard deviation :

The convolution with a Gaussian blends the information in the images and
allows us to recover a connection between the objects in I1 and I2 : We start with
a large initial scale 0 . Then we compute the optical ow (u0 ; v0 ) at scale 0
as the asymptotic state of the solution of the above PDE system using as initial
data u  v  0: Next, we choose a number of scales n < n?1 < :::: < 0 , and
for each scale i we compute the optical ow (u ; v ) as the asymptotic state
of the above PDE system with initial data (u ?1 ; v ?1 ). The nal computed
ow corresponds to the smallest scale n . In accordance with the logarithmic
sampling strategy in linear scale-space theory [17], we choose i := i 0 with
some decay rate  2 (0; 1).
i

4 Numerical Scheme
We discretize the parabolic system (11){(12) by nite di erences. Derivatives
in x and y are approximated by central di erences, and for the discretization
in t direction we use an explicit (Euler forward) scheme. Gaussian convolution
was performed in the spatial domain with renormalizedGaussians,
 which where
a
b
truncated at 5 times their standard deviation. Let D = b c . Then our explicit
scheme has the structure
uki;j+1 ? uki;j


+ ai;j uki+1;j ? uki;j + ai?1;j + ai;j uki?1;j ? uki;j +


2
h21
2
h21
uk
? uk
uk
? uk
+ ci;j+12+ ci;j i;j+1h2 i;j + ci;j?12+ ci;j i;j?1h2 i;j +
2
2
bi?1;j ?1 + bi;j uki?1;j ?1 ? uki;j
bi+1;j +1 + bi;j uki+1;j +1 ? uki;j
+
+
2
2h1h2
2
2h1h2 !?
uk
? uk
uk
? uk
? bi+1;j?21 + bi;j i+1;j2h?1h i;j ? bi?1;j+12 + bi;j i?12;jh+1h i;j +
1 2
1 2
=C

ai+1;j

+ I1; (xi;j ? hk;i;j ) ? I2; (xi;j ) I1;x; (xi;j ? hk;i;j );


(13)
k+1 ? v k
k
k
k
k
vi;j
i;j = C ai+1;j + ai;j vi+1;j ? vi;j + ai?1;j + ai;j vi?1;j ? vi;j +

2
h21
2
h21
vk
? vk
vk
? vk
+ ci;j+12+ ci;j i;j+1h2 i;j + ci;j?12+ ci;j i;j?1h2 i;j +
2
2
k
k
bi?1;j ?1 + bi;j vik?1;j ?1 ? vi;j
bi+1;j +1 + bi;j vik+1;j +1 ? vi;j
+
+
2
2h1 h2
2
2h1 h2 !?
k
k
k
k
? bi+1;j?21 + bi;j vi+1;j2h?1h? vi;j ? bi?1;j+12 + bi;j vi?12;jh+1h? vi;j +
1 2
1 2
+ I1; (xi;j ? hk;i;j ) ? I2; (xi;j ) I1;y; (xi;j ? hk;i;j ):

(14)

The notations are almost selfexplaining: for instance,  is the time step size, h1
and h2 denote the pixel size in x and y direction, respectively, uki;j approximates
1
u in some grid point xi;j at time k , and I1;x; is an approximation to G  @I
@x .
k
We calculate values of type I1; (xi;j ? hi;j ) by linear interpolation, and we use
the time step size
0:5
(15)
 =
4C + max
(jI (x )j2 ; jI1;y; (xi;j )j2 ) :
i;j 1;x; i;j
This step size can be motivated from a stability analysis in the maximum norm
when considering a corresponding isotropic problem with a linearized optical
ow constraint.

5 Experimental Results
The complete algorithm for computing the optical ow depends on a number of
parameters which have an intuitive meaning:
{ The regularization parameter C speci es the balance between the smoothing
term and the optical ow constraint. Larger values lead to smoother ow
elds.
{ The constant  in the smoothing term serves as a contrast parameter: locations where the image gradient magnitude is larger than  are regarded as
edges. The di usion process smoothes anisotropically along these edges. In
our experiments we used  := 1. The results were not very sensitive to
underestimations of .
{ The scale 0 denotes the standard deviation of the largest Gaussian. In
general, 0 is chosen according to the maximum displacement expected. In
our case we used 0 := 10.
{ The decay rate  2 (0; 1) for the computation of the scales m := m 0. We
may expect a good focusing if  is close to 1. We have chosen  := 0:95.
{ The smallest scale is given by n . It should be close to the inner scale of the
image in order to achieve optimal ow localization.
{ The stopping time T for solving the system (11){(12) at each scale m .
We observed that T := 50 gives results which are suciently close to the
asymptotic state.
In our rst experiment we use a synthetic image composed of four black
squares on a white background. Each square moves in a di erent direction and
with a di erent displacement magnitude: the left square on the top moves with
(u; v) = (10; 5), the right square on the top is discplaced with (u; v) = (?10; 0),
the left square on the bottom is shifted by (u; v) = (0; ?5), and the right square
on the bottom undergoes a translation by (?10; ?10). Figure 1 illustrates the
improvements due to scale-space focusing. In order to visualize the ow eld
(u; v) we use two grey level images (ugl ; vgl ) de ned by ugl := 128 + 8u and
vgl := 128 + 8v . We use the regularization parameter C = 15000, start with

initial scale 0 = 10 and show the results for focusing to the scales 10 = 5:99,
20 = 3:58, 30 = 2:15, 40 = 1:29, 50 = 0:77, 60 = 0:46, and 70 = 0:28,
respectively. It can be observed that that this leads to a signi cantly improved
ow localization.

Fig. 1. From top to bottom and from left to right: the original pair of images I1

and I2 ; and the ow components (un ; vn ) resulting from focusing to the scales 10 =
5:99, 20 = 3:58, 30 = 2:15, 40 = 1:29, 50 = 0:77, 60 = 0:46, and 70 = 0:28,
respectively.

We notice that the computed ow is a good approximation of the expected


ow. In fact, not only the orientation of the ow is correct, but also the ow
magnitude is surprisingly accurate: the maximum of the computed optic ow
magnitude is 14:13, which is a very good approximation of the ground truth

Fig. 2. Optic ow obtained without the scalespace approach (T = 800).

maximum 10 2  14:14. It results from the square which moves in (?10; ?10)
direction. This indicates that { under speci c circumstances { our method may
even lead to optic ow results with subpixel accuracy.
In a second experiment, we illustrate the necessity of the scale-space approach.
Figure 2 depicts the optical ow that we obtain without scale-space focusing,
i.e. with 0 = 0. As can be expected, the algorithm gets trapped in a physically
irrelevant local minimum.
In the third experiment, we use the classical taxi sequence, but instead of
taking two consecutive frames { as is usually done { we consider the frames 15
and 19. The dark car at the left creates a largest displacement magnitude of
approximately 12 pixels. In Figure 3 we present the computed ow using the
regularization parameter C = 500 and focusing from 0 = 10 to 70 = 0:28.
The computed maximal ow magnitude is 11:68, which is a good approximation
of the actual displacement of the dark car. Figure 4 shows a vector plot of the
computed ow eld.

6 Conclusions
Usually, when computer vision researchers deal with variational methods for
optical ow calculations, they linearize the optical ow constraint. However, such
a linearization works only for small displacements. In this paper we investigate
an improved formulation of a classical method by Nagel and Enkelmann where
no linearization is used. We identify this method with two coupled anisotropic

Fig. 3. Optic ow computation of the taxi sequence using frames 15 and 19:
di usion lters with a nonlinear reaction term. We showed that this parabolic
system is well-posed from a mathematical viewpoint, and we presented a nite
di erence scheme for its numerical solution. In order to avoid that the algorithms
converges to physically irrelevant local minima, we embedded it into a linear
scale-space approach for focusing the solution from a coarse to a ne scale.
The numerical results that we have presented for a synthetic and a real-world
sequence are very encouraging: it was possible to recover displacements of more
than 10 pixels with high accuracy. It is our hope that this successful blend
of nonlinear anisotropic PDEs and linear scale-space techniques may serve as
a motivation to study other combinations of linear and nonlinear scale-space
approaches in the future.
Acknowledgements. This work has been supported by the European TMR
network "Viscosity Solutions and their Applications".

References
1. L. Alvarez, J. Esclarin, M. Lefebure and J. Sanchez, A PDE model for computing
the optical ow, Proc. C.E.D.Y.A. XVI, Universidad de Las Palmas de Gran
Canaria, 1999.

Fig. 4. Vector plot of the optic ow between the frames 15 and 19 of the taxi
sequence.
2. L. Alvarez, F. Guichard, P.-L. Lions, J.-M. Morel, Axioms and fundamental equations in image processing, Arch. Rational Mech. Anal., Vol. 123, 199{257, 1993.
3. L. Alvarez, P.-L. Lions, J.-M. Morel, Image selective smoothing and edge detection
by nonlinear di usion. II, SIAM J. Numer. Anal., Vol. 29, 845{866, 1992.
4. P. Anandan, A computational framework and an algorithm for the measurement of
visual motion, Int. J. Comput. Vision, Vol. 2, 283{310, 1989.
5. F. Bergholm, Edge focusing, IEEE Trans. Pattern Anal. Mach. Intell., Vol. 9, 726{
741, 1987.
6. M. Bertero, T.A. Poggio, V. Torre, Ill-posed problems in early vision, Proc. IEEE,
Vol. 76, 869{889, 1988.
7. I. Cohen, Nonlinear variational method for optical ow computation, Proc. Eighth
Scandinavian Conf. on Image Analysis (SCIA '93, Troms, May 25{28, 1993), Vol.
1, 523{530, 1993.
8. R. Deriche, P. Kornprobst, G. Aubert, Optical- ow estimation while preserving
its discontinuities: A variational approach, Proc. Second Asian Conf. Computer
Vision (ACCV '95, Singapore, December 5{8, 1995), Vol. 2, 290{295, 1995.
9. W. Enkelmann, Investigation of multigrid for algorithms for the estimation of
optical ow elds in image sequences, Computer Vision, Graphics and Image
Processing, Vol. 43, 150-177, 1988.
10. L. Florack, Image structure, Kluwer, Dordrecht, 1997.
11. L.M.J. Florack, W.J. Niessen, M. Nielsen, The intrinsic structure of the optic ow
incorporating measurement duality, Int. J. Comput. Vision, Vol. 27, 263{286, 1998.

12. B. ter Haar Romeny, L. Florack, J. Koenderink, M. Viergever (Eds.), Scale-space


theory in computer vision, Lecture Notes in Computer Science, Vol. 1252, Springer,
Berlin, 1997.
13. B. Horn, B. Schunck, Determining optical ow, Artif. Intell., Vol. 17, 185{203,
1981.
14. T. Iijima, Basic theory on normalization of pattern (in case of typical onedimensional pattern), Bulletin of the Electrotechnical Laboratory, Vol. 26, 368{388,
1962 (in Japanese).
15. T. Iijima, Pattern recognition, Corona-sha, 1973 (in Japanese).
16. T. Iijima, Theory of pattern recognition, Series of Basic Information Technology,
Vol. 6, Morishita Publishing, 1989 (in Japanese).
17. J.J. Koenderink, The structure of images, Biological Cybernetics, Vol. 50, 363{370,
1984.
18. A. Kumar, A.R. Tannenbaum, G.J. Balas, Optic ow: a curve evolution approach,
IEEE Trans. Image Proc., Vol. 5, 598{610, 1996.
19. T. Lindeberg, Scale-space theory in computer vision, Kluwer, Boston, 1994.
20. A. Mitiche, P. Bouthemy, Computation and analysis of image motion: a synopsis
of current problems and methods, Int. J. Comput. Vision, Vol. 19, 29{55, 1996.
21. H.H. Nagel, W. Enkelmann, An investigation of smoothness constraints for the estimation of displacement vector elds from images sequences, IEEE Trans. Pattern
Anal. Mach. Intell., Vol. 8, 565{593, 1986.
22. P. Nesi, Variational approach to optical ow estimation managing discontinuities,
Image Vision Comput., Vol. 11, 419{439, 1993.
23. M. Proesmans, L. Van Gool, E. Pauwels, A. Oosterlinck, Determination of
optical ow and its discontinuities using non-linear di usion, J.-O. Eklundh (Ed.),
Computer vision { ECCV '94, Lecture Notes in Computer Science, Vol. 801,
Springer, Berlin, 295{304, 1994.
24. C. Schnorr, Segmentation of visual motion by minimizing convex non-quadratic
functionals, Proc. 12th Int. Conf. Pattern Recognition (ICPR 12, Jerusalem, Oct.
9{13, 1994), Vol. A, IEEE Computer Society Press, Los Alamitos, 661{663, 1994.
25. J. Sporring, M. Nielsen, L. Florack, P. Johansen (Eds.), Gaussian scale-space
theory, Kluwer, Dordrecht, 1997.
26. J. Weickert, Theoretical foundations of anisotropic di usion in image processing,
Computing, Suppl. 11, 221{236, 1996.
27. J. Weickert, Anisotropic di usion in image processing, Teubner-Verlag, Stuttgart,
1998.
28. J. Weickert, On discontinuity-preserving optic ow, S. Orphanoudakis, P. Trahanias, J. Crowley, N. Katevas (Eds.), Proc. Computer Vision and Mobile Robotics
Workshop (CVMR '98, Santorini, Sept. 17{18, 1998), 115{122, 1998.
29. J. Weickert, S. Ishikawa, A. Imiya, Linear scale-space has rst been proposed in
Japan, J. Math. Imag. Vision, Vol. 10, May 1999.

You might also like