You are on page 1of 955

On the roots of a random system of equations.

The theorem of Shub & Smale and some


extensions.
Jean-Marc Azas , azais@cict.fr
Mario Wschebor , wschebor@cmat.edu.uy
August 20, 2004

Abstract
We give a new proof of a theorem of Shub & Smale [9] on the
expectation of the number of roots of a system of m random polynomial equations in m real variables, having a special isotropic Gaussian
distribution. Further, we present a certain number of extensions, including the behaviour as m + of the variance of the number of
roots, when the system of equations is also stationary.

AMS subject classification: Primary 60G60, 14Q99. Secondary: 30C15.


Short Title: Random systems
Key words and phrases: Random polynomials, system of random equations, Rice Formula.

Introduction

Let us consider m polynomials in m variables with real coefficients


Xi (t) = Xi (t1 , ..., tm ), i = 1, ..., m.

Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul


Sabatier. 118, route de Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle
Igua 4225. 11400 Montevideo. Uruguay.

We use the notation


(i)

a j tj ,

Xi (t) :=

(1)

j di

where j := (j1 , ..., jm ) is a multi-index of non-negative integers, j := j1 +


(i)
(i)
... + jm , j! := j1 !...jm !, tj := tj11 ....tjmm , aj := aj1 ...,jm . The degree of the i-th
polynomial is di and we assume that di 1 i.
Let N X (V ) be the number of roots lying in the subset V of Rm , of the
system of equations
Xi (t) = 0, i = 1, ..., m.
(2)
We will assume throughout that V is a Borel set with the regularity property
that its boundary has zero Lebesgue measure. We denote N X = N X (Rm ).
We will be interested in random real-valued functions Xi (i = 1, ...m), in
which case we will call random fields the Xi s, as well as the Rm -valued
random function X(t) = (X1 (t), ..., Xm (t))T , t Rm . Whenever the Xi s are
polynomials we will say that the Xi s and X are polynomial random fields.
Generally speaking, little is known on the distribution of N X (V ), even
for simple choices of the law on the coefficients. In the case of one equation
in one variable, a certain number of results have been known since a long
time, starting with the work of Marc Kac [6]. See for example the book by
Bharucha-Reid & Sambandham [2]
Shub & Smale [9] computed the expectation of N X when the coefficients
are Gaussian, centered independent random variables with certain specified
variances (see Theorem 3 below and also the book by Blum et al. [3]).
Extensions of their work, including new results for one polynomial in one
variable, can be found in the review paper by Edelman & Kostlan [5], see
also Kostlan [7].
The primary aim of the present paper is to give a new proof of Shub
& Smales Theorem, based upon the so-called Rice formula to compute the
moments of the number of roots of random fields. At the same time, this
permits certain extensions (some of which are already present in the cited
papers by Edelman & Kostlan) to classes of Gaussian polynomials not considered before, and for which some new behaviour of the number of roots can
be observed.
Additionally, in Section 6 we consider non-polynomial systems such that
the lines are independent and the law of each line is centered Gaussian,
2

invariant under isometries as well as translations. Under general conditions,


we are able to estimate a lower bound for the variance of the number of roots
and show that for very general sets, the ratio of the standard deviation over
the mean tends to infinity as the number m of variables tends to infinity.

Rice formulae

In this section we give a brief account without proofs of Rice formulae, contained in the statements of the following two theorems (Azas and Wschebor,
[1]).
Theorem 1 Let V be a compact subset of Rm , Z : V Rm be a random
field and u Rm be a fixed point.
Assume that:
1) Z is Gaussian,
2) x
Z(x) is a.s. of class C 1 ,
3) for each x V , Z(x) has a non degenerate distribution and denote by
pZ(x) its density.
4) P{x V , Z(x) = u, det Z (x) = 0} = 0. Here, V is the interior of
V and Z denotes the derivative of the field Z(.).
5) m (V ) = 0, where V is the boundary of V and m is the Lebesgue
measure on Rm (we will also use dx instead of m (dx)). Then, denoting
NuZ (V ) := {x V : Z(x) = u}, one has
E NuZ (V ) =

E (| det(Z (x))|/Z(x) = u) pZ(x) (u)dx,

(3)

and both members are finite. E(X/.) denotes conditional expectation.


Theorem 2 Let k, k 2 be an integer. Assume the same hypotheses as in
Theorem 1 excepting for 3) that is replaced by the stronger one:
3) for x1 , ..., xk V pairwise different values of the parameter, the distribution of the random vector Z(x1 ), ..., Z(xk ) does not degenerate in (Rm )k

and we denote by pZ(x1 ),...,Z(xk ) its density. Then


E

NuZ (V ) NuZ (V ) 1 ... NuZ (V ) k + 1


k

E
Vk

j=1

| det Z (xj ) |/Z(x1 ) = ... = Z(xk ) = u


pZ(x1 ),...,Z(xk ) (u, ..., u)dx1 ...dxk , (4)

where both members may be infinite.


If one wants to prove Theorem 1, a direct approach is as follows. Assume
that u is not a critical value of Z (This holds true with probability 1 under
the hypotheses of Theorem 1). Put n := NuZ (V ). Since V is compact, n is
finite, and if n = 0, let x(1) , ..., x(n) be the roots of Z(x) = u belonging to
V . One can prove that almost surely x(i)
/ V for all i = 1, ..., n. Hence,
applying the inverse function theorem, if is small enough, one can find in
V open neighborhoods U1 , ..., Un of x(1) ..., x(n) respectively so that:
1. Z is a C 1 diffeomorphism Ui Bm (u, ), the open ball centered in u
with radius , for each i = 1, ..., n.
2. U1 , ..., Un are pairwise disjoint,
3. if x
/

n
i=1

Ui , then Z(x)
/ Bm (u, ).

Using the change of variable formula, we have:


i=1
V

| det Z (x) |1I{

Z(x)u <} dx

=
n

Ui

| det Z (x) |dx = m (Bm (u, ))n.

| det Z (x) |1I{

Hence,
NuZ (V ) = n = lim
0

1
m (Bm (u, ))

Z(x)u <} dx.

(5)

If n = 0, (5) is obvious. Now an informal computation of E(NuZ (V )) can be


performed in the following way:
E NuZ (V ) = lim
0

dx
V

1
m (Bm (u, ))
=
V

Bm (u,)

E | det Z (x) |/Z(x) = y pZ(x) (y)dy

E | det Z (x) |/Z(x) = u pZ(x) (u)dx. (6)


4

Instead of formally justifying these equalities, the proof in Azas and


Wschebor [1] goes in fact through a different path. The proof of Theorem 2
is similar.
For Gaussian fields , an essential simplification in the application of Theorem 1 comes from the fact that, in this case orthogonality implies independence and this is helpful to simplify the conditional expectation in the
integrand.

Main results

We begin with the statement of Shub & Smales Theorem.


Theorem 3 ([9]) Let
(i)

a j tj ,

Xi (t) =

i = 1, ..., m

j di
(i)

Assume that the real-valued random variables aj are independent Gaussian


centered, and
(i)

V ar aj

di
j1 .....jm

di !
j1 !...jm !(di

Then,
E NX =

h=m
h=1 jh )!

(7)

where d = d1 ...dm is the Bezout-number of the polynomial system X(t).


A direct computation shows that under the Shub & Smale hypothesis, the
Xi s are centered independent Gaussian fields, and the covariance function
of Xi is given by
r Xi (s, t) = E (Xi (s)Xi (t)) = (1 + s, t )di ,
where s, t denotes the usual scalar product in Rm .
More generally, assume that we only require that the polynomials random
fields Xi are independent and that their covariances r Xi (s, t) are invariant
under isometries of Rm , i.e. r Xi (U s, U t) = r Xi (s, t) for any isometry U and
(i)
any pair (s, t). This implies in particular that the coefficients aj remain
5

independent for different is but can be now correlated from one j to another
for the same value of i. It is easy to check that this implies that for each
i = 1, ...m, the covariance r Xi (s, t) is a function of the triple ( s, t , s 2 , t 2 )
( . is Euclidean norm in Rm ). It can also be proved (Spivak [10]) that this
function is in fact a polynomial with real coefficients, say Q(i)
r Xi (s, t) = Q(i) ( s, t , s 2 , t 2 ),

(8)

satisfying the symmetry condition


Q(i) (u, v, w) = Q(i) (u, w, v)

(9)

A simple way to construct a class of covariances of this type is to take


Q(i) (u, v, w) = P (u, vw)

(10)

where P is a polynomial in two variables with non-negative coefficients. In


fact, consider the two functions defined on Rm Rm by means of (s, t)
s, t
2
2
and (s, t)
s
t . It is easy to see that both are covariances of polynomial random fields. On the other hand, the set of covariances of polynomial
random fields is closed under linear combinations with non-negative coefficients as well as under multiplication, so that P ( s, t , s 2 t 2 ) is also the
covariance of some polynomial random field.
One can check that using this recipe one cannot construct all the possible covariances of polynomial random fields. For example, the following
polynomial is a covariance (of some polynomial random field).
m+1
1
s, t 2 ( s 2 t 2 ).
m
m
but, if m 2, it can not be obtained from the construction of (10).
The situation becomes simpler if one considers only functions of the scalar
product, i.e.
r(s, t) = 1 +

di
(i)

c k uk .

Q (u, v, w) =
k=0

In this case, it is known that the necessary and sufficient condition for it
to be a covariance is that ck 0 k = 0, 1, ..., di . [Shub & Smale corresponds
to the choice ck = dki ]. Here is a simple proof of this fact using the method
of Box & Hunter [4]. The covariance of the random field
a j tj ,

X(t) =
j d

t Rm

having the form (1), where the random variables aj are centered and in L2
is given by
E (X(s)X(t)) =
j,j sj tj
(11)
j d, j d

where j,j := E(aj aj ). If dk=0 ck s, t


random field as in (11), one can write:
d
j j

ck

j,j s t =
k=0

j d, j d

j =k

is the covariance of a polynomial

k!
(s1 t1 )j1 ... (sm tm )jm =
j!

j ! j j
st
j!

j d

Identifying coefficients, it follows that j,j = 0 if j = j and for each k =


0, 1, ..., d,
j!
(12)
ck = j,j
k!
whenever j = k. This shows that ck 0 since j,j is the variance of the
random variable aj . Reciprocally, if all the ck re positive, defining j,j by
means of (12) and setting i,j = 0 for i = j shows that dk=0 ck s, t k is the
covariance of a polynomial random field.
Notice that the foregoing argument shows at the same time that if the
i
polynomial random field {X(t) : t Rm } is Gaussian and has dk=0
ck s, t k
as covariance function, then its coefficients are independent random variables.
A description of the homogeneous polynomial covariances that are invariant
under isometries has been given by Kostlan [7], part II.
We now state an extension of the Shub & Smale theorem, valid under
more general conditions.
Theorem 4 Assume that the Xi are independent centered Gaussian polynomial random fields with covariances r Xi (s, t) = Q(i) ( s, t , s 2 , t 2 ) (i =
1, ..., m).
(i)
(i)
(i)
Let us denote by Qu , Qw , Quv , ... the partial derivatives of Q(i) and set
(i)

qi (x) :=
(i)

(i)

(i)

Qu
Q(i)

(i)

(i)

(i)

(i) 2

Q(i) Quu + 2Quv + 2Quw + 4Qvw Qu + Qv + Qw


ri (x) :=
(Q(i) )2
7

where the functions in the right-hand sides are always computed at the triplet
(x, x, x).
Put:
ri (x)
.
hi (x) := 1 + x
qi (x)
Then for all Borel sets V with boundary having zero Lebesgue measure, we
have
m
X

E N (V ) = (2)

m/2

qi ( t 2 )

Lm1
V

Here

1/2

Eh ( t 2 )dt.

(13)

i=1

hi (x)i2 )1/2

Eh (x) := E (
i=1

where 1 , ..., m are i.i.d. standard normal in R and


n

Ln :=

Kj
j=1

with Kj = E( j ) with j standard normal in Rj .


Elementary computations give the identities:
Km =

((m + 1)/2)
2
(m/2)

1 m+1 m + 1
Lm = 2 2 (
).
2
2
We define the integral
+

Jm :=
0

m1
d =
(1 + 2 )(m+1)/2

/2

1
Km

that will appear later on. We need also the surface area m1 of the unit
2 m/2
sphere S m1 in Rm , m1 = (m/2)
.
Remark on formula (13). Note that formula (13) takes simpler forms
8

in some special cases. For example, when the functions hi (x) do not depend
on i, denoting by h(x) their common value, we have
Eh (x) =

h(x)Km .

Under the hypothesis that Q(i) (u, v, w) = Qdi (u), we have


2
(x)
(x)
, hi (x) = h(x) = 1 x Q (x)Q(x)Q
. Then, for the
qi (x) = di q(x) = di QQ(x)
Q(x)Q (x)
expectation of the total number of roots i.e. in case V = Rm , using polar
coordinates, we get from the last theorem the formula:

E(N X ) = (2)m/2

m1 q(2 )m/2

d1 ...dm Lm m1
0

2/Km

h(2 )d

m1 q(2 )m/2

d1 ...dm

h(2 )d. (14)

Proof of Theorem 4

Consider the normalized Gaussian fields


Zi (t) :=

Xi (t)
Q(i) ( t 2 , t 2 , t 2 )

1/2

which have variance 1. Denote Z(t) = (Z1 (t) , ..., Zm (t))T . Applying Rice
Formula for the expectation of the number of zeros of Z (Theorem 1):
E N X (V ) = E N Z (V ) =

E (|det(Z (t)| /Z(t) = 0)


V

1
m

(2) 2

dt,

...
where Z (t) := [Z1 (t) .. .. .. Zm (t)] is the matrix obtained by concatenation of
the vectors Z1 (t), ..., Zm (t). Note that since E (Zi2 (t)) is constant, it follows
i
that E Zi (t) Z
(t) = 0 for all i, j = 1, ..., m. Since the field is Gaussian this
tj
implies that Zi (t) and Zi (t) are independent and given that the coordinate
fields Z1 , ...Zm are independent, one can conclude that for each t, Z(t) and
Z (t) are independent. So
E N X (V ) = E N Z (V ) =

E (|det(Z (t)|) dt.

(2) 2

(15)

A straightforward computation shows that the (, )- entry, , = 1, ..., m,


in the covariance matrix of Zi (t) is
E

Zi Zi
(t)
(t)
t t

2
r Zi (s, t) |s=t = ri ( t 2 )t t + qi ( t 2 ) ,
s t

where , denotes the Kronecker symbol. This can be rewritten as


Var Zi (t) = qi Im + ri ttT ,
where the functions in the right-hand side are to be computed at the point
t 2 . Let U be the orthogonal transformation of Rm that gives the coordinates in a basis with first vector tt , we get
Var U Zi (t) = Diag (ri . t
so that
Var

+ qi ), qi , ..., qi

U Zi (t)
= Diag hi , 1, ..., 1

qi

Put now

U Zi (t)

qi

Ti :=
and set

...
T := [T1 .. .. .. Tm ]

We have

| det Z (t) | = | det T |

1/2

qi .

(16)

i=1

Now, we write

T =

W1

Wm

where the Wi are random row vectors. Because of the properties of independence of all the entries of T , we know that :
W2 , ..., Wm are independent standard Gaussian vectors in Rm
10

W1 is independent from the other Wi , i 2, with distribution


N 0, Diag(h1 , ..., hm )
Now E | det(T )| is calculated as the expectation of the volume of the
parallelotope generated by W1 , ..., Wm in Rm . That is,
m

| det(T )| = W1

d(Wj , Sj1 ),
j=2

where Sj1 denotes the subspace of Rm generated by W1 , ..., Wj1 and d denotes the Euclidean distance. Using the invariance under isometries of the
standard normal distribution of Rm we know that, conditioning on W1 , ..., Wj1 ,

the projection PSj1


(Wj ) of Wj on the orthogonal Sj1 of Sj1 has a distri
bution which is standard normal on the space Sj1
which is of dimension
m j + 1 with probability 1. Thus E d(Wj , Sj1 )/W1 , ..., Wj1 = Kmj+1 .
By successive conditionings on W1 , W1 , W2 etc... , we get:
m1

E | det(T )| = E (

hi (x)i2 )1/2
i=1

Kj ,
j=1

where 1 , ..., m are i.i.d. standard normal in R. Using (16) and (15) we
obtain (13) .

5
5.1

Examples
Shub & Smale

In this case we have Q(i) = Qdi with Q(u, v, w) = 1 + u. We get


h(x) = q(x) =

1
,
1+x

and (7) follows from formula (14).


A simple variant of Shub & Smale theorem corresponds to taking Q(i) (u) =
1 + ud for all i = 1, ..., m (here all the Xi s have the same law), which yields
q(x) = qi (x) =

d
dud1
; h(x) = hi (x) =
d
1+u
1 + ud
11

E(N X ) =

2
Km

+
0

md1
d = d(m1)/2
(1 + 2d )(m+1)/2

which differs by a constant factor from the analogous Shub & Smale result
for (1 + u)d which is dm/2 .

5.2

Linear systems with a quadratic perturbation

Consider linear systems with a quadratic perturbation


Xi (s) = i + < i , s > +i s 2 ,
where the i , i , i , i = 1, ..., m are independent and standard normal in R, R
and Rm respectively. This corresponds to the covariance
r Xi (s, t) = 1 + s, t + s 2 t 2 .
If there is no quadratic perturbation, it is obvious that the number of
roots is almost surely equal to 1.
For the perturbed system, applying Theorem 4 and performing the computations required in this case, we obtain:
1
4
(1 + 2x)2
1 + 4x + x2
q(x) =
; r(x) =

; h(x) =
1 + x + x2
1 + x + x2 (1 + x + x2 )2
1 + x + x2
and
Hm
E(N ) =
Jm
X

with Hm =
0

m1 (1 + 42 + 4 ) 2
d.
m
(1 + 2 + 4 ) 2 +1

An elementary computation shows that E(N X ) = o(1) as m + (see the


next example for a more precise behavior). In other words, the probability
that the perturbed system has no solution tends to 1 as m +.

5.3

More general perturbed systems

Let us consider the covariances given by the polynomials


Qi (u, v, w) = Q(u, v, w) = 1 + 2ud + (vw)d .
This corresponds to adding a perturbation depending on the product of the
norms of s, t to the modified Shub & Smale systems considered in our first
example. We know that for the unperturbed system, one has E(N X ) =
12

m1

d 2 . Note that the factor 2 in Q has only been added for computational
convenience and does not modify the random variable N X of the unperturbed
system. For the perturbed system, we get
2dxd1
2d(d 1)xd2
q(x) =
; r(x) =
; h(x) = d.
(1 + xd )2
(1 + xd )2
Therefore,
X

E(N ) =

2
Km

m
2

2d2(d1)
(1 + 2d )2

m1

d d

m+1
2
Km 2m/2 d 2

+
0

md1
d. (17)
(1 + 2d )m

The integral can be evaluated by an elementary computation and we obtain


E(N X ) = 2

m2
2

m1
2

which shows that the mean number of zeros is reduced by the perturbation
at a geometrical rate as m grows.

5.4

Polynomial in the scalar product, real roots

Consider again the case in which the polynomials Q(i) are all equal and the
covariances depend only on the scalar product, i.e. Q(i) (u, v, w) = Q(u). We
assume further that the roots of Q, that we denote 1 , ..., d , are real
(0 < 1 .... d ). We get
d

q(x) =
h=1

1
; r(x) =
x + h

h=1

1
1
; h(x) =
2
(x + h )
qi (x)

h=1

h
.
(x + h )2

It is easy now to write an upper bound for the integrand in (13) and compute
the remaining integral, thus obtaining the inequality
E(N X )
which is sharp if 1 = ... = d .
13

d m/2
d ,
1

If we further assume that d = 2, with no loss of generality Q(u) has the


1
1
form Q(u) = (u + 1)(u + ) with [0, 1]. Replacing q by x+1
+ x+
in
formula (14) we get:
E(N X ) =

2/Km

m1
0

(18)
1
1
+
2
1+
+ 2

(m1)/2

+
2
2
(1 + )
( + 2 )2

1/2

d.

One can compute the limit of the right-hand side as 0. For this purpose,

2
notice that the function (+
and is
2 )2 attains its maximum at =
1
dominated by 42 . We divide the integral in the right-hand member of (18)
into two parts, setting for some > 0

m1

I, :=
0

1
1
+
2
1+
+ 2

(m1)/2

+
2
2
(1 + )
( + 2 )2

1/2

d,

and
+

m1

J, :=

1
1
+
2
1+
+ 2

(m1)/2

+
2
2
(1 + )
( + x2 )2

1/2

d.

By dominated convergence,
+

J,

as 0. On the other hand

22 + 1
2 + 1

(m1)/2

d
,
1 + 2

I, I,
I,

where

I,

:=
0

2
2
+
1 + 2 + 2
/

=
0

d
+
z 2
z 2
+
1 + z 2 (z 2 + 1)

as 0, and

+
I,

:=
0

2
2
+
1 + 2 + 2

(m1)/2

(m1)/2

dz
Jm , (19)
+1

z2

+ 2
d
2
1+
+

22 + 1 (m1)/2 d
+ Jm , (20)

2 + 1
1 + 2
0

(m1)/2

14

as 0. Since is arbitrary, the integral in the right-hand size of (20) can


be chosen arbitrarily small. Using the identity Km Jm = /2, we get
E(N X ) := 1 +
as 0. Since

22
2 +1

<

22 +1
2 +1

1
Jm

22 + 1
2 + 1

d
1 + 2

< 2:

1 + 2(m1)/2 < < 1 +

5.5

(m1)/2

2(m1)/2
.
Jm 2

An analytic example

Our main result can be extended to random analytic functions in an obvious


manner.
Consider the case
Q(i) (u, v, w) = exp di (u + vw)

; di > 0, 0.

(21)

The case di = 1 i = 1, ..., m, = 0 has been treated by Edelman & Kostlan


[5]. We have E(N X ) = + but it is possible to get a closed expression for
E N X (V ) . We have
qi (x) = di ; r(x) = 4di ; h(x) = 1 + 4x.
Hence
E N X (V ) =

gm (t)dt,
V

with
gm (t) =

1
(m + 1)
(m/2 + 1) (4)m/2

d1 ...dm (1+4 t 2 )1/2 =

Lm
(2)m/2

d1 ...dm (1+4 t 2 )1/2 .

Notice that if = 0, the integrand is constant.

Systems of equations having a probability


law invariant under isometries and translations

In this section we assume that Xi : Rm R, i = 1, ...m are independent


Gaussian centered random fields with covariance of the form
15

r Xi (s, t) = i ( t s 2 ), (i = 1, ...m).

(22)

We will assume that i is of class C 2 and, with no loss of generality, that


i (0) = 1.
In what follows, V is a Borel subset of Rm with positive Lebesgue measure
and the regularity property that its boundary has zero Lebesgue measure.
For the computation of the expectation of the number of roots of the system
of equations
Xi (t) = 0, (i = 1, ...m)
that belong to the set V , we may use the same procedure as in Theorem 4,
obtaining:
E N X (V ) = (2)m/2 E | det(X (0))| m (V )
(23)
where we have used that the law of the random field {X(t) : t Rm } is
invariant under translation and that X(t) and X (t) are independent. One
easily computes, for i, , = 1, ..., m
E

Xi
Xi
(0)
(0)
t
t

2 r Xi
s t

t=s

= 2i (0) ,

which implies, again using the same method as in the proof of Theorem 4 :
m

E | det(X (0))| = 2

m/2

Lm
i=1

|i (0)|1/2

and replacing in (23)


m
X

E N (V ) =

m/2
i=1

|i (0)|1/2 Lm m (V ).

(24)

Our next task is to give a formula for the variance of N X (V ) and use it to
prove that -under certain additional conditions - the variance of
nX (V ) =

N X (V )
E N X (V )

- which has obviously mean value equal to 1- grows exponentially when the
dimension m tends to infinity. In other words, one should expect to have large
fluctuations of nX (V ) around its mean for systems having large m. Moreover
16

this exponential growth implies that an exact expression of the variance


would not improve bounds on the probabilities of the kind P{N X (V ) > A}
that follow from (24) and the Markov inequality.
Our additional requirements are the following:
1) All the i coincide : r Xi (s, t) = r(s, t) = i ( t s 2 ) = ( t s 2 ),
i = 1, ..., m,
( t s 2 ) is a covariance for all

2) the function is such that (s, t)


dimensions m.

It is well known [8] that satisfies 2) and (0) = 1 if and only if there exists
a probability measure G on [0, +) such that
+

(x) =
0

exw G(dw) for all x 0.

(25)

Theorem 5 Let r Xi (s, t) = ( t s 2 ) for i = 1, ..., m where is of the


form (25). We assume further that
1. G is not concentrated at a single point and
+
0

x2 G(dx) < .

2. {Vm }m=1,2... is a sequence of Borel sets, Vm Rm , m (Vm ) = 0 and


there exist two positive constants , such that for each m, Vm contains
a ball with radius and is contained in a ball with radius .
Then,
Var nX (Vm ) +,

(26)

exponentially fast as m +.
Proof: To compute the variance of N X (V ) note first that
Var N X (V )
= E N X (V ) N X (V ) 1

+ E N X (V ) E N X (V )

17

, (27)

so that to prove (26), it suffices to show that


E N X (V ) N X (V ) 1
E N X (V )

(28)

exponentially fast as m +. The denominator in (28) is given by formula


(24). For the numerator, we can apply Theorem 2 with k = 2 to obtain:
E N X (V ) N X (V ) 1
=
V V

E |det(X (s)) det(X (t))| /X(s) = X(t) = 0 pX(s),X(t) (0, 0) ds dt,


(29)

where pX(s),X(t) (., .) denotes the joint density of the random vectors X(s), X(t).
Next we compute the ingredients of the integrand in (29). Because of
invariance under translations, the integrand is a function of = t s. We
denote with 1 , ..., m the coordinates of .
The Gaussian density is immediate:
pX(s),X(t) (0, 0) =

1
1
m
2
(2) [1 (

2 )]m/2

(30)

Let us turn to the conditional expectation in (29). We put


E |det(X (s)) det(X (t))| /X(s) = X(t) = 0 = E det(As ) det(At ) ,
where As = ((Asi )), At = ((Ati )) are m m random matrices having as joint
- Gaussian - distribution the conditional distribution of the pair X (s), X (t)
given that X(s) = X(t) = 0. So, to describe this joint distribution we must
compute the conditional covariances of the elements of the matrices X (s)
and X (t) given the condition C : {X(s) = X(t) = 0}. This is easily done
using standard regression formulae:
E
E

Xi Xi
(s)
(s)/C
s
s

Xi Xi
(s)
(t)/C
s
t

2r
s t

t=s

1
r
r
(s, t)
(s, t)
2
1 (r(s, t)) s
s

r
2r
1
r
(s, t) +
(s, t)
(s, t)r(s, t).
2
s t
1 (r(s, t)) s
t

Replacing in our case, we obtain

18

Asi Asi

Ati Ati

=E

= 2 (0)

E Asi Ati = 4 2 4
and for every i = j:

2
,
4
1 2
2
,
1 2

(31)
(32)

E Asi Asj = E Ati Atj = E Asi Atj = 0,


where = ( 2 ), = ( 2 ), = ( 2 ).
Take now an orthonormal basis of Rm having the unit vector as first
element. Then the variance (2m) (2m) matrix of the pair Asi , Ati - the ith
rows of As and At respectively - takes the following form:

U0
.
. | U1
.
.
. V0 . | . V 1 .

..
..

.
.
.
.
.
|
.
.
.

. V0 | .
. V1
.
T =
,
.
. | U0
.
.
U1

. V1 . | . V 0 .

..
..
.
. . | .
. .
.
.
.
. V1 | .
. V0
where

U0 = U 0 (

) = 2 (0) 4

V0 = 2 (0) ;
U1 = U 1 (

V1 = V 1 (

) = 4

) = 2 ;

2 2
;
1 2

2 4

2 2
;
1 2

and there are zeros outside the diagonals of each one of the four blocks. Let
us perform a second regression of Ati on Asi , that is, write the orthogonal
decompositions
t,s
Ati = Bi
+ C Asi (i, = 1, m),

19

t,s
where Bi
is centered Gaussian independent of the matrix As , and

U12
;
U02
V2
t,s
Var(Bi
) = V0 1 12 .
V0

U1
,
U0
V1
,
=
V0

t,s
Var(Bi1
) = U0 1

For = 1, C1 =
For > 1, C
Conditioning we have :

t,s
E | det(As )|| det(At )| = E | det(As )|E | det((Bi
+ C Asi )i,=1,..,m )|/As

with obvious notations. For the inner conditional expectation, we can proceed in the same way as we did in the proof of Theorem 4 to compute the
determinant, obtaining a product of expectations of Euclidean norms of noncentered Gaussian vectors in Rk for k = 1, ..., m. Now we use the well-known
inequality
E +v E

valid for standard Gaussian in Rk and v any vector in Rk , and it follows


that
E | det(As )|| det(At )| E | det(As )| E | det(B t,s )| .
Since the elements of As (resp. B t,s ) are independent, centered Gaussian
with known variance, we obtain:
E det(As ) det(At ) U0 V0m1 1

U12
U02

1/2

V12
V02

(m1)/2

L2m .

Going back to (28) and on account of (24) and (29) we have


E N X (V ) N X (V ) 1
E

N X (V

m (V )

2
V V

1 V12 V02
dsdt
1 2

m/2

H(

(33)
Let us put V = Vm in (33) and study the integrand in the right hand member.
The function
1/2
U02 (x) U12 (x)
H(x) =
V02 V12 (x)

is continuous for x > 0. Let us show that it does not vanish if x > 0.
It is clear that U12 U02 on applying the Cauchy-Schwarz inequality to
the pair of variables Asi1 , Ati1 . The equality holds if and only if the variables
20

).

Asi1 , Ati1 are linearly dependent. This would imply that the distribution - in
R4 - of the random vector
:= X(s), X(t), 1 X(s), 1 X(t)
would degenerate for s = t (we have denoted 1 differentiation with respect
to the first coordinate). We will show that this is not possible. Notice first
that for each w > 0, the function
e

(s, t)

ts

2w

is positive definite, hence the covariance of a centered Gaussian stationary


field defined on Rm , say {Z w (t) : t Rm } whose spectral measure has the
non-vanishing density:
f w (x) = (2)m/2 (2w)m/2 exp

x 2
4w

(x Rm ).

The field {Z w (t) : t Rm } satisfies the conditions of Proposition 3.1 of Azas


& Wschebor [1] so that the distribution of the 4-tuple
w := Z w (s), Z w (t), 1 Z w (s), 1 Z w (t)
does not degenerate for s = t. On account of (25) we have,
+

Var( w )G(dw),

Var() =
0

where integration of the matrix is integration term by term. This implies


that the distribution of does not degenerate for s = t and that H(x) > 0
for x > 0.
We now show that for = 0:
1 V12 ( 2 )V02
>1
1 2( 2)
which is equivalent to
(x) < (0)(x) , x > 0.
The left-hand member of (34) can be written as
(x) =

1
2

w1 exp(xw1 ) + w2 exp(xw2 ) G(dw1 )G(dw2 )


0

21

(34)

and the right-hand member


(0)(x) =

1
2

w1 exp(xw2 ) + w2 exp(xw1 ) G(dw1 )G(dw2 ),


0

so that
(0)(x)+ (x) =

1
2

+
0

(w2 w1 ) exp(xw1 )exp(xw2 ) G(dw1 )G(dw2 ),

which is 0 and is equal to zero only if G is concentrated at a point, which


is not the case. This proves (34). Now, using the hypotheses on the inner
and outer diameter of Vm , the result follows by a compactness argument. .
Remark: On studying the behaviour of the function H(x) as well as the
ratio
1 V12 (x)V02
1 2 (x)
at zero one can show that the result holds true if we let the radius of the
ball contained in Vm tend to zero not too fast as m +.
Similarly, one can let tend to + in a controlled way and use the same
calculations to get asymptotic lower bounds for the variance as m +.

6.1

Acknowledgments

This work was supported by ECOS action U03E01. The authors thank two
anonymous referees for their remarks that have contributed to improve the
final version and for drawing our attention to the paper by Kostlan [7]

References
[1] J-M. Azas and M. Wschebor. On the distribution of the maximum of
a gaussian field with d parameters. Ann. Appl. Probability, to appear,
2004. see also preprint http://www.lsp.ups-tlse.fr/Azais/publi/ds1.pdf.
[2] A. T. Bharucha-Reid and M. Sambandham. Random polynomials. Probability and Mathematical Statistics. Academic Press Inc., Orlando, FL,
1986.

22

[3] L. Blum, F. Cucker, M. Shub, and S. Smale. Complexity and real computation. Springer-Verlag, New York, 1998. With a foreword by Richard
M. Karp.
[4] G. Box and J. Hunter. Multi-factor experimental designs for exploring
response surfaces. Ann. Math. Stat., 28:195241, 1957.
[5] A. Edelman and E. Kostlan. How many zeros of a random polynomial
are real? Bull. Amer. Math. Soc. (N.S.), 32(1):137, 1995.
[6] M. Kac. On the average number of real roots of a random algebraic
equation. Bull. Amer. Math. Soc., 49:314320, 1943.
[7] E. Kostlan. On the expected number of real roots of a system of random
polynomial equations. In Foundations of computational mathematics
(Hong Kong, 2000), pages 149188. World Sci. Publishing, River Edge,
NJ, 2002.
[8] I. J. Schoenberg. Metric spaces and completely monotone functions.
Ann. of Math. (2), 39(4):811841, 1938.
[9] M. Shub and S. Smale. Complexity of Bezouts theorem. II. Volumes
and probabilities. In Computational algebraic geometry (Nice, 1992),
volume 109 of Progr. Math., pages 267285. Birkhauser Boston, Boston,
MA, 1993.
[10] M. Spivak. A comprehensive introduction to differential geometry. Vol.
V. Publish or Perish Inc., Wilmington, Del., second edition, 1979.

23

Computation of Very High Dimensional integrals


by Quasi Monte-Carlo methods
Jean-Marc Azas
Universite de Toulouse,
IMT, LSP, F31062 Toulouse Cedex 9, France
Email: azais@cict.fr
Let X = (X1 , . . . , Xn ) a Gaussian vector of dimension n. Many problems of
simultaneous statistical inference correspond to the computation of the probability of X to be in an hyper-rectangle.
The evaluation of such a probability for n of the size of (say) 1000 seems
very difficult. In can be conducted in two steps: the first uses elementary
conditioning and the change of variable formula to transform the probability
into an integral over the unit hyper-cube:
h(t)dt.
[0,1]n

This transformation already implies an important reduction of variance if a


Monte-Carlo (MC) method is used.
The second step uses an evaluation on the integral over the hyper-cube using
a lattice rule that generate low discrepancy sequences.
More precisely, let n be prime and let Z1 be a nice integer sequence in Nn ,
the rule consist of choosing
ti =

i.z
M

and computing I =

1
M

h(ti )
i=1

where the notation


means that we have taken the fractional part componentwise. M is chosen prime.
Theorem 1 (Nuyens and Cools, 2006) Assume that h is the tensorial product
of periodic functions that belong to a Koborov space (RKHS). Then the minimax
sequence and the worst error can be calculated by a polynomial algorithm.
This result concerns the worst case so in many cases, the convergence is faster.
Numerical results show it is roughly O(M 1 ) thus much faster than MC.
If h does not satisfies the conditions of the preceding theorem we can still hope
QMC to be faster than MC and a reliable estimation the estimation error can
1

be obtained by adding a Monte-Carlo step:


Let (ti , i) be the lattice sequence, the way of estimating the integral can be
turn to be random but exactly unbiased by setting
M

I(U ) = 1/M

ti + U

i=1

where U is uniform on [0, 1]n . It is clear that E I(U ) = I and general considerations on QMC integration imply that I(U ) has small variance.
So we can make N independent replications of this calculation, computing
I = 1/N (I(U1 ) + + I(UN )
and construct Student-type confidence intervals. This interval is correct whatever the properties of the function h are. In practice N
is chosen rather small
(12) so that the MC implies roughly a loss of speed of 12 with respect to a
pure QMC method. But on the other hand we have a reliable estimation of
error.
We will present some numerical applications and also application to design of
experiments in large dimension making a comparison with LHS and Othogonal
Arrays.

References
Azas Genz, A. (2009),Computation of the distribution of the maximum of
stationary Gaussian sequences and processes. In preparation
Allan Genz web site http://www.math.wsu.edu/faculty/genz/homepage
Genz, A. (1992), Numerical Computation of Multivariate Normal Probabilities,
J. Comp. Graph. Stat. 1, pp. 141150.
Nuyens, D., and Cools, R. (2006), Fast algorithms for component-by-component
construction of rank-1 lattice rules in shift-invariant reproducing kernel
Hilbert spaces, Math. Comp 75, pp. 903920.

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series

Computing the maximum of random processes


and series
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State

University ; Cecile
Mercadier Lyon, France and Mario Wschebor
Universite de Toulouse

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

1 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series

Introduction

MCQMC computations of Gaussian integrals


Reduction of variance
MCQMC

Maxima of Gaussian processes

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

2 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

The lynx data


7000

6000

5000

4000

3000

2000

1000

20

40

60

80

100

120

Annual record of the number of the Canadian lynx trapped in the


Mackenzie River district of the North-West Canada for the period
1821 - 1934, (Elton and Nicholson, 1942)
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

3 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

After passage to the log and centering


3

4
0

20

40

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

60

80

100

120

4 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

Testing
The maximum of absolute value of the series is 3.0224. An estimation
of the covariance with WAFO gives
2

1.5

0.5

0.5

1.5
0

20

40

60

80

100

120

Can we judge the significativity of this quantity ?


Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

5 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Introduction

We assume the series is Gaussian.


Let the Gaussian density in R114 . We have to compute
3.0224

(x1 , . . . , x114 )dx1 , . . . , dx114


3.0224

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

6 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

MCQMC computations of Gaussian integrals


Reduction of variance

Let us consider our problem in a general setting. is a n n


covariance matrix
u
u
1

I :=
l1

(x)dx

(1)

ln

By conditioning or By Choleski decomposition we can write


x1 = T11 z1
x2 = T12 z1 + T22 z2
.....................................
Where the Zi s are independent standard. Integral I becomes
u1 /T11

I :=

(z1 )dz1
l1 /T11

u2 T12 z1
T22
(z2 )dz2
l2 T12 z1
T22

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

(2)

7 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

MCQMC computations of Gaussian integrals


Reduction of variance

Now making the change of variables ti = (zi )


1

1 (u1 /T11 )

I :=

dt1
1 (l1 /T11 )

u2 T12 1 (t1 )
T22
dt2
l2 T12 1 (t1 )
T22

(3)

And by a final scaling this integral can be written as an integral on the


hypercube [0, 1]n .
I :=

h(t)dt.

(4)

[0,1]n

At this stage, if form (4) is evaluated by MC it corresponds to an


important reduction of variance (102 , 103 ) with respect to the form
(1). The transformation up to there is elementary but efficient.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

8 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

QMC
In the form (4) the MC evaluation is based on
M

h(ti )

I = 1/M
i=1

it is well known that its convergence is slow : O(M 1/2 ).


The Quasi Monte Carlo Method is based on the of searching
sequences that are more random than random. A popular method is
based on lattice rules. Let Z1 be a nice integer sequence in Nn , the
rule consist of choosing
i.z
ti =
,
M
where the notation
means that we have taken the fractional part
componentwise. M is chosen prime.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

9 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

Theorem
(Nuyens and cools, 2006) Assume that h is the tensorial product of
periodic functions that belong to a Koborov space (RKHS). Then the
minimax sequence and the worst error can be calculated by a
polynomial algorithm. Numerical results show that the convergence is
roughly O(M 1 ).
This result concerns the worst case so it is not so relevant

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

10 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

A meta theorem

If h does not satisfies the conditions of the preceding theorem we can


still hope QMC to be faster than MC

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

11 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

MCQMC
Let (ti , i) be the lattice sequence, the way of estimating the integral
can be turn to be random but exactly unbiased by setting
M

I = 1/M

ti + U

i=1

where U is uniform on [0, 1]n .


By the meta theorem I has small variance.
So we can make N independent replications of this calculation and
construct Student-type confidence intervals. It is correct whatever the
properties of the function h are.
N must be chosen small : in practical 12.

Conclusion : At the cost of a small loss in speed ( 12 ) we have a


reliable estimation of error.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

12 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

This method has been used to construct confidence bands for


electrical load curves prediction. Azas, Bercu, Fort, Lagnoux L e
(2009)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

13 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Do processes exist ?
In this part X(t) is a Gaussian process defined on a compact interval
[0, T].
Since such a process is always observed in a finite set of times and
since the previous method work with say n = 1000, is it relevant to
consider continuous case ?
Answer yes : random process occur as limit statistics. Consider for
example the simple mixture model
H0 : Y N(0, 1)
H1 : Y pN(0, 1) + (1 p)N(, 1) p [0, 1], M R

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

(5)

14 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Theorem (Asymptotic distribution of the LRT)


Under some conditions the LRT of H0 against H1 has , under H0 , the
distribution of the random variable
1
sup {Z 2 (t)},
2 tM

(6)

where Z(.) is a centered Gaussian process covariance function


r(s, t) =

est 1
es2 1

et2 1

In this case there is no discretization.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

15 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Maxima of Gaussian processes

The record method

IE X (t)+ 1IX(s)u,s<t X(t) = u)pX(t) (u)dt

P{M > u} = P{X(0) > u} +


0

(7)

after discretization of [0, T], Dn = {0, T/n, 2T/n, . . . , T} Then


P{sup X(t) > u} P{M > u} P{X(0) > u}
tDn
T

IE X (t)+ 1IX(s)u,s<t,sDn X(t) = u)pX(t) (u)dt

(8)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

16 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Now the integral is replaced by a trapezoidal rule using the same


discretization. Error of the trapezoidal rule is easy to evaluate .
Moreover that the different terms involved can be computed in a
recursive way.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

17 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

An example

Using MGP written by Genz , let us consider the centered stationary


Gaussian process with covariance exp(t2 /2)
[ pl, pu, el, eu, en, eq ] = MGP( 100000, 0.5, 50,
@(t)exp(-t.2 /2), 0, 4);
pu upper bound with
eu = estimate for total error,
en = estimate for discretization error, and
eq = estimate for MCQMC error ;
pl lower bound
el = error estimate (MCQMC)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

18 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Extensions

Treat all the cases : maximum of the absolute value, non centered,
non-stationary. In each case some tricks have to be used.
A great challenge is to use such formulas for fields .

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

19 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

References

Level Sets and Extrema


of Random Processes
and Fields
Jean-Marc Azas and Mario Wschebor

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

20 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Azas Genz, A. (2009),Computation of the distribution


of the maximum of stationary Gaussian sequences and
processes. In preparation
Allan Genz web site
http ://www.math.wsu.edu/faculty/genz/homepage
Mercadier,C. (2005). MAGP tooolbox,
http ://math.univ-lyon1.fr/ mercadier/
Mercadier, C. (2006), Numerical Bounds for the
Distribution of the Maximum of Some One- and
Two-Parameter Gaussian Processes, Adv. in Appl.
Probab. 38, pp. 149170.
Nuyens, D., and Cools, R. (2006), Fast algorithms for
component-by-component construction of rank-1 lattice
rules in shift-invariant reproducing kernel Hilbert
spaces, Math. Comp 75, pp. 903920.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

21 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

THANK-YOU
MERCI
GRACIAS

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

22 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

THANK-YOU
MERCI
GRACIAS

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

22 / 22

On the tails of the distribution of the maximum of a


smooth stationary Gaussian process
Jean-Marc Bardet 1, Mario Wschebor 2
Laboratoire de Statistique et de Probabilites, URA CNRS 745,
Universite Paul Sabatier, 118 route de Narbonne,
31062 Toulouse Cedex, France.
2
Centro de Matematica, Facultad de Ciencias, Universidad de la Republica,
Calle Igua 4225, 11400 Montevideo, Uruguay.
March 24, 2000
1

Abstract

This paper deals with the asymptotic behavior when the level tends to +1, of
the tail of the distribution of the maximum of a stationary Gaussian process on a
xed interval of the line. For processes satisfying certain regularity conditions, we
give a second order term for this asymptotics.

Mathematics Subject Classi cation (1991): 60Gxx, 60E05, 60G15, 65U05.


Key words: Tail of Distribution of the Maximum, Stationary Gaussian processes
Short Title: Distribution of the Maximum.

Introduction
X = fX (t); t 2 0; T ]g, T > 0 is a real-valued centered stationary Gaussian process with

continuous paths and MT = tmax


X (t). We denote F (u) = P (MT u) the distribution
2 ;T
function of the random variable MT , r(t) = E fX (s)X (s + t)g the covariance function
and k (k = 0; 1; 2; :::) the spectral moments of the process, whenever they are de ned.
With no loss of generality we will assume that = r(0) = 1:
Under certain regularity conditions, Piterbarg (1981, Theorem 2.2.) proved that for
each T > 0 and any u 2 R:
0

B exp ? 1 u+

1 ? (u) + 2 T (u) ? P (MT > u)


2

(1)

for some constants B > 0 and < 1: (respectively ) denotes the standard normal
distribution (respectively density).
The aim of this paper is to improve the description of the asymptotic behavior of
P (MT > u) as u ! +1 that follows from (1) replacing the bound for the error by
an equivalent as u ! +1. More precisely, under the regularity conditions required in
Theorem 1.1, we will prove that:
r

P (MT >u)=1 ? (u) + 2 T (u)? 2T ( ? )


2

2
2

"

3 1?

!#

? u
4

2
2

1 + o(1)]

(2)
This contradicts Theorem 3.1. in Piterbarg's paper in which a di erent equivalent is
given in case T is small enough (see also Aza s e t al. (1999)).
We will assume further that X has C sample paths (this implies < 1) and that
for every n 1 and pairwise di erent values t ; ::; tn in 0; T ], the distribution of the set of
5n random variables (X j (t ); ::; X j (tn); j = 0; 1; 2; 3; 4) is non-degenerate. A su cient
condition for this to hold is the spectral measure of the process not to be purely atomic
or, if it is purely atomic, that the set of atoms have an accumulation point in the real
line (A proof of this facts can be done in the same way as in Chap. 10 of Cramer and
Leadbetter, 1967).
If is a random vector with values in Rn whose distribution has a density with respect
to Lebesgue measure, we denote by p (x) the density of at the point x 2Rn . 1C denotes
the indicator function of the set C .
If Y = fY (t) : t 2 Rg is a process in L we put ?Y (s; t) for its covariance function and
i j Y
?Yij (s; t) = @@si@t? j (s; t) for the partial derivatives, whenever they exist.
The proof of (2) will consist in computing the density of the distribution of the
random variable MT and studying its asymptotic behavior as u ! +1. Our main tool is
4

( )

( )

the following proposition which is a special case of the di erentiation Lemma 3.3 in Aza s
and Wschebor (1999):
Proposition 1.1 Let Y be a Gaussian process with C paths and such that for every
n 1 and pairwise di erent values t ; ::; tn in 0; T ] the distribution of the set of 3n
random variables (Y j (t ); ::; Y j (tn); j = 0; 1; 2) is non-degenerate. Assume also that
E fY (t)g = 0; ?Y (t; t) = E fY (t)g = 1.
Then, if is a C -function on 0; T ],
2

( )

( )

d P (Y (t) u (t); 8t 2 0; T ]) = LY (u; ) + LY (u; ) + LY (u; ); with


du
LY (u; )= (0)P (Y `(s) u `(s); 8s 2 0; T ]):pY ( (0)u)

(3)

LY (u; )= (T )P (Y a(s) u a(s); 8s 2 0; T ]):pY T ( (T )u)

(4)

(0)

( )

Z T

L (u; )=? (t)E (Y t(t)? t(t)u)1fY t s


Y

( )

u t (s);8s2 0;T ]g

p Y t ;Y
( ( )

(t))

( (t)u; 0(t)u)dt: (5)

Here the functions `; a; t and the (random) functions Y `; Y a ; Y t are the continuous
extensions to 0; T ] of:
?
?
`
(s) = 1 (s) ? ?Y (s; 0) (0) ; Y `(s) = 1 Y (s) ? ?Y (s; 0)Y (0) for 0 < s T; (6)

1 (s)??Y (s; T ) (T ) ; Y a(s)= 1 ?Y (s)??Y (s; T )Y (T ) for 0 s< T; (7)


(s)= T?
s
T?s
t

(s) = (s ?2 t)

Y
(s) ? ?Y (s; t) (t) ? ??Y ((t;t; st)) 0(t)
10

11

Y
Y t(s) = (s ?2 t) Y (s) ? ?Y (s; t)Y (t) ? ??Y ((t;t; st)) Y (t)
0

10

11

0 s T; s 6= t;
0 s T; s 6= t:

(8)
(9)

We will repeatedly use the following Lemma. Its proof is elementary and we omit it.
Lemma 1.1 Let f and g be real-valued functions of class C de ned on the interval 0; T ]
of the real line verifying the conditions:
1) f has a unique minimum on 0; T ] at the point t = t , and f 0(t ) = 0; f "(t ) > 0:
2) Let k = inf j : g j (t ) 6= 0 and suppose k = 0 ; 1 or 2.
De ne
Z T
h(u) =
g(t) exp ? 21 u f (t) dt:
Then, as u ! 1:
Z
k (t ) 1
1
g
xk exp ? 41 f "(t )x dx;
h(u) t k! uk exp ? 2 u f (t )
J
where J = 0; +1) ; J = (?1; 0] or J = ]?1; +1 according as t = 0; t =
T or 0 < t < T respectively.
2

( )

( )

+1

We now turn to our result.

Theorem 1.1 Let X = fX (t) : t 2 0; T ]g be a Gaussian centered stationary process with

C -paths, covariance r(:), = 1, and such that for every n 1 and pairwise di erent
t ; ::; tn in 0; T ], the distribution of the set of 5n random variables (X j (t ); ::; X j (tn); j =
0; 1; 2; 3; 4) is non-degenerate. We shall also assume the additional hypothesis that r0 < 0
in a set dense in 0; T ].
4

( )

( )

Then (2) holds true.


Proof.

We divide the proof into several steps.

Step 1. Proposition1.1 applied to the process Y = X and the function (t) = 1 for all
t 2 0; T ] enables to write the density pMT of the distribution of the maximum MT as:
pMT (u) = A (u) + A (u) + A (u)] : (u); with
(10)
1

A (u) = P (X `(s) u `(s); 8s 2 0; T ]);


1

A (u) = P (X a(s) u a(s); 8s 2 0; T ]);


2

Z T
1
E (X t(t)? t(t)u)1fX t s u t s ;8s2 ;T g dt:
A (u) = ? p
2
Since X is a stationary process and (t) 1, it follows that the processes X and
Xe - de ned as Xe (t) = X (T ? t) - have the same law, so that P (X (s) u for all
s 2 0; T ] jX (0) = u) = P (X (s) u for all s 2 0; T ] jX (T ) = u). Hence, A (u) = A (u).
3

( )

( )

Step 2.

We now consider A (u) and write it in the form:


1

A (u) = P (Y (s) u (s); 8s 2 0; T ])


where Y is the continuous extension to 0; T ] of:
`
`
Y (s) = n X (s) o = s:X (s) 1 s 2]0; T ]
1 ? r (s)] 2
(E X `(s)] ) =
1

1 2

and

? r(s) s 2 0; T ]:
(s) = n (s) o = 11 +
r(s)
(E X `(s)] ) =
`

1 2

Thus, Proposition 1.1 can be applied and:

d A (u) = LY (u; ) + LY (u; ) + LY (u; ):


du
1

(11)

LY (u; ) = 0 because (0) = 0.


1

LY (u; ) = (T )P (Y a(s) u a(s); 8s 2 0; T ]) ( (T )u) , with


a
(s) = 1 + r(T ) ? r(s) ?pr(T ? s) s 2]0; T
(T ? s)(1 + r(T )) 1 ? r (s)
2

0 (T )
r
p
(0) =
T (1 + r(T ))

0 and

(T ) =

On the other hand:

(1 + r(T )) 1 ? r (T )
2

0:

P Y a(s) u a(s); 8s 2 0; T ]

check that

r0(pT )

P Y a(T ) u a(T ) ;

0
E (Y a(T )) ) = E (Y 0(T )) = (1 ?(1r ?(Tr))(?T ))(r (T )) ) ;
so that, since the non-degeneracy hypothesis implies that for each T > 0, E (Y 0(T )) ) is
2

non-zeo, it follows that the numerator in the right-hand member is stricly positive for
T > 0.
Hence,
?
P Y a(s) u a(s); 8s 2 0; T ] ( (T )u) C (T ) exp ? u2 F (T ) ;
with C (T ) > 0, where F (t) is the function
2

(12)

F (t) = (1 ? r(1(?t))r(?t))(r0(t)) for t 2 ?T; T ]; t 6= 0:


2

which is well de ned since the denominator does not vanish because of the previous
remark.
The following properties of the function F are elementary and will be useful in our
calculations.
(a) F has a continuous extension at t = 0.
(b) F (t) > F (0) = ? for t 6= 0 because:
0
0
r00(t))(1 ? r(t)) and
* F 0(t) = 2 (1 ? r(t))( r ((1t)((?rr(t())t))??((r0(?
t)) )
* r0(t) < 0 for t 2 A 0; T ] with A dense in 0; T ], and
2
2

2
2

2 2

* For t 6= 0,
(r0(t)) ? ( ? r00(t))(1 ? r(t)) =
?
(E (X 0(t) ? X 0(0))(X (t) ? X (0))) ? E (X 0(t) ? X (0))
2

E (X (t) ? X (0)) < 0,

from Cauchy-Schwartz inequality (and non-degeneracy hypothesis).

(c) F 0(0) = 0:
(d) F 00(0) = 9(( ? ? ) ) .
2

2
4
2 2
2

From (12) and (b), it follows that

LY (u; ) D(T ) exp ? u2


with D(T ) > 0 and (T ) > 0 for all T > 0.

L (u; ) = ?
Y

Z T

with

p Y t ;Y
( ( )

(t)E (Y t(t)? t(t)u)1fY t s

( )

2
2

(T ) +

pY

u t (s);8s2 0;T ]g

(t))

(13)

(t);Y 0 (t))

(u (t); u 0(t))dt;

0
(t) + E f((Y(0t())t)) g
(u (t); u 0(t)) = 2 (E f(Y10(t)) g) = exp ? u2
= 2 (E f(Y10(t)) g) = exp ? u2 F (t) :
2

2
2

1 2

1 2

2
2

We know that t2min;T F (t) = F (0) = ? .


Also check that:
p
d t(t) = 1
(0) = 0; 0(0) = 21
> 0; (0) = 0; lim
t!
dt
12
and E f(Y (0)) g > 0 from the non-degeneracy condition.
As a consequence:
0

2
2

? >0
?
2
4

6
4

2
2

Z T

(t)E Y t(t)1fY t s

( )

p Y t ;Y

u t (s);8s2 0;T ]g

( ( )

(t))

(u (t); u 0(t))dt

Z T
t (t)) g) =
u
(
E
f
(
Y
(t) 2 (E f(Y 0(t)) g) = exp ? 2 F (t) dt C (t) exp ? u2 F (t) dt;
where C is a positive constant. Also, from Lemma 1.1 and the properties of the functions
F and , one gets:
Z T
(t) exp ? u2 F (t) dt C u1 exp ? u2 ?
:
C a positive constant. Hence,
Z T

1 2

1 2

2
2

2
2

Z T

(t)E Y t(t)1fY t s

( )

u t (s);8s2 0;T ]g

p Y t ;Y
( ( )

(t))

(u (t); u 0(t))dt =

= O u1 exp ? u2

2
2

2
2

(14)

On the other hand,


Z T
0

(t)E

(t)1fY t s

( )

Z T
2
0

p Y t ;Y

u t (s);8s2 0;T ]g

( ( )

(t))

(u (t); u 0(t))dt

(t) t(t) exp ? u2 F (t) dt;


2

where A a positive constant. Since the function g(t) = (t) t(t) veri es g(0) = g0(0) = 0,
and g00(0) 6= 0, Lemma 1.1 implies:
2

Z T
0

(t)E

(t)1fY t s

( )

pY

u t (s);8s2 0;T ]g

= O u1 exp ? u2

(t);Y 0 (t))

(u (t); u 0(t))dt =

2
2

2
2

(15)

From (14) and (15) follows:

LY (u; ) = O u1 exp ? u2

2
2

2
2

(16)

and from (13) and (16):

d A (u) = O 1 exp ? u
:
(17)
du
u
2 ?
Further, observe that since Y is continuous and (s) > 0 for s 2 ]0; T ], (0) = 0,
if Y (0) > 0 the event fY (s) u (s); 8s 2 0; T ]g does not occur for positive u, and if
Y (0) < 0, the same event occurs if u is large enough. This implies that
2

2
2

2
2

A (u) = P (Y (s) u (s); 8s 2 0; T ]) ! P (Y (0) < 0) = 21 as u ! +1


1

and so,

A (u) ? 12 = ?
1

Z
u

d A (v)dv = O 1 exp ? u
dv
u
2

2
2

2
2

(18)

on applying (17).
Step 3.

We will now give an equivalent for A (u): Introduce the following notations:
3

for t 2 ]0; T , Zt(s) is the centered Gaussian process


t
Zt(s) = (E f(XXt((ss))) g) = ; s 2 0; T ]:
2

for t 2 0; T ],

1 2

(s)
(1 ? r(t ? s))
p
=
t(s)=
t
=
(E f(X (s)) g)
(1 ? r (t ? s)) ? (r0(t ? s))
t

1 2

= F (t ? s) for s 2 0; T ]; s 6= t
and
t

(t) = p

2
4

2
2

Hence,
0

F 0(0) = 0:
(t) = p
2 F (0)

B (u; t) = P (Zt(s) u t(s); 8s 2 0; T ]);


Be (u; t) = E X t (t)1fZt s u t s ;8s2 ;T g
3

so that

( )

( )

Z T

Z T
1
B (u; t) dt ? p
Be (u; t) dt = S (u) ? T (u): (19)
A (u) = u 2
2
We will consider in detail the behavior of the rst term as u ! +1.
2

We apply again Proposition 1.1 to compute the derivative of B (u; t) with respect to u.
For t 2 ]0; T :
3

d B (u; t) = LZt (u; ) + LZt (u; ) + LZt (u; ); where


t
t
t
du
LZt (u; t) = t(0)P (Zt`(s) u t`(s); 8s 2 0; T ]) ( t(0)u). Then :
p
LZt (u; ) p1 F (t)exp ? u F (t) ;
3

so that as u ! +1 :

Z T
0

LZt (u; t)dt C u1 exp ? u2 F (0)


2

(20)

for some constant C .


1

* LZt (u; t) = t(T )P (Zta(s) u ta(s); 8s 2 0; T ]) ( t(T )u). In the same way:
p
LZt (u; t) p1 F (T ? t) exp ? u2 F (T ? t) :
2
and:
Z T
LZt (u; t)dt C u1 exp ? u2 F (0) :
for some constant C .
2

(21)

Z T

LZt (u; t) =?
3

(x)E (Ztx(x)? tx(x)u)1fZtx s

( )

u tx (s);8s2 0;T ]g

p Zt x ;Zt x (u t(x); u t0(x))dx:


(

( )

( ))

We rst consider the density in the integrand:


p Zt x ;Zt x (u t(x); u t0(x)) =
(

=
De ne

( )

( ))

u F (x ? t) +
(F 0(x ? t))
1
exp
?
2
4F (x ? t)E f(Zt0(x)) g
E f(Zt0(x)) g
2

(22)

0
Gt (x) = F (x ? t) + 4F (x(?F t()xE?f(tZ))0(x)) g :
2

Check that
min Gt(x) = Gt(t) = F (0);

G0t(t) = F 0(0) = 0:

x2 0;T ]

Moreover, for x 2 0; T ] one has:


x ? t)F 00(0)) + O((x ? t) );
Gt(x) = F (0) + (x ?2 t) F 00(0) + 4F(((0)
E f(Zt0(t)) g
and thus
00
G00t (t) = F 00(0) + 2F (0)(FE f(0))
(Zt0(t)) g :
Also,
X t (x) = (x ?2 t) (X (t) + (x ? t)X 0(t) + (x ?2 t) X 00(t) + (x ?6 t) X 000(t)) ? ::
? X (t)(1 ? (x ? t) ) ? X 0(t)(? (x ? t) + (x ?6 t) ) + O((x ? t) );
2

and

Zt(x) =

1
(X 00(t) + X (t)) +
? + O((x ? t) )
t) ( X 000(t) + X 0(t)) + O((x ? t) ):
+ (x ?
3

p
2

2
2

It follows that:

00(0)
E (Zt0(t)) = 9 ( ?? ) = FF (0)
;
2

and

2
4
2
2

G00t (t) = 23 F 00(0) = 6(( ? ? ) ) :


2

We also have:

2
4
2 2
2

? (x; s) 0(x) :
Zt
t(s) ? ? (x; s): t(x) ? Zt
(s ? x)
? (x; x) t
Zt
2
(t; s) 0(t) =
?
t(s) =
Z
t
t(s) ? ? (t; s): t(t) ? Zt
t
(t ? s)
? (t; t) t
p
p
= (t ?2 s) F (t ? s) ? F (0):E fZt(t)Zt(s)g for s 6= t;
F 00(0) + pF (0)E n(Z 0(t)) o = 3 pF 00(0) = ( ? ) > 0;
t
(t t) = 21 p
t
2 F (0) 6( ? ) =
F (0)
where the last inequality is a consequence of the non-degeneracy condition.
Note that since E fZt(t)Zt(s)g 1,
2 pF (t ? s) ? pF (0) for s 6= t;
t
t (s)
(t ? s)
so that
inf tt(s) > 0:
s;t2 ;T
x
t

(s) =

Zt
10

11

10

11

2
4
2 3 2
2

On the other hand, it is easy to see that tx(s) is a continuous function of the triplet
(x; t; s) and a uniform continuity argument shows that one can nd > 0 in such a
way that if jx ? tj
then
x
c > 0 for all s 2 0; T ]:
t (s)
Thus, for jx ? tj , using the Landau-Shepp-Fernique inequality (see Fernique,
1974):
x
? x
t (x) t (x)u
t (x)
x
x s u x s ;8s2 ;T g = ? p
p
(1 + R)
E
(
Z
(
x
)
?
(
x
)
u
)1
f
Z
t
t
t
t
E f(Zt0(x)) g
E f(Zt0(x)) g
( )

where R

( )

: exp(? :u ) and ; positive constants independent of t; x and u.


2

10

So,
Z T
t(
p

x) tx(x) (1 + R) exp ? 1 u G (x) dx:


2 t
E f(Zt0(x)) g
Using the fact that B (+1; t) = 1 for every t 2 0; T ] we can write:
L (u; t) = 2u
Zt
3

(23)

S (u) = u 2 T ? u 2
2

Z T
0

dt

+1

d B (v; t) dv:
dv
3

The same method of Lemma 1.1, plus


t
3 pF "(0)F (0)
t(t) t (t)
p
=
E f(Zt0(t)) g 2
and (20), (21), (23), (22) show that
2

S (u) = u 2 T ? (1 + o(1)) 2T

3 ( ? ) exp ? u
2

2
2

2
2

2
2

(24)

The second term in (19) can be treated in a similar way, only one should use the full
statement of Lemma 3.3 in Aza s and Wschebor (1999) instead of Proposition 1.1, thus
obtaining:
T (u) = O( u1 ): exp ? u2 ? :
(25)
Then, (24) together with (25) imply that as u ! +1:
2

2
2

A (u) = u 2 T ? (1 + o(1)) 2T
3

2
2

3 ( ? ) exp ? u
2
2
2

2
2

2
4

2
2

(26)

Replacing (18), (26) into (10) and integrating, one obtains (2).

Acknowledgment. The authors thank Professors J-M. Aza s, P. Carmona and C. Del-

mas for useful talks on the subject of this paper.

References

Aza s, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Aza s, J-M. and Wschebor, M. (1999). On the Regularity of the Distribution of the
Maximum of One-parameter Gaussian Processes. Submitted.

11

Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. J.
Wiley & Sons, New-York.
Fernique, X. (1974). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
d'Ete de Probabilites de St. Flour. Lecture Notes in Mathematics, 480, Springer-Verlag.
New-York.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.

12

Condition Numbers and Extrema of Random Fields.


J.A. Cuesta-Albertos
Departamento de Matematicas, Estadstica y Computacion,
Universidad de Cantabria. Santander. Spain
E-mail: cuestaj@unican.es
Mario Wschebor
Centro de Matematica, Facultad de Ciencias
Universidad de la Republica. Montevideo. Uruguay.
E-mail: wschebor@cmat.edu.uy
May 20, 2003

Introduction.

Let A be an m m real matrix. We denote by


A = sup Ax
x =1

its Euclidean operator norm, where we denote by v the Euclidean norm of v Rm . If


A is non-singular, its condition number (A) is dened by
(A) = A

A1 .

The role of (A) in Numerical Linear Algebra has been recognized since a long time
[11], [12], [13] as well as its importance in the evaluation of algorithm complexity [5], [7].
(A) measures, to the rst order of approximation, the largest expansion in the relative
error of the solution of the m m linear system of equations
Ax = b

(1)

when its input is measured with error.


In other words, log2 (A) is the loss of precision in the solution x of (1) due to illconditioning of A, measured in number of places in the nite binary expansion of x.
A natural problem is trying to understand the behaviour of (A) when the matrix A
is chosen at random, that is, to estimate the tail
P [(A) > x], for each x R+ ,
This author has been partially supported by the Spanish Ministerio de Ciencia y Tecnolog
a, grants PB98-0369-C02-02
and BFM2002-04430-C02-02.

(where P is the probability dened on the probability space in which A is dened) or the
moments of the random variable (A). Of course, a priori this will depend on the meaning
of choosing A at random, that is, which is the probability distribution of A. A typical
result is the following:
Theorem 1 (Edelman, 1988) Let A = (ai,j )i,j=1,...,m and assume that the ai,j s are
i.i.d. Gaussian standard random variables. Then:
E {log (A)} = log m + C0 + m ,

(2)

where C0 is a konwn constant (C0


= 1, 537) and m 0 as m +.
In [3] one can nd some elementary inequalities for the moments of log (A) when
the entries of A are i.i.d. but not necessarily Gaussian. In a recent paper [9] bounds for
P [(A) > x] are given when the ai,j s are i.i.d. Gaussian with a common variance but
may be non-centered (this has been called smoothed analysis). More precisely:
Theorem 2 (Sankar, Spielman, Teng, 2002) Assume ai,j = mi,j +gi,j (i, j = 1, ..., m)
where the gi,j s are i.i.d. centered Gaussian with common variance 2 1 and the (non
random) matrix
M = (mi,j )i,j=1,...,m
1

veries M m 2 .
Then, there exists x0 such that, if x > x0 , then
1

P [(A) > x]

4.734m 1 + 4 (log x) 2
x

(3)

Remark 3 There are a few dierences between this statement and the actual statement
in [9]. The rst one is that instead of 4.734 their constant is 3.646, apparently due to a
mistake in the numerical evaluation. The second one, their hypothesis is supi,j |mi,j | 1
1
instead of M m 2 , which they actually use in their proof and which is not implied by
the previous one. Finally, the inequality from [10], which is applied in their proof, does
not apply for every x > 0.
If one denotes 1 , ....m , 0 1 .... m , the eigenvalues of the matrix At A (At
stands for the transpose of A), then
(A) =

1
2

m
1

MA
=
mA

where
MA = max f (x);
x =1

mA = min f (x);
x =1

1
2

f (x) = xt At Ax (x Rm ).

It is possible to study the random variable (A) using techniques related to extrema
of random elds. More precisely, if a > 0:
1
P [MA > a] = P [M + (X, a) 2] E M + (X, a) ,
2

(4)

where, if S m1 is the unit sphere in the m-dimensional euclidean space, then X is the
real-valued random eld
X = f | S m1
and
M + (X, a) = # x : x S m1 , X has a local maximum at the point x and X(x) > a
(note that since f is an even function, {MA > a} occurs if and only if {M + (X, a) 2}).
The main point in making inequality (4) a useful tool is that the expectation in the
right-hand side member can be computed - or at least estimated - using Rice formula for
the expectation of the number of critical points of the random eld X (the derivative of
X).
In fact, we will only use an upper bound for E {M + (X, a)}, as will be explained below.
The upper bound thus obtained for P [MA > a] will be one of the tools to prove Theorem
11 which contains a variant of (3) that implies an improvement if x is large enough.
m
However Conjecture 1 in [9] which states that P [(A) > x] O( x
) remains an open
problem.
Inequality in Proposition 6 is a variant of results that have been known since a certain
time (see for example Lemma 2.8 in [10]).
Our main point here is the connection between the spectrum of random matrices and
the zeros of random elds which makes useful Rice formulae for the moments of the
number of zeros. In our context, inequality (13) is interesting for large values of a for
which the classical inequalities are of the same order. Note also that in this case, the
constant 1/4 in the exponent can be replaced by any constant strictly smaller than 1/2,
if a is large enough.
On the other hand, for the time being, this method does not provide the precise bounds
on the distribution of the largest eigenvalue of a Wishart matrix for values of a close to
a = 2 (c.f. [4] or [8]).
These inequalities permit to deduce inequalities for the moments of log (A), as in
Corollary 13, which gives a bound for E {log (A)} for non-centered random matrices.
This also leads to an alternative proof of a weak version of Edelmans Theorem, which
instead of (2) states that
E {log (A)} log m + C
(5)
for some constant C.
Rice formulae for the moments of the number of zeros of a random eld can be applied
in some other related problems, which are in fact more complicated than the one we are
adressing here. In [2] this is the case for condition numbers in linear programming. We
briey sketch one of the results in this paper.
3

Consider the system of inequalities


Ax < 0

(6)

where A is an n m real matrix, n > m, and y < 0 denotes that all the coordinates
of the vector y are negative. In [1] the following condition number was dened, for the
(feasibility) problem of determining wheather the set of solutions of (6) is empty or not.
Denote by at1 , ...., atn the rows of A,
fk (x) =

atk x
(k = 1, ..., n),
ak

max fk (x).
D(A) = min
m1
xS

1kn

Cheung-Cucker condition number is


C(A) = |D(A)|1 ,
with the convention C(A) = + when D(A) = 0. [2] contains the following result:
Theorem 4 Assume that

m(1 + log n)
1.
n
If ai,j , , i = 1, ..., n, j = 1, ..., m are i.i.d. Gaussian standard random variables, then
E {log C(A)} max (log m, log log n) + K,

(7)

where K is a constant.
To prove (7) one can also use a method based upon the formulae on extrema of random
elds, since the problem consists in giving ne bounds for the probability
P

min Z(x) < b ,

xS m1

where Z(x) = max1kn fk (x) and b is a (small) positive number.


The dierence between the proof of Theorem 4 and the study of (A) for square
matrices is that in the latter case the random function X that is to be considered is the
restriction of a quadratic form to the unit sphere, hence a nice regular function, while the
study of the local extrema of Z is complicated. This is due to the fact that x Z(x) is
a non-dierentiable piecewise ane function.
The plan of the paper is as follows. In Section 2 we include some technical results
which are required to state main results in this paper. Those results appear in Section 3.

Technical preliminaries.

In this section, Bm1 (0, ) is the Euclidean ball centered at the origin with radius in
Rm1 , |Bm1 (0, )| is its Lebesgue measure, m1 the (m 1)-dimensional geometric
measure in S m1 and T 0 denotes that the bilinear form T is negative denite.

Proposition 5 (Kacs formula) Let F : S m1 R be a C 2 function. Denote:


M+ (F, a) = x : x S m1 , F has a local maximum at the point x and F (x) > a ,
M + (F, a) = # M+ (F, a) .
We assume that
x : x S m1 , F (x) = 0, det(F (x)) = 0 =

(8)

i.e. that there are no critical points of F in which F is singular.


Then,
M + (F, a) = lim
0

1
|Bm1 (0, )|

S m1

|det(F (x))| 1{

F (x) <,F (x)0,F (x)>a} m1 (dx)

(9)

Proof. The hypothesis implies that the points of M+ (F, a) are isolated, hence, that
M + (F, a) is nite. Put
M+ (F, a) = {x1 , ...., xN } .
Then, for j = 1, ..., N :

F (xj ) 0.

F (xj ) = 0;

If 0 is small enough, using the inverse function theorem, there exist pairwise disjoint
open neighbourhoods U1 , ..., UN in S m1 of the points x1 , ..., xN respectively, such that
for each j = 1, ..., N the map x F (x) is a dieomorphism between Uj and Bm1 (0, 0 )
and
N

Uj = x : x S m1 , F (x) < 0 , F (x) 0, F (x) > a .

j=1

Using the change of variable formula,


Uj

|det(F (x))| m1 (dx) = |Bm1 (0, 0 )| ,

it follows that
N |Bm1 (0, 0 )| =
=

N
j=1

S m1

Uj

|det(F (x))| m1 (dx)

|det(F (x))|{

F (x) <0 ,F (x)0,F (x)>a}

m1 (dx).

This proves (9).


Suppose now that X(x) = xt At Ax, A = (ai,j )i,j=1,...,m , x Rm and introduce the
following notations: for x S m1 , let {e1 , ..., em } be an orthonormal basis of Rm such that
e1 = x. We denote Ax = (axi,j )i,j=1,...,m the matrix associated to the linear transformation
in Rm dened by y Ay, when one takes {e1 , ..., em } as reference basis.
x
x
Put also B x = (Ax )t Ax = (bxi,j )i,j=1,...,m , bxi,j = m
h=1 ah,i ah,j .
5

Direct computations show that:


X(x) = bx1,1

(10)

X (x) = 2 bx2,1 , ...., bxm,1

X (x) =

x
b2,2 bx1,1

...
2

bxm,2

(11)

....
bx2,m

x
x
....
....
= 2(B2,2 b1,1 Im1 )
... bxm,m bx1,1

(12)

(Im1 is the (m 1) (m 1) identity matrix).


In the rest of this paper, G = (gi,j )i,j=1,...,m will denote a random matrix with i.i.d.
centered Gaussian entries and common variance 2 and A = (ai,j )i,j=1,...,m , where ai,j =
gi,j + mi,j and M = (mi,j )i,j=1,...,m is a nonrandom matrix.
Since our interest is in studying (A), the fact that ( 1 A) = (A), for every = 0,
implies that we may assume that = 1 if we replace the expected matrix M by 1 M .
The next proposition is a variant of a known inequality (Lemma 2.8 in [10] and references therein). We give here an independent proof.
Proposition 6 Assume G = (gi,j )i,j=1,...,m where the gi,j s are i.i.d. random variables,
each one having standard Gaussian probability distribution. Assume m 3.
Then, for each a 4 one has
P
where C1 (a) =

2
36 2e

7a3

C1 (a)
1
G a m exp a2 m ,
m
4

2
36 2e

43 7

(13)

= C1 0.008677...

Proof. Step 1.
We consider the quadratic form dened on Rm :
fG (x) = xt Gt Gx.
We have, for t > 0:

1
> t E M + (fG , t) .
(14)
2
To be able to apply Proposition 5 to M + (fG , t) we need to check condition (8). One
way to do this is to use Proposition 4 in [2], applying it to the random vector eld V = fG
since the random variable fG (x) has a bounded density in Rm1 . One can conclude that
almost surely formula (9) holds true for F = fG .
P

Step 2.
For each x S m1 let us compute the joint distribution of fG (x) and fG (x) in RRm1 .
Note rst that due to the invariance under linear isometries, this joint distribution is
the same for all x S m1 . We compute it for x = w = (1, 0, ..., 0)t . Notice that, in this
case Aw = A, B w = B, ...
6

m
h=1

Conditionally on (g1,1 , ..., gm,1 ), fG (w) is constant and equal to b1,1 =


the random variables
m
bi,1 =

gh,i gh,1

2
gh,1
and

(i = 2, ..., m)

h=1

are independent, each one being Gaussian centered with variance b1,1 .
So, since the distribution of b1,1 is 2 with m degrees of freedom and on account of
(10) and (11), the joint density of fG (w) and fG (w) is equal to:
pfG (w),fG (w) (y, z) = 2m (y)
=

exp 12
(2)

m1
2

z 2
4y

2m1 y

1
2

3m
1
2

( m2 )(2)

m1
2

m1
2

exp 12 y +

z 2
4y

(15)

On the other hand, using Step 1 and Proposition 5, we obtain:


2

2P ( G

> t)

E lim
0

1
|Bm1 (0, )|

dy

S m1

|det(fG (x))| 1{

fG (x) <,fG (x)0,fG (x)>t} m1 (dx)

E |det(fG (x))| 1fG (x)0 /fG (x) = y, fG (x) = 0 pfG (x),fG (x) (y, 0) m1 (dx)
+

= m1 (S m1 )

S m1

E |det(fG (w))| 1fG (w)0 /fG (w) = y, fG (w) = 0 pfG (w),fG (w) (y, 0)dy.

In the last equality we have used again the fact that the law of the random eld
{fG (x) : x S m1 } is invariant under a linear isometry of Rm .
Substituting the density from (15) and taking into account that
m

m1 (S

m1

2 2
)= m ,
( 2 )

we obtain:
P( G

> t)

2 2

2m ( m2 )

exp y2
E |det(fG (w))| 1fG (w)0 /(b1,1 , ..., bm,1 ) = (y, 0, ..., 0)
dy.

y
(16)

Step 3.
From the expression (12) for fG (w), since B2,2 is positive denite we have that:
|det (fG (w))| 1{f

G (w)0

m1
1{f (w)0} (2b1,1 )m1 .
} (2b1,1 )
G

Replacing in the integrand in (16) we get:

+
3
y
2
2
y m 2 exp
P G >t
2
2
t
2m ( m2 )
7

dy =

2m ( m2 )

2 Jm (t).

For the remaining, we use the inequalities:


m
m

1
2
2
and
3

Jm (t) 2tm 2 1 +

m
1
2

exp

1
1
+ ... +
8
8

m1

m
1
2
exp

m
1
2

3
t
16
t
tm 2 exp ,
2
7
2

the second one valid for t 16m.


So, if a 4:
P

>a m

em
16 2 m 3
e2 2 1/2
a2 m
2
(a m)
exp
4
(m 2)m1 7
2
2
m1
2
4 2e
2
am

(ea2 )m 1 +
exp
3
7a m
m2
2
2
36 2e
a2 m
2 m

)
exp

(ea
7a3 m
2
2
36 2e
a2

m
exp
1 log(a2 )
3
7a m
2
2
36 2e
a2 m
1

,
exp
7a3
m
4

which is the inequality in the statement.


Next we obtain an upper bound for the tail probabilities P [ A1 > x]. This was done
in Theorem 3.2 in [9] .We include here a proof that in fact uses their technique and also
provides a slight improvement in the numerical constant.
We will employ the following lemma.
Lemma 7 (Lemma 3.1, [9]) Assume that A = (ai,j )i,j=1,...,m , ai,j = mi,j + gi,j (i, j =
1, ..., m), where the gi,j s are i.i.d. standard Gaussian r.v.s. Let v S m1 . Then
P

A1 v > x <

1/2

1
.
x

Lemma 8 Let U = (U1 , ..., Um ) be an m-dimensional vector chosen uniformly on S m1


and let tm1 be a real valued r.v. with a Student distributon with m 1 degrees of freedom.
Then, if c (0, m), we have that
P U12 >

c
m1
= P t2m1 >
c .
m
mc

Proof. Let V = (V1 , ..., Vm ) be a m-dimensional random vector with standard Gaussian
distribution. We can assume that
V
.
U=
V
Let us denote, to simplify the notation K = V22 + ... + Vm2 . Then the statement
V12
c
>
2
V1 + K
m
is equivalent to that

V12
c
>
,
K
mc

and we have that


P U12 >

(m 1)V12
c
m1
m1
=P
>
c = P t2m1 >
c ,
m
K
mc
mc

where tm1 is a real valued r.v. having Students distribution with m 1 degrees of
freedom.
Proposition 9 Assume that A = (ai,j )i,j=1,...,m , ai,j = mi,j + gi,j (i, j = 1, ..., m), where
the gi,j s are i.i.d. standard Gaussian r.v.s and M = (mi,j )i,j=1,...,m is non random.
Then, for each x > 0 :
P [ A1 x] C2 (m)

m1/2
,
x

(17)

where
2
C2 (m) =

1/2

sup

cP

c(0,m)

t2m1

m1
>
c
mc

C2 () = C2 2.34737...

Proof. Let U be an n-dimensional random vector, independent of A with uniform distribution on S m1 .


Aplying Lemma 7 we have that
P

A1 U > x = E P

A1 U > x U

1/2

1
.
x

(18)

Now, since if wA satises that A1 wA = A1 , and u = 1, then,


A1 u A1 | < wA , u > |,
we have that, if c (0, m), then
P

A1 U x

c
m

1/2

A1 x

and

| < wA , U > |

c
m

1/2

A1 x

= E P

x} P | < wA , U > |

= E I{

A1

= E I{

A1 x} P

= P t2m1 >

t2m1 >

c
m

| < wA , U > |

and

c
m

1/2

1/2

m1
c
mc

m1
c P [ A1 x].
mc

where we have applied Lemma 8. From here and (18) we have that
P [ A1 x]

1
P t2m1 >

m1
c
mc

1/2

1 m
x c

1/2

To end the proof notice that, if g is a standard Gaussian random variable, then
sup c1/2 P t2m1 >

c(0,m)

m1
c
mc

sup c1/2 P t2m1 >

c(0,1)

m1
c
mc

(19)

sup c1/2 P t2m1 > c

c(0,1)

sup c1/2 P g 2 > c


c(0,1)

0.5651/2 P g 2 > 0.565 = 0.3399.

Remark 10 Explicit expressions for C2 (m) dont seem to be easy to obtain. Therefore, we have carried out some numerical computations with MatLab in order to have
approximations to this value.
In the following table we include the results.
Table 1. Optimal values for C2 (m) and values of c in which they are reached.
m
3
C2 (m) 1.879
c
1.146

4
2.038
0.923

5
2.086
0.823

10
2.244
0.672

25
2.309
0.604

50
100

2.328 2.338 2.347


.584 0.574 0.565

Notice from the table that restriction in (19) to that c (0, 1) is not important as long
as m 4.

Main results.

Theorem 11 Assume that A = (ai,j )i,j=1,...,m , ai,j = mi,j + gi,j (i, j = 1, ..., m), where
the gi,j s are i.i.d. centered Gaussian with common variance 2 and M = (mi,j )i,j=1,...,m
10

is non random. Let m 3. If log x 4m one has:


P [(A) > x]

M
1
1 C1
+ C2 (m) m
+ C2 (m) 4m (log x) 2 ,
x
m

(20)

where C1 and C2 (m) were dened in Propositions 6 and 9 respectively.


Proof. As we noticed above, we may assume that = 1 and replace the matrix M by
1
M . Put G = (gi,j )i,j=1,...,m . From Proposition 6, if a 4 :

A >

1
M +a m P

C1
a2 m
G > a m exp
.
m
4

Using also Proposition 9:

1
M +a m +P

C1
a2 m
C2 (m) m
exp
+
m
4
x

P [(A) > x] P

A >

A1 >

M +a m

M
+a m .

Putting
a=

4 log x
m

the result follows.

Corollary 12 With the notations and hypotheses of Theorem 11, m 3, for any x large
enough

1
m 1
M
P ((A) > x) H
+
+ (log x) 2 .
x m

where H is a constant.
Proof. Apply Theorem 11.
One can also use Propositions 6 and 9 to get bounds for the moments of log (A). For
example we can obtain the following corollary:
Corollary 13 With the notations and hypotheses of Theorem 11. If m 3, then
E {log (A)} log(m) + 1 + log C2 + log

11

M
C
+ 4 + 1 exp [4m] .
m
2m

Proof. We may assume that = 1 and replace the matrix M by


log (C2 m). Applying Proposition 9, we have that
E log A1

1
M.

P A1 > ex

+ C2 me = log C2 m + 1.
+

+ 4 m . Notice that, if x , then ex


Now, let = log M

Therefore, applying Proposition 6 we obtain


E {log A } +
+

Let =

(21)
M

4 m.

P [ A > ex ] dx

G > ex

C1
+
m

dx

1 x
M
exp
e
4

From here, if we make the change of variable y = ex

dx.

, we obtain that

C1
1 2
E {log A } +
dy
exp y
m 4 m
2
C1
exp (4m) .
+
2m
And the corollary follows from here and (21).
Putting M = 0, = 1, the last Corollary provides a weak version of Edelmans
Theorem of the form (5).
Acknowledgment. Authors want to thank an anonymous referee whose comments and
suggestions have improved the paper.

References
[1] Cheung, D.; Cucker, F. (2001). A new Condition Number for Linear Programming.
Math. Programming, 91, 163-174.
[2] Cucker, F.; Wschebor, M. (2002). On the Expected Condition Number of Linear
Programming Problems. Numer. Mathem. To appear.
[3] Cuesta-Albertos, J.; Wschebor, M. (2002). Some Remarks on the Condition Number
of a Real Random Square Matrix. Submitted.
[4] Davidson, K.R.; Szarek, S.J. (2001) Local Operator Theory, Random Matrices and
Banach Spaces, in Handbook of the Geometry of Banach Spaces, Vol. 1, Ch. 8, Eds.
W.B. Johnson and J. Lindenstrauss, Elsevier, pp. 317-366.
12

[5] Demmel, J. (1997). Applied Numerical Linear Algebra. Ed. SIAM.


[6] Edelman, A. (1988). Eigenvalues and condition numbers of random matrices. SIAM
J. of Matrix Anal. and Appl., 9, 543-556.
[7] Higham, N. (1996). Accuracy and Stability of Numerical Algorithms. Ed. SIAM.
[8] Ledoux, M. (2002) A Remark on Hypercontractivity and Tail Inequalities for the
Largest Eigenvalues of Random Matrices, preprint.
[9] Sankar, A.; Spielman, D.A.; Teng, S.H. (2002). Smoothed Analysis of the Condition
Numbers and Growth Factors of Matrices. Preprint.
n
[10] Szarek, S.J. (1990). Spaces with large distance to l
and random matrices. Amer. J.
Math. 112, 899-942.

[11] Turing, A. (1948). Rounding-o errors in matrix processes. Quart. J. Mech. Appl.
Math. 1, 287-308.
[12] von Neumann, J; Goldstine, H. (1947). Numerical inverting matrices of high order.
Bull. Amer. Math. Soc. 53, 1021-1099.
[13] Wilkinson, J.H. (1963). Rounding Errors in Algebraic Processes. Ed. Prentice-Hall.

13

Zeitschrift ffir
Z. Wahrscheinlichkeitstheorie verw. Gebiete
60, 393-401 (1982)

Wahrscheinliehkeitstheorie
und verwandte Gebiete
9 Springer-Verlag 1982

Formule de Rice en dimension d


Mario Wschebor
Universidad Sim6n Bolivar, Dep. de Matemfiticas y C.C., Apartado Postal N~
Venezuela

Caracas,

Summary. Let {x(t): t ~ R d} a stochastic process with parameter in R a, and u


a fixed real number. Denote by C,, A,, B, respectively the r a n d o m sets {t:
x(t)=u}, {t: x(t)<u}, {t: x(t)>u}. The paper contains two main results for
processes with continuously differentiable paths plus some additional requirements: First, a formula for the expectation of QT(Au) and QT(Bu),
where for a given bounded open set T in R a, QT(B) denotes the "perimeter
of B relative to T" and second, sufficient conditions on the process, so that
it does not have local extrema on the barrier u. The second result can also
be used to interpret the first in terms of C,.

L Soient 0 un ensemble ouvert et B u n Bor61ien dans R d. Le <<p6rim4tre de B


relativement a 0~> (fini ou infini) est d6fini par:

Qo(B)=sup{y div(u)dt: u e ,(C kcot0~d


, , , Ilu(011 < 1 v t e e a}

(1)

off (C~~ e d4note les fonctions C ~ fi valeurs dans R e et support compacte


contenu dans 0, I]" I/ la norme euclidienne et (dt) la mesure de Lebesgue dans
R d.

Exemples. a) Si d = l , on a toujours Qo(B)=~(OBc~O), o6 ~B denote la


frontiSre essentielle de B.
b) Si 8B est suffisamment rhguli6re on a
Qo(B) = ad- 1(OB c~ 0),
off a d_ 1 est la mesure superficielle ( d - 1)-dimensionelle de 0B.
La d6monstration de a) et de b), ainsi que celle du lemme suivant, ne
pr6sente pas de difficult6 [-5]. Nous allons l'appliquer ~t l'estimation de certains
p6rimStres.

Lemme 1. (i) Si {gn} est une suite uniformdment bornde de fonctions contin~ment
diffdrentiables qui converge presque s~rement vers ZB (la fonction indicatrice de
B ), alors :
(20 (B) =<lira inf ~ ]lgrad g,(t)[I dr.
(2)
n~oo

0044- 3719/82/0060/0393/$01.80

394

M. Wschebor

(ii) Soit ~9~: Rd--~R une approximation


dans la boule de rayon e (~>0), et posons:

C ~ de l'unitk avec support contenu

pour chaque fonction g6L]or Alors, si O < e < &


Qo(B)>

~ I[grad()~,)~(t)ll dr,

0_~

(3)

off
0 a={t: Ht-t'H>bVt'r
II. On considbre maintenant un processus stochastique r6el {X(t): t e R e} parametris6 dans R e. On va supposer que ses trajectoires soient continfiment
differentiables et dhnoter par
Ptt

.....

tk; tl ..... ta(Xl .... , Xk; 21 .... ,2h) dx~ ... dx k d21 ... d2 h

(4)

la distribution jointe de X ( t 0 .... , X ( tk) ; grad X ( t'O, . . . , grad X ( t'h),


[q-tj[=~0,

pour i4=j.

It'i-tj[+O,

Pour u r6el fix6, on d6notera:

c.={t: x(t)=u),

A . = { t : X(t)<u},

B~={t: X(0>u).

Nous dirons que le processus {X(t): t ~ R d} v6rifie la condition (H1) si:


1) ps;t(x,2) est une fonction continue de x au point x = u , pour s, t, 2 fixbs,
et de s pour t, x, 2 fix6s.
2)
~ Jr21-A211ps;~,t(x;21,22)d21d22
tends vers z6ro quand s---~t,
R d R a

uniform6ment pour x dans un voisinage de u et t dans un compacte de R a. (I1


est clair que la mesure sur R d x R a ps.,s,t(x;2122)d2 l d 2 2 se concentre, pour
chaque x, duns la diagonal {2~ =22} quand s - ~ t . On exige, donc, une condition plus forte pour x pr6s de u.)
3) Pour chaque ouvert born6 T dans R d, le deuxi6me membre de (5)
ci-dessous est une fonction continue au point u.
Th6or~me 1. Sous la condition (H1) on a:
E ( Q T ( A . ) ) = E ( Q r ( B . ) ) = j" dt ~ 11211p~,t(u; 2) d2,
T

(5)

Ra

o~ T e s t un ouvert bornd de R e.
DOmonstration. On prouve l'6galit6 pour l'6sp6rance de la variable al6atoire
QT(A~). Pour QT(B.) c'est pareil.
Pour r e = l , 2 .... soit f,.~(C~176
1, non-croissante, f m ( x ) = O pour x > u ,
fro(x)= 1 pour x < = u - 1 / m . I1 est clair que

f ~ ( x ( o ) - , z ( . . . . ~(x(o)

v t ~ R e.

Formule de Rice

395

D'apr6s le lemme 1 et le lemme de Fatou:

E(QT(A,) )__<lim infE{~ Ngradfm(X (t))ll dt}


m~

oo

= lim inf ~ E {ls


m~co

[rgrad X(t)[I } dt

= l i m i n f ~ If/.(x)ldx ~ dt ~ []2IPpt;t(x;2)d2
m~

oo

R 1

Ra

=~ dt ~ []2clIpt;t(u;2:)d2.
T

Ra

L'in6galit6 inverse s'obtienne en appliquant la partie (ii) du lemme 1. En


effet, si 0 < e < 6 on a:

E(QT(Au))~E { ~ Hgrad(zAu)~(t)[[dt}
w~
= lira ~ E { ]j ~ tp~( t - s)f" (X (s)) grad X (s) ds H} &.
rn~ao

T-6

(6)

Ra

L'hsphrance dans la dernihre integrale peut etre minor6 par:

E{ y 6~(t-s)Is
~

][gradX(t)l] ds}

- E { ~ ~ ( t - s)If'(X(s))t Ilgrad (X(s) - X(t))Jr ds}.

(7)

Ra

En faisant m - - - ~ et e--,0 (dans cet ordre), le lemme de Fatou et les


conditions verifi6es par le processus permettent de conclure:

E(QT(Au))>= ~ dt ~ I]2Hpt;t(u;2)d2.
T - ,~

Rd

Maintenant, fiN0 donne le r6sultat.

IlL Remarques diverses


(A) Si le processus est stationnaire, (5) devienne:

E(QT(A.)) =E(Qr(Bu))=l~d(T) ~ I1=~11po=o(U;~)d~

(5')

Ra

off #d(T) c'est la mesure de Lebesgue de T.


(B) Darts le cas off le processus est Gaussien et stationnaire, on peut avoir
des r6sultats plus pr6cis que le Th6or6me 1, [6].
(C) Soit W un ensemble ouvert dans R e. Si on s'interesse seulement aux
points de l'ensemble {t: X(t)=u, gradX(t)~ W}, il s'agit de calculer

E {Qr~t: graax(t)~w~(A,)},

396

M. Wschebor

et on obtient, par la marne m6thode et sous les m~mes hypoth6ses:

S dt~ 11211pt;,(u;2)d2
1"

(8)

au lieu du second membre de (5).


En particulier, si d = l et W = { r : r > 0 } , C 2 = { t : X(t)=u, 2 ( 0 > 0 } est
appel6 l'enscmble des ~upcrossings~ de la barri6re u, ct puisque le proccssus,
presque sfirement, n'a pas d'extremum locales (voir le Th6or6me 2 ci-dessous),
on a bien la formule:
E ( e (C + a T)) = f dt ~ 2 Pt;t(u; 2) d2.
T

(9)

(D) La continuith des trajectoires entraine:

~A,~C.,

(10)

ct on a egalit6 presque sfire dans (10), si et seulement si, presque sfirement, le


processus n'a pas d" extremum locales. Dans ce cas, QT(Au) devienne une mesure
de l'intersection du processus avec la barri~re u, regard~e sur le sousensemble
T de R e.
Le Th6or~me suivant donne des conditions suffisantes pour avoir
0A, = ~B, = C,

presque sfirement,

(11)

c'est ~t dire, pour que, presque sfirement, le processus n'ait pas d'extremum
locales. Dans le cas d = l le r6sultat est contenu dans un th6or6me de
Bulinskaya, [3]. N6anmoins, nous avons inclu la d6monstration parce qu'elle
donne une id6e des m6thodes employ6es pour d > 1.
(11) a ~t6 demontr6e par Ylvisaker [7] pour les processus Gaussiens centr6s
avec variance constante et trajectoires continues. Pour des processus g6n6raux,
Belyaiev [1] a prouv6 que si les trajectoires sont, presque sfirement, deux lois
continfiment differentiables et Pt;~(x; 2) est une fonction bonr6e, alors
T~={t: X(t)=u, grad X(t)=0} =~b

presque sfirement,

(12)

ce qui implique (11).


Posons encore:

N,+ = {t: X(t)=u, X a u n minimum local au point t},


N~-={t: X(t)=u, X a un maximum local au point t}.
Th6or~me 2. Soit {X (t): t ~ R e} un processus stochastique presque sfirement avec
des trajectoires contin~ment diff~rentiables et tel que l'ensemble des variables
al~atoires
x ( t ) , x ~ ( t ) .... , x ~ ( t )

(Xi(t) indique la derivde par rapport dr la j-~me coordonde, calcul~e au point t), a
densit~
p,(x, y~,, ..., y J
localement bornde pour chaque choix de il, ..., ih, 1 <=i~< i 2 <

<ibid.

Formule de Rice

397

Alors, pour chaque barriOre u:


(i) Si d= 1, on a

T, = q5 presque sfirement.
(ii) Si d > 1, posons ad = ( d - l)/(d + 1), et pour K compacte dans R d
sup

w(f,K)=

I]gradX(t)-gradX(s)H

s, t E K

IIt-sll <,~

(le module de continuit~ de g r a d X sur K).


Si pour chaque compacte K dans R d il existe ~ > c~d tel que
w(6, K) = 0(3 ~) en probabilit~
c'est d dire que pour tout s >0 il existe 7 tel que
pour tout 6 > 0 )

P (~>y)<e

(13)

on d~duit que N, ~ = N,- = 0 presque s~trement.


La d6monstration est bas6e sur les lemmes suivants.

Lemme 2. Posons, pour ~>0, D~={t: IlgradX(t)ll <e}. Alors, sous les hipoth~ses
du th~or~me 2, si T est un ouvert born~ dam R e, on a:
E(QT~D~(A,)) <__Lea+1,

(14)

E(QT~D~(B,)) < Le~+ ~

(14')

ou L d~pend de la mesure de Lebesgue de T, de la dimension d et est localement


bornde comme fonction de u.
D~monstration. En proc6dant de la m~me forme que dans la premi6re partie de
la d6monstration du th6or6me 1, on obtient:
E ( Q T ~ ( A , ) ) < lim inf ~ E {[f,~(X(t))[ []grad X(t)1] ZD~(t)}dt
m~oo

= l i m i n f ~ dt ~ If,~(x)ldx
m~oo

R1

[12[[pt;t(x,2)d2

{~:ll~ II <e}

~ g 8 d+ l

Une majoration analogue permet de prouver (14').


Soit A un ensemble ouvert born6 dans R e, ro=sup{Htlr: t e A } , et on va
supposer que 0 ~ A et que QR,(A) est fini.
Introduisons les notations (B(t; r) d6note la boule de centre t et rayon r):
VA(r)=t2d(AnB(O; r)),

SA(r)=ad_I(AC3~B(O; r)),

d'ofl l'on d6duit


%_ I(OB(O; 1)) =dca,

Kd=dc TM.

ca=#d(B(O; 1))

398

M. Wschebor

Soit a un nombre r66el fixe, 0 < a < K a et CA l'ensemble:

C A = {p: p > O, S A (p) < a (VA (p))(a- 1)/a},


"CA= i n f {p: #i([0, p] c~ CA)>0 },
I1 est clair que 0 < ? A<r0, puisque 0 ~ A et ceci implique

S A(p ) = Ka. (Va (p))(a- 1)/d


s i p est sufisemment petit.
Lemma 3. Avec les notations ci-dessus, il existe un nombre positif b e - qui
d@end seutement de ta dimension d - tel que
QB(0; 2rA)(A)~ ba (2rA)a- t.

(15)

D~monstration. I1 est clair que:


r

VA(r) = .I SA(y)dy.

(16)

Donc, si la condition

SA (Y) >=a . (VA(y))(a- 1)/e

(17)

est presque partout verifi6e pour O < y < R , on conclut de (16) que
r2

VA(r2)--VA(rl)>a ~ (VA(y))(e-l~/adv.

pour 0=<q<r= 2=<R,

(18)

ri

qui entraine, apr6s un calcul 616mentaire:

VA(r) _--_(d,) a- /

pour O<r<_R.

(19)

Cela nous montre que

D'autre part, on peut prouver que, presque partout pour p > 0 on a

QR~(A c~ B(0; p)) < SA(p) + QB~O:p)(A)

(21)

(en fair, l'6galit6 est vraie, presque partout pour p > 0 , [5], p. 35) et on peut
trouver un rayon p 1 E CA, rA < P l < 27A tel que (21) soit satisfaite.
En utilisant l'inhgalit6 isop6rimetrique dans R a on a:

SA(p O + Qmo; m)(A) > Ka'(#e(A c~ B(0; pl))) (a- 1)/d


= Ke.(VA (p 1))(e- l)/a.
Compte tenue de (20) et de pl ~ CA:
QB<O;2,'A)(A) > Qm0; O,)(A) > (ge - a)(~4(p,))(a- 1)/a

>=(Ka-a)(VA(?A))'a-1)/a>(Ka-a) ( d ) a - l ? y ( t.

(22)

Formule de Rice

399

I1 suffit donc de prendre


b d = (K d - a). (a/(2d)) d- 1
pour avoir (15).
DOmonstration du ThOorOme. Observons d'abord que P ( X ( t ) = u V r ~ V ) = 0 pour
chaque sousensemble ouvert V de R d. Pour d6montrer (i), on voit que pour
tout e > 0
T u ~ ( a A . u a B . u N~+ ~ N.-)c~D~.

(23)

De l'exemple a) de I et du lemme 2, pour tout T ouvert born6 on a:


E( 4~ (~ A . u ~ U.) c~D~ c~ T) ~ E (Q o. ~ r (A~)) + E (QD~~ r (Bu)) ~ 2 L e 2.
D'autre part, la continuit~ de grad (X) entraine que, presque sfirement:
(Nu- c~ T) < lira inf 4~(~ Au_ 1/m ~ T c~ D ~)
m - - + cx)

et d'apr~s le lemme de Fatou:


E ( 4~(N~ c~ T)) < lira inf E ( 4~(0A~_ ~/~ c~ T c~D~)) < lira inf E (Qoo ~ r (A._ ~/~)) < L e2.
Le m~me argument pour N +, plus (23) permettent de conclure E ( ~ ( T ~ T ) )
< 4 L e 2. e 6tant arbitraire et T u n ouvert born6 quelconque, cette inegalit6
entraine T, = q5 p.s.
Maintenant nous dhmontrons (ii). Nous verrons que Nu+ = q5 presque sfirement. La d6monstration est la m6me pour N,-.
On voit facilement qu'il suffit de prouver que, pour chaque u on a:
P(S)=0

(24)

S 6tant d6fini par S = { m r = u } , m B = i n f X ( t ) et T u n parall616pip6de ouvert


d'ar6tes parall~les aux axes.
t~B
Nous proc6dons par r6currence e n d . On a prouv6 dans (i) un rdsultat qui
entra~ne la th6se pour d = 1. Supposons que le r6sultat est vraie pour dimension
plus petite que d. Les processus avec param6tre dans R d-a qui r6sultent de
fixer une des d coordon6es de t, satisfont les hypoth6ses du Th6or6me 2
correspondantes /t la dimension d - l , 6tant donn6 que ed croit avec d. On
d6duit de l'hypoth6se de r6currence que
P(m~r=u)=O,
c'est g dire que, presque sfirement dans S, l'ensemble C,c~T est un sousensemble compacte de T.
Supposons que P ( S ) > 0 . Nous allons voir que ceci nous am6ne a une
contradiction.
Choisissons t / > 0 et S 1 ~ S , P ( S O > ( 1 / 2 ) P ( S ) de fagon que
S~c{C

c~ T +qS Vv tel que u < v < u + ~ } .

400

M. Wschebor

La possibilit6 d'un tel choix d~coule imm6diatement de la d6finition de S,


de la continuit6 des trajectoires et du fait que celles-ci sont, presque sfirement,
localement non-constantes.
D'autre part,
31 =dist { C , n T, 0T} > 0
p.s.
et si V = { t : d i s t {t,C,c~T}<6t/2}, 6tant donn6 que, p.s. dans St,
A~c~T',~C, n T quand vNu, un argument de compacit6 montre qu'il existe
I/'>0 et SacS1, P(S2)>(1/2)P(S 0 de fa~on que
S 2 ~ { A v ("I T Q V

V v

tel que u<v<u+tl'}.

(25)

Prenons maintenant une suite u. N u, u < u. < u + rain {t/, t/'}.


Pour chaque trajectoire appartenant /t $2, soit t ~ C. choisi de telle fagon
que t-soit une variable al6atoire. En appliquant le lemme 3 ~t l'ouvert A.c~ T
on obtient T. > 0 tel que

QB6; ~.)(A.. c~ T) >=bd.vd.- t,


7 . < 2 sup {dist {t, i-}: t ~ A . c ~ T } .

(26)

Toujours dans S 2 d6notons par s. la variable al6atoire


s.=

sup

dist{t,C uc~T}

t~Aunc~T

qui d6croit vers z6ro.


Puisque l'adh6rence (compacte) de Auc~T
{t: dist {t, C~n T} <2s~} =B~, nous avons:

QBo; r

est contenue dans l'ouvert

n T) = QBe;~.)~B.(A.. C~T) <_QDo ~ r(A..)

off
e. = w(6., ~')

6. = m i n {?., 2s.},
la derniSre in6galit6 6tant une cons6quance du fait que grad X s'annule sur C,.
gn r6sumant, nous avons:

QD~.~ r(Au.) )~s2>=bdf~- 1Zs2.

(27)

Nous raisons maintenant intervenir l'hypothSse (13) et d'aprSs le th6or6me


d'Egorov, on peut obtenir un sous-6v6nement S 3, S 3 a S 2, P(S3)>(1/2)P(S2) et
un nombre positif 7 (suffisamment grand) pour que:
a) % = sup 6n T

0,

$3

b) w(6,
6~T ) < y

V6>0

Posons finalement

~,~= Qo~. nw(Aun)


-d - 1
rn

)@3"

sur S 3.

Formule de Rice

401

On a:
k=0

< Z oo

k= o ('~./2k+')d- ~ E {Q.w(~./2~, r)~ r(A..) Zs~}


NL.

~= o ( o . / 2 ~+ 1)~-~ (~(o./2~)~) " ~,

en utilisant le lemme 2.
Si on pose ~l=C~(d+ 1 ) - ( d - i ) > 0 , on a:
(29)

E(~.) < L7 '~+ 12 ~- 1a~l ~ (1/2k,1) ~


Mais (27) entraine, d'autre part,

O.

k=O

E(~.) > bdP(S3) >=(1/8) baP(S),


et ceci contredit (29).

Bibliographie
1.Belyaiev, Y.: Point Processes and First Passage Problems. Proc. Sixth Berkeley Sympos. Math.
Statist. Probab. 3, 1-17 (1972)
2. Benzaquen, S., Cabafia, E.M.: The Expected Measure of the Level Set of a Regular Stationary
Gaussian Process, [A para~tre dans Pacific J. Math.]
3. Bulinskaya, E.V.: On the mean number of crossings of a level by a stationary Gaussian process.
Theor. Probab. Appl. 6, 435-438 (1961)
4. Crfimer, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes. NewYork: J. Wiley
1967
5. Miranda, M.: Frontiere minime. Mort. Mat. No. 27. IMPA (1976)
6. Wschebor, M.: On Crossings of Gaussian Fields. [A paraltre dans Stochastic Processes Appl.]
7. Ylvisaker, D.: The Expected Number of Zeros of a Stationary Gaussian Process. Ann. Math.
Statist. 1043-1046 (1965)
Received February 2, 1981; in revised form 2.2.82

On the tails of the distribution of the maximum of a


smooth stationary Gaussian process
Jean-Marc Bardet 1, Mario Wschebor 2
Laboratoire de Statistique et de Probabilites, URA CNRS 745,
Universite Paul Sabatier, 118 route de Narbonne,
31062 Toulouse Cedex, France.
2
Centro de Matematica, Facultad de Ciencias, Universidad de la Republica,
Calle Igua 4225, 11400 Montevideo, Uruguay.
March 24, 2000
1

Abstract

This paper deals with the asymptotic behavior when the level tends to +1, of
the tail of the distribution of the maximum of a stationary Gaussian process on a
xed interval of the line. For processes satisfying certain regularity conditions, we
give a second order term for this asymptotics.

Mathematics Subject Classi cation (1991): 60Gxx, 60E05, 60G15, 65U05.


Key words: Tail of Distribution of the Maximum, Stationary Gaussian processes
Short Title: Distribution of the Maximum.

Introduction
X = fX (t); t 2 0; T ]g, T > 0 is a real-valued centered stationary Gaussian process with

continuous paths and MT = tmax


X (t). We denote F (u) = P (MT u) the distribution
2 ;T
function of the random variable MT , r(t) = E fX (s)X (s + t)g the covariance function
and k (k = 0; 1; 2; :::) the spectral moments of the process, whenever they are de ned.
With no loss of generality we will assume that = r(0) = 1:
Under certain regularity conditions, Piterbarg (1981, Theorem 2.2.) proved that for
each T > 0 and any u 2 R:
0

B exp ? 1 u+

1 ? (u) + 2 T (u) ? P (MT > u)


2

(1)

for some constants B > 0 and < 1: (respectively ) denotes the standard normal
distribution (respectively density).
The aim of this paper is to improve the description of the asymptotic behavior of
P (MT > u) as u ! +1 that follows from (1) replacing the bound for the error by
an equivalent as u ! +1. More precisely, under the regularity conditions required in
Theorem 1.1, we will prove that:
r

P (MT >u)=1 ? (u) + 2 T (u)? 2T ( ? )


2

2
2

"

3 1?

!#

? u
4

2
2

1 + o(1)]

(2)
This contradicts Theorem 3.1. in Piterbarg's paper in which a di erent equivalent is
given in case T is small enough (see also Aza s e t al. (1999)).
We will assume further that X has C sample paths (this implies < 1) and that
for every n 1 and pairwise di erent values t ; ::; tn in 0; T ], the distribution of the set of
5n random variables (X j (t ); ::; X j (tn); j = 0; 1; 2; 3; 4) is non-degenerate. A su cient
condition for this to hold is the spectral measure of the process not to be purely atomic
or, if it is purely atomic, that the set of atoms have an accumulation point in the real
line (A proof of this facts can be done in the same way as in Chap. 10 of Cramer and
Leadbetter, 1967).
If is a random vector with values in Rn whose distribution has a density with respect
to Lebesgue measure, we denote by p (x) the density of at the point x 2Rn . 1C denotes
the indicator function of the set C .
If Y = fY (t) : t 2 Rg is a process in L we put ?Y (s; t) for its covariance function and
i j Y
?Yij (s; t) = @@si@t? j (s; t) for the partial derivatives, whenever they exist.
The proof of (2) will consist in computing the density of the distribution of the
random variable MT and studying its asymptotic behavior as u ! +1. Our main tool is
4

( )

( )

the following proposition which is a special case of the di erentiation Lemma 3.3 in Aza s
and Wschebor (1999):
Proposition 1.1 Let Y be a Gaussian process with C paths and such that for every
n 1 and pairwise di erent values t ; ::; tn in 0; T ] the distribution of the set of 3n
random variables (Y j (t ); ::; Y j (tn); j = 0; 1; 2) is non-degenerate. Assume also that
E fY (t)g = 0; ?Y (t; t) = E fY (t)g = 1.
Then, if is a C -function on 0; T ],
2

( )

( )

d P (Y (t) u (t); 8t 2 0; T ]) = LY (u; ) + LY (u; ) + LY (u; ); with


du
LY (u; )= (0)P (Y `(s) u `(s); 8s 2 0; T ]):pY ( (0)u)

(3)

LY (u; )= (T )P (Y a(s) u a(s); 8s 2 0; T ]):pY T ( (T )u)

(4)

(0)

( )

Z T

L (u; )=? (t)E (Y t(t)? t(t)u)1fY t s


Y

( )

u t (s);8s2 0;T ]g

p Y t ;Y
( ( )

(t))

( (t)u; 0(t)u)dt: (5)

Here the functions `; a; t and the (random) functions Y `; Y a ; Y t are the continuous
extensions to 0; T ] of:
?
?
`
(s) = 1 (s) ? ?Y (s; 0) (0) ; Y `(s) = 1 Y (s) ? ?Y (s; 0)Y (0) for 0 < s T; (6)

1 (s)??Y (s; T ) (T ) ; Y a(s)= 1 ?Y (s)??Y (s; T )Y (T ) for 0 s< T; (7)


(s)= T?
s
T?s
t

(s) = (s ?2 t)

Y
(s) ? ?Y (s; t) (t) ? ??Y ((t;t; st)) 0(t)
10

11

Y
Y t(s) = (s ?2 t) Y (s) ? ?Y (s; t)Y (t) ? ??Y ((t;t; st)) Y (t)
0

10

11

0 s T; s 6= t;
0 s T; s 6= t:

(8)
(9)

We will repeatedly use the following Lemma. Its proof is elementary and we omit it.
Lemma 1.1 Let f and g be real-valued functions of class C de ned on the interval 0; T ]
of the real line verifying the conditions:
1) f has a unique minimum on 0; T ] at the point t = t , and f 0(t ) = 0; f "(t ) > 0:
2) Let k = inf j : g j (t ) 6= 0 and suppose k = 0 ; 1 or 2.
De ne
Z T
h(u) =
g(t) exp ? 21 u f (t) dt:
Then, as u ! 1:
Z
k (t ) 1
1
g
xk exp ? 41 f "(t )x dx;
h(u) t k! uk exp ? 2 u f (t )
J
where J = 0; +1) ; J = (?1; 0] or J = ]?1; +1 according as t = 0; t =
T or 0 < t < T respectively.
2

( )

( )

+1

We now turn to our result.

Theorem 1.1 Let X = fX (t) : t 2 0; T ]g be a Gaussian centered stationary process with

C -paths, covariance r(:), = 1, and such that for every n 1 and pairwise di erent
t ; ::; tn in 0; T ], the distribution of the set of 5n random variables (X j (t ); ::; X j (tn); j =
0; 1; 2; 3; 4) is non-degenerate. We shall also assume the additional hypothesis that r0 < 0
in a set dense in 0; T ].
4

( )

( )

Then (2) holds true.


Proof.

We divide the proof into several steps.

Step 1. Proposition1.1 applied to the process Y = X and the function (t) = 1 for all
t 2 0; T ] enables to write the density pMT of the distribution of the maximum MT as:
pMT (u) = A (u) + A (u) + A (u)] : (u); with
(10)
1

A (u) = P (X `(s) u `(s); 8s 2 0; T ]);


1

A (u) = P (X a(s) u a(s); 8s 2 0; T ]);


2

Z T
1
E (X t(t)? t(t)u)1fX t s u t s ;8s2 ;T g dt:
A (u) = ? p
2
Since X is a stationary process and (t) 1, it follows that the processes X and
Xe - de ned as Xe (t) = X (T ? t) - have the same law, so that P (X (s) u for all
s 2 0; T ] jX (0) = u) = P (X (s) u for all s 2 0; T ] jX (T ) = u). Hence, A (u) = A (u).
3

( )

( )

Step 2.

We now consider A (u) and write it in the form:


1

A (u) = P (Y (s) u (s); 8s 2 0; T ])


where Y is the continuous extension to 0; T ] of:
`
`
Y (s) = n X (s) o = s:X (s) 1 s 2]0; T ]
1 ? r (s)] 2
(E X `(s)] ) =
1

1 2

and

? r(s) s 2 0; T ]:
(s) = n (s) o = 11 +
r(s)
(E X `(s)] ) =
`

1 2

Thus, Proposition 1.1 can be applied and:

d A (u) = LY (u; ) + LY (u; ) + LY (u; ):


du
1

(11)

LY (u; ) = 0 because (0) = 0.


1

LY (u; ) = (T )P (Y a(s) u a(s); 8s 2 0; T ]) ( (T )u) , with


a
(s) = 1 + r(T ) ? r(s) ?pr(T ? s) s 2]0; T
(T ? s)(1 + r(T )) 1 ? r (s)
2

0 (T )
r
p
(0) =
T (1 + r(T ))

0 and

(T ) =

On the other hand:

(1 + r(T )) 1 ? r (T )
2

0:

P Y a(s) u a(s); 8s 2 0; T ]

check that

r0(pT )

P Y a(T ) u a(T ) ;

0
E (Y a(T )) ) = E (Y 0(T )) = (1 ?(1r ?(Tr))(?T ))(r (T )) ) ;
so that, since the non-degeneracy hypothesis implies that for each T > 0, E (Y 0(T )) ) is
2

non-zeo, it follows that the numerator in the right-hand member is stricly positive for
T > 0.
Hence,
?
P Y a(s) u a(s); 8s 2 0; T ] ( (T )u) C (T ) exp ? u2 F (T ) ;
with C (T ) > 0, where F (t) is the function
2

(12)

F (t) = (1 ? r(1(?t))r(?t))(r0(t)) for t 2 ?T; T ]; t 6= 0:


2

which is well de ned since the denominator does not vanish because of the previous
remark.
The following properties of the function F are elementary and will be useful in our
calculations.
(a) F has a continuous extension at t = 0.
(b) F (t) > F (0) = ? for t 6= 0 because:
0
0
r00(t))(1 ? r(t)) and
* F 0(t) = 2 (1 ? r(t))( r ((1t)((?rr(t())t))??((r0(?
t)) )
* r0(t) < 0 for t 2 A 0; T ] with A dense in 0; T ], and
2
2

2
2

2 2

* For t 6= 0,
(r0(t)) ? ( ? r00(t))(1 ? r(t)) =
?
(E (X 0(t) ? X 0(0))(X (t) ? X (0))) ? E (X 0(t) ? X (0))
2

E (X (t) ? X (0)) < 0,

from Cauchy-Schwartz inequality (and non-degeneracy hypothesis).

(c) F 0(0) = 0:
(d) F 00(0) = 9(( ? ? ) ) .
2

2
4
2 2
2

From (12) and (b), it follows that

LY (u; ) D(T ) exp ? u2


with D(T ) > 0 and (T ) > 0 for all T > 0.

L (u; ) = ?
Y

Z T

with

p Y t ;Y
( ( )

(t)E (Y t(t)? t(t)u)1fY t s

( )

2
2

(T ) +

pY

u t (s);8s2 0;T ]g

(t))

(13)

(t);Y 0 (t))

(u (t); u 0(t))dt;

0
(t) + E f((Y(0t())t)) g
(u (t); u 0(t)) = 2 (E f(Y10(t)) g) = exp ? u2
= 2 (E f(Y10(t)) g) = exp ? u2 F (t) :
2

2
2

1 2

1 2

2
2

We know that t2min;T F (t) = F (0) = ? .


Also check that:
p
d t(t) = 1
(0) = 0; 0(0) = 21
> 0; (0) = 0; lim
t!
dt
12
and E f(Y (0)) g > 0 from the non-degeneracy condition.
As a consequence:
0

2
2

? >0
?
2
4

6
4

2
2

Z T

(t)E Y t(t)1fY t s

( )

p Y t ;Y

u t (s);8s2 0;T ]g

( ( )

(t))

(u (t); u 0(t))dt

Z T
t (t)) g) =
u
(
E
f
(
Y
(t) 2 (E f(Y 0(t)) g) = exp ? 2 F (t) dt C (t) exp ? u2 F (t) dt;
where C is a positive constant. Also, from Lemma 1.1 and the properties of the functions
F and , one gets:
Z T
(t) exp ? u2 F (t) dt C u1 exp ? u2 ?
:
C a positive constant. Hence,
Z T

1 2

1 2

2
2

2
2

Z T

(t)E Y t(t)1fY t s

( )

u t (s);8s2 0;T ]g

p Y t ;Y
( ( )

(t))

(u (t); u 0(t))dt =

= O u1 exp ? u2

2
2

2
2

(14)

On the other hand,


Z T
0

(t)E

(t)1fY t s

( )

Z T
2
0

p Y t ;Y

u t (s);8s2 0;T ]g

( ( )

(t))

(u (t); u 0(t))dt

(t) t(t) exp ? u2 F (t) dt;


2

where A a positive constant. Since the function g(t) = (t) t(t) veri es g(0) = g0(0) = 0,
and g00(0) 6= 0, Lemma 1.1 implies:
2

Z T
0

(t)E

(t)1fY t s

( )

pY

u t (s);8s2 0;T ]g

= O u1 exp ? u2

(t);Y 0 (t))

(u (t); u 0(t))dt =

2
2

2
2

(15)

From (14) and (15) follows:

LY (u; ) = O u1 exp ? u2

2
2

2
2

(16)

and from (13) and (16):

d A (u) = O 1 exp ? u
:
(17)
du
u
2 ?
Further, observe that since Y is continuous and (s) > 0 for s 2 ]0; T ], (0) = 0,
if Y (0) > 0 the event fY (s) u (s); 8s 2 0; T ]g does not occur for positive u, and if
Y (0) < 0, the same event occurs if u is large enough. This implies that
2

2
2

2
2

A (u) = P (Y (s) u (s); 8s 2 0; T ]) ! P (Y (0) < 0) = 21 as u ! +1


1

and so,

A (u) ? 12 = ?
1

Z
u

d A (v)dv = O 1 exp ? u
dv
u
2

2
2

2
2

(18)

on applying (17).
Step 3.

We will now give an equivalent for A (u): Introduce the following notations:
3

for t 2 ]0; T , Zt(s) is the centered Gaussian process


t
Zt(s) = (E f(XXt((ss))) g) = ; s 2 0; T ]:
2

for t 2 0; T ],

1 2

(s)
(1 ? r(t ? s))
p
=
t(s)=
t
=
(E f(X (s)) g)
(1 ? r (t ? s)) ? (r0(t ? s))
t

1 2

= F (t ? s) for s 2 0; T ]; s 6= t
and
t

(t) = p

2
4

2
2

Hence,
0

F 0(0) = 0:
(t) = p
2 F (0)

B (u; t) = P (Zt(s) u t(s); 8s 2 0; T ]);


Be (u; t) = E X t (t)1fZt s u t s ;8s2 ;T g
3

so that

( )

( )

Z T

Z T
1
B (u; t) dt ? p
Be (u; t) dt = S (u) ? T (u): (19)
A (u) = u 2
2
We will consider in detail the behavior of the rst term as u ! +1.
2

We apply again Proposition 1.1 to compute the derivative of B (u; t) with respect to u.
For t 2 ]0; T :
3

d B (u; t) = LZt (u; ) + LZt (u; ) + LZt (u; ); where


t
t
t
du
LZt (u; t) = t(0)P (Zt`(s) u t`(s); 8s 2 0; T ]) ( t(0)u). Then :
p
LZt (u; ) p1 F (t)exp ? u F (t) ;
3

so that as u ! +1 :

Z T
0

LZt (u; t)dt C u1 exp ? u2 F (0)


2

(20)

for some constant C .


1

* LZt (u; t) = t(T )P (Zta(s) u ta(s); 8s 2 0; T ]) ( t(T )u). In the same way:
p
LZt (u; t) p1 F (T ? t) exp ? u2 F (T ? t) :
2
and:
Z T
LZt (u; t)dt C u1 exp ? u2 F (0) :
for some constant C .
2

(21)

Z T

LZt (u; t) =?
3

(x)E (Ztx(x)? tx(x)u)1fZtx s

( )

u tx (s);8s2 0;T ]g

p Zt x ;Zt x (u t(x); u t0(x))dx:


(

( )

( ))

We rst consider the density in the integrand:


p Zt x ;Zt x (u t(x); u t0(x)) =
(

=
De ne

( )

( ))

u F (x ? t) +
(F 0(x ? t))
1
exp
?
2
4F (x ? t)E f(Zt0(x)) g
E f(Zt0(x)) g
2

(22)

0
Gt (x) = F (x ? t) + 4F (x(?F t()xE?f(tZ))0(x)) g :
2

Check that
min Gt(x) = Gt(t) = F (0);

G0t(t) = F 0(0) = 0:

x2 0;T ]

Moreover, for x 2 0; T ] one has:


x ? t)F 00(0)) + O((x ? t) );
Gt(x) = F (0) + (x ?2 t) F 00(0) + 4F(((0)
E f(Zt0(t)) g
and thus
00
G00t (t) = F 00(0) + 2F (0)(FE f(0))
(Zt0(t)) g :
Also,
X t (x) = (x ?2 t) (X (t) + (x ? t)X 0(t) + (x ?2 t) X 00(t) + (x ?6 t) X 000(t)) ? ::
? X (t)(1 ? (x ? t) ) ? X 0(t)(? (x ? t) + (x ?6 t) ) + O((x ? t) );
2

and

Zt(x) =

1
(X 00(t) + X (t)) +
? + O((x ? t) )
t) ( X 000(t) + X 0(t)) + O((x ? t) ):
+ (x ?
3

p
2

2
2

It follows that:

00(0)
E (Zt0(t)) = 9 ( ?? ) = FF (0)
;
2

and

2
4
2
2

G00t (t) = 23 F 00(0) = 6(( ? ? ) ) :


2

We also have:

2
4
2 2
2

? (x; s) 0(x) :
Zt
t(s) ? ? (x; s): t(x) ? Zt
(s ? x)
? (x; x) t
Zt
2
(t; s) 0(t) =
?
t(s) =
Z
t
t(s) ? ? (t; s): t(t) ? Zt
t
(t ? s)
? (t; t) t
p
p
= (t ?2 s) F (t ? s) ? F (0):E fZt(t)Zt(s)g for s 6= t;
F 00(0) + pF (0)E n(Z 0(t)) o = 3 pF 00(0) = ( ? ) > 0;
t
(t t) = 21 p
t
2 F (0) 6( ? ) =
F (0)
where the last inequality is a consequence of the non-degeneracy condition.
Note that since E fZt(t)Zt(s)g 1,
2 pF (t ? s) ? pF (0) for s 6= t;
t
t (s)
(t ? s)
so that
inf tt(s) > 0:
s;t2 ;T
x
t

(s) =

Zt
10

11

10

11

2
4
2 3 2
2

On the other hand, it is easy to see that tx(s) is a continuous function of the triplet
(x; t; s) and a uniform continuity argument shows that one can nd > 0 in such a
way that if jx ? tj
then
x
c > 0 for all s 2 0; T ]:
t (s)
Thus, for jx ? tj , using the Landau-Shepp-Fernique inequality (see Fernique,
1974):
x
? x
t (x) t (x)u
t (x)
x
x s u x s ;8s2 ;T g = ? p
p
(1 + R)
E
(
Z
(
x
)
?
(
x
)
u
)1
f
Z
t
t
t
t
E f(Zt0(x)) g
E f(Zt0(x)) g
( )

where R

( )

: exp(? :u ) and ; positive constants independent of t; x and u.


2

10

So,
Z T
t(
p

x) tx(x) (1 + R) exp ? 1 u G (x) dx:


2 t
E f(Zt0(x)) g
Using the fact that B (+1; t) = 1 for every t 2 0; T ] we can write:
L (u; t) = 2u
Zt
3

(23)

S (u) = u 2 T ? u 2
2

Z T
0

dt

+1

d B (v; t) dv:
dv
3

The same method of Lemma 1.1, plus


t
3 pF "(0)F (0)
t(t) t (t)
p
=
E f(Zt0(t)) g 2
and (20), (21), (23), (22) show that
2

S (u) = u 2 T ? (1 + o(1)) 2T

3 ( ? ) exp ? u
2

2
2

2
2

2
2

(24)

The second term in (19) can be treated in a similar way, only one should use the full
statement of Lemma 3.3 in Aza s and Wschebor (1999) instead of Proposition 1.1, thus
obtaining:
T (u) = O( u1 ): exp ? u2 ? :
(25)
Then, (24) together with (25) imply that as u ! +1:
2

2
2

A (u) = u 2 T ? (1 + o(1)) 2T
3

2
2

3 ( ? ) exp ? u
2
2
2

2
2

2
4

2
2

(26)

Replacing (18), (26) into (10) and integrating, one obtains (2).

Acknowledgment. The authors thank Professors J-M. Aza s, P. Carmona and C. Del-

mas for useful talks on the subject of this paper.

References

Aza s, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Aza s, J-M. and Wschebor, M. (1999). On the Regularity of the Distribution of the
Maximum of One-parameter Gaussian Processes. Submitted.

11

Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. J.
Wiley & Sons, New-York.
Fernique, X. (1974). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
d'Ete de Probabilites de St. Flour. Lecture Notes in Mathematics, 480, Springer-Verlag.
New-York.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.

12

ESAIM: Probability and Statistics

September 1999, Vol. 3, p. 107129

URL: http://www.emath.fr/ps/

BOUNDS AND ASYMPTOTIC EXPANSIONS FOR THE DISTRIBUTION


OF THE MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Jean-Marc Azas 1 , Christine Cierco-Ayrolles 1, 2 and Alain Croquette 1


Abstract. This paper uses the Rice method [18] to give bounds to the distribution of the maximum
of a smooth stationary Gaussian process. We give simpler expressions of the first two terms of the Rice
series [3,13] for the distribution of the maximum. Our main contribution is a simpler form of the second
factorial moment of the number of upcrossings which is in some sense a generalization of Steinberg
et al.s formula ([7] p. 212). Then, we present a numerical application and asymptotic expansions that
give a new interpretation of a result by Piterbarg [15].

R
esum
e. Dans cet article nous utilisons la methode de Rice (Rice, 1944-1945) pour trouver un encadrement de la fonction de repartition du maximum dun processus Gaussien stationnaire regulier.
Nous derivons des expressions simplifiees des deux premiers termes de la serie de Rice (Miroshin, 1974,
Azas et Wschebor, 1997) suffisants pour lencadrement cherche. Notre contribution principale est la
donnee dune forme plus simple du second moment factoriel du nombre de franchissements vers le
haut, ce qui est, en quelque sorte, une generalisation de la formule de Steinberg et al. (Cramer and
Leadbetter, 1967, p. 212). Nous presentons ensuite une application numerique et des developpements
asymptotiques qui fournissent une nouvelle interpretation dun resultat de Piterbarg (1981).

AMS Subject Classification. 60Exx, 60Gxx, 60G10, 60G15, 60G70, 62E17, 65U05.
Received June 4, 1998. Revised June 8, 1999.

1. Introduction
1.1. Framework
Many statistical models involve nuisance parameters. This is the case for example for mixture models [10],
gene detection models [5,6], projection pursuit [20]. In such models, the distributions of test statistics are those
of the maximum of stochastic Gaussian processes (or their squares). Dacunha-Castelle and Gassiat [8] give for
example a theory for the so-called locally conic models.
Thus, the calculation of threshold or power of such tests leads to the calculation of the distribution of the
maximum of Gaussian processes. This problem is largely unsolved [2].
Keywords and phrases: Asymptotic expansions, extreme values, stationary Gaussian process, Rice series, upcrossings.

This paper is dedicated to Mario Wschebor in the occasion of his 60th birthday.

Laboratoire de Statistique et Probabilit


es, UMR C55830 du CNRS, Universite Paul Sabatier, 118 route de Narbonne, 31062
Toulouse Cedex 4, France.
2 Institut National de la Recherche Agronomique, Unit
e de Biom
etrie et Intelligence Artificielle, BP. 27, Chemin de Borde-Rouge,
31326 Castanet-Tolosan Cedex, France; e-mail: azais@cict.fr, cierco@toulouse.inra.fr, croquett@cict.fr
c EDP Sciences, SMAI 1999

108

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Miroshin [13] expressed the distribution function of this maximum as a sum of a series, so-called the Rice
series. Recently, Azas and Wschebor [3, 4] proved the convergence of this series under certain conditions and
proposed a method giving the exact distribution of the maximum for a class of processes including smooth
stationary Gausian processes with real parameter.
The formula given by the Rice series is rather complicated, involving multiple integrals with complex expressions. Fortunatly, for some processes, the convergence is very fast, so the present paper studies the bounds
given by the first two terms that are in some cases sufficient for application.
We give identities that yield simpler expressions of these terms in the case of stationary processes. Generalization to other processes is possible using our techniques but will not be detailed for shortness and simplicity.
For other processes, the calculation of more than two terms of the Rice series is necessary. In such a case,
the identities contained in this paper (and other similar) give a list of numerical tricks used by a program under
construction by Croquette.
We then use Maple to derive asymptotic expansions of some terms involved in these bounds. Our bounds
are shown to be sharp and our expansions are made for a fixed time interval and a level tending to infinity.
Other approaches can be found in the literature [12]. For example, Kratz and Rootzen [11] propose asymptotic
expansions for a size of time interval and a level tending jointly to infinity.
We consider a real valued centred stationary Gaussian process with continuous paths X = {Xt ; t [0, T ] R}.
We are interested in the random variables
X = sup Xt or X

= sup |Xt | .

t[0,T ]

t[0,T ]

For shortness and simplicity, we will focus attention on the variable X ; the necessary modifications for adapting
our method to X are easy to establish [5].
We denote by dF () the spectral measure of the process X and p the spectral moment of order p when it
exists. The spectral measure is supposed to have a finite second moment and a continuous component. This
implies ([7] p. 203) that the process is differentiable in quadratic mean and that for all pairwise different time
points t1 , . . . , tn in [0, T ], the joint distribution of Xt1 , . . . , Xtn , Xt1 , . . . , Xtn is non degenerated.
For simplicity, we will assume that moreover the process admits C 1 sample paths. We will denote by r(.) the
covariance function of X and, without loss of generality, we will suppose that 0 = r(0) = 1.
Let u be a real number, the number of upcrossings of the level u by X, denoted by Uu is defined as follows:
Uu = # {t [0, T ], Xt = u, Xt > 0}
For k N , we denote by k (u, T ) the factorial moment of order k of Uu and by k (u, T ) the factorial moment of
order k of Uu 11{X0 u} . We also define k (u, T ) = k (u, T ) k (u, T ). These factorial moments can be calculated
by Rice formulae. For example:

T 2 u2 /2
1 (u, T ) = E (Uu ) =
e
2
T

and 2 (u, T ) = E (Uu (Uu 1)) =

Ast (u) ds dt
0

with Ast (u) = E (Xs )+ (Xt )+ |Xs = Xt = u ps,t (u, u), where (X )+ is the positive part of X and ps,t the
joint density of (Xs , Xt ).
These two formulae are proved to hold under our hypotheses ( [7], p. 204). See also Wschebor [21],
Chapter 3, for the case of more general processes.
We will denote by the density of the standard Gaussian distribution. In order to have simpler expressions
x

of rather complicated formulae, we will use the folllowing three functions: (x) =
x

and (x) =
0

1
(y)dy = (x) .
2

(y)dy, (x) = 1 (x)

109

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

1.2. Main inequalities


Since the pioneering works of Rice [18], the most commonly used upper bound for the distribution of the
maximum is the following:
P (X > u) P (X0 > u) + P (Uu > 0) P (X0 > u) + E (Uu ) .
2
(u).
2
One can also see the works by [9, 15, 16].
We propose here a slight refinement of this inequality, but also a lower bound using the second factorial
moment of the number of upcrossings. Our results are based on the following remark which is easy to check: if
is a non-negative integer valued random variable, then

That is: P (X > u) (u) + T

1
E () E (( 1)) P ( > 0) E () .
2
Noting that P almost surely, {X > u} = {X0 > u} {X0 u, Uu > 0} and that E Uu (Uu 1)11{X0 u} 2 ,
we get:
P (X0 > u) + 1 (u, T )

2 (u, T )
P X u
2

P (X0 > u) + 1 (u, T ),

(1.1)

with 1 (u, T ) = E Uu 11{X0 u} .


Using the same technique as for calculating E (Uu ) and E (Uu (Uu 1)), one gets
T

1 (u, T ) =

dt
0

dx

y p0,t;t (x, u; y)dy,


0

where p0,t;t stands for the density of the vector (X0 , Xt , Xt ).


Azas and Wschebor [3, 4] have proved, under certain conditions, the convergence of the Rice series [13]
+

P X u = P (X0 > u) +

(1)m+1
m=1

m (u, T )
m!

(1.2)

and the envelopping property of this series:


n
m (u, T )
(1)m+1
if we set Sn = P (X0 > u) +
, then, for all n > 0:
m!
m=1
S2n P X u S2n1 .

(1.3)

Using relation (1.3) with n = 1 gives


P (X0 > u) + 1 (u, T )

2 (u, T )
P X u P (X0 > u) + 1 (u, T ).
2

Since 2 (u, T ) 2 (u, T ), we see that, except this last modification which gives a simpler expression, Main
inequality (1.1) is relation (1.3) with n = 1.

110

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Remark 1.1. In order to calculate these bounds, we are interested in the quantity 1 (u, T ). For asymptotic
calculations and to compare our results with Piterbargs ones, we will also consider the quantity k (u, T ). From
a numerical point of view, k (u, T ) and k (u, T ) are worth being distinguished because they are not of same
order of magnitude as u +. In the following sections, we will work with 1 (u, T ).

2. Some identities
First, let us introduce some notations that will be used in the rest of the paper. We set:
r (t)
u,
(t) = E (X0 |X0 = Xt = u) =
1 + r(t)
r 2 (t)
2 (t) = V ar (X0 |X0 = Xt = u) = 2
,
1 r2 (t)
r (t) 1 r2 (t) r(t)r 2 (t)
(t) = Cor (X0 , Xt |X0 = Xt = u) =
.
2 (1 r2 (t)) r 2 (t)
1 + (t)

and b(t) = (t).


1 (t)

Note that, since the spectrum of the process X admits a continuous component, |(t)| = 1.
In the sequel, the variable t will be omitted when it is not confusing and we will write r, r , , , , k, b instead
of r(t), r (t), (t), (t), (t), k(t), b(t).

We also define k(t) =

Proposition 2.1. (i) If (X, Y ) has a centred normal bivariate distribution with covariance matrix

1
1

then a R+
a

1
P (X > a, Y > a) = arctan

1+
(x)
2
1
0
1+
x (x) dx
1

=2

(ii) 1 (u, T ) = (u)


0
T

2 (T t)

(iii) 2 (u, T ) =
0

1 r 2
u
1+r

1
2
1 r2 (t)

1+
x
1

1r
r
u (b)
1+r
1 r2

u
1 + r(t)

dx

dt

[T1 (t) + T2 (t) + T3 (t)] dt

with:
T1 (t) = 2 (t)

1 2 (t) (b(t)) (k(t) b(t)) ,

(2.1)

T2 (t) = 2 (2 (t)(t) 2 (t))

(k(t) x) (x) dx,

(2.2)

b(t)

T3 (t) = 2 (t) (t) (k(t) b(t)) (b(t)) .

(2.3)

(iv) A second expression for T2 (t) is:


T2 (t) = ( (t)(t) (t))
2

1
arctan (k(t)) 2

b(t)

(k(t) x) (x) dx .
0

(2.4)

111

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Remark 2.2.
p. 27:

1. Formula (i) is analogous to the formula (2.10.4) given in Cramer and Leadbetters [7],

P (X > a, Y > a) = (a)(a) +


0

1
a2

exp
1z
2 1 z 2

dz.

Our formula is easier to prove and is more adapted to numerical application because, when t 0,
(t) 1 and the integrand in Cramer and Leadbetters formula tends to infinity.
2. Utility of these formulae:
these formulae permit a computation of Main inequality (1.1), at the cost of a double integral with
finite bounds. This is a notable reduction of complexity with respect to the original form. The form
(2.4) is more adapted to effective computation, because it involves an integral on a bounded interval;
this method has been implemented in a S+ program that needs about one second of Cpu to run an
example. It has been applied to a genetical problem in Cierco and Azas [6].
The form (iii) has some consequences both for numerical and theoretical purposes. The calculation of 2 (u, T )
yields some numerical difficulties around t = 0. The sum of the three terms is infinitly small with respect to
each term. To discard the diagonal from the computation, we use formula (iii) and Maple to calculate the
equivalent of the integrand in the neighbourhood of t = 0 at fixed u.
T

Recall that we have set 2 (u, T ) =

Ast (u) ds dt. The following proposition gives the Taylor expansion
0

of A at zero.

Proposition 2.3. Assume that 8 is finite. Then, as t 0:


3/2

At (u) =

1
(2 6 4 )
1 4
exp
u2
1296 (4 2 )1/2 2 2
2 4 22
2
2

t4 + O(t5 ).

Piterbarg [17] or Wschebor [21] proved that At (u) = O ( (u(1 + ))) for some 0. Our result is more precise.
Our formulae give some asymptotic expansions as u + for 1 (u, T ) and 2 (u, T ) for small T .
Proposition 2.4. Assume that 8 is finite. Then, there exists a value T0 such that, for every T < T0
11/2

4 22
27
1 (u, T ) =

4 5 (2 6 2 )3/2
2
4

4
u
4 22

u6

1+O

1
u

9/2
4 22
3 3T
2 (u, T ) =

9/2 (2 6 2 )
2
4

4
u
4 22

u5

1+O

1
u

as u +.

3. A numerical example
In the following example, we show how the upper and lower bounds (1.1) permit to evaluate the distribution
of X with an error less than 104 .
We consider the centered stationary Gaussian process with covariance (t) := exp(t2 /2) on the interval
I = [0, 1], and the levels u = 3, 2.5, . . . , 3. The term P (X0 u) is evaluated by the S -plus function P norm,
1 and 2 using Proposition 2.1 and the Simpson method. Though it is rather difficult to assess the exact
precision of these evaluations, it is clear that it is considerably smaller than 104 . So, the main source of error

112

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

is due to the difference between the upper and lower bounds in (1.1).
u
P (X0 u)
3
0.00135
2.5
0.00621
2
0.02275
1.5
0.06681
1
0.15866
0.5
0.30854
0
0.50000
0.5
0.69146
1
0.84134
1.5
0.93319
2
0.97725
2.5
0.99379
3
0.99865

1
0.00121
0.00518
0.01719
0.04396
0.08652
0.13101
0.15272
0.13731
0.09544
0.05140
0.02149
0.00699
0.00177

2
lower bound upper bound
0
0.00014
0.00014
0
0.00103
0.00103
0
0.00556
0.00556
0.00001
0.02285
0.02285
0.00002
0.07213
0.07214
0.00004
0.17753
0.17755
0.00005
0.34728
0.34731
0.00004
0.55415
0.55417
0.00002
0.74591
0.74592
0.00001
0.88179
0.88180
0
0.95576
0.95576
0
0.98680
0.98680
0
0.99688
0.99688

The calculation demands 14 s on a Pentium 100 MHz.


The corresponding program is available sending an e-mail to croquett@cict.fr.

4. Proofs
Proof of Proposition 2.1
Proof of point (i). We first search P (X > a, Y > a).
Put = cos(), [0, [, and use the orthogonal decomposition Y = X +
a X
Then {Y > a} = Z >
. Thus:
1 2
+

P (X > a, Y > a) =

a x

(x)

(x)(z) dx dz,

dx =

1 2

1 2 Z.

1
where D is the domain located between the two half straight lines starting from the point a, a
1+

with angle and .


2
2

Using a symmetry with respect to the straight line with angle passing through the origin, we get:
2
+

P (X > a, Y > a) = 2

(x)
a

1
x
1+

dx.

(4.1)

Now,
P (X > a, Y > a) = (a) P (X > a, Y < a) = (a) P (X > a, (Y ) > a) .
Applying relation (4.1) to (X, Y ) yields
+

P (X > a, Y > a) = (a) 2

(x)
a

1+
x
1

dx = 2

and

1+
x
1

(x) dx.

113

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Now, using polar coordinates, it is easy to establish that


+

(k x) (x) dx =
0

1
arctan(k)
2

which yields the first expression.


Proof of point (ii). Conditionally to (X0 = x, Xt = u), Xt is Gaussian with:
r (t)(x r(t)u)
mean m(t) =
,
1 r2 (t)
2
variance (t) already defined.
It is easy to check that, if Z is a Gaussian random variable with mean m and variance 2 , then
m
m
+ m

E Z + =

These two remarks yield 1 (u, T ) = I1 + I2 , with:


T
+
r (x r u)
I1 =
dt

p0,t (x, u) dx
(1 r2 )
0
u
T
+
r (x r u)
r (x r u)
I2 =
dt

p0,t (x, u) dx.


2)
(1

r
(1 r2 )
0
u
T

I1 can be written under the following form: I1 = (u)


0

parts leads to

I2 = (u)
0

Finally, noticing that 2 +

1 (u, T ) =

22

2
(u)
2

r
1r

u (b)

2
1r
1+r

2 1 r
r2

u
+

1+r
22 (1 r2 )

1r
u
1+r

dt. Integrating I2 by

dt.

r2
= 2 , we obtain:
1 r2
T

1r
u
1+r

dt + (u)
0

1 r2

1r
u
1+r

(b) dt.

Proof of point (iii). We set:


(x b)2 2(x b)(y + b) + (y + b)2
v(x, y) =
2(1 2 )
for (i, j) {(0, 0); (1, 0); (0, 1); (1, 1); (2, 0); (0, 2)}
+

Jij =
0

xi y j
2

1 2

exp (v(x, y)) dydx.

We first calculate the values of Jij . The following relation is clear


+

J10 J01 (1 + )bJ00

1 2
0

1 2 (k b) (b).

exp (v(x, y))


v(x, y)
dx
x
2 1 2

dy
(4.2)

114

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Symmetrically, replacing x with y and b with b in (4.2) yields


J01 J10 + (1 + )bJ00 = 1 2 (k b) (b).

(4.3)

In the same way, multiplying the integrand by y, we get


J11 J02 (1 + ) b J01 = 1 2

3/2

(k b) k b (k b) (b).

(4.4)

[ (k b) + k b (k b)] (b).

(4.5)

And then, multiplying the integrand by x leads to


J11 J20 + (1 + ) b J10 = 1 2
+

Finally, J20 J11 (1 + ) b J10 = (1 2 )

x
0

parts

3/2

exp (v(x, y))

v(x, y)
dx dy. Then, integrating by
x
2 1 2

J20 J11 (1 + ) b J10 = (1 2 ) J00 .

(4.6)

Multiplying equation (4.6) by and adding (4.5) gives:


J11 = b J10 + J00 +

1 2 [ (k b) + k b (k b)] (b).

Multiplying equation (4.3) by and adding equation (4.2) yields:


J10 = b J00 + (k b) + (k b) (b).
+

And, by formula (i), J00 = 2

(k x) (x) dx. Finally, gathering the pieces, it comes:


b

J11 = J11 (b, ) =

1 2 2

b
1

(b) + 2 b2

(k x) (x) dx + 2 b (k b) (b).
b

The final result is obtained remarking that


+

E (X0 ) (Xt ) |X0 = Xt = u = 2 (t) J11 (b(t), (t)) .


Proof of point (iv). Expression (2.4) is obtained simply using the second expression of J00 .
Note 4.1. In the following proofs, some expansions are made as t 0, some as u + and some as
(t, u) (0, +).
We define the uniform Landau symbol OU as a(t, u) = OU (b(t, u)) if there exists T0 and u0 such that for
t < T0 < T and u > u0 ,
a(t, u) (const) b(t, u).
We also define the symbol

as a(t, u)

b(t, u)

a(t, u) = OU (b(t, u))

b(t, u) = OU (a(t, u))

Note 4.2. Many results of this section are based on tedious Taylor expansions. These expansions have been
made or checked by a computer algebra system (Maple). They are not detailed in the proofs.

115

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

1 + (t)
= O(t) is small,
1 (t)

Proof of Proposition 2.3. Use form (iii) and remark that, when t is small, k(t) =
1
and, since () =
2

3
6

+ O 5 as 0, we get:
b(t)

b(t)

k(t)
arctan(k(t))
k 3 (t)

x(x)dx +
x3 (x)dx + O(t5 )
2
2 0
6 2 0

1
k(t)
2 arctan(k(t)) 2 ((0) (b(t)))
+ O(t5 ).
= 2 2 (t)(t) 2 (t)

k 3 (t)
2
+
2(0) b (t) + 2 (b(t))
6 2
In the same way:
2(t)(t)
k 3 (t) 3
b (t) + O(t5 ).
T3 (t) =
(b(t)) k(t)b(t)
6
2
And then, assuming 8 finite, use Maple to get the result.
T2 (t) = 2 2 (t)(t) 2 (t)

Proof of Proposition 2.4. We first prove the following two lemmas.


Lemma 4.3. Let l be a real positive function of class C 2 satisfying l(t) = ct + O(t2 ) as t 0, c > 0. Suppose
that 8 is finite, with the above definitions of k(t) and b(t), we have as u +:
arctan

(p+1)

(i) Ip =

t (k(t) b(t)) (l(t) u) dt = (c u)


0

1 Mp+1

2 2

( dc )
p

(cos ) d 1 + O
0

1
u

22 6 2 24
p+1
and Mp+1 = E |Z|
where Z is a standard Gaussian random variable.
4 22
T
Mp
1
(ii) Jp =
tp (l(t) u) dt = (c u)(p+1)
1+O

2
u
0

with d =

1
6

Proof of Lemma 4.3. Since the derivative of l at zero is non zero, l is invertible in some neighbourghood of zero
1
1
and its inverse l1 satisfies l1 (t) = t + O(t2 ), l1 (t) = + O(t).
c
c
We first consider Ip and use the change of variable y = l(t)u, then
l(T )u

Ip =

y
u

l1

(kb) l1

y
u

(y) l1

y
u

dy

From the expressions of k(t) and b(t), we know that


(kb)(t) =
Thus (kb) l1

y
d
= y + u OU
u
c

1
6
y2
u2

and

l(T )u

(p+1)

yp

Ip = (c u)

We use the following lemma.

22 6 2 24
t u + u O(t3 ) = d u t + u O(t3 ).
4 22

d
y + u OU
c

y2
u2

(y) 1 + OU

y
u

dy.

116

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Lemma 4.4. Let h be a real function such that h(t) = O t2 as t


0, then there exists T0 such that for
0 t T0
(u(t + h(t))) = (t u) [1 + OU (t)] .
Proof of Lemma 4.4. Taking T0 sufficiently small, we can assume that h(t)
A = | (u(t + h(t))) (t u)| u |h(t)|

tu
2

t
. Then
2

(const) u t2

tu
2

We want to prove that, in every case,


A (const) t (t u)

(4.7)

when tu 1, (t u) tu(1) and A (const) u t2 (0), thus (4.7) holds.


ut
when tu > 1, (t u) > (1) and A (const) t2 u
and (4.7) holds again.
2
End of proof of Lemma 4.3.
Due to Lemma 4.4,
l(T )u

(p+1)

yp

Ip = (c u)

0
l(T )u

yp

Put Kp (u) =
0

d
y
c

d
y
c

(y) 1 + OU

y
u

dy.

(4.8)

(y) dy. It is easy to see that, when u +,


+

yp

Kp (u) =
0

d
y
c

(y) dy + O un for every integer n > 0.


d

+
c y yp
y2 + z 2
d
y (y) dy =
exp
dz dy. Then, using polar coorc
2
2
0
0
0
d
1 Mp+1 arctan( c )
p
dinates, we derive that Kp () =
(cos ) d. So we can see that the contribution of the
2 2
0
y
term OU
in formula (4.8) is O u(p+2) which gives the desired result for Ip .
u

Moreover, Kp () =

yp

The same kind of proof gives the expression of Jp .


Proof of the equivalent of 1 (u, T ). We set
A1 (t) = (u)

2 (1 r)
u
2 (1 + r)

1r
u
1+r

r
(b)
1 r2

Then, 1 (u, T ) =

A1 (t) dt.
0

It is well known ([1], p. 932) that, as z tends to infinity,


(z) = (z)

1
1
3
3 + 5 + O(z 7 ) .
z
z
z

(4.9)

117

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

2 (1 r(t))
u for the first term and z = b(t)
2 (t)(1 + r(t))

We use this expansion for both terms of 1 (u, T ), with z =


for the second one.
Besides, remarking that

2 (1 r)
u
2 (1 + r)

1r
u
1+r

(b) ,

we get:

2 (1 + r) 1

2 (1 r) u

2 (1 r)
2 (1 + r)

u
+ OU

2
(1 + r)

2 (1 r)

1
r
1
+
3 + OU
2
b
1r b

(u)
A1 (t) =
2

3/2

2 (1 + r)
2 (1 r)
5/2

1
u5

1
b5

u3


From Taylor expansion made by Maple assuming 8 finite, we know that:


5/2

4
2
exp 2(u4
2)
1 4 2
2
t2 + O(t4 ).
A1 (t) =

7/2
8
u3 2
2

To use Lemma 4.3 point (ii) to calculate 1 (u, T ), it is necessary to have a Taylor expansion of the coefficient
22
2 (1 r)
2 (1 r(t))
of u in
u
.
We
have
lim
=
, therefore, we set:
t0 2 (t)(1 + r(t))
2 (1 + r)
4 22
2 (1 r)
22
.

2 (1 + r)
4 22

l(t) =

From Taylor expansion made by Maple assuming 8 finite, we get


1
l(t) =
6

2 (2 6 24 )
t + O(t2 ).
4 22

And, according to Lemma 4.3 point (ii),


T

1
t (l(t) u) dt =
2

1
6

Finally, remarking that (u)

2
4

22

2 (2 6 24 )
u
4 22

1
=
2
11/2

4 22
27
1 (u, T ) =

4 5 (2 6 2 )3/2
2
4

1+O

1
u

4
u , we get the equivalent for 1 (u, T ).
4 22
4
u
4 22

u6

1+O

1
u

118

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Proof of the equivalent of 2 (u, T ). Remember that


T

2 (T t)

2 (u, T ) =
0

1
2
1 r2 (t)

u
1 + r(t)

[T1 (t) + T2 (t) + T3 (t)] dt.

(4.10)

We first calculate an expansion of term T2 = 2 2 ( b2 )

(x) (k x) dx.

b
The function x x2 1 (x) being bounded, we have
(kx) = (k b) + k (k b) (x b)

1 3
2
3
k b (k b) (x b) + OU k 3 (x b) ,
2

(4.11)

where the Landaus symbol has here the same meaning as in Lemma 4.3.
Moreover, using the expansion of given in formula (4.9), it is easy to check that as z +,
+

(z)
(z)
(z)
3 4 +O
2
z
z
z6
z
+
(z)
(z)
2
(x z) (x) dx = 2 3 + O
z
z5
z
+
(z)
3
(x z) (x) dx = O
.
z4
z
(x z) (x) dx =

Therefore, multiplying formula (4.11) by (x), integrating on [b; +[ and applying formula (4.9) once again
yield:

3
1 k2
3
1
1

+
+ k (k b) (b)
4

(k b) (b)

b b3 b5
b2
b

(k b) (b)
k
2
2
T2 = 2 b
+O
(k
b)
(b)

+
O

b7
b6

3
3

k
k

(k b) (b) + O
(b)

+O
b4
b4

Note that the penultimate term can be forgotten. Then, remarking that, as u +, b =
u, t and

k t, we obtain:
T2

2
2
= 2 2 b (k b) (b) + 2
(k b) (b) + 2
(k b) (b)
b2
b
2

2 3 (k b) (b) 6 3 (k b) (b) + 2 2 k 3 (k b) (b)


b
b
2 k
2 k
2 2 k (k b) (b) + 2
(k b) (b) + 6 2 (k b) (b)
2
b
b
+ OU t2 u5 (k b) (b) + OU t3 u4 (k b) (b) + OU t5 u2 (b)

Remark 4.5. As it will be seen later on, Lemma 4.3 shows that the contribution of the remainder to the
1
integral (4.10) can be neglected since the degrees in t and of each term are greater than 5. So, in the sequel,
u
we will denote the sum of these terms (and other terms that will appear later) by Remainder and we set:
T2 = U1 + U2 + U3 + U4 + U5 + U6 + U7 + U8 + U9 + Remainder.

119

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Now, we have

U1 + T 3 = 0
1 2 2 k = (1 + ) k so that U7 + T1 = (1 + ) 2 k (k b) (b)
2
U2 + U3 = 2
(1 + ) (k b) (b)
b
2

U4 + U5 = 4 3 (k b) (b) 1 + O t2
b
2
U8 + U9 = 4 2 k (k b) (b) 1 + O t2
b
since = 1 + O t2 .

By the same remark as Remark 4.5 above, the term O t2 can be neglected. Consequently,

T1 + T2 + T3

= 2

2
2
(1 + ) (k b) (b) 4 3 (k b) (b)
b
b

(1 + ) 2 k (k b) (b) + 2 2 k 3 (k b) (b) + 4

2
k (k b) (b)
b2

+ Remainder.

Therefore, we are leaded to use Lemma 4.3 in order to calculate the following integrals:

(T t)

0
T

(T t)
0
T

(T t)
0
T

(T t)
0

u
2u
(kb) (b) dt = (T t) m1 (t) (kb) b2 +
dt
1+r
1
+r
0

2
2u
m2 (t) (k b) b2 +
dt
1+r

2
2u
dt
m3 (t) b2 (1 + k 2 ) +
1+r

2
2u
dt
m4 (t) b2 (1 + k 2 ) +
1+r

2
2u
dt
m5 (t) b2 (1 + k 2 ) +
1+r

(T t) m1 (t) exp
0

120

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

with:
m1 (t)

=
=

2
2
1
(t) (1 + (t))
1 r2 (t) b
4 22 3
1 2 6 24
t + O t5
5/2
36
u
2

m2 (t)
m3 (t)

=
=

m4 (t)

=
=

m5 (t)

=
=

5/2

4 22
2
1
(t)
=

t + O t3
7/2
1 r2 (t) b3
u3 2
1
1

(1 + (t)) 2 (t) k(t)
2 1 r2 (t)

3/2
2 2 6 24

t4 + O t6
864 22 4 22 3/2
2
1

2 (t) k 3 (t)
2 1 r2 (t)
3/2
2 4
1 2 6 24
t + O t6
864 22 4 22 3/2
4
1
( 1998).2

(t) k(t)
b2
2 1 r2 (t)
3/2
2 6 24 4 22
2 2
1
t + O t4 .
12
32 3/2 u2

4
=

Lemma 4.3 shows that we can neglect the terms issued from the t part of the factor T t in formula (4.10).

We now consider the argument of in Lemma 4.3. We have:


2
b2
4

lim 2 +
=
t0 u
1+r
4 22
b2
2
4

lim 2 1 + k 2 +
=

t0 u
1+r
4 22
Therefore, we set:
2 2 6 24

l1 (t)

b2 (t)
2
4
+

=
2
u
1 + r(t)
4 22

l2 (t)

b2 (t)
2
4
1 + k 2 (t) +

u2
1+r
4 22

2 2 6 24
2

12 (4 22 )

t + O t3

5/2

18 (4 22 )

t + O t3 .

Then, with the notations of Lemma 4.3, we obtain:

2 = T exp

4 u2
2 (4 22 )

4 22
4 22
1 2 6 24

3
5/2
7/2
36
2 u
u3 2

3/2

2 6 24 4 22
2
1
+
J2
12
32 3/2 u2

I1

1+O

1
u

121

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Where I1 and I3 (resp. J2 ) are defined as in Lemma


4.3 point (i) (resp. (ii) ) with l(t) = l1 (t) (resp. l(t) = l2 (t)).

2
2

2
2
8 3
3
3
(cos ) d =
and that
cos d =
, we find
Noting that
27
3
0
0

4
144 3 4 22
1
I3 =
u4 1 + O
2
u
2 22 (2 6 24 )

2 2
3 3 4 2
1
I1 =
u2 1 + O
2
u
2 2 (2 6 4 )

3
12 3 4 22
1
J2 =
u3 1 + O

2
2
u
2 (2 6 4 ) 2 (2 6 4 )
Finally, gathering the pieces, we obtain the desired expression of 2 .

5. Discussion
Using the general relation (1.3) with n = 1, we get
P X u P (X0 > u) 1 (u, T ) +

2 (u, T ) 3 (u, T )
2 (u, T )

2
2
6

A conjecture is that the orders of magnitude of 2 (u, T ) and 3 (u, T ) are considerably smaller than those of
1 (u, T ) and 2 (u, T ). Admitting this conjecture, Proposition 2.4 implies that for T small enough

9/2
4 22
T 2
3 3T
P X u = (u) +
(u)

2 9/2 (2 6 2 )
2
4
2

4
u
4 22

u5

1+O

1
u

which is Piterbargs theorem with a better remainder ([15], Th. 3.1, p. 703). Piterbargs theorem is, as far as we
know, the most precise expansion of the distribution of the maximum of smooth Gaussian processes. Moreover,
very tedious calculations would give extra terms of the Taylor expansion.

References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables.
Dover, New York (1972).
[2] R.J. Adler, An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes, IMS, Hayward, Ca
(1990).
[3] J.-M. Azas and M. Wschebor, Une formule pour calculer la distribution du maximum dun processus stochastique. C.R. Acad.
Sci. Paris Ser. I Math. 324 (1997) 225-230.
[4] J-M. Azas and M. Wschebor, The Distribution of the Maximum of a Stochastic Process and the Rice Method, submitted.
[5] C. Cierco, Probl`
emes statistiques li
es a
` la d
etection et a
` la localisation dun g`
ene `
a effet quantitatif. PHD dissertation.
University of Toulouse, France (1996).
[6] C. Cierco and J.-M. Azas, Testing for Quantitative Gene Detection in Dense Map, submitted.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes, J. Wiley & Sons, New-York (1967).
[8] D. Dacunha-Castelle and E. Gassiat, Testing in locally conic models, and application to mixture models. ESAIM: Probab.
Statist. 1 (1997) 285-317.
[9] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977) 247-254.
[10] J. Ghosh and P. Sen, On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related
results, in Proc. of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Le Cam L.M. and Olshen R.A., Eds.
(1985).

122

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

[11] M.F. Kratz and H. Rootz


en, On the rate of convergence for extreme of mean square differentiable stationary normal processes.
J. Appl. Prob. 34 (1997) 908-923.
[12] M.R. Leadbetter, G. Lindgren and H. Rootz
en, Extremes and Related Properties of Random Sequences and Processes. SpringerVerlag, New-York (1983).
[13] R.N. Miroshin, Rice series in the theory of random functions. Vestnik Leningrad Univ. Math. 1 (1974) 143-155.
[14] M.B. Monagan, et al. Maple V Programming guide. Springer (1998).
[15] V.I. Piterbarg, Comparison of distribution functions of maxima of Gaussian processes. Theory Probab. Appl. 26 (1981) 687-705.
[16] V.I. Piterbarg, Large deviations of random processes close to gaussian ones. Theory Probab. Appl. 27 (1982) 504-524.
[17] V.I. Piterbarg, Asymptotic Methods in the Theory of Gaussian Processes and Fields. American Mathematical Society. Providence, Rhode Island (1996).
[18] S.O. Rice, Mathematical Analysis of Random Noise. Bell System Tech. J. 23 (1944) 282-332; 24 (1945) 45-156.
[19] SPLUS, Statistical Sciences, S-Plus Programmers Manual, Version 3.2, Seattle: StatSci, a division of MathSoft, Inc. (1993).
[20] J. Sun, Significance levels in exploratory projection pursuit. Biometrika 78 (1991) 759-769.
[21] M. Wschebor, Surfaces aleatoires. Mesure geometrique des ensembles de niveau. Springer-Verlag, New-York, Lecture Notes in
Mathematics 1147 (1985).

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

TAYLOR EXPANSIONS BY MAPLE

GENERAL FORMULAE
>

phi:=t->exp(-t*t/2)/sqrt(2*pi);
2

e(1/2 t )

2
We introduce mu4=lambda4-lambda22 and mu6= lambda2*lambda6-lambda4^2
to make the outputs clearer.
>
assume(t>0);
>
assume(lambda2 > 0);
>
assume(mu4 > 0);
>
assume(mu6>0);
>
interface(showassumed=2);
>
Order:=12;
:= t

>

Order := 12
r:=t->1-lambda2*t^2/2!+lambda4*t^4/4!-lambda6*t^6/6!+lambda8*t^8/8!;
1
1
1
1
2 t2 +
4 t4
6 t6 +
8 t8
2
24
720
40320
siderels:= {lambda4=mu4+lambda2^2,lambda2*lambda6-lambda4^2=mu6}:
I_r2:=t->1-r(t)*r(t);
r := t 1

>
>

I r2 := t 1 r(t)2
>

simplify(simplify(series(I_r2(t),t=0,8),siderels));

>

1
1
1
1
1
2 t2 + ( 22
4) t4 + (
6 +
2 4 +
23 ) t6 + O(t8 )
3
12
360
24
24
with assumptions on t, 2 and 4
rp:=t->diff(r(t),t);
rp := t diff(r(t), t)

>

eval(rp(t));
1
1
1
4 t3
6 t5 +
8 t7
6
120
5040
with assumptions on 2 and t

2 t +
>

rs:=t->diff(r(t),t$2);
rs := t

>

2
r(t)
t2

eval(rs(t));
1
1
1
4 t2
6 t4 +
8 t6
2
24
720
with assumptions on 2 and t

2 +

123

124

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

mu:=t->-u*rp(t)/(1+r(t));
:= t

>

u rp(t)
1 + r(t)

sig2:=t->lambda2-rp(t)*rp(t)/I_r2(t);
sig2 := t 2

>

rp(t)2
I r2(t)

simplify(taylor(sig2(t),t=0,8),siderels);
1
1 6 22 4 3 42 2 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6

>

sigma:=t->sqrt(sig2(t));
:= t

>

simplify(taylor(sigma(t),t=0,6),siderels);
1
2

>

sig2(t)

1 6 22 4 3 42 2 6 3

t + O(t5 )
144
4 2
with assumptions on t, 4, 2 and 6

4 t +

b:=t->mu(t)/sigma(t);
b := t

>

(t)
(t)

simplify(taylor(b(t),t=0,6),siderels);
u 2
1
1 u 6
+ ( u 4 +
) t2 + O(t4 )
8
36 4(3/2)
4
with assumptions on 2, 4, t and 6

>

sig2rho:=t->-rs(t)-r(t)*rp(t)*rp(t)/I_r2(t);
sig2rho := t rs(t)

>

r(t) rp(t)2
I r2(t)

simplify(taylor(sig2rho(t),t=0,8),siderels);
1
1 6 22 4 + 3 42 + 4 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6

>

rho:=t->sig2rho(t)/sig2(t);
:= t

>

sig2rho(t)
sig2(t)

simplify(taylor(rho(t),t=0,8),siderels);
1 6 2
t + O(t4 )
18 2 4
with assumptions on t, 6, 2 and 4

1 +

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

PROOF OF PROPOSITION 2.3


>

k2:=t->(1+rho(t))/(1-rho(t));
k2 := t

>

1 + (t)
1 (t)

sk2:=simplify(taylor(k2(t),t=0),siderels);
1
1 6 2
t +
(3 26 4 + 9 24 42 + 9 22 43 2 6 22 4 3 8 22 4
36 2 4
2160
1
+ 3 44 + 13 6 42 + 5 62 ) (22 42 )t4 +
(147 28 42
907200
+ 175 6 26 4 273 26 43 + 63 24 44 + 196 6 24 42 + 120 8 24 42
+ 357 22 45 + 707 6 22 43 195 8 22 43 175 8 22 6 4 + 168 46

sk2 :=

+ 518 62 42 + 686 6 44 + 175 63) (23 43 )t6 + O(t8 )


with assumptions on t, 6, 2 and 4
>

k:=t->taylor(sqrt(sk2),t=0);

k := t taylor( sk2 , t = 0)

>

simplify(taylor(k(t),t=0,3),siderels);
1
6

>

6
t + O(t3 )
2 4
with assumptions on t, 6, 2 and 4

sqrtI_rho2:=t->k(t)*(1-rho(t));
sqrtI rho2 := t k(t) (1 (t))

>

T1:=t->sig2(t)*sqrtI_rho2(t)*phi(b(t))*phi(k(t)*b(t));
T1 := t sig2(t) sqrtI rho2(t) (b(t)) (k(t) b(t))

>

simplify(simplify(series(T1(t),t=0,6),siderels),power);
1
24

u2 22
6 4 e(1/2 4 ) 3
1

t
((5 62 22 u2 + 3 22 42 8 3 26 42 9 24 43
2880
2
9 22 44 15 6 22 42 u2 18 6 22 42 3 45 + 5 62 4 3 6 43 )

e(1/2

u2 22
4

) ( 6 4(3/2) 2(3/2) )t5 + O(t7 )

with assumptions on t, 6, 4 and 2


>
>

T2 := t->2*sig2(t)*(rho(t)-(b(t))^2)*(arctan(k(t))/(2*pi)
-k(t)/sqrt(2*pi)*(phi(0)-phi(b(t))-k(t)^2/6*(2*phi(0)-((b(t))^2+2)*phi(b(t)))));
T2 := t 2sig2(t) ((t) b(t)2 )

1
2
2
k(t)
((0)

(b(t))

k(t)
(2
(0)

(b(t)
+
2)
(b(t))))

1 arctan(k(t))
6

125

126

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

simplify(simplify(series(T2(t),t=0,6),siderels),power);
1

24

u2 22
6 (u2 22 + 4) e(1/2 4 ) 3
t + O(t5 )

4 2
with assumptions on t, 6, 2 and 4

>

T3:=t->(2*sig2(t)*(k(t)*b(t)^2))/sqrt(2*pi)*(1-(k(t)*b(t))^2/6)*phi(b(t));

>

1
sig2(t) k(t) b(t)2 (1 k(t)2 b(t)2 ) (b(t))
6
T3 := t 2
2
simplify(simplify(series(T3(t),t=0,6),siderels),power);
u2 22
1
1 e(1/2 4 ) 6 2(3/2) u2 3

t
2 u2 (27 8 22 42 + 35 62 22 u2
24
25920
4
27 26 42 81 24 43 81 22 44 162 6 22 42 135 6 22 42 u2

27 45 45 62 4 + 243 6 43)e(1/2

u2 22
4

( 6 4(5/2) )t5 + O(t7 )

with assumptions on t, 2, 4 and 6


>

A:=t->((phi(u/sqrt((1+r(t)))))^2/sqrt(I_r2(t)))*(T1(t)+T2(t)+T3(t));
(
A := t

>

u
)2 (T1(t) + T2(t) + T3(t))
1 + r(t)
I r2(t)

simplify(simplify(series(A(t),t=0,6),siderels),power);
O(t4 )
with assumptions on t

PROOF OF THE EQUIVALENT OF NU1


>

Cphib:=t->phi(t)/t-phi(t)/t^3;
Cphib := t

>

sq:=t->sqrt((1-r(t))/(1+r(t)));
sq := t

>

(t) (t)
3
t
t

1 r(t)
1 + r(t)

simplify(simplify(series(sq(t),t=0,4),siderels),power);
1 2 22 + 4 3
1

2 t
t + O(t5 )
2
48
2
with assumptions on t, 2 and 4

>

nsigma:=t->sigma(t)/sqrt(lambda2);
(t)
nsigma := t
2

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS


>
>

A1:=t->(1/sqrt(2*pi))*phi(u)*phi(sq(t)*u/nsigma(t))*((nsigma(t)/(sq(t)*u)
-(nsigma(t)/(sq(t)*u))^3)*sqrt(lambda2)+(1/b(t)-1/b(t)^3)*rp(t)/sqrt(I_r2(t)));

1
1
(

)
rp(t)
3
nsigma(t) nsigma(t)

sq(t) u
b(t) b(t)3

(u) (
) 2 +
)
(

3
3
nsigma(t)
sq(t) u
sq(t) u
I r2(t)

2
SA1:=simplify(simplify(series(A1(t),t=0,6),siderels),power);
A1 := t

>

u2 (4+22 )
)
4
1 2 e(1/2
4(5/2) 2
SA1 :=
t + O(t4 )
16
2(7/2) (3/2) u3
with assumptions on t, 4 and 2

Expansion of the exponent for using Lemma 4.3 (ii), p=2


>

L2:= t->(1-r(t))/((1+r(t))*nsigma(t)^2)-(lambda4-mu4)/mu4;

>

4 4
1 r(t)

(1 + r(t)) nsigma(t)2
4
SL2:=simplify(simplify(series(L2(t),t=0,6),siderels),power);
L2 := t

1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
We define c as the square root of the coefficient of t2
c:=sqrt(op(1,SL2))

1 2 2 6
c :=
6
4
with assumptions on 2, 6 and 4
>
nu1b:=(sqrt(2*pi))*op(1,SA1)*(c^(-3)*u^(-3)/2);
SL2 :=

u2 (4+22 )
)
4
27 2 e(1/2
4(11/2)
nu1b :=
8
2(7/2) u6 (2 6)(3/2)
with assumptions on 4, 2 and 6
PROOF OF THE EQUIVALENT OF NU2
>

m1:=t->(1+rho(t))*2*sigma(t)^2/(pi*b(t)*sqrt(I_r2(t)));
m1 := t 2

>

(1 + (t)) (t)2
b(t) I r2(t)

sm1:=simplify(simplify(series(m1(t),t=0,8),siderels),power);

1 6 4 3
sm1 :=
t + O(t5 )
36 2(5/2) u
with assumptions on t, 6, 4 and 2

127

128

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

m2:=t->(-4/pi)*sigma(t)^2*b(t)^(-3)/sqrt(I_r2(t));

>

(t)2
b(t)3 I r2(t)
sm2:=simplify(simplify(series(m2(t),t=0,6),siderels),power);

>

4(5/2)
t + O(t3 )
u3 2(7/2)
with assumptions on t, 4 and 2
m3:=t->-(1+rho(t))*sigma(t)^2*k(t)/(pi*sqrt((2*pi)*I_r2(t)));

m2 := t 4

sm2 :=

(1 + (t)) (t)2 k(t)


2 I r2(t)
sm3:=simplify(simplify(series(m3(t),t=0,6),siderels),power);

1
6(3/2) 2

sm3 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m4:=t->(2/pi)*sigma(t)^2*k(t)^3/sqrt(2*pi*I_r2(t));
m3 := t

>

>

m4 := t 2

>

>

(t)2 k(t)3

2 I r2(t)
sm4:=simplify(simplify(series(m4(t),t=0,6),siderels),power);

1
6(3/2) 2

sm4 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m5:=t->(4/pi)*sigma(t)^2*k(t)*b(t)^(-2)/sqrt(2*pi*I_r2(t));

>

(t)2 k(t)
b(t)2 2 I r2(t)
sm5:=simplify(simplify(series(m5(t),t=0,6),siderels),power);

1 6 4(3/2) 2 2
sm5 :=
t + O(t4 )
12 23 (3/2) u2
with assumptions on t, 6, 4 and 2
l12:=t-> (b(t)/u)^2 + 2/(1+r(t))-lambda4/mu4;

>

b(t)2
1
4
+2

u2
1 + r(t) 4
simplify(simplify(series(l12(t),t=0,8),siderels),power);

m5 := t 4

>

l12 := t

1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
>

l22:=t-> ((b(t)/u)^2 )*(1+k(t)^2)+2/(1+r(t))-lambda4/mu4;


l22 := t

b(t)2 (1 + k(t)2 )
1
4
+2

u2
1 + r(t) 4

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS


>

simplify(simplify(series(l22(t),t=0,8),siderels),power);
1 2 6 2
t + O(t4 )
12 42
with assumptions on t, 2, 6 and 4

>

simplify(int( cos(t)^3, t=0..arctan(sqrt(2)/2)),power);


8
3
27

>

opm1:=op(1,sm1);

1 6 4
36 2(5/2) u
with assumptions on 6, 4 and 2

opm1 :=
>

opm2:=op(1,sm2);
4(5/2)
u3 2(7/2)
with assumptions on 4 and 2

opm2 :=
>

>

>

>

>

>

opm5:=op(1,sm5);

1 6 4(3/2) 2
opm5 :=
12 23 (3/2) u2
with assumptions on 6, 4 and 2
c1:=144*sqrt(3)*mu4^4*u^(-4)/(sqrt(2*pi)*lambda2^2*mu6^2);

3 44 2
c1 := 72 4
u 22 62
with assumptions on 4, 2 and 6
c2:=3*sqrt(3)*mu4^2*u^(-2)/(sqrt(2*pi)*lambda2*mu6);

3
3 42 2

c2 :=
2 u2 2 6
with assumptions on 4, 2 and 6
c5:=12*sqrt(3)*mu4^3*u^(-3)/(lambda2^(3/2)*mu6^(3/2));

3 43
c5 := 12 3 (3/2) (3/2)
u 2
6
with assumptions on 4, 2 and 6
B:=opm1*c1+opm2*c2+opm5*c5;

3 4(9/2) 3 2
B :=
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
simplify(B);

3 4(9/2) 3 2
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6

129

On the tails of the distribution of the maximum of a


smooth stationary Gaussian process
Jean-Marc Bardet 1, Mario Wschebor 2
Laboratoire de Statistique et de Probabilites, URA CNRS 745,
Universite Paul Sabatier, 118 route de Narbonne,
31062 Toulouse Cedex, France.
2
Centro de Matematica, Facultad de Ciencias, Universidad de la Republica,
Calle Igua 4225, 11400 Montevideo, Uruguay.
March 24, 2000
1

Abstract

This paper deals with the asymptotic behavior when the level tends to +1, of
the tail of the distribution of the maximum of a stationary Gaussian process on a
xed interval of the line. For processes satisfying certain regularity conditions, we
give a second order term for this asymptotics.

Mathematics Subject Classi cation (1991): 60Gxx, 60E05, 60G15, 65U05.


Key words: Tail of Distribution of the Maximum, Stationary Gaussian processes
Short Title: Distribution of the Maximum.

Introduction
X = fX (t); t 2 0; T ]g, T > 0 is a real-valued centered stationary Gaussian process with

continuous paths and MT = tmax


X (t). We denote F (u) = P (MT u) the distribution
2 ;T
function of the random variable MT , r(t) = E fX (s)X (s + t)g the covariance function
and k (k = 0; 1; 2; :::) the spectral moments of the process, whenever they are de ned.
With no loss of generality we will assume that = r(0) = 1:
Under certain regularity conditions, Piterbarg (1981, Theorem 2.2.) proved that for
each T > 0 and any u 2 R:
0

B exp ? 1 u+

1 ? (u) + 2 T (u) ? P (MT > u)


2

(1)

for some constants B > 0 and < 1: (respectively ) denotes the standard normal
distribution (respectively density).
The aim of this paper is to improve the description of the asymptotic behavior of
P (MT > u) as u ! +1 that follows from (1) replacing the bound for the error by
an equivalent as u ! +1. More precisely, under the regularity conditions required in
Theorem 1.1, we will prove that:
r

P (MT >u)=1 ? (u) + 2 T (u)? 2T ( ? )


2

2
2

"

3 1?

!#

? u
4

2
2

1 + o(1)]

(2)
This contradicts Theorem 3.1. in Piterbarg's paper in which a di erent equivalent is
given in case T is small enough (see also Aza s e t al. (1999)).
We will assume further that X has C sample paths (this implies < 1) and that
for every n 1 and pairwise di erent values t ; ::; tn in 0; T ], the distribution of the set of
5n random variables (X j (t ); ::; X j (tn); j = 0; 1; 2; 3; 4) is non-degenerate. A su cient
condition for this to hold is the spectral measure of the process not to be purely atomic
or, if it is purely atomic, that the set of atoms have an accumulation point in the real
line (A proof of this facts can be done in the same way as in Chap. 10 of Cramer and
Leadbetter, 1967).
If is a random vector with values in Rn whose distribution has a density with respect
to Lebesgue measure, we denote by p (x) the density of at the point x 2Rn . 1C denotes
the indicator function of the set C .
If Y = fY (t) : t 2 Rg is a process in L we put ?Y (s; t) for its covariance function and
i j Y
?Yij (s; t) = @@si@t? j (s; t) for the partial derivatives, whenever they exist.
The proof of (2) will consist in computing the density of the distribution of the
random variable MT and studying its asymptotic behavior as u ! +1. Our main tool is
4

( )

( )

the following proposition which is a special case of the di erentiation Lemma 3.3 in Aza s
and Wschebor (1999):
Proposition 1.1 Let Y be a Gaussian process with C paths and such that for every
n 1 and pairwise di erent values t ; ::; tn in 0; T ] the distribution of the set of 3n
random variables (Y j (t ); ::; Y j (tn); j = 0; 1; 2) is non-degenerate. Assume also that
E fY (t)g = 0; ?Y (t; t) = E fY (t)g = 1.
Then, if is a C -function on 0; T ],
2

( )

( )

d P (Y (t) u (t); 8t 2 0; T ]) = LY (u; ) + LY (u; ) + LY (u; ); with


du
LY (u; )= (0)P (Y `(s) u `(s); 8s 2 0; T ]):pY ( (0)u)

(3)

LY (u; )= (T )P (Y a(s) u a(s); 8s 2 0; T ]):pY T ( (T )u)

(4)

(0)

( )

Z T

L (u; )=? (t)E (Y t(t)? t(t)u)1fY t s


Y

( )

u t (s);8s2 0;T ]g

p Y t ;Y
( ( )

(t))

( (t)u; 0(t)u)dt: (5)

Here the functions `; a; t and the (random) functions Y `; Y a ; Y t are the continuous
extensions to 0; T ] of:
?
?
`
(s) = 1 (s) ? ?Y (s; 0) (0) ; Y `(s) = 1 Y (s) ? ?Y (s; 0)Y (0) for 0 < s T; (6)

1 (s)??Y (s; T ) (T ) ; Y a(s)= 1 ?Y (s)??Y (s; T )Y (T ) for 0 s< T; (7)


(s)= T?
s
T?s
t

(s) = (s ?2 t)

Y
(s) ? ?Y (s; t) (t) ? ??Y ((t;t; st)) 0(t)
10

11

Y
Y t(s) = (s ?2 t) Y (s) ? ?Y (s; t)Y (t) ? ??Y ((t;t; st)) Y (t)
0

10

11

0 s T; s 6= t;
0 s T; s 6= t:

(8)
(9)

We will repeatedly use the following Lemma. Its proof is elementary and we omit it.
Lemma 1.1 Let f and g be real-valued functions of class C de ned on the interval 0; T ]
of the real line verifying the conditions:
1) f has a unique minimum on 0; T ] at the point t = t , and f 0(t ) = 0; f "(t ) > 0:
2) Let k = inf j : g j (t ) 6= 0 and suppose k = 0 ; 1 or 2.
De ne
Z T
h(u) =
g(t) exp ? 21 u f (t) dt:
Then, as u ! 1:
Z
k (t ) 1
1
g
xk exp ? 41 f "(t )x dx;
h(u) t k! uk exp ? 2 u f (t )
J
where J = 0; +1) ; J = (?1; 0] or J = ]?1; +1 according as t = 0; t =
T or 0 < t < T respectively.
2

( )

( )

+1

We now turn to our result.

Theorem 1.1 Let X = fX (t) : t 2 0; T ]g be a Gaussian centered stationary process with

C -paths, covariance r(:), = 1, and such that for every n 1 and pairwise di erent
t ; ::; tn in 0; T ], the distribution of the set of 5n random variables (X j (t ); ::; X j (tn); j =
0; 1; 2; 3; 4) is non-degenerate. We shall also assume the additional hypothesis that r0 < 0
in a set dense in 0; T ].
4

( )

( )

Then (2) holds true.


Proof.

We divide the proof into several steps.

Step 1. Proposition1.1 applied to the process Y = X and the function (t) = 1 for all
t 2 0; T ] enables to write the density pMT of the distribution of the maximum MT as:
pMT (u) = A (u) + A (u) + A (u)] : (u); with
(10)
1

A (u) = P (X `(s) u `(s); 8s 2 0; T ]);


1

A (u) = P (X a(s) u a(s); 8s 2 0; T ]);


2

Z T
1
E (X t(t)? t(t)u)1fX t s u t s ;8s2 ;T g dt:
A (u) = ? p
2
Since X is a stationary process and (t) 1, it follows that the processes X and
Xe - de ned as Xe (t) = X (T ? t) - have the same law, so that P (X (s) u for all
s 2 0; T ] jX (0) = u) = P (X (s) u for all s 2 0; T ] jX (T ) = u). Hence, A (u) = A (u).
3

( )

( )

Step 2.

We now consider A (u) and write it in the form:


1

A (u) = P (Y (s) u (s); 8s 2 0; T ])


where Y is the continuous extension to 0; T ] of:
`
`
Y (s) = n X (s) o = s:X (s) 1 s 2]0; T ]
1 ? r (s)] 2
(E X `(s)] ) =
1

1 2

and

? r(s) s 2 0; T ]:
(s) = n (s) o = 11 +
r(s)
(E X `(s)] ) =
`

1 2

Thus, Proposition 1.1 can be applied and:

d A (u) = LY (u; ) + LY (u; ) + LY (u; ):


du
1

(11)

LY (u; ) = 0 because (0) = 0.


1

LY (u; ) = (T )P (Y a(s) u a(s); 8s 2 0; T ]) ( (T )u) , with


a
(s) = 1 + r(T ) ? r(s) ?pr(T ? s) s 2]0; T
(T ? s)(1 + r(T )) 1 ? r (s)
2

0 (T )
r
p
(0) =
T (1 + r(T ))

0 and

(T ) =

On the other hand:

(1 + r(T )) 1 ? r (T )
2

0:

P Y a(s) u a(s); 8s 2 0; T ]

check that

r0(pT )

P Y a(T ) u a(T ) ;

0
E (Y a(T )) ) = E (Y 0(T )) = (1 ?(1r ?(Tr))(?T ))(r (T )) ) ;
so that, since the non-degeneracy hypothesis implies that for each T > 0, E (Y 0(T )) ) is
2

non-zeo, it follows that the numerator in the right-hand member is stricly positive for
T > 0.
Hence,
?
P Y a(s) u a(s); 8s 2 0; T ] ( (T )u) C (T ) exp ? u2 F (T ) ;
with C (T ) > 0, where F (t) is the function
2

(12)

F (t) = (1 ? r(1(?t))r(?t))(r0(t)) for t 2 ?T; T ]; t 6= 0:


2

which is well de ned since the denominator does not vanish because of the previous
remark.
The following properties of the function F are elementary and will be useful in our
calculations.
(a) F has a continuous extension at t = 0.
(b) F (t) > F (0) = ? for t 6= 0 because:
0
0
r00(t))(1 ? r(t)) and
* F 0(t) = 2 (1 ? r(t))( r ((1t)((?rr(t())t))??((r0(?
t)) )
* r0(t) < 0 for t 2 A 0; T ] with A dense in 0; T ], and
2
2

2
2

2 2

* For t 6= 0,
(r0(t)) ? ( ? r00(t))(1 ? r(t)) =
?
(E (X 0(t) ? X 0(0))(X (t) ? X (0))) ? E (X 0(t) ? X (0))
2

E (X (t) ? X (0)) < 0,

from Cauchy-Schwartz inequality (and non-degeneracy hypothesis).

(c) F 0(0) = 0:
(d) F 00(0) = 9(( ? ? ) ) .
2

2
4
2 2
2

From (12) and (b), it follows that

LY (u; ) D(T ) exp ? u2


with D(T ) > 0 and (T ) > 0 for all T > 0.

L (u; ) = ?
Y

Z T

with

p Y t ;Y
( ( )

(t)E (Y t(t)? t(t)u)1fY t s

( )

2
2

(T ) +

pY

u t (s);8s2 0;T ]g

(t))

(13)

(t);Y 0 (t))

(u (t); u 0(t))dt;

0
(t) + E f((Y(0t())t)) g
(u (t); u 0(t)) = 2 (E f(Y10(t)) g) = exp ? u2
= 2 (E f(Y10(t)) g) = exp ? u2 F (t) :
2

2
2

1 2

1 2

2
2

We know that t2min;T F (t) = F (0) = ? .


Also check that:
p
d t(t) = 1
(0) = 0; 0(0) = 21
> 0; (0) = 0; lim
t!
dt
12
and E f(Y (0)) g > 0 from the non-degeneracy condition.
As a consequence:
0

2
2

? >0
?
2
4

6
4

2
2

Z T

(t)E Y t(t)1fY t s

( )

p Y t ;Y

u t (s);8s2 0;T ]g

( ( )

(t))

(u (t); u 0(t))dt

Z T
t (t)) g) =
u
(
E
f
(
Y
(t) 2 (E f(Y 0(t)) g) = exp ? 2 F (t) dt C (t) exp ? u2 F (t) dt;
where C is a positive constant. Also, from Lemma 1.1 and the properties of the functions
F and , one gets:
Z T
(t) exp ? u2 F (t) dt C u1 exp ? u2 ?
:
C a positive constant. Hence,
Z T

1 2

1 2

2
2

2
2

Z T

(t)E Y t(t)1fY t s

( )

u t (s);8s2 0;T ]g

p Y t ;Y
( ( )

(t))

(u (t); u 0(t))dt =

= O u1 exp ? u2

2
2

2
2

(14)

On the other hand,


Z T
0

(t)E

(t)1fY t s

( )

Z T
2
0

p Y t ;Y

u t (s);8s2 0;T ]g

( ( )

(t))

(u (t); u 0(t))dt

(t) t(t) exp ? u2 F (t) dt;


2

where A a positive constant. Since the function g(t) = (t) t(t) veri es g(0) = g0(0) = 0,
and g00(0) 6= 0, Lemma 1.1 implies:
2

Z T
0

(t)E

(t)1fY t s

( )

pY

u t (s);8s2 0;T ]g

= O u1 exp ? u2

(t);Y 0 (t))

(u (t); u 0(t))dt =

2
2

2
2

(15)

From (14) and (15) follows:

LY (u; ) = O u1 exp ? u2

2
2

2
2

(16)

and from (13) and (16):

d A (u) = O 1 exp ? u
:
(17)
du
u
2 ?
Further, observe that since Y is continuous and (s) > 0 for s 2 ]0; T ], (0) = 0,
if Y (0) > 0 the event fY (s) u (s); 8s 2 0; T ]g does not occur for positive u, and if
Y (0) < 0, the same event occurs if u is large enough. This implies that
2

2
2

2
2

A (u) = P (Y (s) u (s); 8s 2 0; T ]) ! P (Y (0) < 0) = 21 as u ! +1


1

and so,

A (u) ? 12 = ?
1

Z
u

d A (v)dv = O 1 exp ? u
dv
u
2

2
2

2
2

(18)

on applying (17).
Step 3.

We will now give an equivalent for A (u): Introduce the following notations:
3

for t 2 ]0; T , Zt(s) is the centered Gaussian process


t
Zt(s) = (E f(XXt((ss))) g) = ; s 2 0; T ]:
2

for t 2 0; T ],

1 2

(s)
(1 ? r(t ? s))
p
=
t(s)=
t
=
(E f(X (s)) g)
(1 ? r (t ? s)) ? (r0(t ? s))
t

1 2

= F (t ? s) for s 2 0; T ]; s 6= t
and
t

(t) = p

2
4

2
2

Hence,
0

F 0(0) = 0:
(t) = p
2 F (0)

B (u; t) = P (Zt(s) u t(s); 8s 2 0; T ]);


Be (u; t) = E X t (t)1fZt s u t s ;8s2 ;T g
3

so that

( )

( )

Z T

Z T
1
B (u; t) dt ? p
Be (u; t) dt = S (u) ? T (u): (19)
A (u) = u 2
2
We will consider in detail the behavior of the rst term as u ! +1.
2

We apply again Proposition 1.1 to compute the derivative of B (u; t) with respect to u.
For t 2 ]0; T :
3

d B (u; t) = LZt (u; ) + LZt (u; ) + LZt (u; ); where


t
t
t
du
LZt (u; t) = t(0)P (Zt`(s) u t`(s); 8s 2 0; T ]) ( t(0)u). Then :
p
LZt (u; ) p1 F (t)exp ? u F (t) ;
3

so that as u ! +1 :

Z T
0

LZt (u; t)dt C u1 exp ? u2 F (0)


2

(20)

for some constant C .


1

* LZt (u; t) = t(T )P (Zta(s) u ta(s); 8s 2 0; T ]) ( t(T )u). In the same way:
p
LZt (u; t) p1 F (T ? t) exp ? u2 F (T ? t) :
2
and:
Z T
LZt (u; t)dt C u1 exp ? u2 F (0) :
for some constant C .
2

(21)

Z T

LZt (u; t) =?
3

(x)E (Ztx(x)? tx(x)u)1fZtx s

( )

u tx (s);8s2 0;T ]g

p Zt x ;Zt x (u t(x); u t0(x))dx:


(

( )

( ))

We rst consider the density in the integrand:


p Zt x ;Zt x (u t(x); u t0(x)) =
(

=
De ne

( )

( ))

u F (x ? t) +
(F 0(x ? t))
1
exp
?
2
4F (x ? t)E f(Zt0(x)) g
E f(Zt0(x)) g
2

(22)

0
Gt (x) = F (x ? t) + 4F (x(?F t()xE?f(tZ))0(x)) g :
2

Check that
min Gt(x) = Gt(t) = F (0);

G0t(t) = F 0(0) = 0:

x2 0;T ]

Moreover, for x 2 0; T ] one has:


x ? t)F 00(0)) + O((x ? t) );
Gt(x) = F (0) + (x ?2 t) F 00(0) + 4F(((0)
E f(Zt0(t)) g
and thus
00
G00t (t) = F 00(0) + 2F (0)(FE f(0))
(Zt0(t)) g :
Also,
X t (x) = (x ?2 t) (X (t) + (x ? t)X 0(t) + (x ?2 t) X 00(t) + (x ?6 t) X 000(t)) ? ::
? X (t)(1 ? (x ? t) ) ? X 0(t)(? (x ? t) + (x ?6 t) ) + O((x ? t) );
2

and

Zt(x) =

1
(X 00(t) + X (t)) +
? + O((x ? t) )
t) ( X 000(t) + X 0(t)) + O((x ? t) ):
+ (x ?
3

p
2

2
2

It follows that:

00(0)
E (Zt0(t)) = 9 ( ?? ) = FF (0)
;
2

and

2
4
2
2

G00t (t) = 23 F 00(0) = 6(( ? ? ) ) :


2

We also have:

2
4
2 2
2

? (x; s) 0(x) :
Zt
t(s) ? ? (x; s): t(x) ? Zt
(s ? x)
? (x; x) t
Zt
2
(t; s) 0(t) =
?
t(s) =
Z
t
t(s) ? ? (t; s): t(t) ? Zt
t
(t ? s)
? (t; t) t
p
p
= (t ?2 s) F (t ? s) ? F (0):E fZt(t)Zt(s)g for s 6= t;
F 00(0) + pF (0)E n(Z 0(t)) o = 3 pF 00(0) = ( ? ) > 0;
t
(t t) = 21 p
t
2 F (0) 6( ? ) =
F (0)
where the last inequality is a consequence of the non-degeneracy condition.
Note that since E fZt(t)Zt(s)g 1,
2 pF (t ? s) ? pF (0) for s 6= t;
t
t (s)
(t ? s)
so that
inf tt(s) > 0:
s;t2 ;T
x
t

(s) =

Zt
10

11

10

11

2
4
2 3 2
2

On the other hand, it is easy to see that tx(s) is a continuous function of the triplet
(x; t; s) and a uniform continuity argument shows that one can nd > 0 in such a
way that if jx ? tj
then
x
c > 0 for all s 2 0; T ]:
t (s)
Thus, for jx ? tj , using the Landau-Shepp-Fernique inequality (see Fernique,
1974):
x
? x
t (x) t (x)u
t (x)
x
x s u x s ;8s2 ;T g = ? p
p
(1 + R)
E
(
Z
(
x
)
?
(
x
)
u
)1
f
Z
t
t
t
t
E f(Zt0(x)) g
E f(Zt0(x)) g
( )

where R

( )

: exp(? :u ) and ; positive constants independent of t; x and u.


2

10

So,
Z T
t(
p

x) tx(x) (1 + R) exp ? 1 u G (x) dx:


2 t
E f(Zt0(x)) g
Using the fact that B (+1; t) = 1 for every t 2 0; T ] we can write:
L (u; t) = 2u
Zt
3

(23)

S (u) = u 2 T ? u 2
2

Z T
0

dt

+1

d B (v; t) dv:
dv
3

The same method of Lemma 1.1, plus


t
3 pF "(0)F (0)
t(t) t (t)
p
=
E f(Zt0(t)) g 2
and (20), (21), (23), (22) show that
2

S (u) = u 2 T ? (1 + o(1)) 2T

3 ( ? ) exp ? u
2

2
2

2
2

2
2

(24)

The second term in (19) can be treated in a similar way, only one should use the full
statement of Lemma 3.3 in Aza s and Wschebor (1999) instead of Proposition 1.1, thus
obtaining:
T (u) = O( u1 ): exp ? u2 ? :
(25)
Then, (24) together with (25) imply that as u ! +1:
2

2
2

A (u) = u 2 T ? (1 + o(1)) 2T
3

2
2

3 ( ? ) exp ? u
2
2
2

2
2

2
4

2
2

(26)

Replacing (18), (26) into (10) and integrating, one obtains (2).

Acknowledgment. The authors thank Professors J-M. Aza s, P. Carmona and C. Del-

mas for useful talks on the subject of this paper.

References

Aza s, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Aza s, J-M. and Wschebor, M. (1999). On the Regularity of the Distribution of the
Maximum of One-parameter Gaussian Processes. Submitted.

11

Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. J.
Wiley & Sons, New-York.
Fernique, X. (1974). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
d'Ete de Probabilites de St. Flour. Lecture Notes in Mathematics, 480, Springer-Verlag.
New-York.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.

12

CLT for the number of crossing of random processes


Jean-Marc Azas
Institut de Mathematiques de Toulouse (CNRS UMR 5219)
Universite Paul Sabatier, 118 route de Narbonne,
31062 Toulouse, France
jean-marc.azais@math.univ-toulouse.fr
December 2, 2013
Keywords: Rice formula,Wiener Chaos, Gaussian processes.

Introduction

This course presents the application of the Malevich [15],Cuzick [6] Berman [4] method for establishing a central limit theorem for non linear functional of Gaussian processes (see Section 3).These
methods have been introduced in the 70s for studying zero crossing of stationary processes or
the sojourn time of a stochastic process. We present here mainly its application to the number
of roots of random processes. The basic argument is the approximation of the original process
by a m-dependent process (see Section 3). Section 2 presents a short memento of crossings of
process and the calculation of their moments. Our main tools and results are presented in Section
3. Section 4 presents generalizations and applications to some particular processes, in particular
random trigonometric polynomials and spe cular point in sea-wave modeling.

Basic facts on crossings of functions

This section contains preliminary results almost without proofs. They can be found for example
in Azas and Wschebor [3].
For simplicity all the functions f (t) considered are real and of class C 1 . If I is a real interval
we will define:
Nu (f, I) := # {t I : f (t) = u} .
Nu (f, I), (Nu for short in case of no ambiguity) is the number of crossings of the level u or the
number of roots of the equation f (t) = u in the interval I. In a similar way, we define the number
of up-crossings or down crossings:
Uu (f, I) := # {t I : f (t) = u, f (t) > 0}
Du (f, I) := # {t I : f (t) = u, f (t) < 0} .
Down-crossings will not be considered in the sequel since the results are strictly equivalent to
those for the up-crossings.
We will say that the real-valued function f defined on the interval I = [t1 , t2 ] satisfies hypothesis
H1.u if:

f is a function of class C 1 ;
f (t1 ) = u, f (t2 ) = u;
{t : t I, f (t) = u, f (t) = 0} = .
Proposition 1 (Kacs counting formula) If f satisfies H1.u , then
1

Nu (f, I) = lim

0 2

1I{|f (t)u|<} |f (t)| dt.

(1)

The Kac counting formula has a weak version that will be useful
Proposition 2 (Banach formula) Assume that f is only absolutely continuous. Then for any
bounded Borel-measurable function g : R R, one has:
+

|f (t)|g(f (t)) dt.

Nu (f, I) g(u) du =

(2)

This formula is a version of the change of variable formula for non one-to-one functions.
From these formula we deduce by passage to the limit the Rice formula that gives the factorial
moments of the number of (up-) crossings. For simplicity we limit to the Gaussian case and to
the first two moments.
Theorem 3 (Gaussian Rice formula) Let X = {X(t) : t I} , I a compact interval of the real
line, be a Gaussian process having C 1 -paths.
Suppose that for every point t I the variance of X(t) does not vanish. Then
E |X (t)| X(t) = u pX(t) (u)dt,

E Nu =

(3)

and the expression above is finite.


Suppose that
for every s = t I, the distribution of X(s), X(t) does not degenerate .

(4)

Then
E Nu (Nu 1) =

E |X (s)||X (t)| X(s) = X(t) = u pX(s),X(t) (u, u)dt,

(5)

I2

and the expression above may be finite or infinite.


Remarks: We have the same kind of formulas for the up-crossings if we replace |X (t)| by
the positive part (X (t))+ .
In case of stationary processes, assuming that the process is centered with variance 1, (3) takes
the simpler form

22
E Nu = 2E Uu = |I| (u),

where (.) is the standard normal density.

A very important issue is the finiteness of the second (factorial) moment. For stationary
processes a necessary and sufficient condition (in addition to (4)) is given by the Geman condition:
let (.) be the covariance of the process and define the function (.) by means of
( ) := E X(t)X(t + ) = 1

2 2
+ ( ).
2

The Geman condition [5]is


( )
d converges at = 0+ ,
2

(6)

More precisely we have the bound


Proposition 4 Let X(t) be a stationary Gaussian process normalized by E(X(t)) = 0, Var(X(t)) =
1. Let (.) be its covariance function, we assume that for every > 0, ( ) = 1 and the Geman
condition. Let Uu = Uu ([0, T ]), then
T

E (Uu )(Uu 1) = 2

(T )E |X(0)||X ( )| X(0) = X( ) = u pX(0),X( ) (u, u)d 2


0

(T )
0

Remark that because of the Rolle theorem: Nu 2Uu + 1, thus the proposition above also gives
a bound for the variance of the number of crossings.

Central limit theorem for non-linear functionals

Our next main tool will be chaos expansion and Hermite polynomials. These polynomials are
orthogonal polynomials for the Gaussian measure (x)dx where is the standard normal density.
The nth Hermite polynomial Hn can be defined by means of the identity:

exp(tx t2 /2) =

Hn (x)
n=0

tn
.
n!

We have for example H0 (x) = 1, H1 (x) = x, H2 (x) = x2 1.


For F in L2 ((x) dx), F can be written as

F (x) =

an Hn (x),
n=0

with
an =

1
n!

F (x)Hn (x)(x)dx,

and the norm of F in L2 ((x)dx) satisfies

||F ||22 =

a2n n!.
n=0

The Hermite rank of F is defined as the smallest n such that an = 0. For our purpose, we can
assume that this rank greater or equal than 1.
A useful standard tool to perform computations with Hermite polynomials and Gaussian variables is Mehlers formula which we state with an extension (see Leon and Ortega, [13]).
3

( )
d.
2

Lemma 5 (Generalized Mehlers formula) (a) Let (X, Y ) be a centered Gaussian vector E(X 2 ) =
E(Y 2 ) = 1 and = E(XY ). Then,
E(Hj (X)Hk (Y )) = j,k j .
(b) Let (X1 , X2 , X3 , X4 ) be a centered Gaussian vector with variance matrix

1
0 13 14
0
1 23 24

=
13 23 1
0
14 24 0
1
Then, if r1 + r2 = r3 + r4 ,
E Hr1 (X1 )Hr2 (X2 )Hr3 (X3 )Hr4 (X4 ) =
(d1 ,d2 ,d3 ,d4 )Z

r1 !r2 !r3 !r4 ! d1 d2 d3 d4


,
d1 !d2 !d3 !d4 ! 13 14 23 24

where Z is the set of di s satisfying: di 0;


d1 + d2 = r1 ; d3 + d4 = r2 ; d1 + d3 = r3 ; d2 + d4 = r4 .

(7)

If r1 + r2 = r3 + r4 the expectation is equal to zero.


Notice that the four equations in (7) are not independent, and that the set Z is finite and contains,
in general, more than one 4-tuple.

Wiener chaos
Let L2 (, A, P) be the space of square integrable variables generated by the process X(t), t R.
This Hilbert space is the orthogonal sum of the Wiener chaos of order p, p = 0, . . . , n, . . . : Hp . Hp
is defined as the closed linear subspace of L2 (, A, P) generated by the variables Hp (X(t)), t R.
In particular the space H1 is simply the Gaussian space associated to X(t). A good reference on
this subject is the Nualart book [17].

3.1

A first central limit theorem

Let X = {X(t) : t R} be a centered real-valued stationary Gaussian process. Without loss of


generality, we assume that Var(X(t)) = 1 t R. We want to consider functionals having the
form:
t

Tt := 1/t

F (X(s)) ds,

(8)

where F is some function in L2 ((x)dx).


Set := E(F (Z)), Z being a standard normal variable. is well defined. The Maruyama
Theorem implies that if the spectral measure of the process X(t) has no atoms, it is ergodic and
Tt converges almost surely to . Our aim is to compute the speed of convergence and establish
for it a central limit theorem.
For the statement of the next result, which is not hard to prove, we need the following additional
definition.
Definition 6 Let m be some positive real, the Gaussian process {X(t) : t R} is called mdependent if Cov(X(s), X(t)) = 0 whenever |t s| > m.
4

An example of such a 1-dependent process is the Slepian process which is stationary with covariance
(t) = (1 t)+ .
Theorem 7 (Hoeffeding and Robins [7]) With the notations and hypotheses above, if the process X(t) is m dependent, then

F (X(s)) ds

t 1/t

N (0, 2 ) in distribution as t +,

where
2 =

1
Var
m

F (X(s))ds .
0

The proof is easy by the shortening method: we cut [0, T ] into smaller intervals separated
by gaps of size m giving the independence.
Our aim is to extend this result to processes which are not mdependent. The proof we
present follows Berman [4] with a generalization, due to Kratz and Leon [10] , to functions F in
(8) having an Hermite rank not necessarily equal to 1.
For > 0, we will approximate the given process X(t) by a new one X (t) which is 1/dependent and estimate the error.
As an additional hypothesis, we will assume that the process X(t) has a spectral density f ().
It has the following spectral representation:
X(t) =

cos(t) f ()dW1 () + sin(t) f ()dW2 () ,

(9)

where W1 and W2 are two independent Wiener processes (Brownian motions). Indeed, using
isometry properties of the stochastic integral, it is easy to see that the process given by (9) is
centered, Gaussian and with the good covariance:

(t) = E(X(s)X(s + t)) = 2

cos(s) cos((t + s))f ()d + 2


0

sin(s) sin((t + s))f ()d


0

=2

cos(t)f ()d.
0

Define now the function (.) as the convolution 1I[ 21 , 21 ] 1I[ 12 , 12 ] . This function is even, non
negative, (0) = 1, has support included in [1, 1] and a non-negative Fourier transform. Set
(.) := 1 (.) and let be its Fourier transform. Define
X (t) :=

cos(t) f ()dW1 () + sin(t) f ()dW2 () ,

(10)

where the convolution must be understood after prolonging f as an even function on R. The
covariance function of X (t) satisfies (t) = (t)(t). This implies that the process X (t) is
1
-dependent. We have the following proposition:
Proposition 8 Let X be a centered stationary Gaussian process with spectral density f () and
covariance function with L1 (R), positive integer. Let X (t) be defined by (10). Then
1
lim lim E
0 t
t

(H (X(s)) H (X (s))) ds
0

= 0.

(11)

Theorem 9 Let X be a Gaussian process satisfying the hypotheses of Proposition 8 and F a


function in L2 ((x)dx) with Hermite rank 1. Then, as t +,

1
tTt =
t

F (X(s))ds N (0, 2 (F )) in distribution


0

where

a2k k!

(F ) := 2

k (s)ds.
0

k=

Proof:
M

an Hn (x) and TtM :=

Define FM :=
n=

1
t

FM (X(s))ds. Let M = M () >

such that

a2k < .

2
k=M +1

Using Mehlers formula, we get

t Var(Tt TtM ) = 2

c2k k!
k=M

s
(1 )k (s)ds 2
t

c2k k!

||k (s)ds <


0

k=M

|| (s)ds.
0

Since is arbitrary, we only need to prove the asymptotic normality for TtM . Let us introduce
TtM, =

1
t

FM (X (s))ds,
0

where X (t) has been defined in (10). By Proposition 8 recalling that for k l, k is in L1 (R)
since is, we obtain:
lim lim t Var(TtM TtM, ) = 0.
0 t

Now Theorem 7 for m- dependent sequences implies that t TtM, is asymptotically normal. Notice
that
M

M, := lim t Var(TtM, ) = 2
t

a2k k!
k=0

k (s)ds

and that M, (F ) when 0 and M , giving the result.

3.2

Hermite expansion for crossings of regular processes

Our aim is to extend the result above to crossings. Let X(t) be a centered stationary Gaussian
process. With no loss of generality for our purposes, we assume that (0) = (0) = 1 and
(t) = 1 for t = 0. We also assume Gemans Condition (6).
(t) = 1 t2 /2 + (t)

with

(t)
dt converges at 0+ .
t2

We define the following expansions

x =

ak Hk (x),
k=0

x =

bk Hk (x),
k=0

|x| =

ck Hk (x).
k=0

(12)

We have a1 = 1/2, b1 = 1/2, c1 = 0 and using integration by parts for k > 2:


ak =

1
k!

xHk (x)(x)dx =
0

k! 2

Hk2 (0).

The classical properties of Hermite polynomials easily imply that for positive k:
a2k+1 = b2k+1 = c2k+1 = 0,
a2k = b2k =

(1)k+1
,
22k k!(2k 1)

c2k = 2a2k .
We have the following Hermite expansion for the number of up-crossings:
Theorem 10 Under the conditions above,

Uu := Uu (X, [0, T ]) =

dj (u)ak

Hj (X(s))Hk (X (s))ds a.s.


0

j=0 k=0

where dj (u) = j!1 (u)Hj (u) and ak is defined by (12). We have similar results, replacing ak by bk
or ck , for the number Du ([0, T ]) of down-crossings and for the total number of crossings Nu ([0, T ]).
Proof : Let g(.) L2 ((x)dx) and define the functional
t

Tg+ (t) =

g(X(s))X + (s)ds.
0

The convergence of the Hermite expansion implies that a.s.

Tg+ (t) =

gj ak

Hj (X(s))Hk (X (s))ds,

(13)

j=0 k=0

where the gj s are the coefficients of the Hermite expansion of g. Using that for each s, X(s) and
X (s) are independent, we get:
t

g(X(s))(X (s))+

E
0

gj ak Hj (X(s))Hk (X (s)) ds
j,k0:k+jQ

(const)t2

j!gj2 k!a2k . (14)


j,k0:k+jQ

On the other hand, using the Geman condition


2 (u, T ) := E Uu ([0, T ])(Uu ([0, T ]) 1) < +.
For every T , 2 (u, T ) is a bounded continuous function of u and the same holds true for E(Uu2 ).
Let us now define
T
1
1I|X(t)u| X + (t)dt.
Uu :=
2 0
In our case, hypotheses of Proposition 1 are a.s. satisfied. The result can be easily extended to
up-crossings, showing that
Uu Uu a.s. as 0.
7

By Fatous Lemma
E (Uu )2 lim inf E (Uu )2 .
0

To obtain an inequality in the opposite sense, we use the Banach formula (Proposition 2). To
do that, notice that this formula remains valid if one replaces in the left-hand side the total number
of crossings by the up-crossings and in the right-hand side |f (t)| by f + (t). So, on applying it to
the random path X(.), we see that:
u+

1
2

Uu =

Ux dx.
u

Using Jensens inequality,


lim sup E (Uu )2 lim sup
0

u+

1
2

E (Ux )2 dx = E (Uu )2
u

So, E (Uu )2 E (Uu )2 and since the random variables involved are non-negative, a standard
argument of passage to the limit based upon Fatous Lemma shows that Uu Uu in L2 .
We now apply (13) to Uu .

Uu =

dj (u)ak jk ,

(15)

j,k=0

where dj (u) are the Hermite coefficients of the function x

1
1I xu

and

jk =

Hj (X(s))Hk (X (s))ds.
0

Notice that
dj (u)
This implies that:

1
(u)Hj (u) = dj (u).
j!

(16)

dj (u)ak jk .

Uu =

(17)

q=0 j+k=q

Theorem 11 Let {X(t) : t R} be a centered stationary Gaussian process verifying the conditions at the beginning of this subsection. Furthermore, let us assume that:
+

|(t)|dt,
0

| (t)|dt,
0

| (t)|dt < .
0

Let {gk }k=0,1,2,... a sequence of coefficients which satisfies


1
Ft :=
gj ak
t k,j0

+ 2
gk k!
0

< . Put:

Hj (X(s))Hk (X (s))ds
0

where ak has been defined in (12). Then


Ft E(Ft ) N (0, 2 ) in distribution as
8

t +

(18)

where

2 (q) < ,

0< =
q=1

and
q

2 (q) := 2

E Hqk (X(0))Hk (X (0))Hqk (X(s))Hk (X (s)) ds.

ak ak gqk gqk
0

k=0 k =0

The integrand in the right-hand side of this formula can be computed using Lemma 5. Similar
results exist, mutatis mutandis, for the sequences {bk } and {ck }.
A consequence is
Corollary 12 If the process X(t) satisfies the conditions of Theorem 11 then, as T +
2

eu /2
1
Uu [0, T ] T
2
T

N (0, 12 ) in distribution

1
eu /2
Nu [0, T ] T

N (0, 22 ) in distribution,

where 12 and 22 are finite and positive.


Remark The result of Theorem 11 is in fact true under weaker hypotheses namely
+
+
|(t)|dt < , 0 | 2 (t)|dt < , see Theorem 1 of Kratz and Leon [11] or Kratz [9].See
0
also Azas and Leon [1] for another generalization where the integral R (t)dt is defined only in
a generalized sense. Our stronger hypotheses make it possible to make a shorter proof.
Proof of the theorem:
Since is integrable, the process X admits a spectral density. The hypotheses and the
Riemann-Lebesgue lemma imply that:
(i) (t) 0

i = 0, 1, 2

as t +.

Hence, we can choose T0 so that for t T0


(t) := sup{|(t)|, | (t)|, | (t)|} 1/4.

(19)

Step 1 In this step we prove that one can choose Q large enough (and that doesnt depend on
t) so that Ft can be replaced with an arbitrarily small error (in the L2 sense) by its components
in the first Q chaos
1
FtQ :=
t

Gqt

with Gqt :=

q=0

gqk ak

Hqk (X(s))Hk (X (s))ds.


0

k=0

Let us consider
1
E (Gqt )2 =
t
q

1/t

gqk ak gqk ak
k,k =0

E Hqk (X(t1 ))Hk (X (t1 ))Hqk (X(t2 ))Hk (X (t2 ))dt2 .

dt1
0

(20)
9

To give an upper-bound for this quantity we split it into two parts.


The part corresponding to |t1 t2 | T0 is bounded, using Lemma 5, by
q

k!(q k)!k !(q k )!


|(s)|d1 | (s)|d2 +d3 | (s)|d4 ds
d1 !d2 !d3 !d4 !

|gqk ||ak ||gqk ||ak |

(const)

T0 (d ,d ,d ,d )Z
1 2 3 4

k,k =0
q

(const)

k!(q k)!k !(q k )! 1 (q1)


( )
(s)ds, (21)
d1 !d2 !d3 !d4 !
4

|gqk ||ak ||gqk ||ak |


T0 (d ,d ,d ,d )Z
1 2 3 4

k,k =0

where Z is as in Lemma 5, setting r1 = q k, r2 = k, r3 = q k , r4 = k .


1
2k
k!(q k)!k !(q k )!

it follows that
in (21) is bounded
k!
d1 !d2 !d3 !d4 !
d d!(k d)!
above by 2q (k )!(q k )! or 2q (k)!(q k)! depending on the way we group terms. As a consequence
it is also bounded above by 2q (k )!(q k )!(k)!(q k)! and the right-hand side of (21) is bounded
above by
Remarking that sup

|gqk ||ak ||gqk ||ak |q2q

(const)

(k )!(q k )!(k)!(q k)!

(t)dt
0

k,k =0
q

(const)

|gqk ||ak ||gqk ||ak | (k )!(q k )!(k)!(q k)!

(22)

k,k =0

where we have used that the number of terms in Z is bounded by q.


On the other hand, the integration region in (20) corresponding to |t1 t2 | T0 can be covered
by at most [t/T0 ] squares of size 2T0 . Using Jensens inequality as we did for the proof of (14) we
obtain:
q

Gq2T0

(const)T02

2
(q k)!k!gqk
a2k .

(23)

k=0

Finally,
1
E
t

q
2
Gqt

2
(q k)!k!gqk
a2k ,

(const)

k=0

which is the general term of a convergent series. This proves also that 2 is finite.
Step 2 Let us prove that 2 > 0. It is sufficient to prove that 2 (2) > 0. Recall that a1 = 0
so that
+

2 (2) = a20 g22

E H2 (X(0))H2 (X(s))ds
0
+

+ a22 g02

E H2 (X (0))H2 (X (s))ds
0
+

+ 2a0 g2 a2 g0

E H2 (X(0))H2 (X (s))ds. (24)


0

10

Using the Mehler formula


+

( (s))2 ds + 4a0 g2 a2 g0

2 (s)ds + 2a22 g02

2 (2) = 2a20 g22

( (s))2 ds
0

4 a20 g22 + 2 2a0 g2 a2 g0 + a20 g 2 f 2 ()d

2 a2 g0 + a0 g2 f 2 ()d > 0. (25)

Step 3 We define (.) = K 1I[1/4,1/4] , where the constant K is chosen such that (0) =
1. Then we define X (t) using (10). The new definition of (.) ensures now that X (t) is
differentiable. Define
Q
1
FtQ, :=
Gq,
t ,
t q=0
with

Gq,
t

Hqk (X (s))Hk (X ) (s) ds.

gqk ak
0

k=0

FtQ

In this step, we prove that


can be replaced, with an arbitrarily small error if is small enough,
by FtQ, . Since the expression of FtQ involves only a finite number of terms having the form:
t

1
0
Kqk,k
:=
t

Hqk (X(s))Hk X (s) ds


0

if is small enough, one can replace with an arbitrarily small error by


1

Kqk,k
:=
t

Hqk (X (s))Hk (X ) (s) ds.


0

For that purpose we study


t
0

E(Kqk,k
Kqk,k
)2 = 2
0

ts
E Hqk (X(0))Hk X (0) Hqk (X(s))Hk X (s)
t

+ E Hqk (X (0))Hk (X ) (0) Hqk (X (s))Hk (X ) (s)


2E Hqk (X(0))Hk X (0) Hqk (X (s))Hk (X ) (s) ds. (26)
Consider the computation of terms of the kind
t
0

ts
E Hqk (Y1 (0))Hk Y1 (0) Hqk (Y2 (s))Hk Y2 (s) ds
t

(27)

where the processes Y1 (t) and Y2 (t) are chosen among {X(t), X (t)}. It suffices to prove that all
these terms have the same limit, as t + and then 0 whatever the choice is.
Applying Lemma 5, the expectation in(27) is equal to
t
0

ts
t

d1 ,...,d4 Z

(q k)!2 k!2
((s))d1 ( (s))d2 ( (s))d3 ( (s))d4 ds,
d1 !d2 !d3 !d4 !
11

where (.) is the covariance function between the processes Y1 and Y2 and Z is defined as in
Lemma 5. Again, since the number of terms in Z is finite, it suffices to prove that
t

lim lim

0 t

ts
((s))d1 ( (s))d2 +d3 ( (s))d4 ds,
t

where (d1 , . . . , d4 ) is chosen in Z, does not depend on the way to choose Y1 and Y2 . is the
Fourier transform of (say) g() which is taken among f (); f () or f () f (). Define
g() = ig() and g() = 2 g(). Then ((s))d1 ( (s))d2 +d3 ( (s))d4 is the Fourier transform
of the function
d
h() = g d1 () g (d2 +d3 ) () g 4 ().
The continuity and boundedness of f imply that all the functions above are bounded and continuous. The Fubini theorem shows that
t
0

ts
(s)d1 (s)d2 +d3 ( (s))d4 ds =
t

1 cos
h( ),
2
t

As t +, the right-hand side converges, using dominated convergence, to


+

1 cos
h(0)d.
2

The continuity of f now gives the result, as in Proposition 8.

Proof of Corollary 12:


Some attention must be payed to the fact that the coefficients
dj (u) =
do not satisfy

j=0

1
(u)Hj (u)
j!

j!d2j (u) < . They only satisfy the relation


j!d2j (u) is bounded

(28)

First, considering the bound given by the right-hand side of (22), we can improve it by reintroducing the factor q2q that had been bound by 1. We get that in its new expression this right-hand
side is bounded by
q

(const)q2q

|dqk (u)||ak ||dqk (u )||ak | (k )!(q k )!(k)!(q k)!

k,k =0
q
2 q

(const)q 2

dqk (u) a2k (k)!(q k)!

k=0
q

(const)q 2 2q

a2k k! (const)q 2 2q .
k=0

Second we have to replace the bound (23). Since the series in (17) is convergent E
is the term of a convergent series and this in enough to conclude.
.
12

Gq2T0

Applications and extensions

Mourareau [?] has extended the result of Corollary 12 to the case of moving level uT
Theorem 13 Let uT be a moving level that tends to infinity with T . Suppose that instead of (18)
we assume only

(t), (t), (t) tend to 0 as t +


There exist > 2 such that
T

|(i) (t)|dt exp(

lim sup
T

u2T
) < ,

i = 0, 1, 2.

E(Ut )
Then
1
T (uT )

2
T (uT )
2

UuT (T )

N 0,

2
)
2

The main point is that the conditions on the process are weaker. In particular process with long
range dependence satisfy conditions above as soon as the level increases sufficiently rapidly.
The second point is that the variance is now simple and explicit and it corresponds to the
Poissonian limit (The variance is equal to the expectation) known as the Vlokonskii- Rozanov
theorem.
Theorem 14 Assule the conditions of Theorem 11 except (18) which is now replaced by the very
weak Bermans condition
( ) log( ) 0 as .
Let uT be a movinf level such that E Uut = where is some constant. Then Uut converges to
a Poisson distribution with parameter .
This is a simplified version, the full one establishes a functional convergence of the point process
itself.

4.1

Random trigonometric polynomials

Let X(t) be the stochastic process with covariance


(t) =

sin(t)
t

Since the covariance is not summable in the Lebesgue sense, it does not satisfy strictly the conditions of Corollary 12. But in fact the integral
(t)dt
R

can be defined by passage to the limit and it can be checked that the result holds true.

13

Let XN (t) the sequences of random trigonometric polynomials given by


1
XN (t) =
N

(an sin nt + bn cos nt),


n=1

where the an , bn s are independent standard normal.


it is easy to check that for each N , XN (t) is a stationary Gaussian process with covariance:
XN ( ) := E[XN (0)XN ( )] =

1
N

cos n =
n=1

1
(N + 1) sin( N2 )
.
cos(
)
N
2
sin 2

(29)

We define the process


YN (t) = XN (t/N ),
with covariance
YN ( ) = XN ( /N ).
The convergence of the Rieman sum to the intergral implies that
YN ( ) ( ) := sin( )/ as N +
And the have the same type of control for the derivatives. The main argument of Azas and Leon
[1] is a construction of the process XN (t) as well as the limit X(t) in the same probability space
to get that the Central limit theorem for the crossings of X(t) pass to those of XN (t) . It gives a
generalization of a paper by Grandville and Wigman [8]
Theorem 15 With the notation above

1.

1 2 2
1
YN
YN
N[0,N
q2 (u)),
] (u) E(N[0,N ] (u)) N (0, 3 u (u) +
N
q=2

2.

2 2 2
1
YN
YN
N[0,2N
q2 (u)),
] (u) E(N[0,2N ] (u)) N (0, 3 u (u) +
2N
q=2

where is the convergence in distribution as N and q2 (u) is the variance of the part in the
qth chaos.

4.2

Specular points

A different case of central limit theorem is given by the number of specular points. These are
point of the surface of the sea that appear in bright on a photo. We use a cylinder model: time
is fixed; the variation of the elevation of the sea W (x) as a function of the space variable x is
modeled by a smooth stationary Gaussian process; as a function of the second space variable y
the elevation of the sea is supposed to be constant.
Suppose that a source of light is located at (0, h1 ) and that an observer is located at (0, h2 )
where h1 and h2 are big with respect to W (x) and x. Only the variable x has to be taken into
account and the following approximation, was introduced long ago by Longuett-Higgins [?]: the
point x is a specular point if
W (x)

kx, with k :=

14

1 1
1
+
.
2 h1
h2

This is a non stationary case: there are more specular points underneath the observer. In particular
if SP (I) s the number of specular points contained in the interval I,
E(SP (I)) =

1
kx
4 ) ( )dx,
2
2

G(k,
I

(30)

where 2 , 4 are the spectral moments of order 2 and 4 repectively that are assumed to be finite;
G(, ) := E(|Z|), Z with distribution N (, 2 ).
An easy consequence of that formula is that
E(SP ) := E(SP (R)) =

G(k, 4)
k

24 1
,
k

as k tends to 0.
As a consequence the number of specular point is almost surely finite and the Central Limit
Theorem may only happen in the case where k 0,i.e. when the locations of the observer an the
source of light are infinitely far from the surface of the sea.
The central limit theorem is now established using Lyapounov type conditions for Lindeberg
type Central Limit Theorem for triangular arrays.
Theorem 16 Under some conditions (see Azas Le
on and Wschebor [2] for details), as k 0,
S

24 1
k

/k

N (0, 1),

in distribution,

where is some (complicated ) constant.

References
[1] Azas J-M., and Le
on J. CLT for Crossings of random trigonometric Polynomials. Electronic
Journal of Probability, 18 (2013), paper 68.
[2] Azas , J-M., Le
on J. and Wschebor, M. Rice formulae and Gaussian waves Bernoulli Volume
17, Number 1 (2011), 170-193.
[3] Azas, J-M. and Wschebor, M. Level sets and Extrema of Random Processes and Fields.
Wiley (2009).
[4] Berman S.M. Ocupation times for stationary Gaussian process J. Applied Probability. 7, 721733 (1970).
[5] Geman, D. (1972). On the variance of the number of zeros of stationary Gaussian process,
Ann. Math. Statist., Vol 43, N 3, 977-982.
[6] Cuzick, J. A Central Limit Theorem for the Number of Zeros of a Stationary Gaussian
Process, The Annals of Probability. Volume 4, Number 4 (1976), 547-556.
[7] Hoeffding W. and Robbins H. (1948).The Central Limit Theorem for dependent random
variables, Duke Math- J. 15, 773-730 .
[8] Granville, A. Wigman, I. The distribution of the zeros of Random Trigonometric Polynomials.
American Journal of Mathematics. 133, 295-357 (2011).
15

[9] Kratz, M.F.(2006). Level crossings and other level functionals of stationary Gaussian processes. Probability Surveys Vol. 3, 230-288
[10] Kratz, M. and Le
on, J.R. Hermite polynominal expansion for non-smooth functionals of
stationary Gaussian processes: Crossings and extremes. Stoch. Proc. Applic. 66, 237-252,
(1997).
[11] Kratz, M. and Le
on, J.R. Central Limit Theorems for Level Functionals of Stationary Gaussian Processes and Fields. Journal of Theoretical Probability. Vol. 14, No. 3, (2001).
[12] Le
on J. (2006). A note on Breuer-Major CLT for non-linear functionals of continuous time
stationary Gaussian processes. Preprint
[13] Le
on J. and Ortega J. (1989). Weak convergence of different types of variation for biparametric
Gaussian processes, Colloquia Math. Soc. J. Bolyai, n 57, 1989, Limit theorems in Proba.
and Stat. Pecs.
[14] Longuett-Higgins, M.S. Reflection and refraction at a random surface. I, II, III, Journal of
the Optical Society of America, vol. 50, No.9, 838-856 (1960).
[15] Malevich , T.L. Asymptotic normality of the number of crossings of the level zero by a
Gaussian process. Theor. Probability Appli. 14, 287-295 (1969).
[16] Central Limit theorem for the number of crossings of an increasing level. Unpublished
manuscript.
[17] Nualart, D. The Malliavin Calculus and Related Topics. Springer-Verlag, (2006).

16

Probab. Theory Relat. Fields (2000)


Digital Object Identifier (DOI) 10.1007/s004400000102

Jean-Marc Azas Mario Wschebor

On the regularity of the distribution


of the maximum of one-parameter
Gaussian processes
Received: 14 May 1999 / Revised version: 18 October 1999 /
Published online: 14 December 2000 c Springer-Verlag 2000
Abstract. The main result in this paper states that if a one-parameter Gaussian process has
C 2k paths and satisfies a non-degeneracy condition, then the distribution of its maximum on
a compact interval is of class C k . The methods leading to this theorem permit also to give
bounds on the successive derivatives of the distribution of the maximum and to study their
asymptotic behaviour as the level tends to infinity.

1. Introduction and main results


Let X = {Xt : t [0, 1]} be a stochastic process with real values and continuous
paths defined on a probability space ( , , P ). The aim of this paper is to study
the regularity of the distribution function of the random variable M := max{Xt :
t [0, 1]}.
X is said to satisfy the hypothesis Hk , k a positive integer, if:
(1) X is Gaussian;
(2) a.s. X has C k sample paths;
(3) For every integer n 1 and any set t1 , ..., tn of pairwise different parameter
values, the distribution of the random vector:
(k)

(k)

Xt1 , ..., Xtn , Xt1 , ..., Xtn , ..., Xt1 , ..., Xtn

is non degenerate.
We denote m(t) and r(s, t) the mean and covariance functions of X, that is
i+j
m(t) := E(Xt ), r(s, t) := E (Xs m(s))(Xt m(t)) and rij := s i t j r (i, j =
0, 1, ..) the partial derivatives of r, whenever they exist.
Our main results are the following:

J.-M. Azas, M. Wschebor: Laboratoire de Statistique et Probabilites,


UMR-CNRS C55830, Universite Paul Sabatier, 118, route de Narbonne,
31062 Toulouse Cedex 4. France. e-mail: azais@cict.fr
M. Wschebor: Centro de Matematica, Facultad de Ciencias, Universidad de la Republica,
Calle Igua 4225, 11400 Montevideo, Uruguay. e-mail: wscheb@fcien.edu.uy
Mathematics Subject Classification (2000): 60G15, 60Gxx, 60E05
Key words or phrases: Extreme values Distribution of the maximum

J.-M. Azas, M. Wschebor

Theorem 1.1. Let X = {Xt : t [0, 1]} be a stochastic process satisfying H2k .
Denote by F (u) = P (M u) the distribution function of M.
Then, F is of class C k and its succesive derivatives can be computed by repeated
application of Lemma 3.3.
Corollary 1.1. Let X be a stochastic process verifying H2k and assume also that
E(Xt ) = 0 and V ar(Xt ) = 1.
Then, as u +, F (k) (u) is equivalent to
(1)k1

uk u2 /2
e
2

r11 (t, t)dt.

(1)

The regularity of the distribution of M has been the object of a number of


papers. For general results when X is Gaussian, one can mention:Ylvisaker (1968);
Tsirelson (1975); Weber (1985); Lifshits (1995); Diebolt and Posse (1996) and
references therein.
Theorem 1.1 appears to be a considerable extension, in the context of oneparameter Gaussian processes, of existing results on the regularity of the distribution
of the maximum which, as far as the authors know, do not go beyond Lipschitz condition for the first derivative. For example, it implies that if the process is Gaussian
with C paths and satisfies the non-degeneracy condition for every k = 1, 2, . . . ,
then the distribution of the maximum is C . The same methods provide bounds for
the successive derivatives as well as their asymptotic behaviour as their argument
tends to + (Corollary 1.1).
Except in Theorem 3.1, which contains a first upper bound for the density of
M, we will assume X to be Gaussian.
The proof of Theorem 1.1 is based upon the main Lemma 3.3. Before giving
the proofs we have stated Theorem 3.2 which presents the result of this Lemma in
the special case leading to the first derivative of the distribution function of M. As
applications one gets upper and lower bounds for the density of M under conditions that seem to be more clear and more general than in previous work (Diebolt
and Posse, 1996). Some extrawork is needed to extend the implicit formula (9) to
non-Gaussian processes, but this seems to be feasible.
As for Theorem 1.1 for derivatives of order greater than 1, its statement and its
proof rely heavily on the Gaussian character of the process.
The main result of this paper has been exposed in the note by Azas and
Wschebor (1999).

2. Crossings
Our methods are based on well-known formulae for the moments of crossings of
the paths of stochastic processes with fixed levels, that have been obtained by a
variety of authors, starting from the fundamental work of S.O.Rice (19441945).
In this section we review without proofs some of these and related results.
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I ) := {t I : f (t) = u}

Regularity of the distribution of the maximum

Nu (f ; I ) =

Cu (f ; I )

denote respectively the set of roots of the equation f (t) = u on the interval I and
the number of these roots, with the convention Nu (f ; I ) = + if the set Cu is
infinite. Nu (f ; I ) is called the number of crossings of f with the level u on
the interval I .
In the same way, if f is a differentiable function the number of upcrossings
and downcrossings of f are defined by means of
Uu (f ; I ) := ({t I : f (t) = u, f (t) > 0})
Du (f ; I ) := ({t I : f (t) = u, f (t) < 0}).
For a more general definition of these quantities see Cramer and Leadbetter (1967).
In what follows, f p is the norm of f in Lp (I, ), 1 p +, denoting the Lebesgue measure. The joint density of the finite set of real-valued random
variables X1 , ...Xn at the point (x1 , ...xn ) will be denoted pX1 ,...,Xn (x1 , ...xn ) whenever it exists. (t) := (2)1/2 exp(t 2 /2) is the density of the standard normal
t
distribution, (t) := (u)du its distribution function.
The following proposition (sometimes called Kacs formula) is a common tool
to count crossings.
Proposition 2.1. Let f : I = [a, b] IR be of class C1 , f (a), f (b) = u. If f
does not have local extrema with value u on the inteval I , then
Nu (f ; I ) = lim 1/(2)
0

1I{|f (t)u|<} |f (t)|dt.

For m and k, positive integers, k m, define the factorial kth power of m by


m[k] := m(m 1) (m k + 1).
For other real values of m and k we put m[k] := 0. If k is an integer k 1 and I an
interval in the real line, the diagonal of I k is the set:
Dk (I ) := {(t1 , ..., tk ) I k , tj = th for some pair (j, h), j = h}.
Finally, assume that X = {Xt : t IR} is a real valued stochastic process with C 1
paths. We set , for (t1 , ..., tk ) I k \Dk (I ) and xj IR (j = 1, ..., k):
k

At1 ,...tk (x1 , ...xk ) :=

IRk

j =1

|xj | pXt

1 ,...,Xtk ,Xt1 ,...,Xtk

(x1 , ...xk , x1 , ...xk )dx1 ...dxk


and
Ik (x1 , ...xk ) :=

Ik

At1 ,...tk (x1 , ...xk )dt1 ...dtk ,

where it is understood that the density in the integrand of the definition of At1 ,...tk
(x1 , ...xk ) exists almost everywhere and that the integrals above can take the value
+.

J.-M. Azas, M. Wschebor

Proposition 2.2. Let k be a positive integer, u a real number and I a bounded


interval in the line. With the above notations and conditions, let us assume that the
process X also satisfies the following conditions:
1. the density
pXt1 ,...,Xtk ,Xs

,...,Xs

(x1 , ...xk , x1 , ...xk )

exists for (t1 , ...tk ), (s1 , ...sk ) I k \Dk (I ) and is a continuous function of
(t1 , ...tk ) and of x1 , ...xk at the point (u, ..., u).
2. the function
(t1 , ..., tk , x1 , ...xk ) At1 ,...tk (x1 , ...xk )
is continuous for (t1 , ..., tk ) I k \Dk (I ) and x1 , ...xk belonging to a neighbourhood of u.
3. (additional technical condition)
IR3

|x1 |k1 |x2 x3 |pXt

,...,Xtk ,Xs ,Xs ,Xt1 (x1 , ...xk , x1 , x2 , x3 )dx1 dx2 dx3 0


1

as |s2 t1 | 0, uniformly as (t1 , ..., tk ) varies in a compact subset of


I k \Dk (I ) and x1 , ..., xk in a fixed neighbourhood of u.
Then,
E((Nu (X, I ))[k] ) = Ik (u, ..., u).

(2)

Both members in (2) may be +


Remarks. (a) For k = 1 formula (2) becomes
E[Nu (X; I )] =

dt

|x |pXt ,Xt (u, x )dx .

(3)

(b) Simple variations of (3), valid under the same hypotheses are:
E[Uu (X; I )] =
E[Du (X; I )] =

dt

x pXt ,Xt (u, x )dx

(4)

|x |pXt ,Xt (u, x )dx .

(5)

dt

In the same way one can obtain formulae for the factorial moments of marked
crossings, that is, crossings such that some additional condition holds true. For
example, if Y = {Yt : t IR} is some other stochastic process with real values
such that for every t , (Yt , Xt , Xt ) admit a joint density, a < b + and
Nua,b (X, I ) := {t : t I, Xt = u, a < Yt < b}.
Then
E[Nua,b (X; I )] =

b
a

dy

dt

|x |pYt ,Xt ,Xt (y, u, x )dx .

(6)

Regularity of the distribution of the maximum

+
In particular, if Ma,b
is the number of strict local maxima of X(.) on the interval I

+
such that the value of X(.) lies in the interval (a, b), then Ma,b
= D0a,b (X , I ) and:
+
]=
E[Ma,b

b
a

dy

dt

|x |pXt ,Xt ,Xt (x, 0, x )dx .

(7)

Sufficient conditions for the validity of (6) and (7) are similar to those for 3.
(c) Proofs of (2) for Gaussian processes satisfying certain conditions can be
found in Belayev (1966) and Cramer-Leadbetter (1967). Marcus (1977) contains
various extensions. The present statement of Proposition 2.2 is from Wschebor
(1985).
(d) It may be non trivial to verify the hypotheses of Proposition 2.2. However
some general criteria are available. For example if X is a Gaussian process with C1
paths and the densities
pXt1 ,...,Xtk ,Xs ,...,Xs
1

are non-degenerate for (t1 , ...tk ), (s1 , ...sk ) I k \Dk , then conditions 1, 2, 3 of
Proposition 2.2 hold true (cf Wschebor, 1985, p.37 for a proof and also for some
manageable sufficient conditions in non-Gaussian cases).
(e) Another point related to Rice formulae is the non existence of local extrema
at a given level. We mention here two well-known results:
Proposition 2.3 (Bulinskaya, 1961). Suppose that X has C1 paths and that for
every t I , Xt has a density pXt (x) bounded for x in a neighbourhood of u.
Then, almost surely, X has no tangencies at the level u, in the sense that if
TuX := {t I, Xt = u, Xt = 0},
then P (TuX = ) = 1.
Proposition 2.4 (Ylvisakers Theorem, 1968). Suppose that {Xt : t T } is a
real-valued Gaussian process with continuous paths, defined on a compact separable topological space T and that V ar(Xt ) > 0 for every t T . Then, for each
u IR, with probability 1, the function t Xt does not have any local extrema
with value u.
3. Proofs and related results
Let be a random variable with values in IRk with a distribution that admits a
density with respect to the Lebesgue measure . The density will be denoted by
p (.) . Further, suppose E is an event. It is clear that the measure
(B; E) := P ({ B} E)
defined on the Borel sets B of IRk , is also absolutely continuous with respect to .
We will denote the density of related to E the Radon derivative:
p (x; E) :=

d (.; E)
(x).
d

It is obvious that p (x; E) p (x) for -almost every x IRk .

J.-M. Azas, M. Wschebor

Theorem 3.1. Suppose that X has C2 paths, that X, X , X admit a joint density at
every time t, that for every t, Xt has a bounded density pXt (.) and that the function
1

I (x, z) :=

dt

|x |pXt ,Xt ,Xt (x, z, x )dx

is uniformly continuous in z for (x, z) in some neighbourhood of (u, 0). Then the
distribution of M admits a density pM (.) satisfying a.e.
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 > 0)
1

dt

|x |pXt ,Xt ,Xt (u, 0, x )dx .

(8)

Proof . Let u IR and h > 0. We have


P (M u) P (M u h) = P (u h < M u)
P (u h < X0 u, X0 < 0) + P (u h < X1 u, X1 > 0)
+
+P (Muh,u
> 0),
+
+
= Muh,u
where Muh,u
(0, 1), since if u h < M u, then either the maximum
occurs in the interior of the interval [0, 1] or at 0 or 1, with the derivative taking
the indicated sign. Note that
+
+
> 0) E(Muh,u
).
P (Muh,u

Using Proposition 2.3, with probability 1, X (.) has no tangencies at the level 0,
thus an upper bound for this expectation follows from the Kacs formula:
1
0 2

+
= lim
Muh,u

1
0

1I{X(t)[uh,u]} 1I{X (t)[,]} 1I{X

(t)<0} |X

(t)|dt

a.s.

which together with Fatous lemma imply:


+
) lim inf
E(Muh,u
0

1
2

dz

u
uh

I (x, z)dx =

u
uh

I (x, 0)dx.

Combining this bound with the preceeding one, we get


P (M u) P (M u h)

uh

pX0 (x; X0 < 0) + pX1 (x; X1 > 0) + I (x, 0) dx,

which gives the result.


In spite of the simplicity of the proof, this theorem provides the best known
upper-bound for Gaussian processes. In fact, in this case, formula (8) is a simpler
expression of the bound of Diebolt and Posse (1996). More precisely, if we use
their parametrization by putting
m(t) = 0 ; r(s, t) =

(s, t)
,
(s) (t)

Regularity of the distribution of the maximum

with
(t, t) = 1, 11 (t, t) = 1, 10 (t, t) = 0, 12 (t, t) = 0, 02 (t, t) = 1,
after some calculations, we get exactly their bound M(u) ( their formula (9)) for
the density of the maximum.
Let us illustrate formula (8) explicitly when the process is Gaussian, centered
with unit variance. By means of a deterministic time change, one can also assume
that the process has unit speed (V ar(Xt ) 1). Let L the length of the new
time interval. Clearly t, m(t) = 0, r(t, t) = 1, r11 (t, t) = 1, r10 (t, t) = 0,
r12 (t, t) = 0, r02 (t, t) = 1. Note that
Z N(, 2 ) E(Z ) = (/ ) (/ ).
The formulae for regression imply that conditionally on Xt = u, Xt = 0, Xt
has expectation u and variance r22 (t, t) 1. Formula (8) reduces to
pM (u) p+ (u) := (u) 1+(2)1/2
with Cg (t) :=

L
0

Cg (t)(u/Cg (t)) + u (u/Cg (t))dt ,

r22 (t, t) 1

As x +,

(x) = 1

(x)
x

(x)
x3

+O

(x)
x5

. This implies that


L

p + (u) = (u) 1 + Lu(2)1/2 + (2)1/2 u2


0

Cg3 (t)(u/Cg (t))dt

+O u4 (u/C + ) ,
with C + := supt[0,L] Cg (t).
Furthermore the exact equivalent of pM (u) when u + is
(2)1 u L exp(u2 /2)
as we will see in Corollary 1.1.
The following theorem is a special case of Lemma 3.3. We state it separately
since we use it below to compare the results that follow from it with known results.
Theorem 3.2. Suppose that X is a Gaussian process satisfying H2 . Then M has a
continuous density pM given for every u by
pM (u) = pX0 (u ; M u) + pX1 (u ; M u)
1

+
0

dt

|x |pXt ,Xt ,Xt (u , 0, x ; M u)dx ,

(9)

where pX0 (u ; M u) = limxu pX0 (x; M u) exists and is a continuous


function of u , as well as pX1 (u ; M u) and pXt ,Xt ,Xt (u , 0, x ; M u).

J.-M. Azas, M. Wschebor

Again, we obtain a simpler version of the expression by Diebolt and Posse


(1996).
In fact, the result 3.2 remains true if X is Gaussian with C2 paths and one
requires only that Xs , Xt , Xt , Xt admit a joint density for all s, t, s = t [0, 1].
If we replace the event {M u} respectively by {X0 < 0}, {X1 > 0} and in
each of the three terms in the right hand member in formula (9) we get the general
upper-bound given by (8).
To obtain lower bounds for pM (u), we use the following immediate inequalities:
P (M u/X0 = u) = P (M u, X0 < 0/X0 = u)
P X0 < 0/X0 = u
E(Uu [0, 1]1I{X <0} /X0 = u).
0

In the same way


P (M u/X1 = u) = P (M u, X1 > 0/X1 = u)
P X1 > 0/X1 = u
E(Du [0, 1]1I{X >0} /X1 = u)
1

and if x < 0 :
P (M u/Xt = u, Xt = 0, Xt = x )
1 E([Du ([0, t]) + Uu ([t, 1])] /Xt = u, Xt = 0, Xt = x ).
If we plug these lower bounds into Formula (9) and replace the expectations of
upcrossings and downcrossings by means of integral formulae of (4), (5) type, we
obtain the lower bound:
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 < 0)
1

+
0

dt
ds

0
1

dt

|x |pXt ,Xt ,Xt (u, 0, x )dx


+

dx
0

|x |

t
0

xs pXs ,Xs ,X0, X0 (u, xs , u, x )dxs


0

ds |x |pXs ,Xs ,Xt ,Xt ,Xt (u, x , u, 0, x )dx


1
+
+ t ds 0 x pXs ,Xs ,Xt ,Xt ,Xt (u, x , u, 0, x )dx

dx .
(10)

Simpler expressions for (10) also adapted to numerical computations, can be found
in Cierco (1996).
Finally, some sharper upperbounds for pM (u) are obtained when replacing the
event {M > u} by {X0 + X1 > 2u}, the probability of which can be expressed
using the conditionnal expectation and variance of X0 + X1 ; we are able only to
express these bounds in integral form.
We now turn to the proofs of our main results.

Regularity of the distribution of the maximum

Lemma 3.1. (a) Let Z be a stochastic process satisfying Hk (k 2) and t a point


in [0, 1]. Define the Gaussian processes Z , Z , Z t by means of the orthogonal
decompositions:
(11)
Zs = a (s) Z0 + sZs s (0, 1] .
Zs = a (s) Z1 + (1 s) Zs
Zs = bt (s)Zt + ct (s) Zt +

(s t) t
Zs
2
2

s [0, 1) .
s [0, 1] s = t.

(12)
(13)

Then, the processes Z , Z , Z t can be extended defined at s = 0, s = 1, s = t


respectively so that they become pathwise continuous and satisfy Hk1 , Hk1 , Hk2
respectively.
(b) Let f be any function of class C k . When there is no ambiguity on the process Z, we will define f , f , f t in the same manner, putting f instead of Z in
(11), (12), (13), but still keeping the regression coefficients corresponding to Z.
Then f , f , f t can be extended by continuity in the same way to functions in
C k1 , C k1 , C k2 respectively.
(c) Let m be a positive integer, suppose Z satisfies H2m+1 and t1 , ..., tm belong
to [0, 1] { , }. Denote by Z t1 ,...,tm the process obtained by repeated application
of the operation of part (a) of this Lemma, that is
Zst1 ,...,tm = Z t1 ,...,tm1

tm
.
s

Denote by s1 , ..., sp (p m) the ordered p-tuple of the elements of t1 , ..., tm that


belong to [0, 1] (i.e. they are not or ). Then, a.s. for fixed values of the
symbols , the application:
s1 , ..., sp , s Zst1 ,...,tm , Z t1 ,...,tm

is continuous.
Proof . (a) and (b) follow in a direct way, computing the regression coefficients
a (s), a (s) , bt (s), ct (s) and substituting into formulae (11), (12), (13). Note
that (b) also follows from (a) by applying it to Z + f and to Z. We prove now (c)
which is a consequence of the following:
Suppose Z(t1 , ..., tk ) is a Gaussian field with C p sample paths (p 2) defined on
[0, 1]k with no degeneracy in the same sense that in the definition of hypothesis Hk
(3) for one-parameter processes. Then the Gaussian fields defined by means of:
Z (t1 , ..., tk ) = (tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 0)
for tk = 0,
Z (t1 , ..., tk ) = (1 tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 1)
for tk = 1,
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2 (Z(t1 , ..., tk1 , tk+1 )
b(t1 , ..., tk , tk+1 )Z(t1 , ..., tk )
Z
(t1 , ..., tk ))
for tk+1 = tk
c(t1 , ..., tk , tk+1 )
tk

10

J.-M. Azas, M. Wschebor

can be extended to [0, 1]k (respectively [0, 1]k , [0, 1]k+1 ) into fields with paths in
C p1 (respectively C p1 , C p2 ). In the above formulae,
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 0),
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 1),
- b(t1 , ..., tk , tk+1 ), c(t1 , ..., tk , tk+1 ) are the regression coefficients of
Z
(t1 , ..., tk ) .
Z(t1 , ..., tk1 , tk+1 ) on the pair Z(t1 , ..., tk ), t
k
Let us prove the statement on Z. The other two are simpler. Denote by V the subZ
(t1 , ..., tk ) . Denote
space of L2 ( , , P ) generated by the pair Z(t1 , ..., tk ), t
k

by V the version of the orthogonal projection of L2 ( , , P ) on the orthogonal


complement of V , defined by means of.
V (Y )

:= Y bZ(t1 , ..., tk ) + c

Z
(t1 , ..., tk ) ,
tk

where b and c are the regression coefficients of Y on the pair


Z(t1 , ..., tk ),

Z
(t1 , ..., tk ).
tk

Note that if {Y : } is a random field with continuous paths and such that
Y is continuous in L2 ( , , P ) , then a.s.
, t1 , ..., tk )

V (Y )

is continuous.
From the definition:
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2

(Z(t1 , ..., tk1 , tk+1 )) .

On the other hand, by Taylors formula:


Z(t1 , ..., tk1 , tk+1 ) = Z(t1 , ..., tk )+(tk+1 tk )
with
R2 (t1 , ..., tk , tk+1 ) =

tk+1
tk

Z
(t1 , ..., tk )+R2 (t1 , ..., tk , tk+1 )
tk

2Z
(t1 , ..., tk1 , ) (tk+1 ) d
tk2

so that
Z(t1 , ..., tk , tk+1 ) =

2 (tk+1 tk )2 R2 (t1 , ..., tk , tk+1 ) .

(14)

It is clear that the paths of the random field Z are p 1 times continuously differentiable for tk+1 = tk . Relation (14) shows that they have a continuous extension
to [0, 1]k+1 with Z(t1 , ..., tk , tk ) =
V

2Z
(t , ..., tk )
tk2 1

. In fact,

2 (sk+1 sk )2 R2 (s1 , ..., sk , sk+1 )

Regularity of the distribution of the maximum

= 2 (sk+1 sk )2

sk+1
sk

11

2Z
(s , ..., sk1 , )
tk2 1

(sk+1 ) d.

According to our choice of the version of the orthogonal projection V , a.s. the
integrand is a continuous function of the parameters therein so that, a.s.:
Z (s1 , ..., sk , sk+1 )

2Z
(t1 , ..., tk ) when (s1 , ..., sk , sk+1 )
tk2
(t1 , ..., tk , tk ).
V

This proves (c). In the same way, when p 3, we obtain the continuity of the
partial derivatives of Z up to the order p2.
The following lemma has its own interest besides being required in our proof
of Lemma 3.3. It is a slight improvement of Lemma 4.3, p. 76 in Piterbarg (1996)
in the case of one-parameter processes.
Lemma 3.2. Suppose that X is a Gaussian process with C3 paths and that for all
(2)
(3)
s = t, the distributions of Xs , Xs , Xt , Xt and of Xt , Xt , Xt , Xt do not degenerate. Then, there exists a constant K (depending on the process) such that
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) K(t s)4
for all x1 , x2 , x1 , x2 IR and all s, t, s = t [0, 1].
Proof .
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) (2)2 DetV ar(Xs , Xt , Xs , Xt )

1/2

where DetV ar stands for the determinant of the variance matrix. Since by hypothesis the distribution does not degenerate outside the diagonal s = t, the conclusion
of the lemma is trivially true on a set of the form {|s t| > }, > 0. By a compactness argument it is sufficient to prove it for s, t in a neighbourhood of (t0 , t0 )
for each t0 [0, 1]. For this last purpose we use a generalization of a technique
employed by Belyaev (1966). Since the determinant is invariant by adding linear
combination of rows (resp. columns) to another row (resp. column),
DetV ar(Xs , Xt , Xs , Xt ) = DetV ar(Xs , Xs , X s(2) , X s(3) ),
with
X s(2) = Xt Xs (t s)Xs
X s(3) = Xt Xs

2
X (2)
(t s) s

(t s)2 (2)
Xt0
2
(t s)2 (3)
Xt0 ,
6

The equivalence refers to (s, t) (t0 , t0 ). Since the paths of X are of class
(2)
(3)
C3 , Xs , Xs , (2(t s)2 )X s , (6(t s)2 )X s
tends almost surely to

12

J.-M. Azas, M. Wschebor


(2)

(3)

Xt0 , Xt0 , Xt0 , Xt0 as (s, t) (t0 , t0 ). This implies the convergence of the
variance matrices. Hence
DetV ar(Xs , Xt , Xs , Xt )

(t s)8
(2)
(3)
DetV ar(Xt0 , Xt0 , Xt0 , Xt0 ),
144

which ends the proof.


Remark. the proof of Lemma 3.2 shows that the density of Xs , Xs , Xt , Xt exists
for |s t| sufficiently small as soon as the process has C3 paths and for every t
(3)
the distribution of Xt , Xt , Xt , Xt does not degenerate. Hence, under this only
hypothesis, the conclusion of the lemma holds true for 0 < |s t| < and some
> 0.
Lemma 3.3. Suppose Z = {Zt : t [0, 1]} is a stochastic process that verifies
H2 . Define:
Fv (u) = E v .1IAu
where
Au = Au (Z, ) = {Zt (t) u f or all t [0, 1]},
(.) is a real valued C 2 function defined on [0, 1],
v = G(Zt1 (t1 )v, ..., Ztm (tm )v) for some positive integer m, t1 , ..., tm
[0, 1] , v IR and some C function G : IRm IR having at most polynomial
growth at , that is, |G(x)| C(1 + x p ) for some positive constants C, p
and all x IRm ( . stands for Euclidean norm).
Then,
For each v IR, Fv is of class C 1 and its derivative is a continuous function
of the pair (u, v) that can be written in the form:
Fv (u) = (0)E v,u .1IAu (Z

+ (1)E v,u .1IAu (Z


1

) .pZ0 ( (0) .u)


,

) pZ1 ( (1) .u)

t
(t)E v,u
Ztt t (t).u 1IAu (Z t , t )

pZt ,Zt (t) .u, (t) .u dt,

(15)

where the processes Z , Z , Z t and the functions , , t are as in Lemma


t are given by:
3.1 and the random variables v,u , v,u , v,u
v,u = G t1 Zt1 (t1 ) u + (t1 ) (u v), ...
..., tm Ztm (tm ) u + (tm ) (u v)
v,u = G (1 t1 ) Zt1 (t1 ) u + (t1 ) (u v), ...
..., (1 tm ) Ztm (tm ) u + (tm ) (u v)

Regularity of the distribution of the maximum

13

(t1 t)2 t
Zt1 t (t1 ) u + (t1 ) (u v), ...
2
(tm t)2 t
Ztm t (tm ) u + (tm ) (u v) .
...,
2

t
v,u
=G

Proof . We start by showing that the arguments of Theorem 3.1 can be extended to
our present case to establish that Fv is absolutely continuous. This proof already
contains a first approximation to the main ideas leading to the proof of the lemma.
Step 1 Assume - with no loss of generality - that u 0 and write for h > 0:
Fv (u) Fv (u h) = E v .1IAu \Auh E v .1IAuh \Au

(16)

Au \ Auh {(0)(u h) < Z0 (0)u, (0) > 0}


(1)
{(1)(u h) < Z1 (1)u, (1) > 0} Muh,u 1

(17)

Note that:

where:
(1)

Muh,u = {t : t (0, 1), (t) 0, the function Z(.) (.)(u h)


has a local maximum at t with value falling in the interval [0, (t)h]}.
Using the Markov inequality
(1)

(1)

P (Muh,u 1) E Muh,u ,
and the formula for the expectation of the number of local maxima applied to the
process t Zt (t)(u h) imply
|E v .1IAu \Auh |
1I{(0)>0}
+1I{(1)>0}
1

+
0

(0)u
(0)(uh)
(1)u

E (|v |/Z0 = x) pZo (x)dx


E (| v | /Z1 = x) pZ1 (x)dx

(1)(uh)
(t)h

1I{(t)>0} dt

E |v |(Zt (t)(u h)) /V2 = (x, 0)

.pV2 (x, 0)dx,

(18)

where V2 is the random vector


Zt (t)(u h), Zt (t)(u h) .
Now, the usual regression formulae and the form of v imply that
|E v .1IAu \Auh | (const).h
where the constant may depend on u but is locally bounded as a function of u.

14

J.-M. Azas, M. Wschebor


(1)

An analogous computation replacing Muh,u by


(2)

Muh,u = {t : t (0, 1), (t) 0, the function Z(.) (.)u


has a local maximum at t, Zt (t)u [0, (t)h]}
leads to a similar bound for the second term in (16). It follows that
|Fv (u) Fv (u h)| (const).h
where the constant is locally bounded as a function of u. This shows that Fv is
absolutely continuous.
The proof of the Lemma is in fact a refinement of this type of argument. We
will replace the rough inclusion (17) and its consequence (18) by an equality.
In the two following steps we will assume the additional hypothesis that Z
verifies Hk for every k and (.) is a C function.
Step 2.
Notice that:
Au \ Auh = Au {(0)(u h) < Z0 (0)u, (0) > 0}
(1)
{(1)(u h) < Z1 (1)u, (1) > 0} {Muh,u 1} .

(19)

We use the obvious inequality, valid for any three events F1 , F2 , F3 :


3

1IFj 1I3 Fj 1IF1 F2 + 1IF2 F3 + 1IF3 F1


1

to write the first term in (16) as:


E v .1IAu \Auh = E v .1IAu 1I{(0)(uh)<Z0 (0)u} 1I{(0)>0}
+ E v .1IAu 1I{(1)(uh)<Z1 (1)u} 1I{(1)>0}
(1)

+ E v .1IAu Muh,u + R1 (h)

(20)

where
|R1 (h)| E |v |1I{(0)(uh)<Z0 (0)u,(1)(uh)<Z1 (1)u} 1I{(0)>0,(1)>0}
+ E |v |1I
+ E |v |1I

(1)

1I{(0)>0}

(1)

1I{(1)>0}

(0)(uh)<Z0 (0)u,Muh,u 1
(1)(uh)<Z1 (1)u,Muh,u 1
(1)

+ E |v | Muh,u 1IM (1)

uh,u 1

= T1 (h) + T2 (h) + T3 (h) + T4 (h)

Our first aim is to prove that R1 (h) = o(h) as h 0.


It is clear that T1 (h) = O(h2 ).

Regularity of the distribution of the maximum

15

Let us consider T2 (h). Using the integral formula for the expectation of the
number of local maxima:
T2 (h) 1I{(0)>0}

1
0

1I{(t)0} dt

(0)h

(t)h

dz0

dz.

.E |v |(Zt (t)(u h)) /V3 = v3 pV3 (v3 ),


where V3 is the random vector
Z0 (0)(u h), Zt (t)(u h), Zt (t)(u h) ,
and v3 = (z0 , z, 0).
We divide the integral in the right-hand member into two terms, respectively the
integrals on [0, ] and [, 1] in the t-variable, where 0 < < 1. The first integral
can be bounded by

1I{(t)0} dt

(t)h
0

dz E |v |(Zt (t)(u h)) /V2 = (z, 0) pV2 (z, 0).

where the random vector V2 is the same as in (18). Since the conditional expectation as well as the density are bounded for u in a bounded set and 0 < h < 1, this
expression is bounded by (const)h.
As for the second integral, when t is between and 1 the Gaussian vector
Z0 (0)(u h), Zt (t)(u h), Zt (t)(u h)
has a bounded density so that the integral is bounded by C h2 , where C is a constant
depending on .
Since > 0 is arbitrarily small, this proves that T2 (h) = o(h). T3 (h) is similar
to T2 (h).
We now consider T4 (h). Put:
(4)

Z(.) (4) (.)(u h)

Eh =
where

stands

h1/4 |v | h1/4

for the sup-norm in [0, 1]. So,


(1)

(1)

(1)

T4 (h) E |v |1IEh Muh,u (Muh,u 1) + E |v |1IE C Muh,u

(21)

(E C denotes the complement of the event E).


The second term in (21) is bounded as follows:
(1)

E |v |1IE C Muh,u E |v |4 E
h

(1)

Muh,u

1/4

P (EhC )

1/2

The polynomial bound on G, plus the fact that Z has finite moments of all
orders, imply that E |v |4 is uniformly bounded.
(1)
Also, Muh,u D0 (Z(.) (.)(u h), [0, 1]) = D (recall that D0 (g; I ) denotes the number of downcrossings of level 0 by function g). A bound for E D 4

16

J.-M. Azas, M. Wschebor

can be obtained on applying Lemma 1.2 in Nualart-Wschebor (1991). In fact, the


Gaussian process Z(.) (.)(u h) has uniformly bounded one-dimensional marginal densities and for every positive integer p the maximum over [0, 1] of its p-th
derivative has finite moments of all orders. From that Lemma it follows that E D 4
is bounded independently of h, 0 < h < 1.
Hence,
(1)

E |v |1IE C Muh,u
h

(4)

(const) P ( Z(.) (4) (.)(u h)


1/2

(const) C1 eC2 h

+ hq/4 E |v |q

1/4
) + P (|v |
> h
1/2

> h1/4 )

1/2

where C1 , C2 are positive constants and q any positive number. The bound on the
first term follows from the Landau-Shepp (1971) inequality (see also Fernique,
1974) since even though the process depends on h it is easy to see that the bound
is uniform on h, 0 < h < 1. The bound on the second term is simply the Markov
inequality. Choosing q > 8 we see that the second term in (21) is o(h).
For the first term in (21) one can use the formula for the second factorial moment
(1)
of Muh,u to write it in the form:
1
0

1I{(s)0,(t)0} dsdt

0
E(|v |1IEh (Zs

(s)h
0

(t)h

dz1

dz2

(s)(u h)) (Zt (t)(u h)) /V4 = v4 ).pV4 (v4 ),


(22)

where V4 is the random vector


Zs (s)(u h), Zt (t)(u h), Zs (s)(u h), Zt (t)(u h)
and v4 = (z1 , z2 , 0, 0).
Let s = t and Q be the - unique - polynomial of degree 3 such that
Q(s) = z1 , Q(t) = z2 , Q (s) = 0, Q (t) = 0. Check that
Q(y) = z1 + (z2 z1 )(y s)2 (3t 2y s)(t s)3
Q (t) = 6(z1 z2 )(t s)2
Q (s) = 6(z1 z2 )(t s)2 .
Denote, for each positive h,
(y) := Zy (y)(u h) Q(y).
Under the conditioning V4 = v4 in the integrand of (22), the C function (.)
verifies (s) = (t) = (s) = (t) = 0. So, there exist t1 , t2 (s, t) such that
(t1 ) = (t2 ) = 0 and for y [s, t]:
| (y)| =|

y
t1

( )d |=|

y
t1

t2

(4) ( )d |

(t s)2
2

(4)

Regularity of the distribution of the maximum

17

Noting that a b a+b


for any pair of real numbers a, b, it follows that the
2
conditional expectation in the integrand of (22) is bounded by:
(4)

E(|v |.1IEh .(t s)4 ( Z(.) (4) (.)(u h)


(t s)4 .h1/2 .h1/4 = (t s)4 .h3/4 .

2
) /V4

= v4 )
(23)

On the other hand, applying Lemma 3.2 we have the inequality


pV4 (z1 , z2 , 0, 0) pZs ,Zt ,Zs ,Zt (0, 0, 0, 0) (const)(t s)4
the constant depending on the process but not on s, t.
Summing up, the expression in (22) is bounded by
(const).h2 .h3/4 = o(h).
(1)

Replacing now in (20) the expectation E v .1IAu Muh,u by the corresponding


integral formula:
E v .1IAu \Auh
= 1I{(0)>0} (0)

u
uh
u

+1I{(1)>0} (1)
+

E v .1IAu /Z0 = (0)x .pZ0 ((0)x)dx


E v .1IAu /Z1 = (1)x .pZ1 ((1)x)dx

uh
(t)h

1I{(t)0} dt
0
0
pV2 (z, 0) + o(h)
u

uh

dzE v .1IAu (Zt (t)(u h)) /V2 = (z, 0)

H1 (x, h)dx + o(h)

(24)

where:
H1 (x, h) = 1I{(0)>0} (0)E v .1IAu /Z0 = (0)x .pZ0 ((0)x)
+ 1I{(1)>0} (1)E v .1IAu /Z1 = (1)x .pZ1 ((1)x)
+

1I{(t)0}
0
E(v .1IAu (Zt

(t)(u h)) /Zt = (t)x, Zt = (t)(u h))


(25)
.pZt ,Zt ((t)x, (t)(u h))(t)dt.
Step 3. Our next aim is to prove that for each u the limit
lim
h0

Fv (u) Fv (u h)
h

exists and admits the representation (15) in the statement of the Lemma. For that
purpose, we will prove the existence of the limit
1
lim E v .1IAu \Auh .
h0 h

(26)

18

J.-M. Azas, M. Wschebor

This will follow from the existence of the limit


lim

h0,uh<x<u

H1 (x, h).

Consider the first term in expression (25). We apply Lemma 3.1(a) and with the
same notations therein:
Zt = a (t) Z0 + tZt ,

t = a (t) (0) + tt

t [0, 1] .

For u h < x < u replacing in (25) we have:


E v .1IAu /Z0 = (0)x
= E G t1 (Zt1 (t1 )x) + (t1 )(x v), ..., tm (Ztm (tm )x)
+(tm )(x v) 1IB(u,x)
= E v,x .1IB(u,x)

(27)

where v,x is defined in the statement and


B(u, x) = tZt (t)u a (t) (0)x for all t [0, 1] .
For each such that 0 < 1 and a (s) > 0 if 0 s , we define:
B (u, x) = tZt (t)u a (t) (0)x for all t [, 1]
= Zt (t)u +

a (t) (0)(u x)
for all t [, 1] .
t

It is clear that since we consider the case (0) > 0, then


B(u, x) = B0+ (u, x) := lim B (u, x).
0

Introduce also the notations:


M[s,t] = sup Z ( )u : [s, t] ,
(x) = |u x| sup

|a (t) (0)|
: t [, 1] .
t

We prove that as x u,
E v,x .1IB(u,x) E v,u .1IB(u,u)

(28)

We have,
|E v,x .1IB(u,x) E v,u .1IB(u,u) |
E |v,x v,u | + |E v,u (1IB(u,x) 1IB(u,u) ) |.

(29)

Regularity of the distribution of the maximum

19

From the definition of v,x it is immediate that the first term tends to 0 as x u.
For the second term it suffices to prove that
P (B(u, x) B(u, u)) 0 as x u.

(30)

Check the inclusion:


B(u, x) B (u, u) (x) M[,1] (x) M[,1] 0, M[0,] > 0
which implies that
P (B(u, x) B(u, u)) P (B(u, x) B (u, u)) + P (B (u, u) B(u, u))
P (|M[,1] | (x)) + 2.P (M[,1] 0, M[0,] > 0).
Let x u for fixed . Since (x) 0, we get:
lim sup P (B(u, x) B(u, u)) P (M[,1] = 0) + 2.P (M[,1] 0, M[0,] > 0).
xu

The first term is equal to zero because of Proposition 2.4. The second term
decreases to zero as 0 since M[,1] 0, M[0,] > 0 decreases to the empty
set.
It is easy to prove that the function
(u, v) E v,u .1IAu (Z

, )

is continuous. The only difficulty comes from the indicator function 1IAu (Z , ) although again the fact that the distribution function of the maximum of the process
Z(.) (.)u has no atoms implies the continuity in u in much the same way as
above.
So, the first term in the right-hand member of (25) has the continuous limit:
1I{(0)>0} (0)E v,u .1IAu (Z

) .pZ0 ((0).u).

With minor changes, we obtain for the second term the limit:
1I{(1)>0} (1)E v,u .1IAu (Z

) .pZ1 ((1).u),

where Z , are as in Lemma 3.1 and v,u as in the statement of Lemma 3.3.
The third term can be treated in a similar way. The only difference is that the regression must be performed on the pair (Zt , Zt ) for each t [0, 1], applying again
Lemma 3.1 (a),(b),(c). The passage to the limit presents no further difficulties, even
if the integrand depends on h.
Finally, note that conditionally on Zt = (t)u, Zt = (t)u one has
Zt (t)u = Ztt t (t)u
and

(Zt (t)u) 1IAu (Z,) = (Zt (t)u)1IAu (Z,) .

20

J.-M. Azas, M. Wschebor

Adding up the various parts, we get:


1
lim E v .1IAu \Auh = 1I{(0)>0} (0)E v,u .1IAu (Z
h0 h

+1I{(1)>0} (1)E v,u .1IAu (Z


1

) .pZ0 ((0).u)
,

) .pZ1 ((1).u)

t
(t)1I{(t)>0} dtE v,u
(Ztt t (t).u)1IAu (Z t , t )

pZt ,Zt ((t)u, (t)u).


Similar computations that we will not perform here show an analogous result
for
1
lim E v .1IAuh \Au
h0 h
and replacing into (16) we have the result for processes Z with C paths.
Step 4. Suppose now that Z and (.) satisfy the hypotheses of the Lemma and
define:
Z (t) = ( Z)(t) + Y (t)

(t) = ( )(t)

and

where > 0, (t) = 1 ( 1 t), a non-negative C function with compact


+
support, (t)dt = 1 and Y is a Gaussian centered stationary process with C
paths and non-purely atomic spectrum, independent of Z. Proceeding as in Sec.
10.6 of Cramer-Leadbetter (1967), one can see that Y verifies Hk for every k. The
definition of Z implies that Z inherites this property. Thus for each positive ,
Z meets the conditions for the validity of Steps 2 and 3, so that the function
Fv (u) = E v 1IAu (Z

, )

where v = G(Zt1 (t1 )v, ..., Ztm (tm )v) is continuoustly differentiable
and its derivative verifies (15) with the obvious changes, that is:
Fv

(u) = (0)E

v,u

+ (1)E
1

v,u

(t)E

pZt ,(Z

)t

.1IAu

(Z ) ,( )

.1IAu
v,u

.pZ0 (0) .u

(Z ) ,( )

(t) .u,

t
t

.pZ1 (1) .u
t

(t) .u dt.

(t).u 1IAu ((Z

)t ,( )t )

(31)

Let 0. We prove next that (Fv ) (u) converges for fixed (u, v) to a limit
function Fv (u) that is continuous in (u, v). On the other hand, it is easy to see that
for fixed (u, v) Fv (u) Fv (u). Also, from (31) it is clear that for each v, there
exists 0 > 0 such that if (0, 0 ), (Fv ) (u) is bounded by a fixed constant when
u varies in a bounded set because of the hypothesis on the functions G and and
the non-degeneracy of the one and two-dimensional distribution of the process Z.

Regularity of the distribution of the maximum

21

So, it follows that Fv (u) = Fv (u) and the same computation implies that Fv (u)
satisfies (15).
Let us show how to proceed with the first term in the right-hand member of
(31). The remaining terms are similar.
Clearly, almost surely, as 0 one has Zt Zt , (Z )t Zt , (Z )t Zt
uniformly for t [0, 1], so that the definition of Z in (11) implies that (Z )t
Zt uniformly for t [0, 1], since the regression coefficient (a ) (t) converges to
a (t) uniformly for t [0, 1] (with the obvious notation).
Similarly, for fixed (u, v):
( )t t , (v,u ) v,u
uniformly for t [0, 1].
Let us prove that
E (v,u ) 1IAu

E v,u 1IAu (Z

(Z ) ,( )

) .

This is implied by

as

P Au
0. Denote, for

> 0, 0:

Cu, = Au

Au Z ,

0.

(32)

(t).u for every t [0, 1]

Eu, = Zt (t)u + for all t [0, 1] .


One has:
P (Cu,

Eu,0 ) P (Cu, \ Eu, ) + P (Eu, \ Cu, ) + P (Eu, \ Eu,0 ).

Let K be a compact subset of the real line and suppose u K. We denote:


D

sup

uK,t[0,1]

(t).u Zt (t).u |>

and
Fu, = sup

t[0,1]

Zt (t)u .

Fix > 0 and choose small enough so that:


P D

< .

Check the following inclusions:


Cu, \ Eu, D

, ,

Eu, \ Cu,

D c, Fu, ,

Eu, \ Eu,0 Fu,

which imply that if is small enough:


P (Cu,

Eu,0 ) 2. + 2.P Fu, .

22

J.-M. Azas, M. Wschebor

For each u, as 0 one has


P Fu, P

sup

t[0,1]

Zt (t)u = 0 = 0.

where the second equality follows again on applying Proposition 2.4.


This proves that as 0 the first term in the right-hand member of (31) tends
to the limit
(0)E v,u .1IAu (Z , ) .pZ0 ( (0) .u) .
It remains to prove that this is a continuous function of (u, v). It suffices to prove
the continuity of the function
E 1IAu (Z

) = P Au Z ,

as a function of u. For that purpose we use inequality:


| P Au+h Z ,
P

| sup

t[0,1]

P Au Z ,

Zt (t).u || h |

and as h 0 the right-hand member tends to P | supt[0,1] Zt (t).u |= 0


which is equal to zero by Propostion 2.4.
Proof of Theorem 1.1 We proceed by induction on k.
We will give some details for the first two derivatives including some implicit
formulae that will illustrate the procedure for general k.
We introduce the following additional notations. Put Yt := Xt (t)u and define, on the interval [0, 1], the processes X , X , Xt , Y , Y , Y t , and the functions
, , t , as in Lemma 3.1. Note that the regression coefficients corresponding
to the processes X and Y are the same, so that anyone of them may be used to define
the functions , , t . One can easily check that
Ys = Xs (s)u
Ys = Xs (s)u
Yst = Xst t (s)u.

For t1 , ..., tm [0, 1] { , } , m 2, we define by induction the stochastic


t
t
processes X t1 ,...,tm = Xt1 ,...,tm1 m , Y t1 ,...,tm = Y t1 ,...,tm1 m and the function
t
t1 ,...,tm = t1 ,...,tm1 m , applying Lemma 3.1 for the computations at each stage.
With the aim of somewhat reducing the size of the formulae we will express
the successive derivatives in terms of the processes Y t1 ,...,tm instead of Xt1 ,...,tm .
The reader must keep in mind that for each m-tuple t1 , ..., tm the results depend on
u through the expectation of the stochastic process Y t1 ,...,tm . Also, for a stochastic
process Z we will use the notation
A(Z) = A0 (Z, ) = {Zt 0 : for all t [0, 1]} .

Regularity of the distribution of the maximum

23

First derivative. Suppose that X satisfies H2 . We apply formula (15) in Lemma


3.3 for 1, Z = X and (.) 1 obtaining for the first derivative:
F (u) = E 1IA(Y ) pY0 (0) + E 1IA(Y ) pY1 (0)
1

E Ytt11 1IA

Y t1

pYt

,Yt1 (0, 0)dt1 .

(33)

This expression is exactly the expression in (9) with the indicated notational changes and after taking profit of the fact that the process is Gaussian, via the regression
on the conditionning in each term. Note that according to the definition of the
Y -process:
E 1IA(Y ) = E 1IAu (X , )
E 1IA(Y ) = E 1IAu (X
E Ytt11 1IA

= E Ytt11 1IAu (Xt1 , t1 ) .

Y t1

Second derivative. Suppose that X satisfies H4 . Then, X , X , Xt1 satisfy H3 , H3 ,


H2 respectively. Therefore Lemma 3.3 applied to these processes can be used to
show the existence of F (u) and to compute a similar formula, excepting for the
necessity of justifying differentiation under the integral sign in the third term. We
get the expression:
(1)

(1)

F (u) = E 1IA(Y ) pY0 (0) E 1IA(Y ) pY1 (0)


1

+
0

E Ytt11 1IA

Y t1

(1,0)
(0, 0)dt1
t1 ,Yt1

pY

+pY0 (0) (0)E 1IA(Y


1

(t2 )E Yt2 ,t2 1I

A Y

+pY1 (0) (0)E 1IA(Y


1

) pY0 (0) + (1)E 1IA(Y

(t2 )E Yt2 ,t2 1I


A

,t2

pY

t2 ,(Y

,t2

pY

t2 ,(Y

) pY1 (0)

)t2 (0, 0)dt2

t1 (t1 )E 1IA(Y t1 ) + t1 (0)E Ytt11 , 1IA

t ,
,Y (0, 0) + t1 (1)E Yt11 1I
1 t1
A

pYt

) pY1 (0)

)t2 (0, 0)dt2

) pY0 (0) + (1)E 1IA(Y

1
0

(1)
(1)
In this formula pYt , pYt
0
1

t1

(t2 )E

Y t1 ,

Y t1 ,

pY t1 (0)
0

pY t1 (1)
1

Ytt11 ,t2 Ytt21 ,t2 1IA(Y t1 ,t2 )

dt1 ,

pY t1 ,(Y t1 ) (0, 0)dt2


t2

t2

(34)

and pYt ,Yt (0, 0)(1,0) stand respectively for the deriv1
1
ative of pYt0 (.), the derivative of pYt1 (.) and the derivative with respect to the first
variable of (pYt ,Yt (., .)).
1
1
To validate the above formula, note that:

24

J.-M. Azas, M. Wschebor

The first two lines are obtained by differentiating with respect to u, the densities
pY0 (0) = pX0 (u), pY1 (0) = pX1 (u), pYt ,Yt (0, 0) = pXt ,Xt (u, 0).
1
1
1
1
Lines 3 and 4 come from the application of Lemma 3.3 to differentiate E(1IA(Y ) ).
The lemma is applied with Z = X , = , = 1.
Similarly, lines 5 and 6 contain the derivative of E(1IA(Y ) ).
The remaining corresponds to differentiate the function
E Ytt11 1IA(Y t1 ) = E Xtt11 t1 (t1 )u 1IAu (Xt1 , t1 )
in the integrand of the third term in (33). The first term in line 7 comes from the
simple derivative

E (Xtt11 t1 (t1 )v)1IAu (Xt1 , t1 ) = t1 (t1 )E(1IA(Y t1 ).


v
The other terms are obtained by applying Lemma 3.3 to compute

E (Xtt11 t1 (t1 )v)1IAu (Xt1 , t1 ) ,


u
putting Z = X t1 , = t1 , = Xtt11 t1 (t1 )v.
Finally, differentiation under the integral sign is valid since because of Lemma
3.1, the derivative of the integrand is a continuous function of (t1 , t2 , u) due
the regularity and non-degeneracy of the Gaussian distributions involved and
Proposition 2.4.
General case. With the above notation, given the mtuple t1 , ..., tm of elements
of [0, 1] { , } we will call the processes Y, Y t1 , Y t1 ,t2 , ..., Y t1 ,...,tm1 the ancestors of Y t1 ,...,tm . In the same way we define the ancestors of the function t1 ,...,tm .
Assume the following induction hypothesis: If X satisfies H2k then F is k
times continuously differentiable and F (k) is the sum of a finite number of terms
belonging to the class Dk which consists of all expressions of the form:
1

..

ds1 ..dsp Q(s1 , .., sp )E 1IA

Y t1 ,..,tm

K1 (s1 , .., sp )K2 (s1 , .., sp ) (35)

where:
1 m k.
t1 , ..., tm [0, 1] { , } , m 1.
s1 , .., sp , 0 p m, are the elements in {t1 , ..., tm } that belong to [0, 1] (that
is, which are neither nor ). When p = 0 no integral sign is present.
Q(s1 , .., sp ) is a polynomial in the variables s1 , .., sp .
is a product of values of Y t1 ,...,tm at some locations belonging to s1 , .., sp .
K1 (s1 , .., sp ) is a product of values of some ancestors of t1 ,...,tm at some
locations belonging to the set s1 , .., sp {0, 1} .
K2 (s1 , .., sp ) is a sum of products of densities and derivatives of densities of
the random variables Z at the point 0, or the pairs ( Z , Z ) at the point (0, 0)
where s1 , .., sp {0, 1} and the process Z is some ancestor of Y t1 ,...,tm .

Regularity of the distribution of the maximum

25

Note that K1 does not depend on u but K2 is a function of u.


It is clear that the induction hypothesis is verified for k = 1. Assume that it
is true up to the integer k and that X satisfies H2k+2 . Then F (k) can be written as
a sum of terms of the form (35). Consider a term of this form and note that the
variable u may appear in three locations:
1. In , where differentiation is simple given its product form, the fact that
t1 ,...,tq
= t1 ,...,tq (s), q m, s s1 , ..., sp and the boundedness of
u Ys
moments allowing to differentiate under the integral and expectation signs.
2. In K2 (s1 , .., sp ) which is clearly C as a function of u. Its derivative with
respect to u takes the form of a product of functions of the types K1 (s1 , .., sp )
and K2 (s1 , .., sp ) defined above.
3. In 1IA Y t1 ,..,tm . Lemma 3.3 shows that differentiation produces 3 terms depending upon the processes Y t1 ,...,tm ,tm+1 with tm+1 belonging to [0, 1] { , }.
Each term obtained in this way belongs to Dk+1 .
The proof is achieved by noting that, as in the computation of the second derivative, Lemma 3.1 implies that the derivatives of the integrands are continuous
functions of u that are bounded as functions of (s1 , .., sp , tm+1 , u) if u varies in a
bounded set.
The statement and proof of Theorem 1.1 can not, of course, be used to obtain
explicit expressions for the derivatives of the distribution function F . However, the
implicit formula for F (k) (u) as sum of elements of Dk can be transformed into explicit upper-bounds if one replaces everywhere the indicator functions 1IA(Y t1 ,..,tm ) )
by 1 and the functions t1 ,..,tm (.) by their absolute value.
On the other hand, Theorem 1.1 permits to have the exact asymptotic behaviour
of F (k) (u) as u + in case V ar(Xt ) is constant. Even though the number of
terms in the formula increases rapidly with k, there is exactly one term that is dominant. It turns out that as u +, F (k) (u) is equivalent to the k-th derivative of
the equivalent of F (u). This is Corollary 1.1.
Proof of Corollary 1.1. To prove the result for k = 1 note that under the hypothesis
of the Corollary, one has r(t, t) = 1, r01 (t, t) = 0, r02 (t, t) = r11 (t, t) and an
elementary computation of the regression (13) replacing Z by X, shows that:
bt (s) = r(s, t),
and
t (s) = 2

ct (s) =

r01 (s, t)
r11 (t, t)

1 r(s, t)
(t s)2

since we start with (t) 1.


This shows that for every t [0, 1] one has inf s[0,1] ( t (s)) > 0 because of the
non-degeneracy condition and t (t) = r02 (t, t) = r11 (t, t) > 0. The expression
for F becomes:
(36)
F (u) = (u)L(u),

26

J.-M. Azas, M. Wschebor

where
L(u) = L1 (u) + L2 (u) + L3 (u),
L1 (u) = P (Au (X , ),
L2 (u) = P (Au (X , ),
1

L3 (u) =
0

E (Xtt t (t)u)1IAu (Xt , t )

dt
.
(2 r11 (t, t))1/2

Since for each t [0, 1] the process X t is bounded it follows that


a.s. 1IAu (Xt , t ) 1 as u +.
A dominated convergence argument shows now that L3 (u) is equivalent to

u
(2)1/2

1
0

u
r02 (t, t)
dt =
1/2
(r11 (t, t))
(2 )1/2

r11 (t, t)dt.

Since L1 (u), L2 (u) are bounded by 1, (1) follows for k = 1.


For k 2, write
F (k) (u) = (k1) (u)L(u) +

h=k
h=2

k 1 (kh)
(u)L(h1) (u).

k1

(37)

As u +, for each j = 0, 1, ..., k 1, (j ) (u) (1)j uj (u) so that the


first term in (37) is equivalent to the expression in (1). Hence, to prove the Corollary
it suffices to show that the succesive derivatives of the function L are bounded. In
fact, we prove the stronger inequality
|L(j ) (u)| lj (

u
), j = 1, ..., k 1
aj

(38)

for some positive constants lj , aj , j = 1, ..., k 1.


We first consider the function L1 . One has:
(s) =
( ) (s) =

1 r(s, 0)
f or 0 < s 1, (0) = 0,
s

1 + r(s, 0) s.r10 (s, 0)


1
f or 0 < s 1, ( ) (0) = r11 (0, 0).
2
2
s

The derivative L1 (u) becomes


L1 (u) = (1)E[1IAu (X
1

(t)E (Xt

,t

,
,t

)]

pX ( (1)u)
1

(t)u)1IAu (X

,t , ,t )

pX

,(X )t (

(t)u, ( ) (t)u) dt.

Notice that (1) is non-zero so that the first term is bounded by a constant
times a non-degenerate Gaussian density. Even though (0) = 0, the second

Regularity of the distribution of the maximum

27

term is also bounded by a constant times a non-degenerate Gaussian density because the joint distribution of the pair (Xt , (X )t ) is non-degenerate and the pair
( (t), ( ) (t)) = (0, 0) for every t [0, 1].
Applying a similar argument to the succesive derivatives we obtain (38) with
L1 instead of L.
The same follows with no changes for
L2 (u) = P (Au (X , ).
For the third term
1

L3 (u) =
0

E (Xtt t (t)u)1IAu (Xt , t )

dt
(2 r11 (t, t))1/2

we proceed similarly, taking into account t (s) = 0 for every s [0, 1]. So (38)
follows and we are done.
Remark. Suppose that X satisfies the hypotheses of the Corollary with k 2.
Then, it is possible to refine the result as follows.
For j = 1, ..., k :
F (j ) (u) = (1)j 1 (j 1)!hj 1 (u)
1

1 + (2)1/2 .u.
0

(r11 (t, t))1/2 dt (u) + j (u)(u) (39)

1 (j )
where hj (u) = (1)
j ! ((u)) (u), is the standard j-th Hermite polynomial
(j = 0, 1, 2, ...) and
| j (u) | Cj exp(u2 )

where C1 , C2 , ... are positive constants and > 0 does not depend on j .
The proof of (39) consists of a slight modification of the proof of the Corollary.
Note first that from the above computation of (s) it follows that 1) if X0 < 0,
then if u is large enough Xs (s).u 0 for all s [0, 1] and 2) if X0 > 0,
then X0 (0).u > 0 so that:
L1 (u) = P (Xs (s).u 0) for all s [0, 1])

1
2

as u +.

On account of (38) this implies that if u 0:


0

1
L1 (u) =
2

+
u

L1 (v)dv D1 exp(1 u2 )

with D1 , 1 positive constants.


L2 (u) is similar. Finally:
1

L3 (u) =
0

E (Xtt t (t)u)

dt
(2 r11 (t, t))1/2

28

J.-M. Azas, M. Wschebor


1

E (Xtt t (t)u)1I(A

u (X

t , t ) C

dt
.
(2 r11 (t, t))1/2

(40)

The first term in (40) is equal to:


1

(2)1/2 .u.

(r11 (t, t))1/2 dt.

As for the second term in (40) denote # =

inf

s,t[0,1]

t (s) > 0 and let u > 0.

Then:
P

Au (Xt , t )

P ( s [0, 1] such that Xst > # .u) D3 exp(3 u2 )

with D3 , 3 are positive constants, the last inequality being a consequence of the
Landau-Shepp-Fernique inequality.
The remainder follows in the same way as the proof of the Corollary.
Acknowledgements. This work has received a support from CONICYT-BID-Uruguay, grant
91/94 and from ECOS program U97E02.

References
1. Adler, R.J.: An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes, IMS, Hayward, Ca (1990)
2. Azas, J-M., Wschebor, M.: Regularite de la loi du maximum de processus gaussiens
reguliers, C.R. Acad. Sci. Paris, t. 328, serieI, 333336 (1999)
3. Belyaev, Yu.: On the number of intersections of a level by a Gaussian Stochastic process,
Theory Prob. Appl., 11, 106113 (1966)
4. Berman, S.M.: Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series (1992)
5. Bulinskaya, E.V.: On the mean number of crossings of a level by a stationary Gaussian
stochastic process, Theory Prob. Appl., 6, 435438 (1961)
6. Cierco, C.: Probl`emes statistiques lies a` la detection et a` la localisation dun g`ene a` effet
quantitatif. PHD dissertation. University of Toulouse.France (1996)
7. Cramer, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes, J. Wiley &
Sons, New-York (1967)
8. Diebolt, J., Posse, C.: On the Density of the Maximum of Smooth Gaussian Processes,
Ann. Probab., 24, 11041129 (1996)
9. Fernique, X.: Regularite des trajectoires des fonctions aleatoires gaussiennes, Ecole
dEte de Probabilites de Saint Flour, Lecture Notes in Mathematics, 480, Springer-Verlag,New-York (1974)
10. Landau, H.J., Shepp, L.A.: On the supremum of a Gaussian process, Sankya Ser. A, 32,
369378 (1971)
11. Leadbetter, M.R., Lindgren, G., Rootzen, H.: Extremes and related properties of random
sequences and processes. Springer-Verlag, New-York (1983)
12. Lifshits, M.A.: Gaussian random functions. Kluwer, The Netherlands (1995)
13. Marcus, M.B.: Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 5271 (1977)

Regularity of the distribution of the maximum

29

14. Nualart, D., Vives, J.: Continuite absolue de la loi du maximum dun processus continu,
C. R. Acad. Sci. Paris, 307, 349354 (1988)
15. Nualart, D., Wschebor, M.: Integration par parties dans lespace de Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83109 (1991)
16. Piterbarg, V.I.: Asymptotic Methods in the Theory of Gaussian Processes and Fields,
American Mathematical Society. Providence, Rhode Island (1996)
17. Rice, S.O.: Mathematical Analysis of Random Noise, Bell System Technical J., 23,
282332, 24, 45156 (19441945)
18. Tsirelson, V.S.: The Density of the Maximum of a Gaussian Process, Th. Probab. Appl.,
20, 817856 (1975)
19. Weber, M.: Sur la densite du maximum dun processus gaussien, J. Math. Kyoto Univ.,
25, 515521 (1985)
20. Wschebor, M.: Surfaces aleatoires. Mesure geometrique des ensembles de niveau, Lecture Notes in Mathematics, 1147, Springer-Verlag (1985)
21. Ylvisaker, D.: A Note on the Absence of Tangencies in Gaussian Sample Paths, The
Ann. of Math. Stat., 39, 261262 (1968)

A general expression for the distribution of the maximum of a


Gaussian field and the approximation of the tail

arXiv:math/0607041v2 [math.PR] 8 Jan 2007

Jean-Marc Azas , azais@cict.fr

Mario Wschebor , wschebor@cmat.edu.uy

February 2, 2008

AMS subject classification: Primary 60G70 Secondary 60G15


Short Title: Distribution of the Maximum.
Key words and phrases: Gaussian fields, Rice Formula, Euler-Poincare Characteristic, Distribution of the Maximum, Density of the Maximum, Random Matrices.
Abstract
We study the probability distribution F (u) of the maximum of smooth Gaussian fields
defined on compact subsets of Rd having some geometric regularity.
Our main result is a general expression for the density of F . Even though this is an
implicit formula, one can deduce from it explicit bounds for the density, hence for the
distribution, as well as improved expansions for 1 F (u) for large values of u.
The main tool is the Rice formula for the moments of the number of roots of a random
system of equations over the reals.
This method enables also to study second order properties of the expected Euler Characteristic approximation using only elementary arguments and to extend these kind of results
to some interesting classes of Gaussian fields. We obtain more precise results for the direct method to compute the distribution of the maximum, using spectral theory of GOE
random matrices.

Introduction and notations

Let X = {X(t) : t S} be a real-valued random field defined on some parameter set S and
M := suptS X(t) its supremum.
The study of the probability distribution of the random variable M , i.e. the function
FM (u) := P{M u} is a classical problem in probability theory. When the process is Gaussian,
general inequalities allow to give bounds on 1 FM (u) = P{M > u} as well as asymptotic
results for u +. A partial account of this well established theory, since the founding paper
by Landau and Shepp [20] should contain - among a long list of contributors - the works of
Marcus and Shepp [24], Sudakov and Tsirelson [30], Borell [13] [14], Fernique [17], Ledoux and
Talagrand [22], Berman [11] [12], Adler[2], Talagrand [32] and Ledoux[21].
During the last fifteen years, several methods have been introduced with the aim of obtaining more precise results than those arising from the classical theory, at least under certain
restrictions on the process X , which are interesting from the point of view of the mathematical
theory as well as in many significant applications. These restrictions include the requirement

This work was supported by ECOS program U03E01.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118, route de
Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.

the domain S to have certain finite-dimensional geometrical structure and the paths of the
random field to have a certain regularity.
Some examples of these contributions are the double sum method by Piterbarg [28]; the
Euler-Poincare Characteristic (EPC) approximation, Taylor, Takemura and Adler [34], Adler
and Taylor [3]; the tube method, Sun [31] and the well- known Rice method, revisited by Azas
and Delmas [5], Azas and Wschebor [6]. See also Rychlik [29] for numerical computations.
The results in the present paper are based upon Theorem 3 which is an extension of Theorem
3.1 in Azas and Wschebor [8] allowing to express the density pM of FM by means of a general
formula. Even though this is an exact formula, it is only implicit as an expression for the
density, since the relevant random variable M appears in the right-hand side. However, it can
be usefully employed for various purposes.
First, one can use Theorem 3 to obtain bounds for pM (u) and thus for P{M > u} for
every u by means of replacing some indicator function in (4) by the condition that the normal
derivative is extended outward (see below for the precise meaning). This will be called the
direct method. Of course, this may be interesting whenever the expression one obtains can
be handled, which is the actual situation when the random field has a law which is stationary
and isotropic. Our method relies on the application of some known results on the spectrum of
random matrices.
Second, one can use Theorem 3 to study the asymptotics of P{M > u} as u +. More
precisely, one wants to write, whenever it is possible
P{M > u} = A(u) exp

1 u2
2 2

+ B(u)

(1)

where A(u) is a known function having polynomially bounded growth as u +, 2 =


suptS Var(X(t)) and B(u) is an error bounded by a centered Gaussian density with variance
12 , 12 < 2 . We will call the first (respectively the second) term in the right-hand side of (1)
the first (resp second) order approximation of P{M > u}.
First order approximation has been considered in [3] [34] by means of the expectation of the
EPC of the excursion set Eu := {t S : X(t) > u}. This works for large values of u. The same
authors have considered the second order approximation, that is, how fast does the difference
between P{M > u} and the expected EPC tend to zero when u +.
We will address the same question both for the direct method and the EPC approximation method. Our results on the second order approximation only speak about the size of the
variance of the Gaussian bound. More precise results are only known to the authors in the
special case where S is a compact interval of the real line, the Gaussian process X is stationary
and satisfies a certain number of additional requirements (see Piterbarg [28] and Azas et al. [4]).
Theorem 5 is our first result in this direction. It gives a rough bound for the error B(u) as
u +, in the case the maximum variance is attained at some strict subset of the face in S
having the largest dimension. We are not aware of the existence of other known results under
similar conditions.
In Theorem 6 we consider processes with constant variance. This is close to Theorem 4.3
in [34]. Notice that Theorem 6 has some interest only in case suptS t < , that is, when
one can assure that 12 < 2 in (1). This is the reason for the introduction of the additional
hypothesis (S) < on the geometry of S, (see below (64) for the definition of (S)), which
is verified in some relevant situations (see the discussion before the statement of Theorem 6).
In Theorem 7, S is convex and the process stationary and isotropic. We compute the exact
asymptotic rate for the second order approximation as u + corresponding to the direct
2

method.
In all cases, the second order approximation for the direct method provides an upper bound
for the one arising from the EPC method.
Our proofs use almost no differential geometry, except for some elementary notions in Euclidean space. Let us remark also that we have separated the conditions on the law of the
process from the conditions on the geometry of the parameter set.
Third, Theorem 3 and related results in this paper, in fact refer to the density pM of
the maximum. On integration, they imply immediately a certain number of properties of the
probability distribution FM , such as the behaviour of the tail as u +.
Theorem 3 implies that FM has a density and we have an implicit expression for it. The
proof of this fact here appears to be simpler than previous ones (see Azas and Wschebor [8])
even in the case the process has 1-dimensional parameter (Azas and Wschebor [7]). Let us
remark that Theorem 3 holds true for non-Gaussian processes under appropriate conditions
allowing to apply Rice formula.
Our method can be exploited to study higher order differentiability of FM (as it has been
done in [7] for one-parameter processes) but we will not pursue this subject here.
This paper is organized as follows:
Section 2 includes an extension of Rice Formula which gives an integral expression for the
expectation of the weighted number of roots of a random system of d equations with d real
unknowns. A complete proof of this formula in a form which is adapted to our needs in this
paper, can be found in [9]. There is an extensive literature on Rice formula in various contexts
(see for example Belayiev [10] , Cramer-Leadbetter [15], Marcus [23], Adler [1], Wschebor [35].
In Section 3, we obtain the exact expression for the distribution of the maximum as a consequence of the Rice-like formula of the previous section. This immediately implies the existence
of the density and gives the implicit formula for it. The proof avoids unnecessary technicalities
that we have used in previous work, even in cases that are much simpler than the ones considered here.
In Section 4, we compute (Theorem 4) the first order approximation in the direct method
for stationary isotropic processes defined on a polyhedron, from which a new upper bound for
P{M > u} for all real u follows.
In Section 5, we consider second order approximation, both for the direct method and the
EPC approximation method. This is the content of Theorems 5, 6 and 7.
Section 6 contains some examples.

Assumptions and notations


X = {X(t) : t S} denotes a real-valued Gaussian field defined on the parameter set S. We
assume that S satisfies the hypothesis A1
A1 :
S is a compact subset of Rd

S is the disjoint union of Sd , Sd1 ..., S0 , where Sj is an orientable C 3 manifold of dimension


j without boundary. The Sj s will be called faces. Let Sd0 , d0 d be the non empty face
having largest dimension.
We will assume that each Sj has an atlas such that the second derivatives of the inverse
functions of all charts (viewed as diffeomorphisms from an open set in Rj to Sj ) are
bounded by a fixed constant. For t Sj , we denote Lt the maximum curvature of Sj at
the point t. It follows that Lt is bounded for t S.
Notice that the decomposition S = Sd ... S0 is not unique.
Concerning the random field we make the following assumptions A2-A5
A2 : X is in fact defined on an open set containing S and has C 2 paths
A3 : for every t S the distribution of X(t), X (t) does not degenerate; for every s, t S,
s = t, the distribution of X(s), X(t) does not degenerate.
A4 : Almost surely the maximum of X(t) on S is attained at a single point.
(t) denote respectively the derivative along S and the normal derivaFor t Sj , Xj (t) Xj,N
j
d
tive. Both quantities are viewed as vectors in R , and the density of their distribution will be
expressed respectively with respect to an orthonormal basis of the tangent space Tt,j of Sj at
the point t, or its orthogonal complement Nt,j . Xj (t) will denote the second derivative of X
along Sj , at the point t Sj and will be viewed as a matrix expressed in an orthogonal basis
of Tt,j . Similar notations will be used for any function defined on Sj .

A5 : Almost surely, for every j = 0, 1, . . . , d there is no point t in Sj such that Xj (t) = 0,


det(Xj (t)) = 0
Other notations and conventions will be as follows :
j is the geometric measure on Sj .
m(t) := E(X(t)), r(s, t) = Cov(X(s), X(t)) denote respectively the expectation and covariance of the process X ; r0,1 (s, t), r0,2 (s, t) are the first and the second derivatives of r
with respect to t. Analogous notations will be used for other derivatives without further
reference.
If is a random variable taking values in some Euclidean space, p (x) will denote the
density of its probability distribution with respect to the Lebesgue measure, whenever it
exists.
(x) = (2)1/2 exp(x2 /2) is the standard Gaussian density ; (x) :=

x
(y)dy.

Assume that the random vectors , have a joint Gaussian distribution, where has
values in some finite dimensional Euclidean space. When it is well defined,
E(f ()/ = x)
is the version of the conditional expectation obtained using Gaussian regression.
Eu := {t S : X(t) > u} is the excursion set above u of the function X(.) and Au :=
{M u} is the event that the maximum is not larger than u.
, , , denote respectively inner product and norm in a finite-dimensional real Euclidean
space; d is the Lebesgue measure on Rd ; S d1 is the unit sphere ; Ac is the complement
of the set A. If M is a real square matrix, M 0 denotes that it is positive definite.
4

If g : D C is a function and u C, we denote

Nug (D) := {t D : g(t) = u}

which may be finite or infinite.

Some remarks on the hypotheses


One can give simple sufficient additional conditions on the process X so that A4 and A5 hold
true.
If we assume that for each pair j, k = 0, . . . , d and each pair of distinct points s, t, s Sj , t
Sk , the distribution of the triplet
X(t) X(s), Xj (s), Xk (t))
does not degenerate in R Rj Rk , then A4 holds true.

This is well-known and follows easily from the next lemma (called Bulinskaya s lemma)
that we state without proof, for completeness.
Lemma 1 Let Z(t) be a stochastic process defined on some neighborhood of a set T embedded
in some Euclidean space. Assume that the Hausdorff dimension of T is smaller or equal than
the integer m and that the values of Z lie in Rm+k for some positive integer k . Suppose, in
addition, that Z has C 1 paths and that the density pZ(t) (v) is bounded for t T and v in some
neighborhood of u Rm+k . Then, a. s. there is no point t T such that Z(t) = u.
With respect to A5, one has the following sufficient conditions: Assume A1, A2, A3 and as
additional hypotheses one of the following two:
t

X(t) is of class C 3

sup
tS,x V (0)

P | det X (t) | < /X (t) = x 0,

as 0,

where V (0) is some neighborhood of zero.


Then A5 holds true. This follows from Proposition 2.1 of [8] and [16].

Rice formula for the number of weighted roots of random


fields

In this section we review Rice formula for the expectation of the number of roots of a random
system of equations. For proofs, see for example [8], or [9], where a simpler one is given.
Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and
u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t) 0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =

E | det(Z (t))|/Z(t) = u pZ(t) (u)dt.

If B is compact, then both sides in (2) are finite.


5

(2)

Theorem 2 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that

for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:
a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)
ous.

Y t (w) is continu-

Z(s), Y t (w) defined on U W is Gaussian.

b) For each t U the random process (s, w)

Moreover, assume that g : U C(W, Rd ) R is a bounded function, which is continuous when

one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =

E
tI,Z(t)=u

E | det(Z (t)|g(t, Y t )/Z(t) = u).pZ(t) (u)dt.

(3)

Remarks:
1. We have already mentioned in the previous section sufficient conditions implying hypothesis (iv) in Theorem 1.
2. With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .

The implicit formula for the density of the maximum

Theorem 3 Under assumptions A1 to A5, the distribution of M has the density


E 1IAx /X(t) = x pX(t) (x)

pM (x) =
tS0
d

+
j=1

Sj

E | det(Xj (t))| 1IAx /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt),

(4)

Remark: One can replace | det(Xj (t))| in the conditional expectation by (1)j det(Xj (t)),
since under the conditioning and whenever M x holds true, Xj (t) is negative semi-definite.
Proof of Theorem 3
Let Nj (u), j = 0, . . . , d be the number of global maxima of X(.) on S that belong to Sj and are
larger than u. From the hypotheses it follows that a.s.
j=0,...,d Nj (u) is equal to 0 or 1, so
that
P{M > u} =
P{Nj (u) = 1} =
E(Nj (u)).
(5)
j=0,...,d

j=0,...,d

The proof will be finished as soon as we show that each term in (5) is the integral over (u, +)
of the corresponding term in (4).
This is self-evident for j = 0. Let us consider the term j = d. We apply the weighted Rice
formula of Section 2 as follows :
Z is the random field X defined on Sd .
For each t Sd , put W = S and Y t : S R2 defined as:
Y t (w) := X(w) X(t), X(t) .
Notice that the second coordinate in the definition of Y t does not depend on w.
6

In the place of the function g, we take for each n = 1, 2, . . . the function gn defined as
follows:
gn (t, f1 , f2 ) = gn (f1 , f2 ) = 1 Fn (sup f1 (w)) . 1 Fn (u f2 (w)) ,
wS

where w is any point in W and for n a positive integer and x 0, we define :


Fn (x) := F(nx) ;

with F(x) = 0 if 0 x 1/2 , F(x) = 1 if x 1 ,

(6)

and F monotone non-decreasing and continuous.


It is easy to check that all the requirements in Theorem 2 are satisfied, so that, for the value 0
instead of u in formula (3) we get:
gn (Y t ) =

E
tSd ,X (t)=0

Sd

E | det(X (t)|gn (Y t )/X (t) = 0).pX (t) (0)d (dt).

(7)

Notice that the formula holds true for each compact subset of Sd in the place of Sd , hence for
Sd itself by monotone convergence.
Let now n in (7). Clearly gn (Y t ) 1IX(s)X(t)0,sS . 1IX(t)u . The passage to the limit
does not present any difficulty since 0 gn (Y t ) 1 and the sum in the left-hand side is bounded

by the random variable N0X (Sd ), which is in L1 because of Rice Formula. We get
E(Nd (u)) =
Sd

E | det(X (t)| 1IX(s)X(t)0,sS 1IX(t)u /X (t) = 0).pX (t) (0)d (dt)

Conditioning on the value of X(t), we obtain the desired formula for j = d.


The proof for 1 j d 1 is essentially the same, but one must take care of the parameterization of the manifold Sj . One can first establish locally the formula on a chart of Sj , using
local coordinates.
It can be proved as in [8], Proposition 2.2 (the only modification is due to the term 1IAx )
that the quantity written in some chart as
E det(Y (s)) 1IAx /Y (s) = x, Y (s) = 0 pY (s),Yj (s) (x, 0)ds,
where the process Y (s) is the process X written in some chart of Sj ,
(Y (s) = X(1 (s))), defines a j-form. By a j-form we mean a mesure on Sj that does not
depend on the parameterization and which has a density with respect to the Lebesgue measure
ds in every chart. It can be proved also that the integral of this j-form on Sj gives the
expectation of Nj (u).
To get formula (2) it suffices to consider locally around a precise point t Sj the chart
given by the projection on the tangent space at t. In this case we obtain that at t
ds is in fact j (dt)
Y (s) is isometric to Xj (t)
where s = (t).
The first consequence of Theorem 3 is the next corollary. For the statement, we need to
introduce some further notations.
For t in Sj , j d0 we define Ct,j as the closed convex cone generated by the set of directions:
{ Rd :

= 1 ; sn S, (n = 1, 2, . . .) such that sn t,
7

t sn
as n +},
t sn

whenever this set is non-empty and Ct,j = {0} if it is empty. We will denote by Ct,j the dual
cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
Notice that these definitions easily imply that Tt,j Ct,j and Ct,j Nt,j . Remark also that for
j = d0 , Ct,j = Nt,j .
We will say that the function X(.) has an extended outward derivative at the point t in
(t) C .
Sj , j d0 if Xj,N
t,j
Corollary 1 Under assumptions A1 to A5, one has :
(a) pM (x) p(x) where
E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)+

p(x) :=
tS0
d0
Sj

j=1

E | det(Xj (t))| 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (8)

(b) P{M > u}

p(x)dx.
u

Proof
(a) follows from Theorem 3 and the observation that if t Sj , one has
(t) C }. (b) is an obvious consequence of (a).
{M X(t)} {Xj,N
t,j
The actual interest of this Corollary depends on the feasibility of computing p(x). It turns
out that it can be done in some relevant cases, as we will see in the remaining of this section.
+
Our result can be compared with the approximation of P{M > u} by means of u pE (x)dx
given by [3], [34] where
pE (x) :=

E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)


tS0

d0

(1)j

+
j=1

Sj

E det(Xj (t)) 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (9)

Under certain conditions , u pE (x)dx is the expected value of the EPC of the excursion set
Eu (see [3]). The advantage of pE (x) over p(x) is that one can have nice expressions for it in
quite general situations. Conversely p(x) has the obvious advantage that it is an upper-bound
of the true density pM (x) and hence provides upon integrating once, an upper-bound for the
tail probability, for every u value. It is not known whether a similar inequality holds true for
pE (x).
On the other hand, under additional conditions, both provide good first order approximations
for pM (x) as x as we will see in the next section. In the special case in which the process
X is centered and has a law that is invariant under isometries and translations, we describe
below a procedure to compute p(x).

Computing p(x) for stationary isotropic Gaussian fields

For one-parameter centered Gaussian process having constant variance and satisfying certain
regularity conditions, a general bound for pM (x) has been computed in [8], pp.75-77. In the
two parameter case, Mercadier [26] has shown a bound for P{M > u}, obtained by means of a
method especially suited to dimension 2. When the parameter is one or two-dimensional, these
bounds are sharper than the ones below which, on the other hand, apply to any dimension but
to a more restricted context. We will assume now that the process X is centered Gaussian,
with a covariance function that can be written as
E X(s).X(t) = s t

(10)

where : R+ R is of class C 4 . Without loss of generality, we assume that (0) = 1.


Assumption (10) is equivalent to saying that the law of X is invariant under isometries (i.e.
linear transformations that preserve the scalar product) and translations of the underlying
parameter space Rd .
We will also assume that the set S is a polyhedron. More precisely we assume that each
Sj (j = 1, . . . , d) is a union of subsets of affine manifolds of dimension j in Rd .
The next lemma contains some auxiliary computations which are elementary and left to the
reader. We use the abridged notation : := (0), := (0)
Lemma 2 Under the conditions above, for each t U , i, i , k, k , j = 1, . . . , d:
1. E

X
ti (t).X(t)

2. E

X
X
ti (t). tk (t)

3. E

2X
ti tk (t).X(t)

4. E

2X
2X
ti tk (t). ti tk (t)

= 0,
= 2 ik and < 0,
= 2 ik , E

2X
X
ti tk (t). tj (t)

=0

= 24 ii .kk + i k .ik + ik i k ,

5. 2 0
6. If t Sj , the conditional distribution of Xj (t) given X(t) = x, Xj (t) = 0 is the same as
the unconditional distribution of the random matrix
Z + 2 xIj ,
where Z = (Zik : i, k = 1, . . . , j) is a symmetric j j matrix with centered Gaussian
entries, independent of the pair X(t), X (t) such that, for i k, i k one has :
E(Zik Zi k ) = 4 2 ii + ( 2 ) ik i k + 4 ii .kk (1 ik ) .
Let us introduce some additional notations:
Hn (x), n = 0, 1, . . . are the standard Hermite polynomials, i.e.
Hn (x) := ex

n x2

For the properties of the Hermite polynomials we refer to Mehta [25].


H n (x), n = 0, 1, . . . are the modified Hermite polynomials, defined as:
H n (x) := ex

2 /2

n x2 /2

We will use the following result:


Lemma 3 Let

Jn (x) :=

ey

2 /2

Hn ()dy, n = 0, 1, 2, . . .

(11)

where stands for the linear form = ay + bx where a, b are some real parameters that satisfy
a2 + b2 = 1/2. Then

Jn (x) := (2b)n 2 H n (x).


Proof :
It is clear that Jn is a polynomial having degree n. Differentiating in (11) under the integral
sign, we get:
Jn (x) = b

ey

2 /2

Hn ()dy = 2nb

Also:

Jn (0) =

ey

2 /2

Hn1 ()dy = 2n b Jn1 (x)

(12)

ey

2 /2

Hn (ay)dy,

so that Jn (0) = 0 if n is odd.


If n is even, n 2, using the standard recurrence relations for Hermite polynomials, we have:
+

Jn (0) =

ey

2 /2

= 2a2

ey

2ayHn1 (ay) 2(n 1)Hn2 (ay) dy


2 /2

(ay)dy 2(n 1)Jn2 (0)


Hn1

= 4b2 (n 1)Jn2 (0).

Equality (13) plus J0 (x) = 2 for all x R, imply that:

(2p)!
2.
J2p (0) = (1)p (2b)2p (2p 1)!! 2 = (2b2 )p
p!

(13)

(14)

Now we can go back to (12) and integrate successively for n = 1, 2, . . . on the interval [0, x]
using the initial value given by (14) when n = 2p and Jn (0) = 0 when n is odd, obtaining :

Jn (x) = (2b)n 2Qn (x),


where the sequence of polynomials Qn , n = 0, 1, 2, . . . verifies the conditions:
Q0 (x) = 1

(15)

Qn (x)

= nQn (x)

(16)

Qn (0) = 0 if n is odd

(17)

Qn (0) = (1)n/2 (n 1)!! if n is even.

(18)

It is now easy to show that in fact Qn (x) = H n (x) , n = 0, 1, 2, . . . using for example that:
x
H n (x) = 2n/2 Hn .
2

The integrals

In (v) =

2 /2

et

Hn (t)dt,

will appear in our computations. They are computed in the next Lemma, which can be proved
easily, using the standard properties of Hermite polynomials.
10

Lemma 4 (a)
[ n1
]
2
v2 /2

(n 1)!!
Hn12k (v)
(n 1 2k)!!

(n 1)!! 2 (x)

2k

In (v) = 2e

k=0
n

+ 1I{n even} 2 2
(b)

In () = 1I{n even} 2 2 (n 1)!!

(19)
(20)
(21)

Theorem 4 Assume that the process X is centered Gaussian, satisfies conditions A1-A5 with
a covariance having the form (10) and verifying the regularity conditions of the beginning of this
section. Moreover, let S be a polyhedron. Then, p(x) can be expressed by means of the following
formula:

d0

| | j/2
p(x) = (x)
H j (x) + Rj (x) gj
0 (t) +
,
(22)

j=1

tS0

where

gj is a geometric parameter of the face Sj defined by


j (t)j (dt),

gj =

(23)

Sj

where j (t) is the normalized solid angle of the cone Ct,j in Nt,j , that is:
j (t) =

dj1 (Ct,j S dj1 )


for j = 0, . . . , d 1,
dj1 (S dj1 )

d (t) = 1.

(24)
(25)

Notice that for convex or other usual polyhedra j (t) is constant for t Sj , so that gj is
equal to this constant multiplied by the j-dimensional geometric measure of Sj .
For j = 1, . . . d,
Rj (x) =

2
| |

j
2

((j + 1)/2

y2
dy
2

(26)

with := | |( )1/2

(27)

Tj (v) exp

where
v := (2)1/2 (1 2 )1/2 y x
and

j1

Tj (v) :=
k=0

Hk2 (v) v2 /2
Hj (v)
Ij1 (v).
e
j
2k k!
2 (j 1)!

(28)

where In is given in the previous Lemma.


For the proof of the theorem, we need some ingredients from random matrices theory.
Following Mehta [25], denote by qn () the density of eigenvalues of n n GOE matrices at the
point , that is, qn ()d is the probability of Gn having an eigenvalue in the interval (, + d).
The random nn real random matrix Gn is said to have the GOE distribution, if it is symmetric,
2 ) = 1/2 if i < k
with centered Gaussian entries gik , i, k = 1, . . . , n satisfying E(gii2 ) = 1, E(gik
11

and the random variables: {gik , 1 i k n} are independent.


It is well known that:
n1

2 /2

qn () = e

2 /2

c2k Hk2 ()
k=0
+

+ 1/2 (n/2)1/2 cn1 cn Hn1 ()

ey

+ 1I{n odd

2 /2

Hn (y)dy 2

ey

2 /2

Hn (y)dy

Hn1 ()
,
+ y 2 /2
e
H
(y)dy
n1

(29)

where ck := (2k k! )1/2 , k = 0, 1, . . ., (see Mehta [25], ch. 7.)


In the proof of the theorem we will use the following remark due to Fyodorov [18] that we
state as a Lemma
Lemma 5 Let Gn be a GOE n n matrix. Then, for R one has:
E | det(Gn In )| = 23/2 (n + 3)/2 exp( 2 /2)

qn+1 ()
,
n+1

(30)

Proof:
Denote by 1 , . . . , n the eigenvalues of Gn . It is well-known (Mehta [25], Kendall et al. [19])
that the joint density fn of the n-tuple of random variables (1 , . . . , n ) is given by the formula
n

n
2
i=1 i

fn (1 , . . . , n ) = cn exp

1i<kn

|k i | , with cn := (2)n/2 ((3/2))n

(1+i/2)
i=1

Then,
n

E | det(Gn In )| = E

i=1

|i |

=
Rn i=1

|i |cn exp(

= e

2 /2

cn
cn+1

Rn

n
2
i=1 i

)
1i<kn

|k i | d1 , . . . , dn

fn+1 (1 , . . . , n , )d1 , . . . , dn = e

2 /2

cn qn+1 ()
.
cn+1 n + 1

The remainder is plain.


Proof of Theorem 4:
We use the definition (8) given in Corollary 1 and the moment computations of Lemma 2 which
imply that:
pX(t) (x) = (x)

(31)
j/2

pX(t),Xj (t) (x, 0) = (x)(2)

j/2

(2 )

X (t) is independent of X(t)

Xj,N
(t)

is independent of

(33)

(Xj (t), X(t), Xj (t)).

Since the distribution of X (t) is centered Gaussian with variance 2 Id , it follows that :
E( 1IX (t)Cbt,0 /X(t) = x) = 0 (t)
12

(32)

if t S0 ,

(34)

and if t Sj , j 1:
E(| det(Xj (t))| 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0)

= j (t) E(| det(Xj (t))|/X(t) = x, Xj (t) = 0)


= j (t) E(| det(Z + 2 xIj )|). (35)
In the formula above, j (t) is the normalized solid angle defined in the statement of the theorem
and the random j j real matrix Z has the distribution of Lemma 2 .
A standard moment computations shows that Z has the same distribution as the random matrix:
2 Ij ,

8 Gj + 2

where Gj is a j j GOE random matrix, is standard normal in R and independent of Gj .


So, for j 1 one has
+

E | det(Z + 2 xIj )| = (8 )j/2

E | det(Gj Ij )| (y)dy,

where is given by (27).


For the conditional expectation in (8) use this last expression in (35) and (5). For the density
in (8) use (32). Then Lemma 3 gives (22).

Remarks on the theorem


The principal term is
(x)

d0

0 (t) +
j=1

tS0

| |

j/2

H j (x) gj

(36)

which is the product of a standard Gaussian density times a polynomial with degree d0 .
Integrating once, we get -in our special case- the formula for the expectation of the EPC
of the excursion set as given by [3]
The complementary term given by
d0

Rj (x)gj ,

(x)

(37)

j=1

can be computed by means of a formula, as it follows from the statement of the theorem
above. These formulae will be in general quite unpleasant due to the complicated form of
Tj (v). However, for low dimensions they are simple. For example:

2 (v) v(1 (v)) ,

T2 (v) = 2 2(v),

T1 (v) =

T3 (v) =

3(2v 2 + 1)(v) (2v 2 3)v(1 (v)) .


2

13

(38)
(39)
(40)

Second order asymptotics for pM (x) as x + will be mainly considered in the next
section. However, we state already that the complementary term (37) is equivalent, as
x +, to
12

(x) gd0 Kd0 x2d0 4 e

2
3 2

x2

(41)

where the constant Kj , j = 1, 2, ... is given by:


Kj = 23j2

j+1

2
j/4
j/2
3 2
(2) (j 1)!

2j4

(42)

We are not going to go through this calculation, which is elementary but requires some
work. An outline of it is the following. Replace the Hermite polynomials in the expression
for Tj (v) given by (28) by the well-known expansion:
[j/2]

(1)i

Hj (v) = j!
i=0

(2v)j2i
i!(j 2i)!

(43)

and Ij1 (v) by means of the formula in Lemma 4.


Evaluating the term of highest degree in the polynomial part, this allows to prove that,
as v +, Tj (v) is equivalent to

v2
2j1
v 2j4 e 2 .
(j 1)!

(44)

Using now the definition of Rj (x) and changing variables in the integral in (26), one gets
for Rj (x) the equivalent:
12

Kj x2j4 e

2
3 2

x2

(45)

In particular, the equivalent of (37) is given by the highest order non-vanishing term in
the sum.
Consider now the case in which S is the sphere S d1 and the process satisfies the same
conditions as in the theorem. Even though the theorem can not be applied directly,
it is possible to deal with this example to compute p(x), only performing some minor
changes. In this case, only the term that corresponds to j = d 1 in (8) does not vanish,
= 1 for each t S d1 and one can use invariance
Ct,d1 = Nt,d1 , so that 1IX
b
(t)C
d1,N

t,d1

under rotations to obtain:


p(x) = (x)

d1 S d1
E | det(Z + 2 xId1 ) + (2| |)1/2 Id1 |
(2)(d1)/2

(46)

where Z is a (d 1) (d 1) centered Gaussian matrix with the covariance structure


of Lemma 2 and is a standard Gaussian real random variable, independent of Z. (46)
follows from the fact that the normal derivative at each point is centered Gaussian with

14

variance 2| | and independent of the tangential derivative. So, we apply the previous
computation, replacing x by x + (2| |)1/2 and obtain the expression:
p(x) = (x)
+

2 d/2
(d/2)
| | (d1)/2
H d1 (x + (2| |)1/2 y) + Rd1 (x + (2| |)1/2 y) (y)dy.

(47)

Asymptotics as x +

In this section we will consider the errors in the direct and the EPC methods for large values
of the argument x. Theses errors are:
p(x) pM (x) =

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x pX(t) (x)


tS0

d0

+
j=1

Sj

E | det(Xj (t)| 1IX

j,N (t)Ct,j

pE (x) pM (x) =

. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (48)

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x pX(t) (x)


tS0

d0

(1)j

+
j=1

Sj

E det(Xj (t) 1IX

j,N (t)Ct,j

. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt).


(49)

It is clear that for every real x,


|pE (x) pM (x)| p(x) pM (x)
so that the upper bounds for p(x) pM (x) will automatically be upper bounds for
|pE (x) pM (x)|. Moreover, as far as the authors know, no better bounds for |pE (x) pM (x)|
than for p(x) pM (x) are known. It is an open question to determine if there exist situations
in which pE (x) is better asymptotically than p(x).
Our next theorem gives sufficient conditions allowing to ensure that the error
p(x) pM (x)
is bounded by a Gaussian density having strictly smaller variance than the maximum variance
of the given process X , which means that the error is super- exponentially smaller than pM (x)
itself, as x +. In this theorem, we assume that the maximum of the variance is not attained
in S\Sd0 . This excludes constant variance or some other stationary-like condition that will be
addressed in Theorem 6. As far as the authors know, the result of Theorem 5 is new even for
one-parameter processes defined on a compact interval.
For parameter dimension d0 > 1, the only result of this type for non-constant variance
processes of which the authors are aware is Theorem 3.3 of [34].
Theorem 5 Assume that the process X satisfies conditions A1 -A5. With no loss of generality,
we assume that maxtS Var(X(t)) = 1. In addition, we will assume that the set Sv of points
t S where the variance of X(t) attains its maximal value is contained in Sd0 (d0 > 0) the
non-empty face having largest dimension and that no point in Sv is a boundary point of S\Sd0 .
Then, there exist some positive constants C, such that for every x > 0.
|pE (x) pM (x)| p(x) pM (x) C(x(1 + )),
where (.) is the standard normal density.
15

(50)

Proof :
Let W be an open neighborhood of the compact subset Sv of S such that dist(W, (S\Sd0 )) > 0
where dist denote the Euclidean distance in Rd . For t Sj W c , the density
pX(t),Xj (t) (x, 0)
can be written as the product of the density of Xj (t) at the point 0, times the conditional density
of X(t) at the point x given that Xj (t) = 0, which is Gaussian with some bounded expectation
and a conditional variance which is smaller than the unconditional variance, hence, bounded by
some constant smaller than 1. Since the conditional expectations in (48) are uniformly bounded
by some constant, due to standard bounds on the moments of the Gaussian law, one can deduce
that:
p(x) pM (x) =

W Sd0

E | det(Xd0 (t))| 1IX

d0 ,N (t)Ct,d0

.pX(t),Xd

. 1IM >x /X(t) = x, Xd 0 (t) = 0

(t) (x, 0)d0 (dt)

+ O(((1 + 1 )x)), (51)

as x +, for some 1 > 0. Our following task is to choose W such that one can assure
that the first term in the right hand-member of (51) has the same form as the second, with a
possibly different constant 1 .
To do this , for s S and t Sd0 , let us write the Gaussian regression formula of X(s) on the
pair (X(t), Xd 0 (t)):
X(s) = at (s)X(t) + bt (s), Xd 0 (t) +

ts
2

X t (s).

(52)

where the regression coefficients at (s), bt (s) are respectively real-valued and Rd0 -valued.
From now onwards, we will only be interested in those t W . In this case, since W does not
contain boundary points of S\Sd0 , it follows that
Ct,d0 = Nt,d0 and 1IX

d0 ,N (t)Ct,d0

= 1.

Moreover, whenever s S is close enough to t, necessarily, s Sd0 and one can show that
the Gaussian process {X t (s) : t W Sd0 , s S} is bounded, in spite of the fact that its
trajectories are not continuous at s = t. For each t, {X t (s) : s S} is a helix process, see [8]
for a proof of boundedness.
On the other hand, conditionally on X(t) = x, Xd 0 (t) = 0 the event {M > x} can be written as
{X t (s) > t (s) x, for some s S}
where
t (s) =

2(1 at (s))
.
ts 2

(53)

Our next goal is to prove that if one can choose W in such a way that
inf{ t (s) : t W Sd0 , s S, s = t} > 0,

(54)

then we are done. In fact, apply the Cauchy-Schwarz inequality to the conditional expectation
in (51). Under the conditioning, the elements of Xd0 (t) are the sum of affine functions of x
with bounded coefficients plus centered Gaussian variables with bounded variances, hence, the
absolute value of the conditional expectation is bounded by an expression of the form
Q(t, x)

1/2

X t (s)
>x
t
sS\{t} (s)
sup
16

1/2

(55)

where Q(t, x) is a polynomial in x of degree 2d0 with bounded coefficients. For each t W Sd0 ,
the second factor in (55) is bounded by
P sup

X t (s)
: t W Sd0 , s S, s = t > x
t (s)

1/2

Now, we apply to the bounded separable Gaussian process


X t (s)
: t W Sd0 , s S, s = t
t (s)
the classical Landau-Shepp-Fernique inequality [20], [17] which gives the bound
P sup

X t (s)
: t W Sd0 , s S, s = t > x C2 exp(2 x2 ),
t (s)

for some positive constants C2 , 2 and any x > 0. Also, the same argument above for the density
pX(t),Xd (t) (x, 0) shows that it is bounded by a constant times the standard Gaussian density.
0
To finish, it suffices to replace these bounds in the first term at the right-hand side of (51).
It remains to choose W for (54) to hold true. Consider the auxiliary process
Y (s) :=

X(s)
r(s, s)

, s S.

(56)

Clearly, Var(Y (s)) = 1 for all s S. We set


r Y (s, s ) := Cov(Y (s), Y (s )) , s, s S.
Let us assume that t Sv . Since the function s
Var(X(s)) attains its maximum value at

s = t, it follows that X(t), Xd0 (t) are independent, on differentiation under the expectation
sign. This implies that in the regression formula (52) the coefficients are easily computed and
at (s) = r(s, t) which is strictly smaller than 1 if s = t, because of the non-degeneracy condition.
Then
2(1 r(s, t))
2(1 r Y (s, t))
t (s) =

.
(57)
ts 2
ts 2

Since r Y (s, s) = 1 for every s S, the Taylor expansion of r Y (s, t) as a function of s, around
s = t takes the form:
Y
r Y (s, t) = 1 + s t, r20,d
(t, t)(s t) + o( s t 2 ),
0

(58)

where the notation is self-explanatory.


Also, using that Var(Y (s)) = 1 for s S, we easily obtain:
Y
(t, t) = Var(Yd0 (t)) = Var(Xd 0 (t))
r20,d
0,

(59)

where the last equality follows by differentiation in (56) and putting s = t. (59) implies that
Y
(t, t) is uniformly positive definite on t Sv , meaning that its minimum eigenvalue has
r20,d
0,
a strictly positive lower bound. This, on account of (57) and (58), already shows that
inf{ t (s) : t Sv , s S, s = t} > 0,

(60)

The foregoing argument also shows that


inf{ (at )d0 (t) : t Sv , S d0 1 , s = t} > 0,
17

(61)

since whenever t Sv , one has at (s) = r(s, t) so that


(at )d0 (t) = r20,d0 , (t, t).
To end up, assume there is no neighborhood W of Sv satisfying (54). In that case using a
compactness argument, one can find two convergent sequences {sn } S , {tn } Sd0 , sn s0 ,
tn t0 Sv such that
tn (sn ) 0.
may be .
t0 = s0 is not possible, since it would imply
=2

(1 at0 (s0 ))
= t0 (s0 ),
t0 s 0 2

which is strictly positive.


If t0 = s0 , on differentiating in (52) with respect to s along Sd0 we get:
Xd 0 (s) = (at )d0 (s)X(t) + (bt )d0 (s), Xd 0 (t) +

d0 t s
s
2

X t (s),

where (at )d0 (s) is a column vector of size d0 and (bt )d0 (s) is a d0 d0 matrix. Then, one must
have at (t) = 1, (at )d0 (t) = 0 . Thus
tn (sn ) = uTn (at0 )d0 (t0 )un + o(1),
where un := (sn tn )/ sn tn . Since t0 Sv we may apply (61) and the limit of tn (sn )
cannot be non-positive.
A straightforward application of Theorem 5 is the following
Corollary 2 Under the hypotheses of Theorem 5, there exists positive constants C, such that,
for every u > 0 :
+

pE (x)dx P(M > u)

+
u

p(x)dx P(M > u) CP( > u),

where is a centered Gaussian variable with variance 1


The precise order of approximation of p(x) pM (x) or pE (x) pM (x) as x + remains
2 respectively which
in general an open problem, even if one only asks for the constants d2 , E
govern the second order asymptotic approximation and which are defined by means of

and

1
:= lim 2x2 log p(x) pM (x)
x+
d2

(62)

1
lim 2x2 log pE (x) pM (x)
2 := x+
E

(63)

whenever these limits exist. In general, we are unable to compute the limits (62) or (63) or
even to prove that they actually exist or differ. Our more general results (as well as in [3], [34])
only contain lower-bounds for the liminf as x +. This is already interesting since it gives
some upper-bounds for the speed of approximation for pM (x) either by p(x) or pE (x). On the
other hand, in Theorem 7 below, we are able to prove the existence of the limit and compute
d2 for a relevant class of Gaussian processes.
18

For the next theorem we need an additional condition on the parameter set S. For S
verifying A1 we define
(S) = sup

sup

0jd0 tSj

sup
sS,s=t

dist (t s), Ct,j


st 2

(64)

where dist is the Euclidean distance in Rd .


One can show that (S) < in each one of the following classes of parameter sets S:
- S is convex, in which case (S) = 0.
- S is a C 3 manifold, with or without boundary.
- S verifies the following condition: For every t S there exists an open neighborhood V of
t in Rd and a C 3 diffeomorphism : V B(0, r) (where B(0, r) denotes the open ball in Rd
centered at 0 and having radius r, r > 0) such that
(V S) = C B(0, r), where C is a convex cone.
However, (S) < can fail in general. A simple example showing what is going on is the
following: take an orthonormal basis of R2 and put
S = {(, 0) : 0 1} {( cos , sin ) : 0 1}
where 0 < < , that is, S is the boundary of an angle of size . One easily checks that
(S) = +. Moreover it is known [3] that in this case the EPC approximation does not verify
a super- exponential inequality. More generally, sets S having whiskers have (S) = +.
Theorem 6 Let X be a stochastic process on S satisfying A1 -A5. Suppose in addition that
Var(X(t)) = 1 for all t S and that (S) < +.
Then
1
(65)
lim inf 2x2 log p(x) pM (x) 1 + inf 2
x+
tS + (t)2
t
t
with

Var X(s)/X(t), X (t)


(1 r(s, t))2
sS\{t}

t2 := sup
and
t := sup

dist 1
t r01 (s, t), Ct,j
1 r(s, t)

sS\{t}

(66)

where
t := Var(X (t))
(t) is the maximum eigenvalue of t
in (66), j is such that t Sj ,(j = 0, 1, . . . , d0 ).
The quantity in the right hand side of (65) is strictly bigger than 1.
Remark. In formula (65) it may happen that the denominator in the right-hand side is
identically zero, in which case we put + for the infimum. This is the case of the one-parameter
process X(t) = cos t + sin t where , are Gaussian standard independent random variables,
and S is an interval having length strictly smaller than .

19

Proof of Theorem 6
Let us first prove that suptS t < .
For each t S, let us write the Taylor expansions
r01 (s, t) = r01 (t, t) + r11 (t, t)(s t) + O( s t 2 )
= t (s t) + O( s t 2 )

where O is uniform on s, t S, and


1 r(s, t) = (s t)T t (s t) + O( s t 2 ) L2 s t 2 ,
where L2 is some positive constant. It follows that for s S, t Sj , s = t, one has:
dist 1
t r01 (s, t), Ct,j

L3

1 r(s, t)

dist (t s), Ct,j


st 2

+ L4 ,

(67)

where L3 and L4 are positive constants. So,


dist 1
t r01 (s, t), Ct,j
which implies suptS t < .

L3 (S) + L4 .

1 r(s, t)

With the same notations as in the proof of Theorem 5, using (4) and (8), one has:

p(x) pM (x) = (x)

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x


tS0

d0

+
j=1

Sj

E | det(Xj (t))| 1IX

j,N (t)Ct,j .

1IM >x /X(t) = x, Xj (t) = 0

(2)j/2 (det(Var(Xj (t))))1/2 j (dt) . (68)


Proceeding in a similar way to that of the proof of Theorem 5, an application of the Holder
inequality to the conditional expectation in each term in the right-hand side of (68) shows that
the desired result will follow as soon as we prove that:

lim inf 2x2 log P {Xj,N


Ct,j } {M > x}/X(t) = x, Xj (t) = 0
x+

t2

1
,
+ (t)2t

for each j = 0, 1, . . . , d0 , where the liminf has some uniformity in t.


Let us write the Gaussian regression of X(s) on the pair (X(t), X (t))
X(s) = at (s)X(t) + bt (s), X (t) + Rt (s).
Since X(t) and X (t) are independent, one easily computes :
at (s) = r(s, t)
bt (s) = 1
t r01 (s, t).
Hence, conditionally on X(t) = x, Xj (t) = 0, the events

T
{M > x} and {Rt (s) > (1 r(s, t))x r01
(s, t)1
t Xj,N (t) for some s S}

20

(69)

coincide.
(t)|X (t) = 0) the regression of X (t) on X (t) = 0. So, the probability in
Denote by (Xj,N
j
j
j,N
(69) can written as

Cbt,j

P{ t (s) > x

T (s, t)1 x
r01

for some s S}pXj,N


(t)|Xj (t)=0 (x )dx
1 r(s, t)

(70)

where
t (s) :=

Rt (s)
1 r(s, t)

dx is the Lebesgue measure on Nt,j . Remember that Ct,j Nt,j .

If 1
t r01 (s, t) Ct,j one has

T
r01
(s, t)1
t x 0

for every x Ct,j , because of the definition of Ct,j .


/ Ct,j , since Ct,j is a closed convex cone, we can write
If 1
t r01 (s, t)

1
t r01 (s, t) = z + z

with z Ct,j , z z and z = dist(1


t r01 (s, t), Ct,j ).

So, if x Ct,j :
T (s, t)1 x
r01
z T x + z T x
t
=
t x
1 r(s, t)
1 r(s, t)

using that z T x 0 and the Cauchy-Schwarz inequality. It follows that in any case, if x Ct,j
the expression in (70) is bounded by
Cbt,j

P t (s) > x t x for some s S pXj,N


(t)|Xj (t)=0 (x )dx .

(71)

To obtain a bound for the probability in the integrand of (71) we will use the classical
inequality for the tail of the distribution of the supremum of a Gaussian process with bounded
paths.
The Gaussian process (s, t))
t (s), defined on (S S)\{s = t} has continuous paths. As
the pair (s, t) approches the diagonal of S S, t (s) may not have a limit but, almost surely,
it is bounded (see [8] for a proof). (For fixed t, t (.) is a helix process with a singularity at
s = t, a class of processes that we have already met above).
We set
mt (s) := E( t (s)) (s = t)
m := sups,tS,s=t |mt (s)|
:= E | sups,tS,s=t t (s) mt (s) | .
The almost sure boundedness of the paths of t (s) implies that m < and < . Applying the
Borell-Sudakov-Tsirelson type inequality (see for example Adler [2] and references therein) to
the centered process s
t (s)mt (s) defined on S\{t} , we get whenever xt x m > 0:
P{ t (s) > x t x for some s S}

P{ t (s) mt (s) > x t x m for some s S}


2 exp
21

(x t x m )2
.
2t2

The Gaussian density in the integrand of (71) is bounded by


(2j (t))

jd
2

x mj,N (t)

exp

2j (t)

(t)|X (t))
where j (t) and j (t) are respectively the minimum and maximum eigenvalue of Var(Xj,N
j

and mj,N (t) is the conditional expectation E(Xj,N (t)|Xj (t) = 0). Notice that j (t), j (t), mj,N (t)
are bounded, j (t) is bounded below by a positive constant and j (t) (t).

Replacing into (71) we have the bound :

P {Xj,N
Ct,j } {M > x}/X(t) = x, Xj (t) = 0

(2j (t))

jd
2

Cbt,j {xt x m>0}

exp

x mj,N (t) 2
(x t x m )2
dx
+
2t2
2(t)
xm

+ P Xj,N
(t)|Xj (t) = 0
, (72)
t

where it is understood that the second term in the right-hand side vanishes if t = 0.
Let us consider the first term in the right-hand side of (72). We have:
x mj,N (t)
(x t x m )2
+
2t2
2(t)

2
(x t x m )2 ( x mj,N (t) )
+
2t2
2(t)
(x m t mj,N (t) )2
2
,
= A(t) x + B(t)(x m ) + C(t) +
2t2 + 2(t)2t

where the last inequality is obtained after some algebra, A(t), B(t), C(t) are bounded functions
and A(t) is bounded below by some positive constant.
So the first term in the right-hand side of (72) is bounded by :
2.(2j )

jd
2

exp

(x m t mj,N (t))2
2t2 + 2(t)2t

Rdj

exp A(t) x + B(t)(x m ) + C(t)


L|x|dj1 exp

dx

(x m t mj,N (t) )2
2t2 + 2(t)2t

(73)

where L is some constant. The last inequality follows easily using polar coordinates.
Consider now the second term in the right-hand side of (72). Using the form of the conditional

density pXj,N
(t)/Xj (t)=0 (x ), it follows that it is bounded by
P

(Xj,N
(t)/Xj (t)

= 0)

mj,N (t)

x m t mj,N (t)
t

L1 |x|dj2 exp

(x m t mj,N (t) )2
2(t)2t

where L1 is some constant. Putting together (73) and (74) with (72), we obtain (69).
The following two corollaries are straightforward consequences of Theorem 6:
22

(74)

Corollary 3 Under the hypotheses of Theorem 6 one has


lim inf 2x2 log |pE (x) pM (x)| 1 + inf

tS 2
t

x+

1
.
+ (t)2t

Corollary 4 Let X a stochastic process on S satisfying A1 -A5. Suppose in addition that


E(X(t)) = 0, E(X 2 (t)) = 1, Var(X (t) = Id for all t S.
Then
+
1
pE (x)dx 1 + inf 2
.
lim inf 2u2 log P(M > u)
u+
tS t + 2
u
t
and

d0

(1)j (2)j/2 gj H j (x) (x).

p (x) =
j=0

where gj is given by (23) and H j (x) has been defined in Section 4.


The proof follows directly from Theorem 6 the definition of pE (x) and the results in [1].

Examples

1) A simple application of Theorem 5 is the following. Let X be a one parameter real-valued


centered Gaussian process with regular paths, defined on the interval [0, T ] and satisfying an
adequate non-degeneracy condition. Assume that the variance v(t) has a unique maximum, say
1 at the interior point t0 , and k = min{j : v (2j) (t0 ) = 0} < . Notice that v (2k) (t0 ) < 0. Then,
one can obtain the equivalent of pM (x) as x which is given by:
pM (x)

1 v (t0 )/2
1/k
kCk

E || 2k 1 x11/k (x),

(75)

1
v (2k) (t0 ) + 14 [v (t0 )]2 1Ik=2 . The
where is a standard normal random variable and Ck = (2k)!
proof is a direct application of the Laplace method. The result is new for the density of the
maximum, but if we integrate the density from u to +, the corresponding bound for P{M > u}
is known under weaker hypotheses (Piterbarg [28]).

2) Let the process X be centered and satisfy A1-A5. Assume that the the law of the process
is isotropic and stationary, so that the covariance has the form (10) and verifies the regularity
condition of Section 4. We add the simple normalization = (0) = 1/2. One can easily
check that
1 2 ( s t 2 ) 42 ( s t 2 ) s t 2
(76)
t2 = sup
[1 ( s t 2 )]2
sS\{t}
Furthermore if
(x) 0 for x 0

(77)

one can show that the sup in (76) is attained as s t 0 and is independent of t. Its value
is
t2 = 12 1.
The proof is elementary (see [4] or [34]).
Let S be a convex set. For t Sj , s S:
dist r01 (s, t), Ct,j = dist 2 ( s t 2 )(t s), Ct,j .
23

(78)

The convexity of S implies that (t s) Ct,j . Since Ct,j is a convex cone and 2 ( s t 2 ) 0,
one can conclude that r01 (s, t) Ct,j so that the distance in (78) is equal to zero. Hence,
t = 0 for every t S
and an application of Theorem 6 gives the inequality
lim inf
x+

1
2
.
log p(x) pM (x) 1 +
2

x
12 1

(79)

A direct consequence is that the same inequality holds true when replacing p(x) pM (x) by
|pE (x) pM (x)| in (79), thus obtainig the main explicit example in Adler and Taylor [3], or in
Taylor et al. [34].
Next, we improve (79). In fact, under the same hypotheses, we prove that the liminf is an
ordinary limit and the sign is an equality sign. We state this as
Theorem 7 Assume that X is centered, satisfies hypotheses A1-A5, the covariance has the
form (10) with (0) = 1/2, (x) 0 f or x 0. Let S be a convex set, and d0 = d 1.
Then
1
2
.
(80)
lim log p(x) pM (x) = 1 +
x+ x2
12 1
Remark Notice that since S is convex, the added hypothesis that the maximum dimension d0
such that Sj is not empty is equal to d is not an actual restriction.
Proof of Theorem 7
In view of (79), it suffices to prove that
lim sup
x+

1
2
log p(x) pM (x) 1 +
.
2

x
12 1

(81)

Using (4) and the definition of p(x) given by (8), one has the inequality
p(x) pM (x) (2)d/2 (x)

Sd

E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0)d (dt),

(82)

where our lower bound only contains the term corresponding to the largest dimension and we
have already replaced the density pX(t),X (t) (x, 0) by its explicit expression using the law of the
process. Under the condition {X(t) = x, X (t) = 0} if v0T X (t)v0 > 0 for some v0 S d1 , a
Taylor expansion implies that M > x. It follows that
E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0

E | det(X (t))| 1I

sup v T X (t)v > 0

/X(t) = x, X (t) = 0 . (83)

vS d1

We now apply Lemma 2 which describes the conditional distribution of X (t) given X(t) =
x, X (t) = 0 . Using the notations of this lemma, we may write the right-hand side of (83) as :
E | det(Z xId)| 1I

sup v T Zv > x

vS d1

which is obviously bounded below by


E | det(Z xId)| 1IZ11 >x

=
x

E | det(Z xId)|/Z11 = y (2)1/2 1 exp


24

y2
dy, (84)
2 2

where 2 := Var(Z11 ) = 12 1. The conditional distribution of Z given Z11 = y is easily


deduced from Lemma 2. It can be represented by the random d d real symmetric matrix

y
Z12
... ...
Z1d

2 + y Z23 . . .
Z2d

Z :=
,
..

.
d + y

where the random variables {2 , . . . , d , Zik , 1 i < k d} are independent centered Gaussian
with
Var(Zik ) = 4 (1 i < k d) ; Var(i ) =

4 1
16 (8 1)
(i = 2, . . . , d) ; =

12 1
12 1

Observe that 0 < < 1.


Choose now 0 such that (1+0 ) < 1. The expansion of det(Z xId) shows that if x(1+0 )
y x(1 + 0 ) + 1 and x is large enough, then
E | det(Z xId)| L 0 (1 (1 + 0 ))d1 xd ,

where L is some positive constant. This implies that

1
2

exp(
x

L
y2
)E | det(ZxId)| dy
2
2
2

x(1+0 )+1

exp(
x(1+0 )

y2
)0 (1(1+0 ))d1 xd dy
2 2

for x large enough. On account of (82),(83),(84), we conclude that for x large enough,
p(x) pM (x) L1 xd exp

x2 (x(1 + 0 ) + 1)2
+
.
2
2 2

for some new positive constant L1 . Since 0 can be chosen arbitrarily small, this implies (81).
3) Consider the same processes of Example 2, but now defined on the non-convex set {a
t b}, 0 < a < b. The same calculations as above show that t = 0 if a < t b and
t = max

2a (2a2 (1 cos ))(1 cos)


2 (z 2 )z
,
sup
,
2
1 (2a2 (1 cos ))
z[2a,a+b] 1 (z ) [0,]
sup

for t = a.
4) Let us keep the same hypotheses as in Example 2 but without assuming that the covariance is decreasing as in (77). The variance is still given by (76) but t is not necessarily equal
to zero. More precisely, relation (78) shows that
t sup 2
sS\{t}

( s t 2 )+ s t
1 ( s t 2 )

The normalization: = 1/2 implies that the process X is identity speed, that is
Var(X (t)) = Id so that (t) = 1. An application of Theorem 6 gives
lim inf
x+

where

2
log p(x) pM (x) 1 + 1/Z .
x2

(85)
2

4 (z 2 )+ z
1 2 (z 2 ) 42 (z 2 )z 2
+
max
,
[1 (z 2 )]2
z(0,] [1 (z 2 )]2
z(0,]

Z := sup
and is the diameter of S.
5) Suppose that

25

the process X is stationary with covariance (t) := Cov(X(s), X(s + t)) that satisfies
(s1 , . . . , sd ) = i=1,...,d i (si ) where 1 , ..., d are d covariance functions on R which are
monotone, positive on [0, +) and of class C 4 ,
S is a rectangle

S=

[ai , bi ] , ai < bi .
i=1,...,d

Then, adding an appropriate non-degeneracy condition, conditions A2-A5 are fulfilled and Theorem 6 applies
It is easy to see that

1 (s1 t1 )2 (s2 t2 ) . . . d (sd td )

..
r0,1 (s, t) =

.
1 (s1 t1 ) . . . d1 (sd1 td1 ).d (sd td )

belongs to Ct,j for every s S. As a consequence t = 0 for all t S. On the other hand,
standard regressions formulae show that
2
2
2
2
2
1 21 . . . 2d 2
Var X(s)/X(t), X (t)
1 2 . . . d 1 . . . d1 d
=
,
(1 r(s, t))2
(1 1 . . . d )2

where i stands for i (si ti ). Computation and maximisation of t2 should be performed


numerically in each particular case.

References
[1] Adler, R.J. (1981). The Geometry of Random Fields. Wiley, New York.
[2] Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes. IMS, Hayward, Ca.
[3] Adler, R.J. and Taylor J. E.(2005). Random fields and geometry. Book to appear.
[4] Azas J-M., Bardet J-M. and Wschebor M. (2002). On the Tails of the distribution of the
maximum of a smooth stationary Gaussian Process. Esaim: P. and S., 6,177-184.
[5] Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution of the
maximum of a Gaussian random fields. Extremes (2002)5(2), 181-212.
[6] Azas, J-M and Wschebor, M. (2002). The Distribution of the Maximum of a Gaussian
Process: Rice Method Revisited, In and out of equilibrium: probability with a physical
flavour, Progress in Probability, 321-348, Birkha
user.
[7] Azas J-M. and Wschebor M (2001). On the regularity of the distribution of the Maximum
of one parameter Gaussian processes Probab. Theory Relat. Fields, 119, 70-98.
[8] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[9] Azas J-M. and Wschebor, M. (2006). A self contained proof of the Rice formula for random
fields. Preprint available at http://www.lsp.ups-tlse.fr/Azais/publi/completeproof.pdf.
[10] Belyaev, Y. (1966). On the number of intersections of a level by a Gaussian Stochastic
process. Theory Prob. Appl., 11, 106-113.
26

[11] Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a
Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
[12] Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series.
[13] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30,
207-216.
[14] Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci. Paris, Ser. I, 337, 663-666.
[15] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J.
Wiley & Sons, New-York.
[16] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[17] Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
dEte de Probabilites de Saint Flour (1974). Lecture Notes in Mathematics, 480, SpringerVerlag, New-York.
[18] Fyodorov, Y. (2006). Complexity of Random Energy Landscapes, Glass Transition and
Absolute Value of Spectral Determinant of Random Matrices Physical Review Letters v. 92
(2004), 240601 (4pages); Erratum: ibid. v.93 (2004),149901(1page)
[19] Kendall, M.G., Stuart,A. and Ord, J.K. (1987). The Advanced Theory of Statistics, Vol. 3.
[20] Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process. Sankya
Ser. A 32, 369-378.
[21] Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Math. Soc.,
Providence, RI.
[22] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, Springer-Verlag,
New-York.
[23] Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 52-71.
[24] Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes. Proc.
Sixth Berkeley Symp. Math. Statist. Prob., 2, 423-442.
[25] Mehta,M.L. (2004). Random matrices, 3d-ed. Academic Press.
[26] Mercadier, C. (2006). Numerical bounds for the distribution of the maximum of one- and
two-dimensional processes, to appear in Advances in Applied Probability, 38, (1).
[27] Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th, Proba. Appl., 26, 687-705.
[28] Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes and
Fields. American Mathematical Society. Providence. Rhode Island.
[29] Rychlik, I. (1990). New bounds for the first passage, wave-length and amplitude densities.
Stochastic Processes and their Applications, 34, 313-339.
[30] Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties of half spaces for spherically
invariant measures (in Russian). Zap. Nauchn. Sem. LOMI, 45, 75-82.
27

[31] Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields, Ann. Probab.,
21, 34-71.
[32] Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab., 24,
1049-1103.
[33] Taylor, J.E. and Adler, R. J. (2003). Euler characteristics for Gaussian fields on manifolds.
Ann. Probab., 31, 533-563.
[34] Taylor J.E., Takemura A. and Adler R.J. (2005). Validity of the expected Euler Characteristic heuristic. Ann. Probab., 33, 4, 1362-1396.
[35] Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de niveau.
Lecture Notes in Mathematics, 1147, Springer-Verlag.

28

On the tails of the distribution of the maximum of a


smooth stationary Gaussian process
Jean-Marc Azas 1 , Jean-Marc Bardet 1, Mario Wschebor
1

Laboratoire de Statistique et de Probabilites, UMR CNRS 5583,


Universite Paul Sabatier, 118 route de Narbonne,
31062 Toulouse Cedex 4, France.

Centro de Matematica, Facultad de Ciencias, Universidad de la Rep


ublica,
Calle Igua 4225, 11400 Montevideo, Uruguay.
E-mail : azais@cict.fr,

bardet@cict.fr,

wscheb@fcien.edu.uy

February 7, 2002

Abstract
We study the tail of the distribution of the maximum of a stationary Gaussian
process on a bounded interval of the real line. Under regularity conditions, we give
an additional term for this asymptotics.

Mathematics Subject Classification (1991): 60Gxx, 60E05, 60G15, 65U05.


Key words: Tail of Distribution of the Maximum, Stationary Gaussian processes
Short Title: Distribution of the Maximum.

This work was supported from ECOS program U97E02.


Corresponding author.

Introduction and main result

X = {X(t), t [0, T ]}, T > 0 is a real-valued centered stationary Gaussian process.


r(t) := IE X(s)X(s + t) denotes its covariance function, its spectral measure and
k (k = 0, 1, 2, ...) its spectral moments whenever they exist.
Throughout this paper we will assume that 8 < ) and for every pair of parameter
values s and t , 0 s = t T , the six-dimensional random vector
(X(s), X (s), X (s), X(t), X (t), X (t)) has a non degenerate distribution. With no loss
of generality we will also assume 0 = r(0) = 1.
We are interested in the distribution of the random variable
M := max X(t).
t[0,T ]

With the usual notations


x2
1
(x) = exp( ) and (x) =
2
2

(t)dt,

Piterbarg (1981, Theorem 2.2.) proved (under the weaker condition 4 < instead of
8 < ) that for each T > 0 and any u R:
1 (u) +

2
u2 (1 + )
T (u) P (M > u) B exp
2
2

(1)

for some positive constants B and . It is easy to see (see for example Miroshin, 1974)
that the expression inside the modulus is non-negative, so that in fact:
0 1 (u) +

2
u2 (1 + )
T (u) P (M > u) B exp
2
2

(2)

The problem of improving relation (2) does not seem to have been solved in a satisfactory manner until now. A crucial step has been done by Piterbarg in the same paper
(Theorem 3.1) in which he proved that if T is small enough, then as u +:

T
2
4
3 3(4 22 )9/2
P (M > u) = 1(u)+
u [1 + o(1)] .
T (u)

9/2
5
2
4 22
22 (2 6 24 ) u
(3)
The same result has been obtained by other methods (Azas and Bardet, 2000; see also
Azas et al., 1999).
However Piterbarg equivalent (3) is of limited interest for applications since it contains
no information on the meaning of the expression T small enough.
The aim of this paper is to show that formula (3) is in fact valid for any length T
under appropriate conditions that will be described below.
Consider the function F (t) defined by
2

2 1 r(t)
F (t) :=
2 1 r 2 (t) r 2 (t)

Lemma 1 The even function F is well defined, has a continuous extension at zero and
F (0) =

22
;
4 22

F (0) = 0;
0 < F (0) =

2 (2 6 24 )
< .
9(4 22 )

Proof:
The denominator of F (t) is equal to 1 r 2 (t) Var X (0)|X(0), X(t) thus non zero
due to the non degeneracy hypothesis.
A direct Taylor expansion gives the value of F (0).
The expression of F (t) below shows that F (0) = 0 and gives the value of F (0)

F (t) =

22 1 r(t) r (t) r 2 (t) 2 r (t) 1 r(t)


2 1 r 2 (t) r 2 (t)

(4)

Note that 4 22 can vanish only if there exists some real such that ({ }) =
({}) = 1/2. Similarly, 2 6 24 can vanish only if there exists some real and
p 0 such that ({ }) = ({}) = p, ({0}) = 1 2p. These cases are excluded
by the non degeneracy hypothesis.

We will say that the function F satisfies hypothesis (H) if it has a unique minimum at
t = 0. The next proposition contains some sufficient conditions for this to take place.
Proposition 1 (a) If r (t) < 0 for 0 < t T then (H) is satisfied.
(b) Suppose that X is defined on the whole line and that
4 > 222 ;
r(t), r (t) 0 as t ;
there exists no local maximum of r(t) (other than at t = 0) with value greater or
4 222
equal to
.
4
Then (H) is satisfied for every T > 0.

An example of a process satisfying condition (b) but not condition (a) is given by the
covariance
1 + cos(t) t2 /2
e
r(t) :=
2
if we choose sufficiently small. In fact, a direct computation gives 2 = 1 + 2 /2 ;
4 = 3 + 3 2 + 4 /2 so that
1 + 2
4 222
=
.
4
3 + 3 2 + 4 /2
2
,
,

On [0, ), the covariance attains its second largest local maximum in the interval
so that its value is smaller than exp

2
. Hence, choosing is sufficiently small the
2 2

last condition in (b) is satisfied.

The main result of this article is the following :


Theorem 1 If the process X satisfies hypothesis (H), then (3) holds true.

Proofs

Notations :
p (x) is the density (when it exists) of the random variable at the point x IRn .
1IC denotes the indicator function of the event C.
Uu ([a, b]), u IR is the number of upscrossings on the interval [a, b] of the level u
by the process X defined as follows:
Uu ([a, b]) = #{t [a, b], X(t) = u, X (t) > 0}.
For k a positive integer, k (u, [a, b]) is the kth order factorial moment of Uu ([a, b])
k (u, [a, b]) = IE

Uu ([a, b]) Uu ([a, b]) 1 .... Uu ([a, b]) k + 1

We define also
k (u, [a, b]) = IE

Uu ([a, b]) Uu ([a, b]) 1 .... Uu ([a, b]) k + 1 1I{X(0)u} .

a+ = a 0 denotes the positive part of the real number a.


(const) denotes a positive constant whose value may vary from one occurrence to
another.

We will repeatedly use the following Lemma.


Lemma 2 Let f (respectively g) be a real-valued function of class C 2 (respectively C k for
some integer k 1) defined on the interval [0, T ] of the real line verifying the conditions:
1. f has a unique minimum on [0, T ] at the point t = t , and f (t ) = 0, f (t ) > 0.
2. Let k = inf j : g (j)(t ) = 0 .
Define

h(u) =
0

1
g(t) exp u2 f (t)
2

dt.

Then, as u :
h(u)

g (k) (t )
k!

1
xk exp f (t )x2
4
J

dx

1 2
exp

u f (t )
uk+1
2

(5)

where J = [0, +) , J = (, 0] or J = (, +) according as t = 0, t = T or 0 <


t < T respectively.
Proof: Assume 0 < t < T . The other two cases are similar.
Choose > 0 such that for all t [t , t + ] [0, T ] and |f (t)| C if |t t | <
for some positive constant C. Put m =
inf f (t) > f (t ).
[0,T ]{|tt |}

Write:
t+

h(u) =
t

1
g(t) exp u2 f (t)
2

dt +
[0,T ]{|tt |}

1
g(t) exp u2 f (t)
2

dt

The second term is bounded by


1
(const) T exp u2 m
2
As for the first, write the Taylor expansions of f (up to the second order) and g (up to
the kth order) around the point t = t , make the change of variables t = t + ux and pass
to the limit as u + on applying dominated convergence. The result follows.
We will use the following well-known expansion as (Abramovitz and Stegun 1972 p. 932).
For each a0 > 0 as u +

1
exp ay 2 dy =
2

for all a a0 where O

1
u7

1
3
1
2 3 + 3 5 +O
au a u
au

1
u7

1
exp au2 .
2

should be interpreted as bounded by

depending only on a0 .

(6)

K
, K a constant
u7

Proof of Theorem 1 :
Step 1 : The proof is based on an extension of Piterbargs result to intervals of any
length. Let > 0, the following relation is clear
P (M[0, ] > u) = P (X(0) > u) + P (Uu ([0, ]).1I{X(0)u} 1)
= 1 (u) + P (Uu ([0, ]) 1) P (Uu ([0, ]).1I{X(0)>u} 1).
In the sequel a term will be called negligible if it is O u6 exp
u +. We use the following relations to be proved later:

1 4 u 2
2 4 22

as

(i) P (Uu ([0, T ]).1I{X(0)>u} 1) is negligible.


(ii) Let 2 T . Then P ({Uu ([0, ])Uu ([, 2 ]) 1})is negligible.
With these relations, for 2 T , we have
P (M[0,2 ] > u) 1 (u) = P (Uu ([0, 2 ]) 1) + N1
= P (Uu ([0, ]) 1) + P (Uu ([, 2 ]) 1) + N2 = 2P (Uu ([0, ]) 1) + N3

= 2P (M[0, ] > u) 2 1 (u) + N4 , (7)

N1 N4 being negligible. Applying (7) repeatedly and on account of Piterbargs theorem


that states that (3) is valid if T is small enough, one gets the result.
Step 2 : Proof of (i). Using Markovs inequality:
P (Uu ([0, T ]).1I{X(0)>u} 1) 1 (u, [0, T ])
In Azas et al. 1999, 1 is evaluated using the Rice formula (Cramer and Leadbetter,
1967)
+

1 (u, [0, T ]) =

IE X + (t)|X(0) = x, X(t) = u pX(0),X(t) (x, u)dt

dx
u

(8)

Also if Z is a real-valued random variable with a Normal-(m, 2 ) distribution,


IE(Z + ) =

m
m
+ m
,

and plugging into (8) one obtains:

1
(u) T
2
dt
2 F
e 2 F y dy
2 0
u

2
r 2 F y 2
r F
(1 r)u2
exp


exp

2(1 + r)
22 (1 r 2 )
2 (1 r 2 )
u

1 (u, [0, T ]) =

B(t, u)dt,
0

where r, r and F stand for r(t), r (t) and F (t) respectively. Clearly, since r (0) = 2 <
0, there exists T0 such that r < 0 on (0, T0 ]. Divide the integral into two parts: [0, T0 ]
and [T0 , T ]. Using formula (6) on [0, T0 ] we get
(u)
2

B(t, u) =

2 F 5/2

2 (1 r)2 3
u + O u5 (u) ,
r 2

and since, as t 0, (1 r)2 r 2 = O(t2 ), Lemma 2 shows that


T0

B(t, u)dt = O u6 exp

4 u 2
2(4 22 )

On the other hand, since inf t[T0 ,T ] F (t) is strictly larger than F (0), it follows easily from

exp

ay 2
2

1
au2
dy (const) exp
2
a

that

a > 0 , u 0,

B(t, u)dt
T0

is negligible.
Step 3 : Proof of (ii). Use once more Markovs inequality:
P (Uu ([0, ])Uu ([, 2 ]) 1 ) IE (Uu ([0, ])Uu ([, 2 ])) .
Because of Rice formula (Cramer and Leadbetter, 1967):

IE (Uu ([0, ])Uu ([, 2 ])) =

At2 t1 (u)dt2 dt1 =


0

(t (2 t))At (u)dt,

(9)

with
At (u) = E X + (0)X + (t)|X(0) = X(t) = u pX(0),X(t) (u, u).
It is proved in Azas et al. (1999) that
At (u) =

1
2
1 r2

u
1+r

[T1 (t, u) + T2 (t, u) + T3 (t, u)] ,

with

T1 (t, u) = 1 + (b)(kb),
+
2

T2 (t, u) = 2( )

(kx)(x)dx
b

T3 (t, u) = 2(kb)(b)
= (t, u) = IE X (0)|X(0) = X(t) = u) =

r
u
1+r

(10)

2 = 2 (t) = Var X (0)|X(0), X(t)) = 2


= (t) = Cor X (0), X (t)|X(0), X(t)) =

r (1 r 2 ) rr 2
2 (1 r 2 ) r 2

1+
; b = b(t, u) = /,
1

k = k(t) =
z

(z) =

r 2
1 r2

(v)dv,
0

r, r , r again stand for r(t), r (t), r (t).


As in Step 2, we divide the integral (9) into two parts : [0, T0 ] and [T0 , 2 ]. For t < T0 ,
b(t, u) and k(t) are positive, thus using expansion (6), we get the formula p. 119 of Azas
et al. (1999) :
2 2
4 2
(kb)(1 + )(b) 3 (kb)(b) (1 + ) 2 k(kb)(b)
b
b
2
k
k3
1
4
2
2 3
.
+ 2 k (kb)(b) + 2 k(kb)(b) + O (kb)(b) 7 + 6 + 4
b
b
b
b

T1 (u) + T2 (u) + T3 (u) =

Since T1 (u) + T2 (u) + T3 (u) is non negative, majorizing (kb) and (kb) by 1 we get
2
1 r 2 (t)

At (u) (const)

(1 + )
k
1
k
k3
+ k3 + 2 + 7 + 6 + 4
b
b
b
b
b

exp

1
1 + F (t) u2 .
2

Now it is easy to see that, as t 0


2 (const)t2 , (1 + ) (const)t2 ,

1 r 2 (t) (const)t , b (const)u,

so that
2 (1 + )
t3
(const) ;
u
b 1 r 2 (t)
2 3
k
(const)t4 ;
2
1 r (t)
2 k
(const)t2 u2 ;
2
2
b 1 r (t)
and also that the other terms are negligible. Then, applying Lemma 2:
T0
0

(t (2 t))At (u)dt (const)u6 exp

thus negligible.

4 u 2
2(4 22 )

For t T0 remark that T1 (u)+T2(u)+T3(u) does not change when (and consequently
b) change of sign. Thus and b can supposed to be non negative. Forgetting negative
terms in formula (10) and majorizing by 1; 1 (b) by (const)(b) and by (const)u,
we get:
At (u) (const)2

u
1+r

(b)(1 + u) = (const)(1 + u) exp

1
1 + F (t) u2 .
2

We conclude as in Step 2.

Proof of Proposition 1 : Let us prove statement (a). The expression (4) of F


shows that it is positive for 0 < t T , since r (t) < 0 and
r 2 (t) (2 r (t))(1 r(t)) =
1
= Var X(t) X(0) Var X (t) + X (0) Cov2 X(t) X(0), X (t) + X (0)
4

< 0.
(11)

Thus the minimum is attained at zero.


22
< 1 = F (+). If F has a local minimum at t = t , (4)
(b) Note that F (0) =
2
4 2
shows that r has a local maximum at t = t so that
F (t ) =

22
1 r(t )
>
1 + r(t )
4 22

due to the last condition in (b). This proves (b).

Remark : The proofs above show that even if hypothesis (H) is not satisfied, it is
still possible to improve inequality (2). In fact it remains true for every such that
< min F (t).
t[0,T ]

Acknowledgment. The authors thank Professors P. Carmona and C. Delmas for useful
talks on the subject of this paper.

References

Abramowitz, M. and Stegun, I. A. (1972). Handbook of Mathematical functions with


Formulas, graphs and mathematical Tables. Dover, New-York.
Azas-Bardet (2000) Unpublished manuscript.

Azas, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes.
J. Wiley & Sons, New-York.
Miroshin, R.N. (1974). Rice series in the theory of random functions. Vestnik Leningrad
Univ. Math., 1, 143-155.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.

10

ESAIM: Probability and Statistics

September 1999, Vol. 3, p. 107129

URL: http://www.emath.fr/ps/

BOUNDS AND ASYMPTOTIC EXPANSIONS FOR THE DISTRIBUTION


OF THE MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Jean-Marc Azas 1 , Christine Cierco-Ayrolles 1, 2 and Alain Croquette 1


Abstract. This paper uses the Rice method [18] to give bounds to the distribution of the maximum
of a smooth stationary Gaussian process. We give simpler expressions of the first two terms of the Rice
series [3,13] for the distribution of the maximum. Our main contribution is a simpler form of the second
factorial moment of the number of upcrossings which is in some sense a generalization of Steinberg
et al.s formula ([7] p. 212). Then, we present a numerical application and asymptotic expansions that
give a new interpretation of a result by Piterbarg [15].

R
esum
e. Dans cet article nous utilisons la methode de Rice (Rice, 1944-1945) pour trouver un encadrement de la fonction de repartition du maximum dun processus Gaussien stationnaire regulier.
Nous derivons des expressions simplifiees des deux premiers termes de la serie de Rice (Miroshin, 1974,
Azas et Wschebor, 1997) suffisants pour lencadrement cherche. Notre contribution principale est la
donnee dune forme plus simple du second moment factoriel du nombre de franchissements vers le
haut, ce qui est, en quelque sorte, une generalisation de la formule de Steinberg et al. (Cramer and
Leadbetter, 1967, p. 212). Nous presentons ensuite une application numerique et des developpements
asymptotiques qui fournissent une nouvelle interpretation dun resultat de Piterbarg (1981).

AMS Subject Classification. 60Exx, 60Gxx, 60G10, 60G15, 60G70, 62E17, 65U05.
Received June 4, 1998. Revised June 8, 1999.

1. Introduction
1.1. Framework
Many statistical models involve nuisance parameters. This is the case for example for mixture models [10],
gene detection models [5,6], projection pursuit [20]. In such models, the distributions of test statistics are those
of the maximum of stochastic Gaussian processes (or their squares). Dacunha-Castelle and Gassiat [8] give for
example a theory for the so-called locally conic models.
Thus, the calculation of threshold or power of such tests leads to the calculation of the distribution of the
maximum of Gaussian processes. This problem is largely unsolved [2].
Keywords and phrases: Asymptotic expansions, extreme values, stationary Gaussian process, Rice series, upcrossings.

This paper is dedicated to Mario Wschebor in the occasion of his 60th birthday.

Laboratoire de Statistique et Probabilit


es, UMR C55830 du CNRS, Universite Paul Sabatier, 118 route de Narbonne, 31062
Toulouse Cedex 4, France.
2 Institut National de la Recherche Agronomique, Unit
e de Biom
etrie et Intelligence Artificielle, BP. 27, Chemin de Borde-Rouge,
31326 Castanet-Tolosan Cedex, France; e-mail: azais@cict.fr, cierco@toulouse.inra.fr, croquett@cict.fr
c EDP Sciences, SMAI 1999

108

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Miroshin [13] expressed the distribution function of this maximum as a sum of a series, so-called the Rice
series. Recently, Azas and Wschebor [3, 4] proved the convergence of this series under certain conditions and
proposed a method giving the exact distribution of the maximum for a class of processes including smooth
stationary Gausian processes with real parameter.
The formula given by the Rice series is rather complicated, involving multiple integrals with complex expressions. Fortunatly, for some processes, the convergence is very fast, so the present paper studies the bounds
given by the first two terms that are in some cases sufficient for application.
We give identities that yield simpler expressions of these terms in the case of stationary processes. Generalization to other processes is possible using our techniques but will not be detailed for shortness and simplicity.
For other processes, the calculation of more than two terms of the Rice series is necessary. In such a case,
the identities contained in this paper (and other similar) give a list of numerical tricks used by a program under
construction by Croquette.
We then use Maple to derive asymptotic expansions of some terms involved in these bounds. Our bounds
are shown to be sharp and our expansions are made for a fixed time interval and a level tending to infinity.
Other approaches can be found in the literature [12]. For example, Kratz and Rootzen [11] propose asymptotic
expansions for a size of time interval and a level tending jointly to infinity.
We consider a real valued centred stationary Gaussian process with continuous paths X = {Xt ; t [0, T ] R}.
We are interested in the random variables
X = sup Xt or X

= sup |Xt | .

t[0,T ]

t[0,T ]

For shortness and simplicity, we will focus attention on the variable X ; the necessary modifications for adapting
our method to X are easy to establish [5].
We denote by dF () the spectral measure of the process X and p the spectral moment of order p when it
exists. The spectral measure is supposed to have a finite second moment and a continuous component. This
implies ([7] p. 203) that the process is differentiable in quadratic mean and that for all pairwise different time
points t1 , . . . , tn in [0, T ], the joint distribution of Xt1 , . . . , Xtn , Xt1 , . . . , Xtn is non degenerated.
For simplicity, we will assume that moreover the process admits C 1 sample paths. We will denote by r(.) the
covariance function of X and, without loss of generality, we will suppose that 0 = r(0) = 1.
Let u be a real number, the number of upcrossings of the level u by X, denoted by Uu is defined as follows:
Uu = # {t [0, T ], Xt = u, Xt > 0}
For k N , we denote by k (u, T ) the factorial moment of order k of Uu and by k (u, T ) the factorial moment of
order k of Uu 11{X0 u} . We also define k (u, T ) = k (u, T ) k (u, T ). These factorial moments can be calculated
by Rice formulae. For example:

T 2 u2 /2
1 (u, T ) = E (Uu ) =
e
2
T

and 2 (u, T ) = E (Uu (Uu 1)) =

Ast (u) ds dt
0

with Ast (u) = E (Xs )+ (Xt )+ |Xs = Xt = u ps,t (u, u), where (X )+ is the positive part of X and ps,t the
joint density of (Xs , Xt ).
These two formulae are proved to hold under our hypotheses ( [7], p. 204). See also Wschebor [21],
Chapter 3, for the case of more general processes.
We will denote by the density of the standard Gaussian distribution. In order to have simpler expressions
x

of rather complicated formulae, we will use the folllowing three functions: (x) =
x

and (x) =
0

1
(y)dy = (x) .
2

(y)dy, (x) = 1 (x)

109

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

1.2. Main inequalities


Since the pioneering works of Rice [18], the most commonly used upper bound for the distribution of the
maximum is the following:
P (X > u) P (X0 > u) + P (Uu > 0) P (X0 > u) + E (Uu ) .
2
(u).
2
One can also see the works by [9, 15, 16].
We propose here a slight refinement of this inequality, but also a lower bound using the second factorial
moment of the number of upcrossings. Our results are based on the following remark which is easy to check: if
is a non-negative integer valued random variable, then

That is: P (X > u) (u) + T

1
E () E (( 1)) P ( > 0) E () .
2
Noting that P almost surely, {X > u} = {X0 > u} {X0 u, Uu > 0} and that E Uu (Uu 1)11{X0 u} 2 ,
we get:
P (X0 > u) + 1 (u, T )

2 (u, T )
P X u
2

P (X0 > u) + 1 (u, T ),

(1.1)

with 1 (u, T ) = E Uu 11{X0 u} .


Using the same technique as for calculating E (Uu ) and E (Uu (Uu 1)), one gets
T

1 (u, T ) =

dt
0

dx

y p0,t;t (x, u; y)dy,


0

where p0,t;t stands for the density of the vector (X0 , Xt , Xt ).


Azas and Wschebor [3, 4] have proved, under certain conditions, the convergence of the Rice series [13]
+

P X u = P (X0 > u) +

(1)m+1
m=1

m (u, T )
m!

(1.2)

and the envelopping property of this series:


n
m (u, T )
(1)m+1
if we set Sn = P (X0 > u) +
, then, for all n > 0:
m!
m=1
S2n P X u S2n1 .

(1.3)

Using relation (1.3) with n = 1 gives


P (X0 > u) + 1 (u, T )

2 (u, T )
P X u P (X0 > u) + 1 (u, T ).
2

Since 2 (u, T ) 2 (u, T ), we see that, except this last modification which gives a simpler expression, Main
inequality (1.1) is relation (1.3) with n = 1.

110

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Remark 1.1. In order to calculate these bounds, we are interested in the quantity 1 (u, T ). For asymptotic
calculations and to compare our results with Piterbargs ones, we will also consider the quantity k (u, T ). From
a numerical point of view, k (u, T ) and k (u, T ) are worth being distinguished because they are not of same
order of magnitude as u +. In the following sections, we will work with 1 (u, T ).

2. Some identities
First, let us introduce some notations that will be used in the rest of the paper. We set:
r (t)
u,
(t) = E (X0 |X0 = Xt = u) =
1 + r(t)
r 2 (t)
2 (t) = V ar (X0 |X0 = Xt = u) = 2
,
1 r2 (t)
r (t) 1 r2 (t) r(t)r 2 (t)
(t) = Cor (X0 , Xt |X0 = Xt = u) =
.
2 (1 r2 (t)) r 2 (t)
1 + (t)

and b(t) = (t).


1 (t)

Note that, since the spectrum of the process X admits a continuous component, |(t)| = 1.
In the sequel, the variable t will be omitted when it is not confusing and we will write r, r , , , , k, b instead
of r(t), r (t), (t), (t), (t), k(t), b(t).

We also define k(t) =

Proposition 2.1. (i) If (X, Y ) has a centred normal bivariate distribution with covariance matrix

1
1

then a R+
a

1
P (X > a, Y > a) = arctan

1+
(x)
2
1
0
1+
x (x) dx
1

=2

(ii) 1 (u, T ) = (u)


0
T

2 (T t)

(iii) 2 (u, T ) =
0

1 r 2
u
1+r

1
2
1 r2 (t)

1+
x
1

1r
r
u (b)
1+r
1 r2

u
1 + r(t)

dx

dt

[T1 (t) + T2 (t) + T3 (t)] dt

with:
T1 (t) = 2 (t)

1 2 (t) (b(t)) (k(t) b(t)) ,

(2.1)

T2 (t) = 2 (2 (t)(t) 2 (t))

(k(t) x) (x) dx,

(2.2)

b(t)

T3 (t) = 2 (t) (t) (k(t) b(t)) (b(t)) .

(2.3)

(iv) A second expression for T2 (t) is:


T2 (t) = ( (t)(t) (t))
2

1
arctan (k(t)) 2

b(t)

(k(t) x) (x) dx .
0

(2.4)

111

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Remark 2.2.
p. 27:

1. Formula (i) is analogous to the formula (2.10.4) given in Cramer and Leadbetters [7],

P (X > a, Y > a) = (a)(a) +


0

1
a2

exp
1z
2 1 z 2

dz.

Our formula is easier to prove and is more adapted to numerical application because, when t 0,
(t) 1 and the integrand in Cramer and Leadbetters formula tends to infinity.
2. Utility of these formulae:
these formulae permit a computation of Main inequality (1.1), at the cost of a double integral with
finite bounds. This is a notable reduction of complexity with respect to the original form. The form
(2.4) is more adapted to effective computation, because it involves an integral on a bounded interval;
this method has been implemented in a S+ program that needs about one second of Cpu to run an
example. It has been applied to a genetical problem in Cierco and Azas [6].
The form (iii) has some consequences both for numerical and theoretical purposes. The calculation of 2 (u, T )
yields some numerical difficulties around t = 0. The sum of the three terms is infinitly small with respect to
each term. To discard the diagonal from the computation, we use formula (iii) and Maple to calculate the
equivalent of the integrand in the neighbourhood of t = 0 at fixed u.
T

Recall that we have set 2 (u, T ) =

Ast (u) ds dt. The following proposition gives the Taylor expansion
0

of A at zero.

Proposition 2.3. Assume that 8 is finite. Then, as t 0:


3/2

At (u) =

1
(2 6 4 )
1 4
exp
u2
1296 (4 2 )1/2 2 2
2 4 22
2
2

t4 + O(t5 ).

Piterbarg [17] or Wschebor [21] proved that At (u) = O ( (u(1 + ))) for some 0. Our result is more precise.
Our formulae give some asymptotic expansions as u + for 1 (u, T ) and 2 (u, T ) for small T .
Proposition 2.4. Assume that 8 is finite. Then, there exists a value T0 such that, for every T < T0
11/2

4 22
27
1 (u, T ) =

4 5 (2 6 2 )3/2
2
4

4
u
4 22

u6

1+O

1
u

9/2
4 22
3 3T
2 (u, T ) =

9/2 (2 6 2 )
2
4

4
u
4 22

u5

1+O

1
u

as u +.

3. A numerical example
In the following example, we show how the upper and lower bounds (1.1) permit to evaluate the distribution
of X with an error less than 104 .
We consider the centered stationary Gaussian process with covariance (t) := exp(t2 /2) on the interval
I = [0, 1], and the levels u = 3, 2.5, . . . , 3. The term P (X0 u) is evaluated by the S -plus function P norm,
1 and 2 using Proposition 2.1 and the Simpson method. Though it is rather difficult to assess the exact
precision of these evaluations, it is clear that it is considerably smaller than 104 . So, the main source of error

112

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

is due to the difference between the upper and lower bounds in (1.1).
u
P (X0 u)
3
0.00135
2.5
0.00621
2
0.02275
1.5
0.06681
1
0.15866
0.5
0.30854
0
0.50000
0.5
0.69146
1
0.84134
1.5
0.93319
2
0.97725
2.5
0.99379
3
0.99865

1
0.00121
0.00518
0.01719
0.04396
0.08652
0.13101
0.15272
0.13731
0.09544
0.05140
0.02149
0.00699
0.00177

2
lower bound upper bound
0
0.00014
0.00014
0
0.00103
0.00103
0
0.00556
0.00556
0.00001
0.02285
0.02285
0.00002
0.07213
0.07214
0.00004
0.17753
0.17755
0.00005
0.34728
0.34731
0.00004
0.55415
0.55417
0.00002
0.74591
0.74592
0.00001
0.88179
0.88180
0
0.95576
0.95576
0
0.98680
0.98680
0
0.99688
0.99688

The calculation demands 14 s on a Pentium 100 MHz.


The corresponding program is available sending an e-mail to croquett@cict.fr.

4. Proofs
Proof of Proposition 2.1
Proof of point (i). We first search P (X > a, Y > a).
Put = cos(), [0, [, and use the orthogonal decomposition Y = X +
a X
Then {Y > a} = Z >
. Thus:
1 2
+

P (X > a, Y > a) =

a x

(x)

(x)(z) dx dz,

dx =

1 2

1 2 Z.

1
where D is the domain located between the two half straight lines starting from the point a, a
1+

with angle and .


2
2

Using a symmetry with respect to the straight line with angle passing through the origin, we get:
2
+

P (X > a, Y > a) = 2

(x)
a

1
x
1+

dx.

(4.1)

Now,
P (X > a, Y > a) = (a) P (X > a, Y < a) = (a) P (X > a, (Y ) > a) .
Applying relation (4.1) to (X, Y ) yields
+

P (X > a, Y > a) = (a) 2

(x)
a

1+
x
1

dx = 2

and

1+
x
1

(x) dx.

113

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Now, using polar coordinates, it is easy to establish that


+

(k x) (x) dx =
0

1
arctan(k)
2

which yields the first expression.


Proof of point (ii). Conditionally to (X0 = x, Xt = u), Xt is Gaussian with:
r (t)(x r(t)u)
mean m(t) =
,
1 r2 (t)
2
variance (t) already defined.
It is easy to check that, if Z is a Gaussian random variable with mean m and variance 2 , then
m
m
+ m

E Z + =

These two remarks yield 1 (u, T ) = I1 + I2 , with:


T
+
r (x r u)
I1 =
dt

p0,t (x, u) dx
(1 r2 )
0
u
T
+
r (x r u)
r (x r u)
I2 =
dt

p0,t (x, u) dx.


2)
(1

r
(1 r2 )
0
u
T

I1 can be written under the following form: I1 = (u)


0

parts leads to

I2 = (u)
0

Finally, noticing that 2 +

1 (u, T ) =

22

2
(u)
2

r
1r

u (b)

2
1r
1+r

2 1 r
r2

u
+

1+r
22 (1 r2 )

1r
u
1+r

dt. Integrating I2 by

dt.

r2
= 2 , we obtain:
1 r2
T

1r
u
1+r

dt + (u)
0

1 r2

1r
u
1+r

(b) dt.

Proof of point (iii). We set:


(x b)2 2(x b)(y + b) + (y + b)2
v(x, y) =
2(1 2 )
for (i, j) {(0, 0); (1, 0); (0, 1); (1, 1); (2, 0); (0, 2)}
+

Jij =
0

xi y j
2

1 2

exp (v(x, y)) dydx.

We first calculate the values of Jij . The following relation is clear


+

J10 J01 (1 + )bJ00

1 2
0

1 2 (k b) (b).

exp (v(x, y))


v(x, y)
dx
x
2 1 2

dy
(4.2)

114

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Symmetrically, replacing x with y and b with b in (4.2) yields


J01 J10 + (1 + )bJ00 = 1 2 (k b) (b).

(4.3)

In the same way, multiplying the integrand by y, we get


J11 J02 (1 + ) b J01 = 1 2

3/2

(k b) k b (k b) (b).

(4.4)

[ (k b) + k b (k b)] (b).

(4.5)

And then, multiplying the integrand by x leads to


J11 J20 + (1 + ) b J10 = 1 2
+

Finally, J20 J11 (1 + ) b J10 = (1 2 )

x
0

parts

3/2

exp (v(x, y))

v(x, y)
dx dy. Then, integrating by
x
2 1 2

J20 J11 (1 + ) b J10 = (1 2 ) J00 .

(4.6)

Multiplying equation (4.6) by and adding (4.5) gives:


J11 = b J10 + J00 +

1 2 [ (k b) + k b (k b)] (b).

Multiplying equation (4.3) by and adding equation (4.2) yields:


J10 = b J00 + (k b) + (k b) (b).
+

And, by formula (i), J00 = 2

(k x) (x) dx. Finally, gathering the pieces, it comes:


b

J11 = J11 (b, ) =

1 2 2

b
1

(b) + 2 b2

(k x) (x) dx + 2 b (k b) (b).
b

The final result is obtained remarking that


+

E (X0 ) (Xt ) |X0 = Xt = u = 2 (t) J11 (b(t), (t)) .


Proof of point (iv). Expression (2.4) is obtained simply using the second expression of J00 .
Note 4.1. In the following proofs, some expansions are made as t 0, some as u + and some as
(t, u) (0, +).
We define the uniform Landau symbol OU as a(t, u) = OU (b(t, u)) if there exists T0 and u0 such that for
t < T0 < T and u > u0 ,
a(t, u) (const) b(t, u).
We also define the symbol

as a(t, u)

b(t, u)

a(t, u) = OU (b(t, u))

b(t, u) = OU (a(t, u))

Note 4.2. Many results of this section are based on tedious Taylor expansions. These expansions have been
made or checked by a computer algebra system (Maple). They are not detailed in the proofs.

115

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

1 + (t)
= O(t) is small,
1 (t)

Proof of Proposition 2.3. Use form (iii) and remark that, when t is small, k(t) =
1
and, since () =
2

3
6

+ O 5 as 0, we get:
b(t)

b(t)

k(t)
arctan(k(t))
k 3 (t)

x(x)dx +
x3 (x)dx + O(t5 )
2
2 0
6 2 0

1
k(t)
2 arctan(k(t)) 2 ((0) (b(t)))
+ O(t5 ).
= 2 2 (t)(t) 2 (t)

k 3 (t)
2
+
2(0) b (t) + 2 (b(t))
6 2
In the same way:
2(t)(t)
k 3 (t) 3
b (t) + O(t5 ).
T3 (t) =
(b(t)) k(t)b(t)
6
2
And then, assuming 8 finite, use Maple to get the result.
T2 (t) = 2 2 (t)(t) 2 (t)

Proof of Proposition 2.4. We first prove the following two lemmas.


Lemma 4.3. Let l be a real positive function of class C 2 satisfying l(t) = ct + O(t2 ) as t 0, c > 0. Suppose
that 8 is finite, with the above definitions of k(t) and b(t), we have as u +:
arctan

(p+1)

(i) Ip =

t (k(t) b(t)) (l(t) u) dt = (c u)


0

1 Mp+1

2 2

( dc )
p

(cos ) d 1 + O
0

1
u

22 6 2 24
p+1
and Mp+1 = E |Z|
where Z is a standard Gaussian random variable.
4 22
T
Mp
1
(ii) Jp =
tp (l(t) u) dt = (c u)(p+1)
1+O

2
u
0

with d =

1
6

Proof of Lemma 4.3. Since the derivative of l at zero is non zero, l is invertible in some neighbourghood of zero
1
1
and its inverse l1 satisfies l1 (t) = t + O(t2 ), l1 (t) = + O(t).
c
c
We first consider Ip and use the change of variable y = l(t)u, then
l(T )u

Ip =

y
u

l1

(kb) l1

y
u

(y) l1

y
u

dy

From the expressions of k(t) and b(t), we know that


(kb)(t) =
Thus (kb) l1

y
d
= y + u OU
u
c

1
6
y2
u2

and

l(T )u

(p+1)

yp

Ip = (c u)

We use the following lemma.

22 6 2 24
t u + u O(t3 ) = d u t + u O(t3 ).
4 22

d
y + u OU
c

y2
u2

(y) 1 + OU

y
u

dy.

116

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Lemma 4.4. Let h be a real function such that h(t) = O t2 as t


0, then there exists T0 such that for
0 t T0
(u(t + h(t))) = (t u) [1 + OU (t)] .
Proof of Lemma 4.4. Taking T0 sufficiently small, we can assume that h(t)
A = | (u(t + h(t))) (t u)| u |h(t)|

tu
2

t
. Then
2

(const) u t2

tu
2

We want to prove that, in every case,


A (const) t (t u)

(4.7)

when tu 1, (t u) tu(1) and A (const) u t2 (0), thus (4.7) holds.


ut
when tu > 1, (t u) > (1) and A (const) t2 u
and (4.7) holds again.
2
End of proof of Lemma 4.3.
Due to Lemma 4.4,
l(T )u

(p+1)

yp

Ip = (c u)

0
l(T )u

yp

Put Kp (u) =
0

d
y
c

d
y
c

(y) 1 + OU

y
u

dy.

(4.8)

(y) dy. It is easy to see that, when u +,


+

yp

Kp (u) =
0

d
y
c

(y) dy + O un for every integer n > 0.


d

+
c y yp
y2 + z 2
d
y (y) dy =
exp
dz dy. Then, using polar coorc
2
2
0
0
0
d
1 Mp+1 arctan( c )
p
dinates, we derive that Kp () =
(cos ) d. So we can see that the contribution of the
2 2
0
y
term OU
in formula (4.8) is O u(p+2) which gives the desired result for Ip .
u

Moreover, Kp () =

yp

The same kind of proof gives the expression of Jp .


Proof of the equivalent of 1 (u, T ). We set
A1 (t) = (u)

2 (1 r)
u
2 (1 + r)

1r
u
1+r

r
(b)
1 r2

Then, 1 (u, T ) =

A1 (t) dt.
0

It is well known ([1], p. 932) that, as z tends to infinity,


(z) = (z)

1
1
3
3 + 5 + O(z 7 ) .
z
z
z

(4.9)

117

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

2 (1 r(t))
u for the first term and z = b(t)
2 (t)(1 + r(t))

We use this expansion for both terms of 1 (u, T ), with z =


for the second one.
Besides, remarking that

2 (1 r)
u
2 (1 + r)

1r
u
1+r

(b) ,

we get:

2 (1 + r) 1

2 (1 r) u

2 (1 r)
2 (1 + r)

u
+ OU

2
(1 + r)

2 (1 r)

1
r
1
+
3 + OU
2
b
1r b

(u)
A1 (t) =
2

3/2

2 (1 + r)
2 (1 r)
5/2

1
u5

1
b5

u3


From Taylor expansion made by Maple assuming 8 finite, we know that:


5/2

4
2
exp 2(u4
2)
1 4 2
2
t2 + O(t4 ).
A1 (t) =

7/2
8
u3 2
2

To use Lemma 4.3 point (ii) to calculate 1 (u, T ), it is necessary to have a Taylor expansion of the coefficient
22
2 (1 r)
2 (1 r(t))
of u in
u
.
We
have
lim
=
, therefore, we set:
t0 2 (t)(1 + r(t))
2 (1 + r)
4 22
2 (1 r)
22
.

2 (1 + r)
4 22

l(t) =

From Taylor expansion made by Maple assuming 8 finite, we get


1
l(t) =
6

2 (2 6 24 )
t + O(t2 ).
4 22

And, according to Lemma 4.3 point (ii),


T

1
t (l(t) u) dt =
2

1
6

Finally, remarking that (u)

2
4

22

2 (2 6 24 )
u
4 22

1
=
2
11/2

4 22
27
1 (u, T ) =

4 5 (2 6 2 )3/2
2
4

1+O

1
u

4
u , we get the equivalent for 1 (u, T ).
4 22
4
u
4 22

u6

1+O

1
u

118

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Proof of the equivalent of 2 (u, T ). Remember that


T

2 (T t)

2 (u, T ) =
0

1
2
1 r2 (t)

u
1 + r(t)

[T1 (t) + T2 (t) + T3 (t)] dt.

(4.10)

We first calculate an expansion of term T2 = 2 2 ( b2 )

(x) (k x) dx.

b
The function x x2 1 (x) being bounded, we have
(kx) = (k b) + k (k b) (x b)

1 3
2
3
k b (k b) (x b) + OU k 3 (x b) ,
2

(4.11)

where the Landaus symbol has here the same meaning as in Lemma 4.3.
Moreover, using the expansion of given in formula (4.9), it is easy to check that as z +,
+

(z)
(z)
(z)
3 4 +O
2
z
z
z6
z
+
(z)
(z)
2
(x z) (x) dx = 2 3 + O
z
z5
z
+
(z)
3
(x z) (x) dx = O
.
z4
z
(x z) (x) dx =

Therefore, multiplying formula (4.11) by (x), integrating on [b; +[ and applying formula (4.9) once again
yield:

3
1 k2
3
1
1

+
+ k (k b) (b)
4

(k b) (b)

b b3 b5
b2
b

(k b) (b)
k
2
2
T2 = 2 b
+O
(k
b)
(b)

+
O

b7
b6

3
3

k
k

(k b) (b) + O
(b)

+O
b4
b4

Note that the penultimate term can be forgotten. Then, remarking that, as u +, b =
u, t and

k t, we obtain:
T2

2
2
= 2 2 b (k b) (b) + 2
(k b) (b) + 2
(k b) (b)
b2
b
2

2 3 (k b) (b) 6 3 (k b) (b) + 2 2 k 3 (k b) (b)


b
b
2 k
2 k
2 2 k (k b) (b) + 2
(k b) (b) + 6 2 (k b) (b)
2
b
b
+ OU t2 u5 (k b) (b) + OU t3 u4 (k b) (b) + OU t5 u2 (b)

Remark 4.5. As it will be seen later on, Lemma 4.3 shows that the contribution of the remainder to the
1
integral (4.10) can be neglected since the degrees in t and of each term are greater than 5. So, in the sequel,
u
we will denote the sum of these terms (and other terms that will appear later) by Remainder and we set:
T2 = U1 + U2 + U3 + U4 + U5 + U6 + U7 + U8 + U9 + Remainder.

119

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Now, we have

U1 + T 3 = 0
1 2 2 k = (1 + ) k so that U7 + T1 = (1 + ) 2 k (k b) (b)
2
U2 + U3 = 2
(1 + ) (k b) (b)
b
2

U4 + U5 = 4 3 (k b) (b) 1 + O t2
b
2
U8 + U9 = 4 2 k (k b) (b) 1 + O t2
b
since = 1 + O t2 .

By the same remark as Remark 4.5 above, the term O t2 can be neglected. Consequently,

T1 + T2 + T3

= 2

2
2
(1 + ) (k b) (b) 4 3 (k b) (b)
b
b

(1 + ) 2 k (k b) (b) + 2 2 k 3 (k b) (b) + 4

2
k (k b) (b)
b2

+ Remainder.

Therefore, we are leaded to use Lemma 4.3 in order to calculate the following integrals:

(T t)

0
T

(T t)
0
T

(T t)
0
T

(T t)
0

u
2u
(kb) (b) dt = (T t) m1 (t) (kb) b2 +
dt
1+r
1
+r
0

2
2u
m2 (t) (k b) b2 +
dt
1+r

2
2u
dt
m3 (t) b2 (1 + k 2 ) +
1+r

2
2u
dt
m4 (t) b2 (1 + k 2 ) +
1+r

2
2u
dt
m5 (t) b2 (1 + k 2 ) +
1+r

(T t) m1 (t) exp
0

120

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

with:
m1 (t)

=
=

2
2
1
(t) (1 + (t))
1 r2 (t) b
4 22 3
1 2 6 24
t + O t5
5/2
36
u
2

m2 (t)
m3 (t)

=
=

m4 (t)

=
=

m5 (t)

=
=

5/2

4 22
2
1
(t)
=

t + O t3
7/2
1 r2 (t) b3
u3 2
1
1

(1 + (t)) 2 (t) k(t)
2 1 r2 (t)

3/2
2 2 6 24

t4 + O t6
864 22 4 22 3/2
2
1

2 (t) k 3 (t)
2 1 r2 (t)
3/2
2 4
1 2 6 24
t + O t6
864 22 4 22 3/2
4
1
( 1998).2

(t) k(t)
b2
2 1 r2 (t)
3/2
2 6 24 4 22
2 2
1
t + O t4 .
12
32 3/2 u2

4
=

Lemma 4.3 shows that we can neglect the terms issued from the t part of the factor T t in formula (4.10).

We now consider the argument of in Lemma 4.3. We have:


2
b2
4

lim 2 +
=
t0 u
1+r
4 22
b2
2
4

lim 2 1 + k 2 +
=

t0 u
1+r
4 22
Therefore, we set:
2 2 6 24

l1 (t)

b2 (t)
2
4
+

=
2
u
1 + r(t)
4 22

l2 (t)

b2 (t)
2
4
1 + k 2 (t) +

u2
1+r
4 22

2 2 6 24
2

12 (4 22 )

t + O t3

5/2

18 (4 22 )

t + O t3 .

Then, with the notations of Lemma 4.3, we obtain:

2 = T exp

4 u2
2 (4 22 )

4 22
4 22
1 2 6 24

3
5/2
7/2
36
2 u
u3 2

3/2

2 6 24 4 22
2
1
+
J2
12
32 3/2 u2

I1

1+O

1
u

121

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Where I1 and I3 (resp. J2 ) are defined as in Lemma


4.3 point (i) (resp. (ii) ) with l(t) = l1 (t) (resp. l(t) = l2 (t)).

2
2

2
2
8 3
3
3
(cos ) d =
and that
cos d =
, we find
Noting that
27
3
0
0

4
144 3 4 22
1
I3 =
u4 1 + O
2
u
2 22 (2 6 24 )

2 2
3 3 4 2
1
I1 =
u2 1 + O
2
u
2 2 (2 6 4 )

3
12 3 4 22
1
J2 =
u3 1 + O

2
2
u
2 (2 6 4 ) 2 (2 6 4 )
Finally, gathering the pieces, we obtain the desired expression of 2 .

5. Discussion
Using the general relation (1.3) with n = 1, we get
P X u P (X0 > u) 1 (u, T ) +

2 (u, T ) 3 (u, T )
2 (u, T )

2
2
6

A conjecture is that the orders of magnitude of 2 (u, T ) and 3 (u, T ) are considerably smaller than those of
1 (u, T ) and 2 (u, T ). Admitting this conjecture, Proposition 2.4 implies that for T small enough

9/2
4 22
T 2
3 3T
P X u = (u) +
(u)

2 9/2 (2 6 2 )
2
4
2

4
u
4 22

u5

1+O

1
u

which is Piterbargs theorem with a better remainder ([15], Th. 3.1, p. 703). Piterbargs theorem is, as far as we
know, the most precise expansion of the distribution of the maximum of smooth Gaussian processes. Moreover,
very tedious calculations would give extra terms of the Taylor expansion.

References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables.
Dover, New York (1972).
[2] R.J. Adler, An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes, IMS, Hayward, Ca
(1990).
[3] J.-M. Azas and M. Wschebor, Une formule pour calculer la distribution du maximum dun processus stochastique. C.R. Acad.
Sci. Paris Ser. I Math. 324 (1997) 225-230.
[4] J-M. Azas and M. Wschebor, The Distribution of the Maximum of a Stochastic Process and the Rice Method, submitted.
[5] C. Cierco, Probl`
emes statistiques li
es a
` la d
etection et a
` la localisation dun g`
ene `
a effet quantitatif. PHD dissertation.
University of Toulouse, France (1996).
[6] C. Cierco and J.-M. Azas, Testing for Quantitative Gene Detection in Dense Map, submitted.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes, J. Wiley & Sons, New-York (1967).
[8] D. Dacunha-Castelle and E. Gassiat, Testing in locally conic models, and application to mixture models. ESAIM: Probab.
Statist. 1 (1997) 285-317.
[9] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977) 247-254.
[10] J. Ghosh and P. Sen, On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related
results, in Proc. of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Le Cam L.M. and Olshen R.A., Eds.
(1985).

122

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

[11] M.F. Kratz and H. Rootz


en, On the rate of convergence for extreme of mean square differentiable stationary normal processes.
J. Appl. Prob. 34 (1997) 908-923.
[12] M.R. Leadbetter, G. Lindgren and H. Rootz
en, Extremes and Related Properties of Random Sequences and Processes. SpringerVerlag, New-York (1983).
[13] R.N. Miroshin, Rice series in the theory of random functions. Vestnik Leningrad Univ. Math. 1 (1974) 143-155.
[14] M.B. Monagan, et al. Maple V Programming guide. Springer (1998).
[15] V.I. Piterbarg, Comparison of distribution functions of maxima of Gaussian processes. Theory Probab. Appl. 26 (1981) 687-705.
[16] V.I. Piterbarg, Large deviations of random processes close to gaussian ones. Theory Probab. Appl. 27 (1982) 504-524.
[17] V.I. Piterbarg, Asymptotic Methods in the Theory of Gaussian Processes and Fields. American Mathematical Society. Providence, Rhode Island (1996).
[18] S.O. Rice, Mathematical Analysis of Random Noise. Bell System Tech. J. 23 (1944) 282-332; 24 (1945) 45-156.
[19] SPLUS, Statistical Sciences, S-Plus Programmers Manual, Version 3.2, Seattle: StatSci, a division of MathSoft, Inc. (1993).
[20] J. Sun, Significance levels in exploratory projection pursuit. Biometrika 78 (1991) 759-769.
[21] M. Wschebor, Surfaces aleatoires. Mesure geometrique des ensembles de niveau. Springer-Verlag, New-York, Lecture Notes in
Mathematics 1147 (1985).

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

TAYLOR EXPANSIONS BY MAPLE

GENERAL FORMULAE
>

phi:=t->exp(-t*t/2)/sqrt(2*pi);
2

e(1/2 t )

2
We introduce mu4=lambda4-lambda22 and mu6= lambda2*lambda6-lambda4^2
to make the outputs clearer.
>
assume(t>0);
>
assume(lambda2 > 0);
>
assume(mu4 > 0);
>
assume(mu6>0);
>
interface(showassumed=2);
>
Order:=12;
:= t

>

Order := 12
r:=t->1-lambda2*t^2/2!+lambda4*t^4/4!-lambda6*t^6/6!+lambda8*t^8/8!;
1
1
1
1
2 t2 +
4 t4
6 t6 +
8 t8
2
24
720
40320
siderels:= {lambda4=mu4+lambda2^2,lambda2*lambda6-lambda4^2=mu6}:
I_r2:=t->1-r(t)*r(t);
r := t 1

>
>

I r2 := t 1 r(t)2
>

simplify(simplify(series(I_r2(t),t=0,8),siderels));

>

1
1
1
1
1
2 t2 + ( 22
4) t4 + (
6 +
2 4 +
23 ) t6 + O(t8 )
3
12
360
24
24
with assumptions on t, 2 and 4
rp:=t->diff(r(t),t);
rp := t diff(r(t), t)

>

eval(rp(t));
1
1
1
4 t3
6 t5 +
8 t7
6
120
5040
with assumptions on 2 and t

2 t +
>

rs:=t->diff(r(t),t$2);
rs := t

>

2
r(t)
t2

eval(rs(t));
1
1
1
4 t2
6 t4 +
8 t6
2
24
720
with assumptions on 2 and t

2 +

123

124

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

mu:=t->-u*rp(t)/(1+r(t));
:= t

>

u rp(t)
1 + r(t)

sig2:=t->lambda2-rp(t)*rp(t)/I_r2(t);
sig2 := t 2

>

rp(t)2
I r2(t)

simplify(taylor(sig2(t),t=0,8),siderels);
1
1 6 22 4 3 42 2 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6

>

sigma:=t->sqrt(sig2(t));
:= t

>

simplify(taylor(sigma(t),t=0,6),siderels);
1
2

>

sig2(t)

1 6 22 4 3 42 2 6 3

t + O(t5 )
144
4 2
with assumptions on t, 4, 2 and 6

4 t +

b:=t->mu(t)/sigma(t);
b := t

>

(t)
(t)

simplify(taylor(b(t),t=0,6),siderels);
u 2
1
1 u 6
+ ( u 4 +
) t2 + O(t4 )
8
36 4(3/2)
4
with assumptions on 2, 4, t and 6

>

sig2rho:=t->-rs(t)-r(t)*rp(t)*rp(t)/I_r2(t);
sig2rho := t rs(t)

>

r(t) rp(t)2
I r2(t)

simplify(taylor(sig2rho(t),t=0,8),siderels);
1
1 6 22 4 + 3 42 + 4 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6

>

rho:=t->sig2rho(t)/sig2(t);
:= t

>

sig2rho(t)
sig2(t)

simplify(taylor(rho(t),t=0,8),siderels);
1 6 2
t + O(t4 )
18 2 4
with assumptions on t, 6, 2 and 4

1 +

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

PROOF OF PROPOSITION 2.3


>

k2:=t->(1+rho(t))/(1-rho(t));
k2 := t

>

1 + (t)
1 (t)

sk2:=simplify(taylor(k2(t),t=0),siderels);
1
1 6 2
t +
(3 26 4 + 9 24 42 + 9 22 43 2 6 22 4 3 8 22 4
36 2 4
2160
1
+ 3 44 + 13 6 42 + 5 62 ) (22 42 )t4 +
(147 28 42
907200
+ 175 6 26 4 273 26 43 + 63 24 44 + 196 6 24 42 + 120 8 24 42
+ 357 22 45 + 707 6 22 43 195 8 22 43 175 8 22 6 4 + 168 46

sk2 :=

+ 518 62 42 + 686 6 44 + 175 63) (23 43 )t6 + O(t8 )


with assumptions on t, 6, 2 and 4
>

k:=t->taylor(sqrt(sk2),t=0);

k := t taylor( sk2 , t = 0)

>

simplify(taylor(k(t),t=0,3),siderels);
1
6

>

6
t + O(t3 )
2 4
with assumptions on t, 6, 2 and 4

sqrtI_rho2:=t->k(t)*(1-rho(t));
sqrtI rho2 := t k(t) (1 (t))

>

T1:=t->sig2(t)*sqrtI_rho2(t)*phi(b(t))*phi(k(t)*b(t));
T1 := t sig2(t) sqrtI rho2(t) (b(t)) (k(t) b(t))

>

simplify(simplify(series(T1(t),t=0,6),siderels),power);
1
24

u2 22
6 4 e(1/2 4 ) 3
1

t
((5 62 22 u2 + 3 22 42 8 3 26 42 9 24 43
2880
2
9 22 44 15 6 22 42 u2 18 6 22 42 3 45 + 5 62 4 3 6 43 )

e(1/2

u2 22
4

) ( 6 4(3/2) 2(3/2) )t5 + O(t7 )

with assumptions on t, 6, 4 and 2


>
>

T2 := t->2*sig2(t)*(rho(t)-(b(t))^2)*(arctan(k(t))/(2*pi)
-k(t)/sqrt(2*pi)*(phi(0)-phi(b(t))-k(t)^2/6*(2*phi(0)-((b(t))^2+2)*phi(b(t)))));
T2 := t 2sig2(t) ((t) b(t)2 )

1
2
2
k(t)
((0)

(b(t))

k(t)
(2
(0)

(b(t)
+
2)
(b(t))))

1 arctan(k(t))
6

125

126

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

simplify(simplify(series(T2(t),t=0,6),siderels),power);
1

24

u2 22
6 (u2 22 + 4) e(1/2 4 ) 3
t + O(t5 )

4 2
with assumptions on t, 6, 2 and 4

>

T3:=t->(2*sig2(t)*(k(t)*b(t)^2))/sqrt(2*pi)*(1-(k(t)*b(t))^2/6)*phi(b(t));

>

1
sig2(t) k(t) b(t)2 (1 k(t)2 b(t)2 ) (b(t))
6
T3 := t 2
2
simplify(simplify(series(T3(t),t=0,6),siderels),power);
u2 22
1
1 e(1/2 4 ) 6 2(3/2) u2 3

t
2 u2 (27 8 22 42 + 35 62 22 u2
24
25920
4
27 26 42 81 24 43 81 22 44 162 6 22 42 135 6 22 42 u2

27 45 45 62 4 + 243 6 43)e(1/2

u2 22
4

( 6 4(5/2) )t5 + O(t7 )

with assumptions on t, 2, 4 and 6


>

A:=t->((phi(u/sqrt((1+r(t)))))^2/sqrt(I_r2(t)))*(T1(t)+T2(t)+T3(t));
(
A := t

>

u
)2 (T1(t) + T2(t) + T3(t))
1 + r(t)
I r2(t)

simplify(simplify(series(A(t),t=0,6),siderels),power);
O(t4 )
with assumptions on t

PROOF OF THE EQUIVALENT OF NU1


>

Cphib:=t->phi(t)/t-phi(t)/t^3;
Cphib := t

>

sq:=t->sqrt((1-r(t))/(1+r(t)));
sq := t

>

(t) (t)
3
t
t

1 r(t)
1 + r(t)

simplify(simplify(series(sq(t),t=0,4),siderels),power);
1 2 22 + 4 3
1

2 t
t + O(t5 )
2
48
2
with assumptions on t, 2 and 4

>

nsigma:=t->sigma(t)/sqrt(lambda2);
(t)
nsigma := t
2

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS


>
>

A1:=t->(1/sqrt(2*pi))*phi(u)*phi(sq(t)*u/nsigma(t))*((nsigma(t)/(sq(t)*u)
-(nsigma(t)/(sq(t)*u))^3)*sqrt(lambda2)+(1/b(t)-1/b(t)^3)*rp(t)/sqrt(I_r2(t)));

1
1
(

)
rp(t)
3
nsigma(t) nsigma(t)

sq(t) u
b(t) b(t)3

(u) (
) 2 +
)
(

3
3
nsigma(t)
sq(t) u
sq(t) u
I r2(t)

2
SA1:=simplify(simplify(series(A1(t),t=0,6),siderels),power);
A1 := t

>

u2 (4+22 )
)
4
1 2 e(1/2
4(5/2) 2
SA1 :=
t + O(t4 )
16
2(7/2) (3/2) u3
with assumptions on t, 4 and 2

Expansion of the exponent for using Lemma 4.3 (ii), p=2


>

L2:= t->(1-r(t))/((1+r(t))*nsigma(t)^2)-(lambda4-mu4)/mu4;

>

4 4
1 r(t)

(1 + r(t)) nsigma(t)2
4
SL2:=simplify(simplify(series(L2(t),t=0,6),siderels),power);
L2 := t

1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
We define c as the square root of the coefficient of t2
c:=sqrt(op(1,SL2))

1 2 2 6
c :=
6
4
with assumptions on 2, 6 and 4
>
nu1b:=(sqrt(2*pi))*op(1,SA1)*(c^(-3)*u^(-3)/2);
SL2 :=

u2 (4+22 )
)
4
27 2 e(1/2
4(11/2)
nu1b :=
8
2(7/2) u6 (2 6)(3/2)
with assumptions on 4, 2 and 6
PROOF OF THE EQUIVALENT OF NU2
>

m1:=t->(1+rho(t))*2*sigma(t)^2/(pi*b(t)*sqrt(I_r2(t)));
m1 := t 2

>

(1 + (t)) (t)2
b(t) I r2(t)

sm1:=simplify(simplify(series(m1(t),t=0,8),siderels),power);

1 6 4 3
sm1 :=
t + O(t5 )
36 2(5/2) u
with assumptions on t, 6, 4 and 2

127

128

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

m2:=t->(-4/pi)*sigma(t)^2*b(t)^(-3)/sqrt(I_r2(t));

>

(t)2
b(t)3 I r2(t)
sm2:=simplify(simplify(series(m2(t),t=0,6),siderels),power);

>

4(5/2)
t + O(t3 )
u3 2(7/2)
with assumptions on t, 4 and 2
m3:=t->-(1+rho(t))*sigma(t)^2*k(t)/(pi*sqrt((2*pi)*I_r2(t)));

m2 := t 4

sm2 :=

(1 + (t)) (t)2 k(t)


2 I r2(t)
sm3:=simplify(simplify(series(m3(t),t=0,6),siderels),power);

1
6(3/2) 2

sm3 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m4:=t->(2/pi)*sigma(t)^2*k(t)^3/sqrt(2*pi*I_r2(t));
m3 := t

>

>

m4 := t 2

>

>

(t)2 k(t)3

2 I r2(t)
sm4:=simplify(simplify(series(m4(t),t=0,6),siderels),power);

1
6(3/2) 2

sm4 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m5:=t->(4/pi)*sigma(t)^2*k(t)*b(t)^(-2)/sqrt(2*pi*I_r2(t));

>

(t)2 k(t)
b(t)2 2 I r2(t)
sm5:=simplify(simplify(series(m5(t),t=0,6),siderels),power);

1 6 4(3/2) 2 2
sm5 :=
t + O(t4 )
12 23 (3/2) u2
with assumptions on t, 6, 4 and 2
l12:=t-> (b(t)/u)^2 + 2/(1+r(t))-lambda4/mu4;

>

b(t)2
1
4
+2

u2
1 + r(t) 4
simplify(simplify(series(l12(t),t=0,8),siderels),power);

m5 := t 4

>

l12 := t

1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
>

l22:=t-> ((b(t)/u)^2 )*(1+k(t)^2)+2/(1+r(t))-lambda4/mu4;


l22 := t

b(t)2 (1 + k(t)2 )
1
4
+2

u2
1 + r(t) 4

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS


>

simplify(simplify(series(l22(t),t=0,8),siderels),power);
1 2 6 2
t + O(t4 )
12 42
with assumptions on t, 2, 6 and 4

>

simplify(int( cos(t)^3, t=0..arctan(sqrt(2)/2)),power);


8
3
27

>

opm1:=op(1,sm1);

1 6 4
36 2(5/2) u
with assumptions on 6, 4 and 2

opm1 :=
>

opm2:=op(1,sm2);
4(5/2)
u3 2(7/2)
with assumptions on 4 and 2

opm2 :=
>

>

>

>

>

>

opm5:=op(1,sm5);

1 6 4(3/2) 2
opm5 :=
12 23 (3/2) u2
with assumptions on 6, 4 and 2
c1:=144*sqrt(3)*mu4^4*u^(-4)/(sqrt(2*pi)*lambda2^2*mu6^2);

3 44 2
c1 := 72 4
u 22 62
with assumptions on 4, 2 and 6
c2:=3*sqrt(3)*mu4^2*u^(-2)/(sqrt(2*pi)*lambda2*mu6);

3
3 42 2

c2 :=
2 u2 2 6
with assumptions on 4, 2 and 6
c5:=12*sqrt(3)*mu4^3*u^(-3)/(lambda2^(3/2)*mu6^(3/2));

3 43
c5 := 12 3 (3/2) (3/2)
u 2
6
with assumptions on 4, 2 and 6
B:=opm1*c1+opm2*c2+opm5*c5;

3 4(9/2) 3 2
B :=
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
simplify(B);

3 4(9/2) 3 2
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6

129

A general expression for the distribution of the maximum of a


Gaussian field and the approximation of the tail

arXiv:math/0607041v2 [math.PR] 8 Jan 2007

Jean-Marc Azas , azais@cict.fr

Mario Wschebor , wschebor@cmat.edu.uy

February 2, 2008

AMS subject classification: Primary 60G70 Secondary 60G15


Short Title: Distribution of the Maximum.
Key words and phrases: Gaussian fields, Rice Formula, Euler-Poincare Characteristic, Distribution of the Maximum, Density of the Maximum, Random Matrices.
Abstract
We study the probability distribution F (u) of the maximum of smooth Gaussian fields
defined on compact subsets of Rd having some geometric regularity.
Our main result is a general expression for the density of F . Even though this is an
implicit formula, one can deduce from it explicit bounds for the density, hence for the
distribution, as well as improved expansions for 1 F (u) for large values of u.
The main tool is the Rice formula for the moments of the number of roots of a random
system of equations over the reals.
This method enables also to study second order properties of the expected Euler Characteristic approximation using only elementary arguments and to extend these kind of results
to some interesting classes of Gaussian fields. We obtain more precise results for the direct method to compute the distribution of the maximum, using spectral theory of GOE
random matrices.

Introduction and notations

Let X = {X(t) : t S} be a real-valued random field defined on some parameter set S and
M := suptS X(t) its supremum.
The study of the probability distribution of the random variable M , i.e. the function
FM (u) := P{M u} is a classical problem in probability theory. When the process is Gaussian,
general inequalities allow to give bounds on 1 FM (u) = P{M > u} as well as asymptotic
results for u +. A partial account of this well established theory, since the founding paper
by Landau and Shepp [20] should contain - among a long list of contributors - the works of
Marcus and Shepp [24], Sudakov and Tsirelson [30], Borell [13] [14], Fernique [17], Ledoux and
Talagrand [22], Berman [11] [12], Adler[2], Talagrand [32] and Ledoux[21].
During the last fifteen years, several methods have been introduced with the aim of obtaining more precise results than those arising from the classical theory, at least under certain
restrictions on the process X , which are interesting from the point of view of the mathematical
theory as well as in many significant applications. These restrictions include the requirement

This work was supported by ECOS program U03E01.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118, route de
Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.

the domain S to have certain finite-dimensional geometrical structure and the paths of the
random field to have a certain regularity.
Some examples of these contributions are the double sum method by Piterbarg [28]; the
Euler-Poincare Characteristic (EPC) approximation, Taylor, Takemura and Adler [34], Adler
and Taylor [3]; the tube method, Sun [31] and the well- known Rice method, revisited by Azas
and Delmas [5], Azas and Wschebor [6]. See also Rychlik [29] for numerical computations.
The results in the present paper are based upon Theorem 3 which is an extension of Theorem
3.1 in Azas and Wschebor [8] allowing to express the density pM of FM by means of a general
formula. Even though this is an exact formula, it is only implicit as an expression for the
density, since the relevant random variable M appears in the right-hand side. However, it can
be usefully employed for various purposes.
First, one can use Theorem 3 to obtain bounds for pM (u) and thus for P{M > u} for
every u by means of replacing some indicator function in (4) by the condition that the normal
derivative is extended outward (see below for the precise meaning). This will be called the
direct method. Of course, this may be interesting whenever the expression one obtains can
be handled, which is the actual situation when the random field has a law which is stationary
and isotropic. Our method relies on the application of some known results on the spectrum of
random matrices.
Second, one can use Theorem 3 to study the asymptotics of P{M > u} as u +. More
precisely, one wants to write, whenever it is possible
P{M > u} = A(u) exp

1 u2
2 2

+ B(u)

(1)

where A(u) is a known function having polynomially bounded growth as u +, 2 =


suptS Var(X(t)) and B(u) is an error bounded by a centered Gaussian density with variance
12 , 12 < 2 . We will call the first (respectively the second) term in the right-hand side of (1)
the first (resp second) order approximation of P{M > u}.
First order approximation has been considered in [3] [34] by means of the expectation of the
EPC of the excursion set Eu := {t S : X(t) > u}. This works for large values of u. The same
authors have considered the second order approximation, that is, how fast does the difference
between P{M > u} and the expected EPC tend to zero when u +.
We will address the same question both for the direct method and the EPC approximation method. Our results on the second order approximation only speak about the size of the
variance of the Gaussian bound. More precise results are only known to the authors in the
special case where S is a compact interval of the real line, the Gaussian process X is stationary
and satisfies a certain number of additional requirements (see Piterbarg [28] and Azas et al. [4]).
Theorem 5 is our first result in this direction. It gives a rough bound for the error B(u) as
u +, in the case the maximum variance is attained at some strict subset of the face in S
having the largest dimension. We are not aware of the existence of other known results under
similar conditions.
In Theorem 6 we consider processes with constant variance. This is close to Theorem 4.3
in [34]. Notice that Theorem 6 has some interest only in case suptS t < , that is, when
one can assure that 12 < 2 in (1). This is the reason for the introduction of the additional
hypothesis (S) < on the geometry of S, (see below (64) for the definition of (S)), which
is verified in some relevant situations (see the discussion before the statement of Theorem 6).
In Theorem 7, S is convex and the process stationary and isotropic. We compute the exact
asymptotic rate for the second order approximation as u + corresponding to the direct
2

method.
In all cases, the second order approximation for the direct method provides an upper bound
for the one arising from the EPC method.
Our proofs use almost no differential geometry, except for some elementary notions in Euclidean space. Let us remark also that we have separated the conditions on the law of the
process from the conditions on the geometry of the parameter set.
Third, Theorem 3 and related results in this paper, in fact refer to the density pM of
the maximum. On integration, they imply immediately a certain number of properties of the
probability distribution FM , such as the behaviour of the tail as u +.
Theorem 3 implies that FM has a density and we have an implicit expression for it. The
proof of this fact here appears to be simpler than previous ones (see Azas and Wschebor [8])
even in the case the process has 1-dimensional parameter (Azas and Wschebor [7]). Let us
remark that Theorem 3 holds true for non-Gaussian processes under appropriate conditions
allowing to apply Rice formula.
Our method can be exploited to study higher order differentiability of FM (as it has been
done in [7] for one-parameter processes) but we will not pursue this subject here.
This paper is organized as follows:
Section 2 includes an extension of Rice Formula which gives an integral expression for the
expectation of the weighted number of roots of a random system of d equations with d real
unknowns. A complete proof of this formula in a form which is adapted to our needs in this
paper, can be found in [9]. There is an extensive literature on Rice formula in various contexts
(see for example Belayiev [10] , Cramer-Leadbetter [15], Marcus [23], Adler [1], Wschebor [35].
In Section 3, we obtain the exact expression for the distribution of the maximum as a consequence of the Rice-like formula of the previous section. This immediately implies the existence
of the density and gives the implicit formula for it. The proof avoids unnecessary technicalities
that we have used in previous work, even in cases that are much simpler than the ones considered here.
In Section 4, we compute (Theorem 4) the first order approximation in the direct method
for stationary isotropic processes defined on a polyhedron, from which a new upper bound for
P{M > u} for all real u follows.
In Section 5, we consider second order approximation, both for the direct method and the
EPC approximation method. This is the content of Theorems 5, 6 and 7.
Section 6 contains some examples.

Assumptions and notations


X = {X(t) : t S} denotes a real-valued Gaussian field defined on the parameter set S. We
assume that S satisfies the hypothesis A1
A1 :
S is a compact subset of Rd

S is the disjoint union of Sd , Sd1 ..., S0 , where Sj is an orientable C 3 manifold of dimension


j without boundary. The Sj s will be called faces. Let Sd0 , d0 d be the non empty face
having largest dimension.
We will assume that each Sj has an atlas such that the second derivatives of the inverse
functions of all charts (viewed as diffeomorphisms from an open set in Rj to Sj ) are
bounded by a fixed constant. For t Sj , we denote Lt the maximum curvature of Sj at
the point t. It follows that Lt is bounded for t S.
Notice that the decomposition S = Sd ... S0 is not unique.
Concerning the random field we make the following assumptions A2-A5
A2 : X is in fact defined on an open set containing S and has C 2 paths
A3 : for every t S the distribution of X(t), X (t) does not degenerate; for every s, t S,
s = t, the distribution of X(s), X(t) does not degenerate.
A4 : Almost surely the maximum of X(t) on S is attained at a single point.
(t) denote respectively the derivative along S and the normal derivaFor t Sj , Xj (t) Xj,N
j
d
tive. Both quantities are viewed as vectors in R , and the density of their distribution will be
expressed respectively with respect to an orthonormal basis of the tangent space Tt,j of Sj at
the point t, or its orthogonal complement Nt,j . Xj (t) will denote the second derivative of X
along Sj , at the point t Sj and will be viewed as a matrix expressed in an orthogonal basis
of Tt,j . Similar notations will be used for any function defined on Sj .

A5 : Almost surely, for every j = 0, 1, . . . , d there is no point t in Sj such that Xj (t) = 0,


det(Xj (t)) = 0
Other notations and conventions will be as follows :
j is the geometric measure on Sj .
m(t) := E(X(t)), r(s, t) = Cov(X(s), X(t)) denote respectively the expectation and covariance of the process X ; r0,1 (s, t), r0,2 (s, t) are the first and the second derivatives of r
with respect to t. Analogous notations will be used for other derivatives without further
reference.
If is a random variable taking values in some Euclidean space, p (x) will denote the
density of its probability distribution with respect to the Lebesgue measure, whenever it
exists.
(x) = (2)1/2 exp(x2 /2) is the standard Gaussian density ; (x) :=

x
(y)dy.

Assume that the random vectors , have a joint Gaussian distribution, where has
values in some finite dimensional Euclidean space. When it is well defined,
E(f ()/ = x)
is the version of the conditional expectation obtained using Gaussian regression.
Eu := {t S : X(t) > u} is the excursion set above u of the function X(.) and Au :=
{M u} is the event that the maximum is not larger than u.
, , , denote respectively inner product and norm in a finite-dimensional real Euclidean
space; d is the Lebesgue measure on Rd ; S d1 is the unit sphere ; Ac is the complement
of the set A. If M is a real square matrix, M 0 denotes that it is positive definite.
4

If g : D C is a function and u C, we denote

Nug (D) := {t D : g(t) = u}

which may be finite or infinite.

Some remarks on the hypotheses


One can give simple sufficient additional conditions on the process X so that A4 and A5 hold
true.
If we assume that for each pair j, k = 0, . . . , d and each pair of distinct points s, t, s Sj , t
Sk , the distribution of the triplet
X(t) X(s), Xj (s), Xk (t))
does not degenerate in R Rj Rk , then A4 holds true.

This is well-known and follows easily from the next lemma (called Bulinskaya s lemma)
that we state without proof, for completeness.
Lemma 1 Let Z(t) be a stochastic process defined on some neighborhood of a set T embedded
in some Euclidean space. Assume that the Hausdorff dimension of T is smaller or equal than
the integer m and that the values of Z lie in Rm+k for some positive integer k . Suppose, in
addition, that Z has C 1 paths and that the density pZ(t) (v) is bounded for t T and v in some
neighborhood of u Rm+k . Then, a. s. there is no point t T such that Z(t) = u.
With respect to A5, one has the following sufficient conditions: Assume A1, A2, A3 and as
additional hypotheses one of the following two:
t

X(t) is of class C 3

sup
tS,x V (0)

P | det X (t) | < /X (t) = x 0,

as 0,

where V (0) is some neighborhood of zero.


Then A5 holds true. This follows from Proposition 2.1 of [8] and [16].

Rice formula for the number of weighted roots of random


fields

In this section we review Rice formula for the expectation of the number of roots of a random
system of equations. For proofs, see for example [8], or [9], where a simpler one is given.
Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and
u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t) 0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =

E | det(Z (t))|/Z(t) = u pZ(t) (u)dt.

If B is compact, then both sides in (2) are finite.


5

(2)

Theorem 2 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that

for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:
a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)
ous.

Y t (w) is continu-

Z(s), Y t (w) defined on U W is Gaussian.

b) For each t U the random process (s, w)

Moreover, assume that g : U C(W, Rd ) R is a bounded function, which is continuous when

one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =

E
tI,Z(t)=u

E | det(Z (t)|g(t, Y t )/Z(t) = u).pZ(t) (u)dt.

(3)

Remarks:
1. We have already mentioned in the previous section sufficient conditions implying hypothesis (iv) in Theorem 1.
2. With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .

The implicit formula for the density of the maximum

Theorem 3 Under assumptions A1 to A5, the distribution of M has the density


E 1IAx /X(t) = x pX(t) (x)

pM (x) =
tS0
d

+
j=1

Sj

E | det(Xj (t))| 1IAx /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt),

(4)

Remark: One can replace | det(Xj (t))| in the conditional expectation by (1)j det(Xj (t)),
since under the conditioning and whenever M x holds true, Xj (t) is negative semi-definite.
Proof of Theorem 3
Let Nj (u), j = 0, . . . , d be the number of global maxima of X(.) on S that belong to Sj and are
larger than u. From the hypotheses it follows that a.s.
j=0,...,d Nj (u) is equal to 0 or 1, so
that
P{M > u} =
P{Nj (u) = 1} =
E(Nj (u)).
(5)
j=0,...,d

j=0,...,d

The proof will be finished as soon as we show that each term in (5) is the integral over (u, +)
of the corresponding term in (4).
This is self-evident for j = 0. Let us consider the term j = d. We apply the weighted Rice
formula of Section 2 as follows :
Z is the random field X defined on Sd .
For each t Sd , put W = S and Y t : S R2 defined as:
Y t (w) := X(w) X(t), X(t) .
Notice that the second coordinate in the definition of Y t does not depend on w.
6

In the place of the function g, we take for each n = 1, 2, . . . the function gn defined as
follows:
gn (t, f1 , f2 ) = gn (f1 , f2 ) = 1 Fn (sup f1 (w)) . 1 Fn (u f2 (w)) ,
wS

where w is any point in W and for n a positive integer and x 0, we define :


Fn (x) := F(nx) ;

with F(x) = 0 if 0 x 1/2 , F(x) = 1 if x 1 ,

(6)

and F monotone non-decreasing and continuous.


It is easy to check that all the requirements in Theorem 2 are satisfied, so that, for the value 0
instead of u in formula (3) we get:
gn (Y t ) =

E
tSd ,X (t)=0

Sd

E | det(X (t)|gn (Y t )/X (t) = 0).pX (t) (0)d (dt).

(7)

Notice that the formula holds true for each compact subset of Sd in the place of Sd , hence for
Sd itself by monotone convergence.
Let now n in (7). Clearly gn (Y t ) 1IX(s)X(t)0,sS . 1IX(t)u . The passage to the limit
does not present any difficulty since 0 gn (Y t ) 1 and the sum in the left-hand side is bounded

by the random variable N0X (Sd ), which is in L1 because of Rice Formula. We get
E(Nd (u)) =
Sd

E | det(X (t)| 1IX(s)X(t)0,sS 1IX(t)u /X (t) = 0).pX (t) (0)d (dt)

Conditioning on the value of X(t), we obtain the desired formula for j = d.


The proof for 1 j d 1 is essentially the same, but one must take care of the parameterization of the manifold Sj . One can first establish locally the formula on a chart of Sj , using
local coordinates.
It can be proved as in [8], Proposition 2.2 (the only modification is due to the term 1IAx )
that the quantity written in some chart as
E det(Y (s)) 1IAx /Y (s) = x, Y (s) = 0 pY (s),Yj (s) (x, 0)ds,
where the process Y (s) is the process X written in some chart of Sj ,
(Y (s) = X(1 (s))), defines a j-form. By a j-form we mean a mesure on Sj that does not
depend on the parameterization and which has a density with respect to the Lebesgue measure
ds in every chart. It can be proved also that the integral of this j-form on Sj gives the
expectation of Nj (u).
To get formula (2) it suffices to consider locally around a precise point t Sj the chart
given by the projection on the tangent space at t. In this case we obtain that at t
ds is in fact j (dt)
Y (s) is isometric to Xj (t)
where s = (t).
The first consequence of Theorem 3 is the next corollary. For the statement, we need to
introduce some further notations.
For t in Sj , j d0 we define Ct,j as the closed convex cone generated by the set of directions:
{ Rd :

= 1 ; sn S, (n = 1, 2, . . .) such that sn t,
7

t sn
as n +},
t sn

whenever this set is non-empty and Ct,j = {0} if it is empty. We will denote by Ct,j the dual
cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
Notice that these definitions easily imply that Tt,j Ct,j and Ct,j Nt,j . Remark also that for
j = d0 , Ct,j = Nt,j .
We will say that the function X(.) has an extended outward derivative at the point t in
(t) C .
Sj , j d0 if Xj,N
t,j
Corollary 1 Under assumptions A1 to A5, one has :
(a) pM (x) p(x) where
E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)+

p(x) :=
tS0
d0
Sj

j=1

E | det(Xj (t))| 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (8)

(b) P{M > u}

p(x)dx.
u

Proof
(a) follows from Theorem 3 and the observation that if t Sj , one has
(t) C }. (b) is an obvious consequence of (a).
{M X(t)} {Xj,N
t,j
The actual interest of this Corollary depends on the feasibility of computing p(x). It turns
out that it can be done in some relevant cases, as we will see in the remaining of this section.
+
Our result can be compared with the approximation of P{M > u} by means of u pE (x)dx
given by [3], [34] where
pE (x) :=

E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)


tS0

d0

(1)j

+
j=1

Sj

E det(Xj (t)) 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (9)

Under certain conditions , u pE (x)dx is the expected value of the EPC of the excursion set
Eu (see [3]). The advantage of pE (x) over p(x) is that one can have nice expressions for it in
quite general situations. Conversely p(x) has the obvious advantage that it is an upper-bound
of the true density pM (x) and hence provides upon integrating once, an upper-bound for the
tail probability, for every u value. It is not known whether a similar inequality holds true for
pE (x).
On the other hand, under additional conditions, both provide good first order approximations
for pM (x) as x as we will see in the next section. In the special case in which the process
X is centered and has a law that is invariant under isometries and translations, we describe
below a procedure to compute p(x).

Computing p(x) for stationary isotropic Gaussian fields

For one-parameter centered Gaussian process having constant variance and satisfying certain
regularity conditions, a general bound for pM (x) has been computed in [8], pp.75-77. In the
two parameter case, Mercadier [26] has shown a bound for P{M > u}, obtained by means of a
method especially suited to dimension 2. When the parameter is one or two-dimensional, these
bounds are sharper than the ones below which, on the other hand, apply to any dimension but
to a more restricted context. We will assume now that the process X is centered Gaussian,
with a covariance function that can be written as
E X(s).X(t) = s t

(10)

where : R+ R is of class C 4 . Without loss of generality, we assume that (0) = 1.


Assumption (10) is equivalent to saying that the law of X is invariant under isometries (i.e.
linear transformations that preserve the scalar product) and translations of the underlying
parameter space Rd .
We will also assume that the set S is a polyhedron. More precisely we assume that each
Sj (j = 1, . . . , d) is a union of subsets of affine manifolds of dimension j in Rd .
The next lemma contains some auxiliary computations which are elementary and left to the
reader. We use the abridged notation : := (0), := (0)
Lemma 2 Under the conditions above, for each t U , i, i , k, k , j = 1, . . . , d:
1. E

X
ti (t).X(t)

2. E

X
X
ti (t). tk (t)

3. E

2X
ti tk (t).X(t)

4. E

2X
2X
ti tk (t). ti tk (t)

= 0,
= 2 ik and < 0,
= 2 ik , E

2X
X
ti tk (t). tj (t)

=0

= 24 ii .kk + i k .ik + ik i k ,

5. 2 0
6. If t Sj , the conditional distribution of Xj (t) given X(t) = x, Xj (t) = 0 is the same as
the unconditional distribution of the random matrix
Z + 2 xIj ,
where Z = (Zik : i, k = 1, . . . , j) is a symmetric j j matrix with centered Gaussian
entries, independent of the pair X(t), X (t) such that, for i k, i k one has :
E(Zik Zi k ) = 4 2 ii + ( 2 ) ik i k + 4 ii .kk (1 ik ) .
Let us introduce some additional notations:
Hn (x), n = 0, 1, . . . are the standard Hermite polynomials, i.e.
Hn (x) := ex

n x2

For the properties of the Hermite polynomials we refer to Mehta [25].


H n (x), n = 0, 1, . . . are the modified Hermite polynomials, defined as:
H n (x) := ex

2 /2

n x2 /2

We will use the following result:


Lemma 3 Let

Jn (x) :=

ey

2 /2

Hn ()dy, n = 0, 1, 2, . . .

(11)

where stands for the linear form = ay + bx where a, b are some real parameters that satisfy
a2 + b2 = 1/2. Then

Jn (x) := (2b)n 2 H n (x).


Proof :
It is clear that Jn is a polynomial having degree n. Differentiating in (11) under the integral
sign, we get:
Jn (x) = b

ey

2 /2

Hn ()dy = 2nb

Also:

Jn (0) =

ey

2 /2

Hn1 ()dy = 2n b Jn1 (x)

(12)

ey

2 /2

Hn (ay)dy,

so that Jn (0) = 0 if n is odd.


If n is even, n 2, using the standard recurrence relations for Hermite polynomials, we have:
+

Jn (0) =

ey

2 /2

= 2a2

ey

2ayHn1 (ay) 2(n 1)Hn2 (ay) dy


2 /2

(ay)dy 2(n 1)Jn2 (0)


Hn1

= 4b2 (n 1)Jn2 (0).

Equality (13) plus J0 (x) = 2 for all x R, imply that:

(2p)!
2.
J2p (0) = (1)p (2b)2p (2p 1)!! 2 = (2b2 )p
p!

(13)

(14)

Now we can go back to (12) and integrate successively for n = 1, 2, . . . on the interval [0, x]
using the initial value given by (14) when n = 2p and Jn (0) = 0 when n is odd, obtaining :

Jn (x) = (2b)n 2Qn (x),


where the sequence of polynomials Qn , n = 0, 1, 2, . . . verifies the conditions:
Q0 (x) = 1

(15)

Qn (x)

= nQn (x)

(16)

Qn (0) = 0 if n is odd

(17)

Qn (0) = (1)n/2 (n 1)!! if n is even.

(18)

It is now easy to show that in fact Qn (x) = H n (x) , n = 0, 1, 2, . . . using for example that:
x
H n (x) = 2n/2 Hn .
2

The integrals

In (v) =

2 /2

et

Hn (t)dt,

will appear in our computations. They are computed in the next Lemma, which can be proved
easily, using the standard properties of Hermite polynomials.
10

Lemma 4 (a)
[ n1
]
2
v2 /2

(n 1)!!
Hn12k (v)
(n 1 2k)!!

(n 1)!! 2 (x)

2k

In (v) = 2e

k=0
n

+ 1I{n even} 2 2
(b)

In () = 1I{n even} 2 2 (n 1)!!

(19)
(20)
(21)

Theorem 4 Assume that the process X is centered Gaussian, satisfies conditions A1-A5 with
a covariance having the form (10) and verifying the regularity conditions of the beginning of this
section. Moreover, let S be a polyhedron. Then, p(x) can be expressed by means of the following
formula:

d0

| | j/2
p(x) = (x)
H j (x) + Rj (x) gj
0 (t) +
,
(22)

j=1

tS0

where

gj is a geometric parameter of the face Sj defined by


j (t)j (dt),

gj =

(23)

Sj

where j (t) is the normalized solid angle of the cone Ct,j in Nt,j , that is:
j (t) =

dj1 (Ct,j S dj1 )


for j = 0, . . . , d 1,
dj1 (S dj1 )

d (t) = 1.

(24)
(25)

Notice that for convex or other usual polyhedra j (t) is constant for t Sj , so that gj is
equal to this constant multiplied by the j-dimensional geometric measure of Sj .
For j = 1, . . . d,
Rj (x) =

2
| |

j
2

((j + 1)/2

y2
dy
2

(26)

with := | |( )1/2

(27)

Tj (v) exp

where
v := (2)1/2 (1 2 )1/2 y x
and

j1

Tj (v) :=
k=0

Hk2 (v) v2 /2
Hj (v)
Ij1 (v).
e
j
2k k!
2 (j 1)!

(28)

where In is given in the previous Lemma.


For the proof of the theorem, we need some ingredients from random matrices theory.
Following Mehta [25], denote by qn () the density of eigenvalues of n n GOE matrices at the
point , that is, qn ()d is the probability of Gn having an eigenvalue in the interval (, + d).
The random nn real random matrix Gn is said to have the GOE distribution, if it is symmetric,
2 ) = 1/2 if i < k
with centered Gaussian entries gik , i, k = 1, . . . , n satisfying E(gii2 ) = 1, E(gik
11

and the random variables: {gik , 1 i k n} are independent.


It is well known that:
n1

2 /2

qn () = e

2 /2

c2k Hk2 ()
k=0
+

+ 1/2 (n/2)1/2 cn1 cn Hn1 ()

ey

+ 1I{n odd

2 /2

Hn (y)dy 2

ey

2 /2

Hn (y)dy

Hn1 ()
,
+ y 2 /2
e
H
(y)dy
n1

(29)

where ck := (2k k! )1/2 , k = 0, 1, . . ., (see Mehta [25], ch. 7.)


In the proof of the theorem we will use the following remark due to Fyodorov [18] that we
state as a Lemma
Lemma 5 Let Gn be a GOE n n matrix. Then, for R one has:
E | det(Gn In )| = 23/2 (n + 3)/2 exp( 2 /2)

qn+1 ()
,
n+1

(30)

Proof:
Denote by 1 , . . . , n the eigenvalues of Gn . It is well-known (Mehta [25], Kendall et al. [19])
that the joint density fn of the n-tuple of random variables (1 , . . . , n ) is given by the formula
n

n
2
i=1 i

fn (1 , . . . , n ) = cn exp

1i<kn

|k i | , with cn := (2)n/2 ((3/2))n

(1+i/2)
i=1

Then,
n

E | det(Gn In )| = E

i=1

|i |

=
Rn i=1

|i |cn exp(

= e

2 /2

cn
cn+1

Rn

n
2
i=1 i

)
1i<kn

|k i | d1 , . . . , dn

fn+1 (1 , . . . , n , )d1 , . . . , dn = e

2 /2

cn qn+1 ()
.
cn+1 n + 1

The remainder is plain.


Proof of Theorem 4:
We use the definition (8) given in Corollary 1 and the moment computations of Lemma 2 which
imply that:
pX(t) (x) = (x)

(31)
j/2

pX(t),Xj (t) (x, 0) = (x)(2)

j/2

(2 )

X (t) is independent of X(t)

Xj,N
(t)

is independent of

(33)

(Xj (t), X(t), Xj (t)).

Since the distribution of X (t) is centered Gaussian with variance 2 Id , it follows that :
E( 1IX (t)Cbt,0 /X(t) = x) = 0 (t)
12

(32)

if t S0 ,

(34)

and if t Sj , j 1:
E(| det(Xj (t))| 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0)

= j (t) E(| det(Xj (t))|/X(t) = x, Xj (t) = 0)


= j (t) E(| det(Z + 2 xIj )|). (35)
In the formula above, j (t) is the normalized solid angle defined in the statement of the theorem
and the random j j real matrix Z has the distribution of Lemma 2 .
A standard moment computations shows that Z has the same distribution as the random matrix:
2 Ij ,

8 Gj + 2

where Gj is a j j GOE random matrix, is standard normal in R and independent of Gj .


So, for j 1 one has
+

E | det(Z + 2 xIj )| = (8 )j/2

E | det(Gj Ij )| (y)dy,

where is given by (27).


For the conditional expectation in (8) use this last expression in (35) and (5). For the density
in (8) use (32). Then Lemma 3 gives (22).

Remarks on the theorem


The principal term is
(x)

d0

0 (t) +
j=1

tS0

| |

j/2

H j (x) gj

(36)

which is the product of a standard Gaussian density times a polynomial with degree d0 .
Integrating once, we get -in our special case- the formula for the expectation of the EPC
of the excursion set as given by [3]
The complementary term given by
d0

Rj (x)gj ,

(x)

(37)

j=1

can be computed by means of a formula, as it follows from the statement of the theorem
above. These formulae will be in general quite unpleasant due to the complicated form of
Tj (v). However, for low dimensions they are simple. For example:

2 (v) v(1 (v)) ,

T2 (v) = 2 2(v),

T1 (v) =

T3 (v) =

3(2v 2 + 1)(v) (2v 2 3)v(1 (v)) .


2

13

(38)
(39)
(40)

Second order asymptotics for pM (x) as x + will be mainly considered in the next
section. However, we state already that the complementary term (37) is equivalent, as
x +, to
12

(x) gd0 Kd0 x2d0 4 e

2
3 2

x2

(41)

where the constant Kj , j = 1, 2, ... is given by:


Kj = 23j2

j+1

2
j/4
j/2
3 2
(2) (j 1)!

2j4

(42)

We are not going to go through this calculation, which is elementary but requires some
work. An outline of it is the following. Replace the Hermite polynomials in the expression
for Tj (v) given by (28) by the well-known expansion:
[j/2]

(1)i

Hj (v) = j!
i=0

(2v)j2i
i!(j 2i)!

(43)

and Ij1 (v) by means of the formula in Lemma 4.


Evaluating the term of highest degree in the polynomial part, this allows to prove that,
as v +, Tj (v) is equivalent to

v2
2j1
v 2j4 e 2 .
(j 1)!

(44)

Using now the definition of Rj (x) and changing variables in the integral in (26), one gets
for Rj (x) the equivalent:
12

Kj x2j4 e

2
3 2

x2

(45)

In particular, the equivalent of (37) is given by the highest order non-vanishing term in
the sum.
Consider now the case in which S is the sphere S d1 and the process satisfies the same
conditions as in the theorem. Even though the theorem can not be applied directly,
it is possible to deal with this example to compute p(x), only performing some minor
changes. In this case, only the term that corresponds to j = d 1 in (8) does not vanish,
= 1 for each t S d1 and one can use invariance
Ct,d1 = Nt,d1 , so that 1IX
b
(t)C
d1,N

t,d1

under rotations to obtain:


p(x) = (x)

d1 S d1
E | det(Z + 2 xId1 ) + (2| |)1/2 Id1 |
(2)(d1)/2

(46)

where Z is a (d 1) (d 1) centered Gaussian matrix with the covariance structure


of Lemma 2 and is a standard Gaussian real random variable, independent of Z. (46)
follows from the fact that the normal derivative at each point is centered Gaussian with

14

variance 2| | and independent of the tangential derivative. So, we apply the previous
computation, replacing x by x + (2| |)1/2 and obtain the expression:
p(x) = (x)
+

2 d/2
(d/2)
| | (d1)/2
H d1 (x + (2| |)1/2 y) + Rd1 (x + (2| |)1/2 y) (y)dy.

(47)

Asymptotics as x +

In this section we will consider the errors in the direct and the EPC methods for large values
of the argument x. Theses errors are:
p(x) pM (x) =

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x pX(t) (x)


tS0

d0

+
j=1

Sj

E | det(Xj (t)| 1IX

j,N (t)Ct,j

pE (x) pM (x) =

. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (48)

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x pX(t) (x)


tS0

d0

(1)j

+
j=1

Sj

E det(Xj (t) 1IX

j,N (t)Ct,j

. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt).


(49)

It is clear that for every real x,


|pE (x) pM (x)| p(x) pM (x)
so that the upper bounds for p(x) pM (x) will automatically be upper bounds for
|pE (x) pM (x)|. Moreover, as far as the authors know, no better bounds for |pE (x) pM (x)|
than for p(x) pM (x) are known. It is an open question to determine if there exist situations
in which pE (x) is better asymptotically than p(x).
Our next theorem gives sufficient conditions allowing to ensure that the error
p(x) pM (x)
is bounded by a Gaussian density having strictly smaller variance than the maximum variance
of the given process X , which means that the error is super- exponentially smaller than pM (x)
itself, as x +. In this theorem, we assume that the maximum of the variance is not attained
in S\Sd0 . This excludes constant variance or some other stationary-like condition that will be
addressed in Theorem 6. As far as the authors know, the result of Theorem 5 is new even for
one-parameter processes defined on a compact interval.
For parameter dimension d0 > 1, the only result of this type for non-constant variance
processes of which the authors are aware is Theorem 3.3 of [34].
Theorem 5 Assume that the process X satisfies conditions A1 -A5. With no loss of generality,
we assume that maxtS Var(X(t)) = 1. In addition, we will assume that the set Sv of points
t S where the variance of X(t) attains its maximal value is contained in Sd0 (d0 > 0) the
non-empty face having largest dimension and that no point in Sv is a boundary point of S\Sd0 .
Then, there exist some positive constants C, such that for every x > 0.
|pE (x) pM (x)| p(x) pM (x) C(x(1 + )),
where (.) is the standard normal density.
15

(50)

Proof :
Let W be an open neighborhood of the compact subset Sv of S such that dist(W, (S\Sd0 )) > 0
where dist denote the Euclidean distance in Rd . For t Sj W c , the density
pX(t),Xj (t) (x, 0)
can be written as the product of the density of Xj (t) at the point 0, times the conditional density
of X(t) at the point x given that Xj (t) = 0, which is Gaussian with some bounded expectation
and a conditional variance which is smaller than the unconditional variance, hence, bounded by
some constant smaller than 1. Since the conditional expectations in (48) are uniformly bounded
by some constant, due to standard bounds on the moments of the Gaussian law, one can deduce
that:
p(x) pM (x) =

W Sd0

E | det(Xd0 (t))| 1IX

d0 ,N (t)Ct,d0

.pX(t),Xd

. 1IM >x /X(t) = x, Xd 0 (t) = 0

(t) (x, 0)d0 (dt)

+ O(((1 + 1 )x)), (51)

as x +, for some 1 > 0. Our following task is to choose W such that one can assure
that the first term in the right hand-member of (51) has the same form as the second, with a
possibly different constant 1 .
To do this , for s S and t Sd0 , let us write the Gaussian regression formula of X(s) on the
pair (X(t), Xd 0 (t)):
X(s) = at (s)X(t) + bt (s), Xd 0 (t) +

ts
2

X t (s).

(52)

where the regression coefficients at (s), bt (s) are respectively real-valued and Rd0 -valued.
From now onwards, we will only be interested in those t W . In this case, since W does not
contain boundary points of S\Sd0 , it follows that
Ct,d0 = Nt,d0 and 1IX

d0 ,N (t)Ct,d0

= 1.

Moreover, whenever s S is close enough to t, necessarily, s Sd0 and one can show that
the Gaussian process {X t (s) : t W Sd0 , s S} is bounded, in spite of the fact that its
trajectories are not continuous at s = t. For each t, {X t (s) : s S} is a helix process, see [8]
for a proof of boundedness.
On the other hand, conditionally on X(t) = x, Xd 0 (t) = 0 the event {M > x} can be written as
{X t (s) > t (s) x, for some s S}
where
t (s) =

2(1 at (s))
.
ts 2

(53)

Our next goal is to prove that if one can choose W in such a way that
inf{ t (s) : t W Sd0 , s S, s = t} > 0,

(54)

then we are done. In fact, apply the Cauchy-Schwarz inequality to the conditional expectation
in (51). Under the conditioning, the elements of Xd0 (t) are the sum of affine functions of x
with bounded coefficients plus centered Gaussian variables with bounded variances, hence, the
absolute value of the conditional expectation is bounded by an expression of the form
Q(t, x)

1/2

X t (s)
>x
t
sS\{t} (s)
sup
16

1/2

(55)

where Q(t, x) is a polynomial in x of degree 2d0 with bounded coefficients. For each t W Sd0 ,
the second factor in (55) is bounded by
P sup

X t (s)
: t W Sd0 , s S, s = t > x
t (s)

1/2

Now, we apply to the bounded separable Gaussian process


X t (s)
: t W Sd0 , s S, s = t
t (s)
the classical Landau-Shepp-Fernique inequality [20], [17] which gives the bound
P sup

X t (s)
: t W Sd0 , s S, s = t > x C2 exp(2 x2 ),
t (s)

for some positive constants C2 , 2 and any x > 0. Also, the same argument above for the density
pX(t),Xd (t) (x, 0) shows that it is bounded by a constant times the standard Gaussian density.
0
To finish, it suffices to replace these bounds in the first term at the right-hand side of (51).
It remains to choose W for (54) to hold true. Consider the auxiliary process
Y (s) :=

X(s)
r(s, s)

, s S.

(56)

Clearly, Var(Y (s)) = 1 for all s S. We set


r Y (s, s ) := Cov(Y (s), Y (s )) , s, s S.
Let us assume that t Sv . Since the function s
Var(X(s)) attains its maximum value at

s = t, it follows that X(t), Xd0 (t) are independent, on differentiation under the expectation
sign. This implies that in the regression formula (52) the coefficients are easily computed and
at (s) = r(s, t) which is strictly smaller than 1 if s = t, because of the non-degeneracy condition.
Then
2(1 r(s, t))
2(1 r Y (s, t))
t (s) =

.
(57)
ts 2
ts 2

Since r Y (s, s) = 1 for every s S, the Taylor expansion of r Y (s, t) as a function of s, around
s = t takes the form:
Y
r Y (s, t) = 1 + s t, r20,d
(t, t)(s t) + o( s t 2 ),
0

(58)

where the notation is self-explanatory.


Also, using that Var(Y (s)) = 1 for s S, we easily obtain:
Y
(t, t) = Var(Yd0 (t)) = Var(Xd 0 (t))
r20,d
0,

(59)

where the last equality follows by differentiation in (56) and putting s = t. (59) implies that
Y
(t, t) is uniformly positive definite on t Sv , meaning that its minimum eigenvalue has
r20,d
0,
a strictly positive lower bound. This, on account of (57) and (58), already shows that
inf{ t (s) : t Sv , s S, s = t} > 0,

(60)

The foregoing argument also shows that


inf{ (at )d0 (t) : t Sv , S d0 1 , s = t} > 0,
17

(61)

since whenever t Sv , one has at (s) = r(s, t) so that


(at )d0 (t) = r20,d0 , (t, t).
To end up, assume there is no neighborhood W of Sv satisfying (54). In that case using a
compactness argument, one can find two convergent sequences {sn } S , {tn } Sd0 , sn s0 ,
tn t0 Sv such that
tn (sn ) 0.
may be .
t0 = s0 is not possible, since it would imply
=2

(1 at0 (s0 ))
= t0 (s0 ),
t0 s 0 2

which is strictly positive.


If t0 = s0 , on differentiating in (52) with respect to s along Sd0 we get:
Xd 0 (s) = (at )d0 (s)X(t) + (bt )d0 (s), Xd 0 (t) +

d0 t s
s
2

X t (s),

where (at )d0 (s) is a column vector of size d0 and (bt )d0 (s) is a d0 d0 matrix. Then, one must
have at (t) = 1, (at )d0 (t) = 0 . Thus
tn (sn ) = uTn (at0 )d0 (t0 )un + o(1),
where un := (sn tn )/ sn tn . Since t0 Sv we may apply (61) and the limit of tn (sn )
cannot be non-positive.
A straightforward application of Theorem 5 is the following
Corollary 2 Under the hypotheses of Theorem 5, there exists positive constants C, such that,
for every u > 0 :
+

pE (x)dx P(M > u)

+
u

p(x)dx P(M > u) CP( > u),

where is a centered Gaussian variable with variance 1


The precise order of approximation of p(x) pM (x) or pE (x) pM (x) as x + remains
2 respectively which
in general an open problem, even if one only asks for the constants d2 , E
govern the second order asymptotic approximation and which are defined by means of

and

1
:= lim 2x2 log p(x) pM (x)
x+
d2

(62)

1
lim 2x2 log pE (x) pM (x)
2 := x+
E

(63)

whenever these limits exist. In general, we are unable to compute the limits (62) or (63) or
even to prove that they actually exist or differ. Our more general results (as well as in [3], [34])
only contain lower-bounds for the liminf as x +. This is already interesting since it gives
some upper-bounds for the speed of approximation for pM (x) either by p(x) or pE (x). On the
other hand, in Theorem 7 below, we are able to prove the existence of the limit and compute
d2 for a relevant class of Gaussian processes.
18

For the next theorem we need an additional condition on the parameter set S. For S
verifying A1 we define
(S) = sup

sup

0jd0 tSj

sup
sS,s=t

dist (t s), Ct,j


st 2

(64)

where dist is the Euclidean distance in Rd .


One can show that (S) < in each one of the following classes of parameter sets S:
- S is convex, in which case (S) = 0.
- S is a C 3 manifold, with or without boundary.
- S verifies the following condition: For every t S there exists an open neighborhood V of
t in Rd and a C 3 diffeomorphism : V B(0, r) (where B(0, r) denotes the open ball in Rd
centered at 0 and having radius r, r > 0) such that
(V S) = C B(0, r), where C is a convex cone.
However, (S) < can fail in general. A simple example showing what is going on is the
following: take an orthonormal basis of R2 and put
S = {(, 0) : 0 1} {( cos , sin ) : 0 1}
where 0 < < , that is, S is the boundary of an angle of size . One easily checks that
(S) = +. Moreover it is known [3] that in this case the EPC approximation does not verify
a super- exponential inequality. More generally, sets S having whiskers have (S) = +.
Theorem 6 Let X be a stochastic process on S satisfying A1 -A5. Suppose in addition that
Var(X(t)) = 1 for all t S and that (S) < +.
Then
1
(65)
lim inf 2x2 log p(x) pM (x) 1 + inf 2
x+
tS + (t)2
t
t
with

Var X(s)/X(t), X (t)


(1 r(s, t))2
sS\{t}

t2 := sup
and
t := sup

dist 1
t r01 (s, t), Ct,j
1 r(s, t)

sS\{t}

(66)

where
t := Var(X (t))
(t) is the maximum eigenvalue of t
in (66), j is such that t Sj ,(j = 0, 1, . . . , d0 ).
The quantity in the right hand side of (65) is strictly bigger than 1.
Remark. In formula (65) it may happen that the denominator in the right-hand side is
identically zero, in which case we put + for the infimum. This is the case of the one-parameter
process X(t) = cos t + sin t where , are Gaussian standard independent random variables,
and S is an interval having length strictly smaller than .

19

Proof of Theorem 6
Let us first prove that suptS t < .
For each t S, let us write the Taylor expansions
r01 (s, t) = r01 (t, t) + r11 (t, t)(s t) + O( s t 2 )
= t (s t) + O( s t 2 )

where O is uniform on s, t S, and


1 r(s, t) = (s t)T t (s t) + O( s t 2 ) L2 s t 2 ,
where L2 is some positive constant. It follows that for s S, t Sj , s = t, one has:
dist 1
t r01 (s, t), Ct,j

L3

1 r(s, t)

dist (t s), Ct,j


st 2

+ L4 ,

(67)

where L3 and L4 are positive constants. So,


dist 1
t r01 (s, t), Ct,j
which implies suptS t < .

L3 (S) + L4 .

1 r(s, t)

With the same notations as in the proof of Theorem 5, using (4) and (8), one has:

p(x) pM (x) = (x)

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x


tS0

d0

+
j=1

Sj

E | det(Xj (t))| 1IX

j,N (t)Ct,j .

1IM >x /X(t) = x, Xj (t) = 0

(2)j/2 (det(Var(Xj (t))))1/2 j (dt) . (68)


Proceeding in a similar way to that of the proof of Theorem 5, an application of the Holder
inequality to the conditional expectation in each term in the right-hand side of (68) shows that
the desired result will follow as soon as we prove that:

lim inf 2x2 log P {Xj,N


Ct,j } {M > x}/X(t) = x, Xj (t) = 0
x+

t2

1
,
+ (t)2t

for each j = 0, 1, . . . , d0 , where the liminf has some uniformity in t.


Let us write the Gaussian regression of X(s) on the pair (X(t), X (t))
X(s) = at (s)X(t) + bt (s), X (t) + Rt (s).
Since X(t) and X (t) are independent, one easily computes :
at (s) = r(s, t)
bt (s) = 1
t r01 (s, t).
Hence, conditionally on X(t) = x, Xj (t) = 0, the events

T
{M > x} and {Rt (s) > (1 r(s, t))x r01
(s, t)1
t Xj,N (t) for some s S}

20

(69)

coincide.
(t)|X (t) = 0) the regression of X (t) on X (t) = 0. So, the probability in
Denote by (Xj,N
j
j
j,N
(69) can written as

Cbt,j

P{ t (s) > x

T (s, t)1 x
r01

for some s S}pXj,N


(t)|Xj (t)=0 (x )dx
1 r(s, t)

(70)

where
t (s) :=

Rt (s)
1 r(s, t)

dx is the Lebesgue measure on Nt,j . Remember that Ct,j Nt,j .

If 1
t r01 (s, t) Ct,j one has

T
r01
(s, t)1
t x 0

for every x Ct,j , because of the definition of Ct,j .


/ Ct,j , since Ct,j is a closed convex cone, we can write
If 1
t r01 (s, t)

1
t r01 (s, t) = z + z

with z Ct,j , z z and z = dist(1


t r01 (s, t), Ct,j ).

So, if x Ct,j :
T (s, t)1 x
r01
z T x + z T x
t
=
t x
1 r(s, t)
1 r(s, t)

using that z T x 0 and the Cauchy-Schwarz inequality. It follows that in any case, if x Ct,j
the expression in (70) is bounded by
Cbt,j

P t (s) > x t x for some s S pXj,N


(t)|Xj (t)=0 (x )dx .

(71)

To obtain a bound for the probability in the integrand of (71) we will use the classical
inequality for the tail of the distribution of the supremum of a Gaussian process with bounded
paths.
The Gaussian process (s, t))
t (s), defined on (S S)\{s = t} has continuous paths. As
the pair (s, t) approches the diagonal of S S, t (s) may not have a limit but, almost surely,
it is bounded (see [8] for a proof). (For fixed t, t (.) is a helix process with a singularity at
s = t, a class of processes that we have already met above).
We set
mt (s) := E( t (s)) (s = t)
m := sups,tS,s=t |mt (s)|
:= E | sups,tS,s=t t (s) mt (s) | .
The almost sure boundedness of the paths of t (s) implies that m < and < . Applying the
Borell-Sudakov-Tsirelson type inequality (see for example Adler [2] and references therein) to
the centered process s
t (s)mt (s) defined on S\{t} , we get whenever xt x m > 0:
P{ t (s) > x t x for some s S}

P{ t (s) mt (s) > x t x m for some s S}


2 exp
21

(x t x m )2
.
2t2

The Gaussian density in the integrand of (71) is bounded by


(2j (t))

jd
2

x mj,N (t)

exp

2j (t)

(t)|X (t))
where j (t) and j (t) are respectively the minimum and maximum eigenvalue of Var(Xj,N
j

and mj,N (t) is the conditional expectation E(Xj,N (t)|Xj (t) = 0). Notice that j (t), j (t), mj,N (t)
are bounded, j (t) is bounded below by a positive constant and j (t) (t).

Replacing into (71) we have the bound :

P {Xj,N
Ct,j } {M > x}/X(t) = x, Xj (t) = 0

(2j (t))

jd
2

Cbt,j {xt x m>0}

exp

x mj,N (t) 2
(x t x m )2
dx
+
2t2
2(t)
xm

+ P Xj,N
(t)|Xj (t) = 0
, (72)
t

where it is understood that the second term in the right-hand side vanishes if t = 0.
Let us consider the first term in the right-hand side of (72). We have:
x mj,N (t)
(x t x m )2
+
2t2
2(t)

2
(x t x m )2 ( x mj,N (t) )
+
2t2
2(t)
(x m t mj,N (t) )2
2
,
= A(t) x + B(t)(x m ) + C(t) +
2t2 + 2(t)2t

where the last inequality is obtained after some algebra, A(t), B(t), C(t) are bounded functions
and A(t) is bounded below by some positive constant.
So the first term in the right-hand side of (72) is bounded by :
2.(2j )

jd
2

exp

(x m t mj,N (t))2
2t2 + 2(t)2t

Rdj

exp A(t) x + B(t)(x m ) + C(t)


L|x|dj1 exp

dx

(x m t mj,N (t) )2
2t2 + 2(t)2t

(73)

where L is some constant. The last inequality follows easily using polar coordinates.
Consider now the second term in the right-hand side of (72). Using the form of the conditional

density pXj,N
(t)/Xj (t)=0 (x ), it follows that it is bounded by
P

(Xj,N
(t)/Xj (t)

= 0)

mj,N (t)

x m t mj,N (t)
t

L1 |x|dj2 exp

(x m t mj,N (t) )2
2(t)2t

where L1 is some constant. Putting together (73) and (74) with (72), we obtain (69).
The following two corollaries are straightforward consequences of Theorem 6:
22

(74)

Corollary 3 Under the hypotheses of Theorem 6 one has


lim inf 2x2 log |pE (x) pM (x)| 1 + inf

tS 2
t

x+

1
.
+ (t)2t

Corollary 4 Let X a stochastic process on S satisfying A1 -A5. Suppose in addition that


E(X(t)) = 0, E(X 2 (t)) = 1, Var(X (t) = Id for all t S.
Then
+
1
pE (x)dx 1 + inf 2
.
lim inf 2u2 log P(M > u)
u+
tS t + 2
u
t
and

d0

(1)j (2)j/2 gj H j (x) (x).

p (x) =
j=0

where gj is given by (23) and H j (x) has been defined in Section 4.


The proof follows directly from Theorem 6 the definition of pE (x) and the results in [1].

Examples

1) A simple application of Theorem 5 is the following. Let X be a one parameter real-valued


centered Gaussian process with regular paths, defined on the interval [0, T ] and satisfying an
adequate non-degeneracy condition. Assume that the variance v(t) has a unique maximum, say
1 at the interior point t0 , and k = min{j : v (2j) (t0 ) = 0} < . Notice that v (2k) (t0 ) < 0. Then,
one can obtain the equivalent of pM (x) as x which is given by:
pM (x)

1 v (t0 )/2
1/k
kCk

E || 2k 1 x11/k (x),

(75)

1
v (2k) (t0 ) + 14 [v (t0 )]2 1Ik=2 . The
where is a standard normal random variable and Ck = (2k)!
proof is a direct application of the Laplace method. The result is new for the density of the
maximum, but if we integrate the density from u to +, the corresponding bound for P{M > u}
is known under weaker hypotheses (Piterbarg [28]).

2) Let the process X be centered and satisfy A1-A5. Assume that the the law of the process
is isotropic and stationary, so that the covariance has the form (10) and verifies the regularity
condition of Section 4. We add the simple normalization = (0) = 1/2. One can easily
check that
1 2 ( s t 2 ) 42 ( s t 2 ) s t 2
(76)
t2 = sup
[1 ( s t 2 )]2
sS\{t}
Furthermore if
(x) 0 for x 0

(77)

one can show that the sup in (76) is attained as s t 0 and is independent of t. Its value
is
t2 = 12 1.
The proof is elementary (see [4] or [34]).
Let S be a convex set. For t Sj , s S:
dist r01 (s, t), Ct,j = dist 2 ( s t 2 )(t s), Ct,j .
23

(78)

The convexity of S implies that (t s) Ct,j . Since Ct,j is a convex cone and 2 ( s t 2 ) 0,
one can conclude that r01 (s, t) Ct,j so that the distance in (78) is equal to zero. Hence,
t = 0 for every t S
and an application of Theorem 6 gives the inequality
lim inf
x+

1
2
.
log p(x) pM (x) 1 +
2

x
12 1

(79)

A direct consequence is that the same inequality holds true when replacing p(x) pM (x) by
|pE (x) pM (x)| in (79), thus obtainig the main explicit example in Adler and Taylor [3], or in
Taylor et al. [34].
Next, we improve (79). In fact, under the same hypotheses, we prove that the liminf is an
ordinary limit and the sign is an equality sign. We state this as
Theorem 7 Assume that X is centered, satisfies hypotheses A1-A5, the covariance has the
form (10) with (0) = 1/2, (x) 0 f or x 0. Let S be a convex set, and d0 = d 1.
Then
1
2
.
(80)
lim log p(x) pM (x) = 1 +
x+ x2
12 1
Remark Notice that since S is convex, the added hypothesis that the maximum dimension d0
such that Sj is not empty is equal to d is not an actual restriction.
Proof of Theorem 7
In view of (79), it suffices to prove that
lim sup
x+

1
2
log p(x) pM (x) 1 +
.
2

x
12 1

(81)

Using (4) and the definition of p(x) given by (8), one has the inequality
p(x) pM (x) (2)d/2 (x)

Sd

E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0)d (dt),

(82)

where our lower bound only contains the term corresponding to the largest dimension and we
have already replaced the density pX(t),X (t) (x, 0) by its explicit expression using the law of the
process. Under the condition {X(t) = x, X (t) = 0} if v0T X (t)v0 > 0 for some v0 S d1 , a
Taylor expansion implies that M > x. It follows that
E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0

E | det(X (t))| 1I

sup v T X (t)v > 0

/X(t) = x, X (t) = 0 . (83)

vS d1

We now apply Lemma 2 which describes the conditional distribution of X (t) given X(t) =
x, X (t) = 0 . Using the notations of this lemma, we may write the right-hand side of (83) as :
E | det(Z xId)| 1I

sup v T Zv > x

vS d1

which is obviously bounded below by


E | det(Z xId)| 1IZ11 >x

=
x

E | det(Z xId)|/Z11 = y (2)1/2 1 exp


24

y2
dy, (84)
2 2

where 2 := Var(Z11 ) = 12 1. The conditional distribution of Z given Z11 = y is easily


deduced from Lemma 2. It can be represented by the random d d real symmetric matrix

y
Z12
... ...
Z1d

2 + y Z23 . . .
Z2d

Z :=
,
..

.
d + y

where the random variables {2 , . . . , d , Zik , 1 i < k d} are independent centered Gaussian
with
Var(Zik ) = 4 (1 i < k d) ; Var(i ) =

4 1
16 (8 1)
(i = 2, . . . , d) ; =

12 1
12 1

Observe that 0 < < 1.


Choose now 0 such that (1+0 ) < 1. The expansion of det(Z xId) shows that if x(1+0 )
y x(1 + 0 ) + 1 and x is large enough, then
E | det(Z xId)| L 0 (1 (1 + 0 ))d1 xd ,

where L is some positive constant. This implies that

1
2

exp(
x

L
y2
)E | det(ZxId)| dy
2
2
2

x(1+0 )+1

exp(
x(1+0 )

y2
)0 (1(1+0 ))d1 xd dy
2 2

for x large enough. On account of (82),(83),(84), we conclude that for x large enough,
p(x) pM (x) L1 xd exp

x2 (x(1 + 0 ) + 1)2
+
.
2
2 2

for some new positive constant L1 . Since 0 can be chosen arbitrarily small, this implies (81).
3) Consider the same processes of Example 2, but now defined on the non-convex set {a
t b}, 0 < a < b. The same calculations as above show that t = 0 if a < t b and
t = max

2a (2a2 (1 cos ))(1 cos)


2 (z 2 )z
,
sup
,
2
1 (2a2 (1 cos ))
z[2a,a+b] 1 (z ) [0,]
sup

for t = a.
4) Let us keep the same hypotheses as in Example 2 but without assuming that the covariance is decreasing as in (77). The variance is still given by (76) but t is not necessarily equal
to zero. More precisely, relation (78) shows that
t sup 2
sS\{t}

( s t 2 )+ s t
1 ( s t 2 )

The normalization: = 1/2 implies that the process X is identity speed, that is
Var(X (t)) = Id so that (t) = 1. An application of Theorem 6 gives
lim inf
x+

where

2
log p(x) pM (x) 1 + 1/Z .
x2

(85)
2

4 (z 2 )+ z
1 2 (z 2 ) 42 (z 2 )z 2
+
max
,
[1 (z 2 )]2
z(0,] [1 (z 2 )]2
z(0,]

Z := sup
and is the diameter of S.
5) Suppose that

25

the process X is stationary with covariance (t) := Cov(X(s), X(s + t)) that satisfies
(s1 , . . . , sd ) = i=1,...,d i (si ) where 1 , ..., d are d covariance functions on R which are
monotone, positive on [0, +) and of class C 4 ,
S is a rectangle

S=

[ai , bi ] , ai < bi .
i=1,...,d

Then, adding an appropriate non-degeneracy condition, conditions A2-A5 are fulfilled and Theorem 6 applies
It is easy to see that

1 (s1 t1 )2 (s2 t2 ) . . . d (sd td )

..
r0,1 (s, t) =

.
1 (s1 t1 ) . . . d1 (sd1 td1 ).d (sd td )

belongs to Ct,j for every s S. As a consequence t = 0 for all t S. On the other hand,
standard regressions formulae show that
2
2
2
2
2
1 21 . . . 2d 2
Var X(s)/X(t), X (t)
1 2 . . . d 1 . . . d1 d
=
,
(1 r(s, t))2
(1 1 . . . d )2

where i stands for i (si ti ). Computation and maximisation of t2 should be performed


numerically in each particular case.

References
[1] Adler, R.J. (1981). The Geometry of Random Fields. Wiley, New York.
[2] Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes. IMS, Hayward, Ca.
[3] Adler, R.J. and Taylor J. E.(2005). Random fields and geometry. Book to appear.
[4] Azas J-M., Bardet J-M. and Wschebor M. (2002). On the Tails of the distribution of the
maximum of a smooth stationary Gaussian Process. Esaim: P. and S., 6,177-184.
[5] Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution of the
maximum of a Gaussian random fields. Extremes (2002)5(2), 181-212.
[6] Azas, J-M and Wschebor, M. (2002). The Distribution of the Maximum of a Gaussian
Process: Rice Method Revisited, In and out of equilibrium: probability with a physical
flavour, Progress in Probability, 321-348, Birkha
user.
[7] Azas J-M. and Wschebor M (2001). On the regularity of the distribution of the Maximum
of one parameter Gaussian processes Probab. Theory Relat. Fields, 119, 70-98.
[8] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[9] Azas J-M. and Wschebor, M. (2006). A self contained proof of the Rice formula for random
fields. Preprint available at http://www.lsp.ups-tlse.fr/Azais/publi/completeproof.pdf.
[10] Belyaev, Y. (1966). On the number of intersections of a level by a Gaussian Stochastic
process. Theory Prob. Appl., 11, 106-113.
26

[11] Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a
Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
[12] Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series.
[13] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30,
207-216.
[14] Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci. Paris, Ser. I, 337, 663-666.
[15] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J.
Wiley & Sons, New-York.
[16] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[17] Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
dEte de Probabilites de Saint Flour (1974). Lecture Notes in Mathematics, 480, SpringerVerlag, New-York.
[18] Fyodorov, Y. (2006). Complexity of Random Energy Landscapes, Glass Transition and
Absolute Value of Spectral Determinant of Random Matrices Physical Review Letters v. 92
(2004), 240601 (4pages); Erratum: ibid. v.93 (2004),149901(1page)
[19] Kendall, M.G., Stuart,A. and Ord, J.K. (1987). The Advanced Theory of Statistics, Vol. 3.
[20] Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process. Sankya
Ser. A 32, 369-378.
[21] Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Math. Soc.,
Providence, RI.
[22] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, Springer-Verlag,
New-York.
[23] Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 52-71.
[24] Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes. Proc.
Sixth Berkeley Symp. Math. Statist. Prob., 2, 423-442.
[25] Mehta,M.L. (2004). Random matrices, 3d-ed. Academic Press.
[26] Mercadier, C. (2006). Numerical bounds for the distribution of the maximum of one- and
two-dimensional processes, to appear in Advances in Applied Probability, 38, (1).
[27] Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th, Proba. Appl., 26, 687-705.
[28] Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes and
Fields. American Mathematical Society. Providence. Rhode Island.
[29] Rychlik, I. (1990). New bounds for the first passage, wave-length and amplitude densities.
Stochastic Processes and their Applications, 34, 313-339.
[30] Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties of half spaces for spherically
invariant measures (in Russian). Zap. Nauchn. Sem. LOMI, 45, 75-82.
27

[31] Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields, Ann. Probab.,
21, 34-71.
[32] Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab., 24,
1049-1103.
[33] Taylor, J.E. and Adler, R. J. (2003). Euler characteristics for Gaussian fields on manifolds.
Ann. Probab., 31, 533-563.
[34] Taylor J.E., Takemura A. and Adler R.J. (2005). Validity of the expected Euler Characteristic heuristic. Ann. Probab., 33, 4, 1362-1396.
[35] Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de niveau.
Lecture Notes in Mathematics, 1147, Springer-Verlag.

28

1,2

rqsrtsrqrqssq r
sqrrrrsrtqsrqsr rr
qrq

rssqrrsqsrqq
s

qsqqsqr qqsrqqsrrs q
rrqqrs srq rqrsqtqrq qqqsqs
qqs rqr q tqsqr srqqsqqr
r rrqsq t rqr

s
q

sq rrr q [0, L] qrq


rqrrqr r qq qsqss rqr q rq qsrs qsrsqsqsrqr q
d
srq dqsrr q
d q [0,L] rs r rd qsrqq qqsrrr s qsqqqqrq q
s srrrq

s
q

q
s
q

s
q

s
q

n
qsrqrqrr rq sqsqMqrqqrsqrsrrqq
rrqqrqsq rqqsq
q r srq srqsr sq
s q r q r qs qs rqrqq r s
rqqssr qsqq rqrqsqsrqq srqsqrqstqq
qqqrrsqsq sqsqqsrsqsqs qrqrrqsq
r r s r(t)
= exp ( |t|) > 0 = 2 rq

s
q

qs r qsrq qs

rs rqsqrrrqqrqs qqqr rqq


Mq r qqqsqstq rqs r rqqM q
qqqqsqsrq sq
rqqsqrsqrrrqqsrrq rqs srqr

s
q
q

rrqqqsqqrq q r qqsqsq qs
s sqqr rsqrrrq qr q srsqsqqqrsrsqsqrq

s
q

s
q

rrqr sq
qs qsrqrrq
qsq qq rqsr
qsqq qs

d0
1

srqr rqqss qqsqs qq rrrqrqrqqsqqqs


rsrqrq sq
qrqq rrq rqsqr r
sq rr qqqq
rqrrqqs
qsrqrrrq
qsqqq sqs



qs qsr rs qsq qsrrq rqrqAqsr
B
A
sqqsrrqss (A B)A
Brrqrq nqq

A rrr r q
rqsqs A r B rrrrq
srrrrqsrq qsrqsrqsqqrqq r rqqsqsqr rrr
r ssq M q Lr n r
rrqrqs n n r (d ) ; d < d < . . . < d
sqqrq qsqqqqqsrrqsrrqs

rq qs rrr d s
sqrqqqs A r

rrrq q rrr d s
sqrqqqs B r
qrqqq q qrr rssqs rqrqsr rqsq
r B qsr qrqrqsqrqArr qsqsqrrqsqsq s qqsr
rr qsqrt srr rrqs q d drqsqsrr
|d d |

1 exp(2 |d d |)

i,n i=1,...,Mn

1,n

2,n

Mn ,n

i,n

i,n

R(d, d ) =

s
q

rr d [0, L] rq q Y
q
k rqrr qsqr qqr
q r

th

Yk = + Xk (d0 ) a/2 + k ,

q
a rq

k rrq q
X (d ) =
k rqs q
sr r qssr
q ( )
s
qsqr rrq r (F )
n qq
d Cn
rqq d := inf d
qssqqstr qqsq C qqrrqqrq r
M
n
qqr

qs ,
a = a = n
qssr q
{ Y , X (d ), ..., X (d
) , k = 1, ..., n}.

= a = n
, rqs d [0, L] qs
r q qsq r q q qs rr
qssqrqqrr q qqqqsqqqr a r rq qr
d
q rr rsrqqr qds qsrq aqrqq
qsrqrqsr
Y
Y
S (d) :=

th

1
1

th

1kn

1iMn

i+1,n

i,n

1/2

1,n

Mn ,n

1/2

k [Xk (d)=1]

n,1

k=1
n

k [Xk (d)=1]

k=1
n

[Xk (d)=1]

k=1

[Xk (d)=1]

k=1

s rqqrqrrqrrqs {X(d)r=s}q rrq


ssq sq n2 qrrsqsqq
rqsqsrq rqr qrqqrqqrq Y qs
2
2
S (d) :=

.
Y Y
Y Y
n
n
rq d rrqqr X (d) rqqs
qsqssr qrr q

[Xk (d)=]

n,2

[Xk (d)=1]

k=1

[Xk (d)=1]

k=1

d
n

d+
n

d
n = sup {di,n : di,n < d; i = 1, . . . , Mn } ,

qr a rq rqsr q rqsrrr rqs S(d)qs q


S (d )
rd S d (d) dr d Ssqq (S (d)) rqr qsqsq
rqs r rqqsqq ssrqq
r rqs qt qrr r S t(d) S(d ) qsqr r r
rrrqsqq r qs rqsq r sr r q qsq
(F )
d+
n = inf {di,n : di,n > d; i = 1, . . . , Mn } .

n,2

+
n

1,n

Mn ,n

n,2

n,2

n,2

d[0,L]

n,2

n,2

n,2

+
n

Xn (d) := Sn,2 (d) V ar(Sn,2 (d))

21

s Varqqqqqqsrrqs rr qqrqrqsq q qst


(X(d))
r(t) := cov (X(d), X(d + t)) = exp(2|t|)

m(d) =
exp(2|d d|)
2



qsqrs rqqsqsqsr T = sup |X (d)| r
q qsqsrrq rqsq qsqrqqrqq

d[0,L]

d[0,L]

r sqsqqqrq rqqrqssq qqs q


rqqs =0 = 0 rsq rsr qqrq
ssrq
rqrqqrq d q
d
q sqsq rq rqqq (X (d))
rr qqrr q q qqqsq rr

(X (d))

d[0,L]
2

d[0,L]

X (u) (d u)du = (X ) (d).


qsr qsqsqs T = sup |X (d)|
qq
qsrqrqq rrqsqsqr rq rrqqq (X(d))r (X(d))

m (d) = (m ) (d) =
Xn (d) =

d[0,L]

d0 22 d

r (d) =

d[0,L]

d[0,L]

exp 2(d0 + 2 + d)

+ exp 2(d0 + 2 d)

d0 + 22 d

(u) (d v)r(u v)dudv =


2

d 42

= exp(2(22 d))

2 (d v) r(v) dv

+ exp(2(22 + d))

(22 ) 2

qqsqsqrqs (t) =

d + 42
1

(22 ) 2

1 (u)du

= 1

qsq rqsr (Y(d))


q r
qsrqsrqrqqrqqssrqsrrqsqq (X(d))

q q qrq qsq
qsqrqq (Y(d))

t , t ; s , s [0, L] rqr YC(t ), Y (t ); Y (s ), Y (s ) q

d[0,L]

d[0,L]

d[0,L]

srqs q
r qsYqrqrrq qq ps (Y(d)
rqqr r q sss qs qsr(Y(d)))

r |Y | = sup |Y (d)| qrr t r rq


rq qr rq stqr q q

u 0, ({|Y | > u}) = ({|Y (0)| > u} {|Y (0)| u ; (U + D ) 1}) ,


qqsqrqsq U Dqq sqr s q qsqrq
u
u
Y
[0, L]

t1 ,t2 ;s1 ,s2

d[0,L]

d[0,L]

Uu = # {d [0, L] ; Y (d) = u; Y (d) > 0}

q s

rq

() 21 [ (( 1))] ( 1) () .
r (2) = (U + D )
[(U + D )(U + D 1)]
(U + D )

(|Y | > u) (U + D )
[(U + D ) (U + D 1)] = (U (U 1))+ (D (D 1))+2 (U D ) .
rqrt r r qr
q q
U
= I (u) =
dt
dx
yp (x, u; y)dy
|y|p (x, u; y)dy
dx
dt
D
= I (u) =
(U (U 1)) = I (u) =
dt dt
yyp
(u, u; y , y )dy dy
(D (D 1)) = I (u) =
dt dt
|y y | p
(u, u; y , y )dy dy

Du = # {d [0, L] ; Y (d) = u; Y (d) < 0} .

|Y (0)|u

|Y (0)|u

|Y (0)|u

u |Y (0)|u

0,t;t

u |Y (0)|u

0,t;t

1 2 t1 ,t2 ;t1 ,t2

1 2

t1 ,t2 ;t1 ,t2

|y y | p
(u, u; y , y )dy dy ,
dt dt
(U D ) = I (u) =
rq psrq srqs (Y(0), Y (t),
Y (t)) qsqqqr

ts rrt
(|Y (0)| > u) + I (u) + I (u) I (u) + I (u)2 + 2I (u) (|Y | > u)
(|Y (0)| > u) + I (u) + I (u).

rq s
r q q
s Y = sup Y (d) qs (3) qsr
(Y > u) (Y (0) > u) + (U 1) (Y (0) > u) + (U ) .
rq qrrr
qqq s qs r|Y| rq Y {|Y
(0)| u}
qrrqssqsqrq
u r qsrqrqqqs
rrqsqs s r rqsrrq
rqtr r srrqqq
rqrq

qrqqsq (Y (d))
Y Y

qrqssr qqsrr
2 [ (Y (0) > u) + I (u)] I (u) I (u) (|Y | > u) 2 [ (Y (0) > u) + I (u)] .


qqsrq sqqsqsq rqr qqs tr
sq rqrr r rsqsqrsqqr q
qsqsqsqrq rt qrq
L

1 2

t1 ,t2 ;t1 ,t2

0,t;t

d[0,L]

d[0,L]

sq
rr qs


r



q



qr ==100



qr = 10

q

rsr = 0
rrssrr = 10
= 10

(X (d))
= 0 (X (d))

r
r
r 10

s
q

s
q

qrq rrqs
qs qsqq rqrqrr q
rqqqr qqs
qssqsq rq qr

qqsq qsr r tr
r r s rr rs rqts rq

s
q

q qsqqsqsrrqqsqsqsq rqsqst sqqsr


qsrqqsqsr rsrrqqsqr
rqqqsr qqsrqrqqqrrqsrq
=6
sq dr=qs0.4rqqs rrt
2

d[0,L]

d[0,L]

rqs qqqsrqsq sqrrqqrrs r q


10 qsrrqq
qqqsqrrqrqqsrr
rtrqr rqsr s sq qsr rt r
q rt
rq
qssqq
10
10





> u)

(U (|Y+ D(0)|

2
(U (U 1))

2
(D (D 1)) 10 10

2
10 10

(U D )
r

r 71.37 0.88 72.53 0.87 68.99 0.91



= 6

d = 0.4 r r


r r

10

u
2

|Y (0)|u

q srsrqqrqsrq qsqsr
rqs qqq qr
= 10
qsrqsq sqr rqsqrqqsrsqq
qss r rqrq rr q r q q qsr qsrqs rqsr
= 10 , 10

qrqsrqr = 10 , 10 r rqqs

rqsr s sqrqq qqsrq qrr qsqs rr
10
=

sqr qqs r r


rrqsq s s srqst q
qsrrq


r




r r

10

sqrrqr
r rqsqrrq srrsr
qsrsqq rqsr
qsqrrrqq rq
q rr rq q q rqrrrqsq qr = 10 , 10
q = 10 , 10 r rqrq
2

rqrqqqs

q r qsqsqqqs qst sr r rqsrqr r

qq qsrqs rrqq r rqrr r sqqsq


q qqs qqq r qsr
q q qqs qsqq r qs r


q qq srqsrq ts q qsq q qsqr
sqsq
rrqsr r s
r q qs r sq qs

q r qs r r rrq
srqr ts q qrqqsq q r rqsrr

r s
rq qsr q q rqrqq
rq qsr q q rqrqq
qq s qqsr s r rq t s qq

qsrqsqs r r qsr r qqqs q q r


r q rq q r rq
r q rrq qqsr

rsq rrq q rq r

q qrq q r r r sqs r
r sr s s

r r qs srq
q q s qqsrq qsrs qq qsr rq

r r sr sq
r qq rq
r qq rq
rqsrqsrrq r
q sqrsqq srt
srr q qrq

A self contained proof of the Rice formula for random fields


Jean-Marc Azas , azais@cict.fr

Mario Wschebor , wschebor@cmat.edu.uy

December 22, 2006

AMS subject classification: Primary 60G70 Secondary 60G15


Key words and phrases: Gaussian fields, Rice Formula,
After an elementary proof of the Area Formula, we give a proof of Rice Formula for the
expectation of the number of roots of a random system of equations. We provide a complete
proof which is new and quite elementary, and in any case shorter than previous ones (see for
example [1]).
Similar formulae hold true for higher order factorial moments of the number of roots (Theorem 2). Theorem 3 provides a formula for the expectation of the total weight, when random
weights are put in each root.

The Area formula

We begin with a proof of the so-called Area formula, under conditions that will be sufficient for
our main purpose. One can find this formula in its full generality in Federer [3] Th 3.2.5
For any function f , we denote Nuf (T ) the number of roots of the equation f (t) = u that
belong to the subset T of the domain of f .
Proposition 1 (Area formula) Let f be a C 1 function defined on an open subset U of Rd
taking values in Rd . Assume that the set of critical values of f has zero Lebesgue measure.
Let g: Rd R be continuous and bounded. Then

Rd

g(u)Nuf (T )du =

| det(f (t))|g(f (t))dt.

(1)

for any Borel subset T of U , whenever the integral in the right-hand side is well defined.
Proof. Notice first that, due to standard extension arguments, it suffices to prove (1) for
non-negative g and for T a compact parallelotope contained in U . Second, if T is a compact
parallelotope, since f is C 1 , the set of boundary values of f , that is, f (T ) has Lebesgue measure
zero.
We next define an auxiliary function (u) for u Rd in the following way:
If u is neither a critical value nor a boundary value and n := Nuf (T ) is non zero, we denote
by x(1) , . . . , x(n) the roots of f (x) = u belonging to T . Using the local inversion theorem, we
know that there exists some > 0 and n neighborhoods U1 , ..., Un of of x(1) , . . . , x(n) such that:

This work was supported by ECOS program U03E01.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118, route de
Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.

1. f is a C 1 diffeomorphism Ui B(u; ), the ball centered at u with radius .


2. U1 , ..., Un are pairwise disjoint and included in T .
3. if t
/

n
i=1 Ui ,

then f (t)
/ B(u; ).

The compactness implies that n is finite.


In this case, we define
(u) := sup{ > 0 : (1), (2), (3) hold true for all }.
If u is a critical value or a boundary value we set (u) := 0.
If Nuf (T ) = 0, we put
(u) := sup{ > 0 : f (T ) B(u; ) = }.
It is clear that in this case (u) > 0.
The function (u) is Lipschitz. In fact, let u be a value of f which is not a critical value nor
a boundary value, if u belongs to B(u; (u)), then B(u ; (u) u u ) B(u; (u)) and as
a consequence (u ) (u) u u . Exchanging the roles of u and u , we get
|(u ) (u)| u u .
The Lipschitz condition is easily checked in the other two cases.
Let now F be a real-valued monotone continuous function defined on R+ such that
F 0 on [0, 1/2],
F 1 on [1 + ).

(2)

Let (u) > 0 and 0 < < (u). Using the change of variable formula we have
n

| det(f (t))| 1I
T

f (t)u

| det(f (t))|dt = V ()n,

< dt =
i=1

Ui

where V () is the volume of the ball with radius in Rd . Thus, we have an exact counter for
Nuf (T ) when it is non-zero, which obviously holds true also when Nuf (T ) = 0 for < (u)
Let g : Rd R continuous, bounded and non-negative and 0 > 0. For every < 0 /2 we
have:
Rd

g(u)Nuf (T )F

(u)
0

du =

(u)
0

g(u) F
Rd

du

1
V ( )

| det(f (t))| 1I
T

f (t)u <

dt

Applying Fubinis Theorem we see that the expression above is equal to:
A0 , :=

| det(f (t))| dt
T

1
V ( )

F
B(f (t); )

(u)
0

g(u)du.

A0 , in fact does not depend on so it is equal to its limit as 0 which is, because of the
continuity of the function u

(u)
0

g(u), equal to

| det(f (t))|F
T

(f (t))
0

g(f (t))dt.

Let now 0 tend to zero and use monotone convergence. For the left-hand side, we take into
account that the set of critical values and the set of boundary values have measure zero. For
2

the right-hand side, we use the definition of F, that the boundary of T has Lebesgue measure
zero and the integrand is zero if t is a critical point of f .
Remarks:
1: By standard extension arguments the continuous function g can be replaced by the
indicator function of a Borel set say B. Formula (1) can be rewritten as
h(t, u)du =
Rd

| det(f (t))|h((t, f (t))dt,

(3)

Rd

tZ 1 (u)

where h is the function (t, u)


1ItT g(u). Again by a standard approximation argument (3)
holds true for every bounded Borel function h such that the right-hand side of (3) is well-defined.
2:
2. Notice that the hypothesis that the set of critical values has zero Lebesgue measure is
unnecessary, since this follows from the fact that f is C1. The statement above is sufficient to
prove Rice formula, but the interested reader can prove this as an exercise. On the other hand,
one can prove the result under weaker hypotheses: it suffices f to be Lipschitz ([3]).

Rice formulae

Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and


u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t)
0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =

E | det(Z (t))|/Z(t) = u pZ(t) (u)dt.

(4)

If B is compact both sides in (4) are finite.


Theorem 2 Let k, k 2 be an integer. Assume the same hypotheses as in Theorem 1 except
for (iii) that is replaced by
(iii) for t1 , ..., tk U pairwise different values of the parameter, the distribution of
Z(t1 ), ..., Z(tk )
does not degenerate in (Rd )k . Then for every Borel set B contained in U , one has
E

NuZ (B) NuZ (B) 1 ... NuZ (B) k + 1


k

E
Bk

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u


j=1

pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk , (5)


where both members may be infinite.
Theorem 3 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that
for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:

a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)


ous.

Y t (w) is continu-

Z(s), Y t (w) defined on U W is Gaussian.

b) For each t U the random process (s, w)

Moreover, assume that g : U C(W, Rd ) R is a bounded function, which is continuous when


one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =

tI,Z(t)=u

E | det(Z (t)|g(t, Y t )/Z(t) = u).pZ(t) (u)dt.

(6)

Sufficient conditions for (iv) are given in the following proposition


Proposition 2 Let Z : U Rd , U a compact subset of Rd be a random field with paths of
class C 1 and u Rd . Assume that
pZ(t) (x) C for all t U and x in some neighborhood of u.
at least one of the two following hypotheses is satisfied:
a) a.s. t

Z(t) is of class C 2

b)
() =

sup

P | det(Z (t))| < /Z(t) = x 0

tU,xV (u)

as 0, where V (u) is some neighborhood of u.


Then (iv) holds true.
Proof. If condition a) holds true, the result is Lemma 5 in Cucker and Wschebor [2].
To prove it under condition b), assume with no loss of generality that I = [0, 1]d and that
u = 0. Put GI = t I, Z(t) = 0, det Z (t) = 0 Choose > 0, > 0; there exists a positive
number M such that
P(EM ) = P sup Z (t) > M .
tI

Denote by det the modulus of continuity of | det(X (.))| and choose m large enough so that

d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id be such a
cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
P GCi1 ...id EM
Fm,

P(GI ) P(EM ) + P(Fm, ) +


1i1 ...id m

When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|

M
d j = 1, ..., d
m

where Zj denotes the j-th coordinate of Z, and:


det Z (ti1 ...id )

< .

(7)

So, if m is chosen sufficiently large so that V (0) contains the ball centered at 0 with radius
M d
m , one has:
2M d
P(GI ) 2 + md (
d) C()
m
Since and are arbitrarily small, the result follows.
Remark:
With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
Proof of Theorem 1
Let F : R+ [0, 1] be the function defined in (2), For m, n positive integers and x 0, define:
Fm (x) := F(mx) ; Gn (x) := 1 F(x/n).

(8)

A standard extension argument says that it is enough to prove the theorem when B is a
compact rectangle included in U . So we assume that this is the case. Let us introduce some
more notations:
(t) := | det(Z (t))| (t U )
For n, m positive integers and u Rd :
Cum (B) :=

Fm ((s)).

(9)

sB:Z(s)=u
m
m
Qn,m
u (B) := Cu (B)Gn (Cu (B)).

(10)

In (9) when the summation index set is empty, we put Cum (B) = 0. Let g : Rd R be
continuous with compact support . We apply the area formula (3) for the function
h(t, u) = Fm ((t))Gn (Cum (B))g(u) 1ItB
to get:
Rd

g(u)Qn,m
u (B)du =

m
(t) Fm ((t)) Gn (CZ(t)
(B)) g(Z(t))dt.

Taking expectations on both sides :

Rd

g(u) E(Qn,m
u (B))du =

g(u)du
Rd

E (t) Fm ((t))Gn (Cum (B))/Z(t) = u pZ(t) (u)dt.

Since this equality holds for any g continuous with bounded support, it follows that
E(Qn,m
u (B)) =

E (t)Fm ((t)Gn (Cum (B))/Z(t) = u pZ(t) (u)dt,

(11)

holds true for almost every u Rd .


Let us prove that the left-hand side of (11) is a continuous function of u. Fix u Rd . Outside
the compact set
{t B : (t) 1/2m},
the contribution to the sum (9) defining Cvm (B) is zero, for any v Rd . Using the local inversion
theorem, the number of points t B such that Z(t) = u; (t) 1/2m, say k, is finite. Notice
that almost surely there is no such point in the boundary of B.
If k is non-zero, Z(t) is locally invertible in k neighborhoods V1 , . . . , Vk B around these k
points. For v in some (random) neighborhood of u, there is exactly one root of Z(s) = v in
each V1 , ..., Vk and the contribution to Cvm (B) of these points can be made arbitrarily close to
5

the one corresponding to v = u. Outside the union of V1 , ..., Vk , Z(t) u is bounded away from
zero in B, so that the contribution to Cvm (B) vanishes if v is sufficiently close to u.
This shows that a.s., the function v
Qn,m
is continuous at v = u. On the other hand, it is
v
obvious from its definition that Qn,m
v (B) n and an application of the Lebesgue dominated
convergence theorem implies the continuity of E(Qn,m
u (B)) as a function of u.
Let us now write the regression formulae for fixed t B :
Z(s) = at (s)Z(t) + Z t (s)
Z (s) = (at ) (s)Z(t) + (Z t ) (s),

(12)

where denotes the derivative with respect to s and the pair Z t (s), (Z t ) (s) is independent
from Z(t) for all s U .
Then, we write the conditional expectation on the right-hand side of (11) as the unconditional
expectation :
E tu (t)Fm (tu (t))Gn (Cum (B)) ,
(13)
where we use the notations
tu (s) := | det(Zut ) (s)|
Zut (s) := at (s)u + Z t (s)
Fm tu (s) .

Cum (B) :=
sB,Zut (s)=u

Now, observe that (11) implies that for almost every u Rd one has the inequality
E(Qn,m
u (B))

E (t)/Z(t) = u pZ(t) (u)dt,

(14)

which is in fact true for all u Rd since both sides are continuous functions of u.
The remainder of the proof consists in proving the converse inequality. Let us fix n, m, u
and t. Let K be the compact set
K := {s B : tu (s) 1/4m}
If v varies in a sufficient small (random) neighborhood of u, the points outside K do not
contribute to the sum defining Cvm (B).
Let k the almost surely finite number of roots of Zut (s) = u lying in the set K. Assume that
k does not vanish and denote these roots by s1 , . . . , sk . Consider the equation
Zvt (s) v = 0.

(15)

in a neighborhood of each one of the pairs s = si , v = u. Applying the Implicit Function


Theorem, one can find k pairwise disjoint open sets sets V1 , ...Vk such that if v is sufficiently
close to u, equation (15) has exactly one root si = si (v) in Vi , 1 = 1, . . . , k. These roots vary
continuously with v and si (u) = si . On the other hand on the compact set K\(V1 ... Vk )
the quantity Zut (s) u is bounded away from zero so Zvt (s) v does not vanishes if v is
sufficiently close to u. As a conclusion, we have that
lim sup Cvm (B) Cum (B)
vu

where the inequality arises from the fact that some of the points si (v) may not belong to B and
hence, dont contribute to the sum defining Cvm (B). Now since (11) holds for a.e. u, one can
find a sequence {uN , N = 1, 2, . . .} converging to u such that (11) holds true for u = uN and
6

all N = 1, 2, .... Using the continuity -already proved- of u E(Qn,m


u (B)), Fatous Lemma and
the fact that Gn is non-increasing, we have :
E Qn,m
u (B) =

lim E Qn,m
uN (B)

N +

lim

N + B

E tuN (t)Fm (tuN (t)Gn CumN (B) pZ(t) (uN )dt

E tu (t)Fm (tu (t)Gn Cum (B) pZ(t) (u)dt.

Since Cum (B) is a.s. finite, we can now pass to the limit as n +, m + in that order
and applying Beppo-Levis Theorem, conclude the proof.

Proof of Theorem 2:
For each > 0, define the domain
Dk, (B) = {(t1 , ..., tk ) B k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (B)

Z(t1 , ..., tk ) = Z(t1 ), ..., Z(tk ) .

It is clear that Z satisfies the hypotheses of Theorem 1 for every value (u, ..., u) (Rd )k . So,
e

Z
E N(u,...,u)
Dk, (B)
k

E
Dk, (B)

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (16)
j=1

To finish, let 0, note that


NuZ (B) NuZ (B) 1 ... NuZ (B) k + 1
is the monotone limit of

Z
N(u,...,u)
Dk, (B) ,

and that the diagonal Dk (B) = (t1 , ..., tk ) B k , ti = tj for some pair i, j, i = j has zero Lebesgue
measure in (Rd )k .

Proof of Theorem 3:
The proof is essentially the same. It suffices to consider instead of Cum (B) the quantity
Cum (I) :=

Fm ((s)).gs (s, Y s ).
sI:Z(s)=u

(17)

References
[1] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[2] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[3] Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York

A self contained proof of the Rice formula for random fields


Jean-Marc Azas , azais@cict.fr

Mario Wschebor , wschebor@cmat.edu.uy

December 22, 2006

AMS subject classification: Primary 60G70 Secondary 60G15


Key words and phrases: Gaussian fields, Rice Formula,
After an elementary proof of the Area Formula, we give a proof of Rice Formula for the
expectation of the number of roots of a random system of equations. We provide a complete
proof which is new and quite elementary, and in any case shorter than previous ones (see for
example [1]).
Similar formulae hold true for higher order factorial moments of the number of roots (Theorem 2). Theorem 3 provides a formula for the expectation of the total weight, when random
weights are put in each root.

The Area formula

We begin with a proof of the so-called Area formula, under conditions that will be sufficient for
our main purpose. One can find this formula in its full generality in Federer [3] Th 3.2.5
For any function f , we denote Nuf (T ) the number of roots of the equation f (t) = u that
belong to the subset T of the domain of f .
Proposition 1 (Area formula) Let f be a C 1 function defined on an open subset U of Rd
taking values in Rd . Assume that the set of critical values of f has zero Lebesgue measure.
Let g: Rd R be continuous and bounded. Then

Rd

g(u)Nuf (T )du =

| det(f (t))|g(f (t))dt.

(1)

for any Borel subset T of U , whenever the integral in the right-hand side is well defined.
Proof. Notice first that, due to standard extension arguments, it suffices to prove (1) for
non-negative g and for T a compact parallelotope contained in U . Second, if T is a compact
parallelotope, since f is C 1 , the set of boundary values of f , that is, f (T ) has Lebesgue measure
zero.
We next define an auxiliary function (u) for u Rd in the following way:
If u is neither a critical value nor a boundary value and n := Nuf (T ) is non zero, we denote
by x(1) , . . . , x(n) the roots of f (x) = u belonging to T . Using the local inversion theorem, we
know that there exists some > 0 and n neighborhoods U1 , ..., Un of of x(1) , . . . , x(n) such that:

This work was supported by ECOS program U03E01.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118, route de
Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.

1. f is a C 1 diffeomorphism Ui B(u; ), the ball centered at u with radius .


2. U1 , ..., Un are pairwise disjoint and included in T .
3. if t
/

n
i=1 Ui ,

then f (t)
/ B(u; ).

The compactness implies that n is finite.


In this case, we define
(u) := sup{ > 0 : (1), (2), (3) hold true for all }.
If u is a critical value or a boundary value we set (u) := 0.
If Nuf (T ) = 0, we put
(u) := sup{ > 0 : f (T ) B(u; ) = }.
It is clear that in this case (u) > 0.
The function (u) is Lipschitz. In fact, let u be a value of f which is not a critical value nor
a boundary value, if u belongs to B(u; (u)), then B(u ; (u) u u ) B(u; (u)) and as
a consequence (u ) (u) u u . Exchanging the roles of u and u , we get
|(u ) (u)| u u .
The Lipschitz condition is easily checked in the other two cases.
Let now F be a real-valued monotone continuous function defined on R+ such that
F 0 on [0, 1/2],
F 1 on [1 + ).

(2)

Let (u) > 0 and 0 < < (u). Using the change of variable formula we have
n

| det(f (t))| 1I
T

f (t)u

| det(f (t))|dt = V ()n,

< dt =
i=1

Ui

where V () is the volume of the ball with radius in Rd . Thus, we have an exact counter for
Nuf (T ) when it is non-zero, which obviously holds true also when Nuf (T ) = 0 for < (u)
Let g : Rd R continuous, bounded and non-negative and 0 > 0. For every < 0 /2 we
have:
Rd

g(u)Nuf (T )F

(u)
0

du =

(u)
0

g(u) F
Rd

du

1
V ( )

| det(f (t))| 1I
T

f (t)u <

dt

Applying Fubinis Theorem we see that the expression above is equal to:
A0 , :=

| det(f (t))| dt
T

1
V ( )

F
B(f (t); )

(u)
0

g(u)du.

A0 , in fact does not depend on so it is equal to its limit as 0 which is, because of the
continuity of the function u

(u)
0

g(u), equal to

| det(f (t))|F
T

(f (t))
0

g(f (t))dt.

Let now 0 tend to zero and use monotone convergence. For the left-hand side, we take into
account that the set of critical values and the set of boundary values have measure zero. For
2

the right-hand side, we use the definition of F, that the boundary of T has Lebesgue measure
zero and the integrand is zero if t is a critical point of f .
Remarks:
1: By standard extension arguments the continuous function g can be replaced by the
indicator function of a Borel set say B. Formula (1) can be rewritten as
h(t, u)du =
Rd

| det(f (t))|h((t, f (t))dt,

(3)

Rd

tZ 1 (u)

where h is the function (t, u)


1ItT g(u). Again by a standard approximation argument (3)
holds true for every bounded Borel function h such that the right-hand side of (3) is well-defined.
2:
2. Notice that the hypothesis that the set of critical values has zero Lebesgue measure is
unnecessary, since this follows from the fact that f is C1. The statement above is sufficient to
prove Rice formula, but the interested reader can prove this as an exercise. On the other hand,
one can prove the result under weaker hypotheses: it suffices f to be Lipschitz ([3]).

Rice formulae

Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and


u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t)
0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =

E | det(Z (t))|/Z(t) = u pZ(t) (u)dt.

(4)

If B is compact both sides in (4) are finite.


Theorem 2 Let k, k 2 be an integer. Assume the same hypotheses as in Theorem 1 except
for (iii) that is replaced by
(iii) for t1 , ..., tk U pairwise different values of the parameter, the distribution of
Z(t1 ), ..., Z(tk )
does not degenerate in (Rd )k . Then for every Borel set B contained in U , one has
E

NuZ (B) NuZ (B) 1 ... NuZ (B) k + 1


k

E
Bk

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u


j=1

pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk , (5)


where both members may be infinite.
Theorem 3 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that
for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:

a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)


ous.

Y t (w) is continu-

Z(s), Y t (w) defined on U W is Gaussian.

b) For each t U the random process (s, w)

Moreover, assume that g : U C(W, Rd ) R is a bounded function, which is continuous when


one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =

tI,Z(t)=u

E | det(Z (t)|g(t, Y t )/Z(t) = u).pZ(t) (u)dt.

(6)

Sufficient conditions for (iv) are given in the following proposition


Proposition 2 Let Z : U Rd , U a compact subset of Rd be a random field with paths of
class C 1 and u Rd . Assume that
pZ(t) (x) C for all t U and x in some neighborhood of u.
at least one of the two following hypotheses is satisfied:
a) a.s. t

Z(t) is of class C 2

b)
() =

sup

P | det(Z (t))| < /Z(t) = x 0

tU,xV (u)

as 0, where V (u) is some neighborhood of u.


Then (iv) holds true.
Proof. If condition a) holds true, the result is Lemma 5 in Cucker and Wschebor [2].
To prove it under condition b), assume with no loss of generality that I = [0, 1]d and that
u = 0. Put GI = t I, Z(t) = 0, det Z (t) = 0 Choose > 0, > 0; there exists a positive
number M such that
P(EM ) = P sup Z (t) > M .
tI

Denote by det the modulus of continuity of | det(X (.))| and choose m large enough so that

d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id be such a
cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
P GCi1 ...id EM
Fm,

P(GI ) P(EM ) + P(Fm, ) +


1i1 ...id m

When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|

M
d j = 1, ..., d
m

where Zj denotes the j-th coordinate of Z, and:


det Z (ti1 ...id )

< .

(7)

So, if m is chosen sufficiently large so that V (0) contains the ball centered at 0 with radius
M d
m , one has:
2M d
P(GI ) 2 + md (
d) C()
m
Since and are arbitrarily small, the result follows.
Remark:
With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
Proof of Theorem 1
Let F : R+ [0, 1] be the function defined in (2), For m, n positive integers and x 0, define:
Fm (x) := F(mx) ; Gn (x) := 1 F(x/n).

(8)

A standard extension argument says that it is enough to prove the theorem when B is a
compact rectangle included in U . So we assume that this is the case. Let us introduce some
more notations:
(t) := | det(Z (t))| (t U )
For n, m positive integers and u Rd :
Cum (B) :=

Fm ((s)).

(9)

sB:Z(s)=u
m
m
Qn,m
u (B) := Cu (B)Gn (Cu (B)).

(10)

In (9) when the summation index set is empty, we put Cum (B) = 0. Let g : Rd R be
continuous with compact support . We apply the area formula (3) for the function
h(t, u) = Fm ((t))Gn (Cum (B))g(u) 1ItB
to get:
Rd

g(u)Qn,m
u (B)du =

m
(t) Fm ((t)) Gn (CZ(t)
(B)) g(Z(t))dt.

Taking expectations on both sides :

Rd

g(u) E(Qn,m
u (B))du =

g(u)du
Rd

E (t) Fm ((t))Gn (Cum (B))/Z(t) = u pZ(t) (u)dt.

Since this equality holds for any g continuous with bounded support, it follows that
E(Qn,m
u (B)) =

E (t)Fm ((t)Gn (Cum (B))/Z(t) = u pZ(t) (u)dt,

(11)

holds true for almost every u Rd .


Let us prove that the left-hand side of (11) is a continuous function of u. Fix u Rd . Outside
the compact set
{t B : (t) 1/2m},
the contribution to the sum (9) defining Cvm (B) is zero, for any v Rd . Using the local inversion
theorem, the number of points t B such that Z(t) = u; (t) 1/2m, say k, is finite. Notice
that almost surely there is no such point in the boundary of B.
If k is non-zero, Z(t) is locally invertible in k neighborhoods V1 , . . . , Vk B around these k
points. For v in some (random) neighborhood of u, there is exactly one root of Z(s) = v in
each V1 , ..., Vk and the contribution to Cvm (B) of these points can be made arbitrarily close to
5

the one corresponding to v = u. Outside the union of V1 , ..., Vk , Z(t) u is bounded away from
zero in B, so that the contribution to Cvm (B) vanishes if v is sufficiently close to u.
This shows that a.s., the function v
Qn,m
is continuous at v = u. On the other hand, it is
v
obvious from its definition that Qn,m
v (B) n and an application of the Lebesgue dominated
convergence theorem implies the continuity of E(Qn,m
u (B)) as a function of u.
Let us now write the regression formulae for fixed t B :
Z(s) = at (s)Z(t) + Z t (s)
Z (s) = (at ) (s)Z(t) + (Z t ) (s),

(12)

where denotes the derivative with respect to s and the pair Z t (s), (Z t ) (s) is independent
from Z(t) for all s U .
Then, we write the conditional expectation on the right-hand side of (11) as the unconditional
expectation :
E tu (t)Fm (tu (t))Gn (Cum (B)) ,
(13)
where we use the notations
tu (s) := | det(Zut ) (s)|
Zut (s) := at (s)u + Z t (s)
Fm tu (s) .

Cum (B) :=
sB,Zut (s)=u

Now, observe that (11) implies that for almost every u Rd one has the inequality
E(Qn,m
u (B))

E (t)/Z(t) = u pZ(t) (u)dt,

(14)

which is in fact true for all u Rd since both sides are continuous functions of u.
The remainder of the proof consists in proving the converse inequality. Let us fix n, m, u
and t. Let K be the compact set
K := {s B : tu (s) 1/4m}
If v varies in a sufficient small (random) neighborhood of u, the points outside K do not
contribute to the sum defining Cvm (B).
Let k the almost surely finite number of roots of Zut (s) = u lying in the set K. Assume that
k does not vanish and denote these roots by s1 , . . . , sk . Consider the equation
Zvt (s) v = 0.

(15)

in a neighborhood of each one of the pairs s = si , v = u. Applying the Implicit Function


Theorem, one can find k pairwise disjoint open sets sets V1 , ...Vk such that if v is sufficiently
close to u, equation (15) has exactly one root si = si (v) in Vi , 1 = 1, . . . , k. These roots vary
continuously with v and si (u) = si . On the other hand on the compact set K\(V1 ... Vk )
the quantity Zut (s) u is bounded away from zero so Zvt (s) v does not vanishes if v is
sufficiently close to u. As a conclusion, we have that
lim sup Cvm (B) Cum (B)
vu

where the inequality arises from the fact that some of the points si (v) may not belong to B and
hence, dont contribute to the sum defining Cvm (B). Now since (11) holds for a.e. u, one can
find a sequence {uN , N = 1, 2, . . .} converging to u such that (11) holds true for u = uN and
6

all N = 1, 2, .... Using the continuity -already proved- of u E(Qn,m


u (B)), Fatous Lemma and
the fact that Gn is non-increasing, we have :
E Qn,m
u (B) =

lim E Qn,m
uN (B)

N +

lim

N + B

E tuN (t)Fm (tuN (t)Gn CumN (B) pZ(t) (uN )dt

E tu (t)Fm (tu (t)Gn Cum (B) pZ(t) (u)dt.

Since Cum (B) is a.s. finite, we can now pass to the limit as n +, m + in that order
and applying Beppo-Levis Theorem, conclude the proof.

Proof of Theorem 2:
For each > 0, define the domain
Dk, (B) = {(t1 , ..., tk ) B k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (B)

Z(t1 , ..., tk ) = Z(t1 ), ..., Z(tk ) .

It is clear that Z satisfies the hypotheses of Theorem 1 for every value (u, ..., u) (Rd )k . So,
e

Z
E N(u,...,u)
Dk, (B)
k

E
Dk, (B)

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (16)
j=1

To finish, let 0, note that


NuZ (B) NuZ (B) 1 ... NuZ (B) k + 1
is the monotone limit of

Z
N(u,...,u)
Dk, (B) ,

and that the diagonal Dk (B) = (t1 , ..., tk ) B k , ti = tj for some pair i, j, i = j has zero Lebesgue
measure in (Rd )k .

Proof of Theorem 3:
The proof is essentially the same. It suffices to consider instead of Cum (B) the quantity
Cum (I) :=

Fm ((s)).gs (s, Y s ).
sI:Z(s)=u

(17)

References
[1] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[2] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[3] Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York

On the Distribution of the Maximum of a Gaussian


Field with d Parameters.
Jean-Marc Azas , azais@cict.fr
Mario Wschebor , wscheb@fcien.edu.uy
November 10, 2003

AMS subject classification: 60G15, 60G70.


Short Title: Distribution of the Maximum.
Key words and phrases: Gaussian fields, Rice Formula, Regularity of the Distribution of the Maximum.
Abstract
Let I be a compact d-dimensional manifold, X : I R a Gaussian
process with regular paths and FI (u) , u R the probability distribution
function of suptI X(t).
We prove that under certain regularity and non-degeneracy conditions, FI
is a C 1 -function and FI is absolutely continuous, and that FI FI satisfy
certain implicit equations that permit to give bounds for their values and to
compute their asymptotic behaviour as u +. This is a partial extension
of previous results by the authors in the case d = 1.
Our methods use strongly the so-called Rice formulae for the moments of
the number of roots of an equation of the form Z(t) = x, where Z : I Rd
is a random field and x a fixed point in Rd . We also give proofs for this kind
of formulae, which have their own interest beyond the present application.

This work was supported by ECOS program U97E02.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118,
route de Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matematica. Facultad de Ciencias. Universidad de la Rep


ublica. Calle Igua 4225.
11400 Montevideo. Uruguay.

Introduction and notations.

Let I be a d-dimensional compact manifold and X : I R a Gaussian process with


regular paths defined on some probability space (, A, P). Define MI = sup X(t)
tI

and FI (u) = P{MI u}, u R the probability distribution function of the random
variable MI . Our aim is to study the regularity of the function FI when d > 1.
There exist a certain number of general results on this subject, starting from
the papers by Ylvisaker (1968) and Tsirelson (1975) (see also Weber (1985), Lifshits
(1995), Diebolt and Posse (1996) and references therein). The main purpose of this
paper is to extend to d > 1 some of the results about the regularity of the function
u
FI (u) in Azas & Wschebor (2001), which concern the case d = 1.
Our main tool here is Rice Formula for the moments of the number of roots
NuZ (I) of the equation Z(t) = u on the set I, where {Z(t) : t I} is an Rd -valued
Gaussian field, I is a subset of Rd and u a given point in Rd . For d > 1, even
though it has been used in various contexts, as far as the authors know, a full proof
of Rice Formula for the moments of NuZ (I) seems to have only been published by R.
Adler (1981) for the first moment of the number of critical points of a real-valued
stationary Gaussian process with a d-dimensional parameter, and extended by Azas
and Delmas (2002) to the case of processes with constant variance. Caba
na (1985)
contains related formulae for random fields; see also the PHD thesis of Konakov
cited by Piterbarg (1996b). In the next section we give a more general result which
has an interest that goes beyond the application of the present paper. At the same
time the proof appears to be simpler than previous ones. We have also included
the proof of the formula for higher moments, which in fact follows easily from the
first moment. Both extend with no difficulties to certain classes of non-Gaussian
processes.
It should be pointed out that the validity of Rice Formula for Lebesgue-almost
every u Rd is easy to prove (Brillinger, 1972) but this is insufficient for a certain
number of standard applications. For example, assume X : I
R is a real-valued
random process and one is willing to compute the moments of the number of critical
points of X. Then, we must take for Z the random field Z(t) = X (t) and the
formula one needs is for the precise value u = 0 so that a formula for almost every
u does not solve the problem.
We have added Rice Formula for processes defined on smooth manifolds. Even
though Rice Formula is local, this is convenient for various applications. We will
need a formula of this sort to state and prove the implicit formulae for the derivatives
of the distribution of the maximum (see Section 3).
2

The results on the differentiation of FI are partial extensions of Azas & Wschebor (2001). They concern only the first two derivatives and remain quite far away
from what is known for d = 1. The main result in that paper states that if X is
a real-valued Gaussian process defined on a certain compact interval I of the real
line, has C 2k paths (k integer, k 1) and satisfies a non-degeneracy condition, then
the distribution of MI is of class C k .
For Gaussian fields defined on a d-dimensional regular manifold (d > 1) and
possessing regular paths we obtain some improvements with respect to classical
and general results due to Tsirelson (1975) for Gaussian sequences. An example is
Corollary 6.1, that provides an asymptotic formula for FI (u) as u + which is
explicit in terms of the covariance of the process and can be compared with Theorem
4 in Tsirelson (1975) where an implicit expression depending on the function F itself
is given.
We use the following notations:
If Z is a smooth function U
Rd , U a subset of Rd , its successive derivatives
are denoted Z , Z ,...Z (k) and considered respectively as linear, bilinear, ..., klinear
forms on Rd . For example, X (3) (t){v1 , v2 , v3 } is the value of the third derivative at
point t applied to the triplet (v1 , v2 , v3 ). The same notation is used for a derivative
on a C manifold.
I and I are respectively the interior, the boundary and the closure of the set
I,
I. If is a random vector with values in Rd , whenever they exist, we denote by
p (x) the value of the density of at the point x, by E() its expectation and by
Var() its variance-covariance matrix. is Lebesgue measure.
If u, v are points in Rd , u, v denotes their usual scalar product and u the
Euclidean norm of u.
For M a d d real matrix, we denote
M = sup M x
x =1

We put 21 = min , ...2d for the eigenvalues of M M T , 0 1 ... d . Then,


M = d and if M is non-singular, M 1 = 11 = min1(M ) .
Also for symmetric M , M 0 (respectively M 0) denotes that M is positive
definite (resp. negative definite).
m
m!
is the usual combinatorial number, i.e. m
= n!(mn)!
if m, n are nonn
n
m
negative integers, m n and n = 0 otherwise.
Ac denotes the complement of the set A. For real x, x+ = sup(x, 0), x =
sup(x, 0)
3

Rice formulae

Our main results in this section are the following:


Theorem 2.1 Let Z : I
Rd , I a compact subset of Rd , be a random field and
u Rd .
Assume that:
A0: Z is Gaussian,
A1: t
Z(t) is a.s. of class C 1 ,
A2: for each t I, Z(t) has a non degenerate distribution (i.e. Var Z(t)
0),
Z(t) = u, det Z (t) = 0} = 0
A3: P{t I,
A4: (I) = 0.
Then

E NuZ (I) =

E (| det(Z (t))|/Z(t) = u) pZ(t) (u)dt,

(1)

and both members are finite.


Theorem 2.2 Let k, k 2 be an integer. Assume the same hypotheses as in
Theorem (2.1) excepting for A2 that is replaced by
A2 : for t1 , ..., tk I pairwise different values of the parameter, the distribution
of
Z(t1 ), ..., Z(tk )
does not degenerate in (Rd )k . Then
E

NuZ (I) NuZ (I) 1 ... NuZ (I) k + 1


k

E
Ik

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u


j=1

pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk , (2)


where both members may be infinite.
Remark.
Note that Theorem 2.1 (resp 2.2) remains valid, excepting for the finiteness of the
expectation in Theorem (2.1), if I is open and hypotheses A0,A1,A2 (resp A2) and
4

A3 are verified. This follows immediately from the above statements. A standard extension argument shows that (1) holds true if one replaces I by any Borel subset of I
Sufficient conditions for hypotheses A3 to hold are given by the next proposition.
Proposition 2.1 Let Z : I
Rd , I a compact subset of Rd be a random field with
1
d
paths of class C and u R . Assume that
pZ(t) (x) C for all t I and x in some neighbourhood of u.
at least one of the two following hypotheses is satisfied:
a) a.s. t

Z(t) is of class C 2

b)
() =

sup

P | det(Z (t))| < /Z(t) = x 0

tI,xV (u)

as 0, where V (u) is some neighbourhood of u.


Then A3 holds true.
Proof. If condition a) holds true, the result is Lemma 5 in Cucker and Wschebor
(2003).
To prove it under condition b), assume with no loss of generality that I = [0, 1]d
and that u = 0. Put GI = t I, Z(t) = 0, det Z (t) = 0 Choose > 0, > 0;
there exists a positive number M such that
P(EM ) = P sup Z (t) > M .
tI

Denote by det the modulus of continuity of | det(X (.))| and choose m large enough
so that

d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id
such a cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
Fm,
P GCi1 ...id EM

P(GI ) P(EM ) + P(Fm, ) +


1i1 ...id m

(3)

When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|

M
d j = 1, ..., d
m

where Zj denotes the j-th coordinate of Z, and:


det Z (ti1 ...id )

< .

So, if m is
chosen sufficiently large so that V (0) contains the ball centred at 0 with

M d
radius m , one has:
P(GI ) 2 + md (

2M d
d) C()
m

Since and are arbitrarily small, the result follows.


Lemma 2.1 With the notations of Theorem (2.1), suppose that A1 and A4 hold
true and that
pZ(t) (x) C for all t I and x in some neighbourhood of u
Then P NuZ (I) = 0 = 0
Proof: We use the notation of Proposition 2.1, with the same definition of EM
excepting that we do not suppose that I = [0, 1]d .
Since I has zero measure, for each positive integer m, it can be covered by h(m)
cubes C1 , ..., Ch(m) with centres t1 , ...th(m) and side lengths s1 , ...sh(m) smaller than
1/m, such that
h(m)

(si )d 0 as m +.
i=1

So,
h(m)

NuZ (I)

= 0 P(EM ) +

c
NuZ (Ci ) = 0 EM

i=1
h(m)

+
i=1

d
P |Zj (ti ) uj | M si
j = 1, ..., d + C
2

This gives the result.


6

h(m)

( dM si )d

i=1

Lemma 2.2 Let Z : I Rd , I a compact subset of Rd , be a C 1 function and u a


point in Rd . Assume that
a) inf tZ 1 ({u}) min Z (t) > 0
b) Z () < /d
where Z is the continuity modulus of Z , defined as the maximum of the continuity moduli of its entries and a positive number.
Then, if t1 , t2 are two distinct roots of the equation Z(t) = u such that the
segment [t1 , t2 ] is contained in I, the Euclidean distance between t1 and t2 is greater
than .
1

Recall that min Z (t) is the inverse of Z (t)


.
t1 t2
Proof: Set = t1 t2 , v = t1 t2 . Using the mean value theorem, for
i = 1, ..., d, there exists i [t1 , t2 ] such that
Z (i )v

=0

Thus
| Z (t1 )v i | = | Z (t1 )v i Z (i )v i |
d

|Z (t1 )ik Z (i )ik ||vk | Z (


)

k=1

|vk | Z (
) d

k=1

In conclusion
min Z (t1 ) Z (t1 )v Z (
)d,
that implies > .
Proof of Theorem 2.1: Consider a continuous non-decreasing function F such
that
F (x) = 0
F (x) = 1

for x 1/2
for x 1.

Let and be positive real numbers. Define the random function


, (u) = F

1
inf min Z (s) + Z(s) u
2 sI

1F

d
Z ()

(4)

and the set I = {t I : t s , s


/ I}. If , (u) > 0 and NuZ (I )
does not vanish, conditions a) and b) in Lemma 2.2 are satisfied. Hence, in each
7

ball with diameter 2 centred at a point in I there is at most one root of the
equation Z(t) = u, and a compactness argument shows that NuZ (I ) is bounded by
a constant C(, I), depending only on and on the set I.
Take now any real-valued non-random continuous function f : Rd R with
compact support. Because of the coarea formula (Federer, 1969, Th 3.2.3), since
a.s. Z is Lipschitz and , (u).f (u) is integrable:

Rd

f (u)NuZ (I ), (u)du =

| det(Z (t))|f Z(t) , Z(t) dt.


I

Taking expectations in both sides,

Rd

f (u)E NuZ (I ), (u) du =


f (u)du
Rd

E (| det(Z (t))|, (u)/Z(t) = u) pZ(t) (u)dt.


I

It follows that the two functions


(i) : E NuZ (I ), (u)
(ii) :

E (| det(Z (t))|, (u)/Z(t) = u) pZ(t) (u)dt,


I

coincide Lebesgue-almost everywhere as functions of u.


Let us prove that both functions are continuous, hence they are equal for every
u Rd .
Fix u = u0 and let us show that the function in (i) is continuous at u = u0 .
Consider the random variable inside the expectation sign in (i). Almost surely, there
is no point t in Z 1 ({u0 }) such that det(Z (t)) =0. By the local inversion theorem,
Z(.) is invertible in some neighbourhood of each point belonging to Z 1 ({u0 }) and
the distance from Z(t) to u0 is bounded below by a positive number for t I
outside of the union of these neighbourhoods. This implies that, a.s., as a function of
u, NuZ (I ) is constant in some (random) neighbourhood of u0 . On the other hand, it
is clear from its definition that the function u
, (u) is continuous and bounded.
We may now apply dominated convergence as u u0 , since NuZ (I ), (u) is
bounded by a constant that does not depend on u.
For the continuity of (ii), it is enough to prove that, for each t I the conditional
expectation in the integrand is a continuous function of u. Note that the random
8

variable | det(Z (t))|, (u) is a functional defined on {(Z(s), Z (s)) : s I}. Perform a Gaussian regression of (Z(s), Z (s)) : s I with respect to the random
variable Z(t), that is, write
Z(s) = Y t (s) + t (s)Z(t)
Zj (s) = Yjt (s) + jt (s)Z(t), j = 1, ..., d
where Zj (s) (j = 1, ..., d) denote the columns of Z (s), Y t (s) and Yjt (s) are Gaussian
vectors, independent of Z(t) for each s I, and the regression matrices t (s), jt (s)
(j = 1, ..., d) are continuous functions of s, t (take into account A2). Replacing in
the conditional expectation we are now able to get rid of the conditioning, and using
the fact that the moments of the supremum of an a.s. bounded Gaussian process
are finite, the continuity in u follows by dominated convergence.
So, now we fix u Rd and make 0, 0 in that order, both in (i) and (ii).
For (i) one can use Beppo Levis Theorem. Note that almost surely
= NuZ (I),
NuZ (I ) NuZ (I)
where the last equality follows from Lemma 2.1. On the other hand, the same
Lemma 2.1 plus A3 imply together that,almost surely:
inf min Z (s) + Z(s) u
sI

>0

so that the first factor in the right-hand member of (4) increases to 1 as decreases
to zero. Hence by Beppo Levis Theorem:
lim lim E NuZ (I ), (u) = E NuZ (I) .
0 0

For (ii), one can proceed in a similar way after de-conditioning obtaining (1). To
finish the proof, remark that standard Gaussian calculations show the finiteness of
the right-hand member of (1).
Proof of Theorem 2.2: For each > 0, define the domain
Dk, (I) = {(t1 , ..., tk ) I k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (I)

Z(t1 , ..., tk ) = Z(t1 ), ..., Z(tk ) .


9

It is clear that Z satisfies the hypotheses of Theorem 2.1 for every value (u, ..., u)
(Rd )k . So,
Z
E N(u,...,u)
Dk, (I)

=
Dk, (I)

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (5)
j=1

To finish, let 0, note that NuZ (I) NuZ (I) 1 ... NuZ (I) k + 1 is the monotone
limit of
Z
N(u,...,u)
Dk, (I) ,
and that the diagonal Dk (I) = (t1 , ..., tk ) I k , ti = tj for some pair i, j, i = j has
zero Lebesgue measure in (Rd )k .
Remark Even thought we will not use this in the present paper, we point out
that it is easy to adapt the proofs of Theorems 2.1 and 2.2 to certain classes of
non-Gaussian processes.
For example, the statement of Theorem 2.1 remains valid if one replaces hypotheses A0 and A2 respectively by the following B0 and B2:
B0 : Z(t) = H(Y (t)) for t I where
Y : I Rn is a Gaussian process with C 1 paths such that for each t I, Y (t) has
a non-degenerate distribution and H : Rn Rd is a C 1 function.
B2 : for each t I, Z(t) has a density pZ(t) which is continuous as a function of
(t, u).
Note that B0 and B2 together imply that n d. The only change to be introduced in the proof of the theorem is in the continuity of (ii) where the regression is
performed on Y (t) instead of Z(t)
Similarly, the statement of Theorem 2.2 remains valid if we replace A0 by B0 and
add the requirement the joint density of Z(t1 ), ..., Z(tk ) to be a continuous function
of t1 , ..., tk , u for pairwise different t1 , ..., tk
Now consider a process X from I to R and define
X
(I) = {t I, X(.) has a local maximum at the point t, X(t) > u}
Mu,1
X
(I) = {t I, X (t) = 0, X(t) > u}
Mu,2

The problem of writing Rice Formulae for the factorial moments of these random
variables can be considered as a particular case of the previous one and the proofs are
10

the same, mutatis mutandis. For further use, we state as a theorem, Rice Formula
for the expectation. For short we do not state the equivalent of Theorem (2.2) that
holds true similarly.
Theorem 2.3 Let X : I
R , I a compact subset of Rd , be a random field. Let
X
u R, define Mu,i
(I), i = 1, 2 as above. For each d d real symmetric matrix M ,
1
we put (M ) := | det(M )|1IM 0 , 2 (M ) := | det(M )|.
Assume:
A0: X is Gaussian,
A1: a.s. t
X(t) is of class C 2 ,
A2: for each t I, X(t), X (t) has a non degenerate distribution in R1 Rd ,
A3: either
a.s. t
X(t) is of class C 3
or
() =

sup

P | det X (t) | < /X (t) = x 0

tI,x V (0)

as 0, where V (0) denotes some neighbourhood of 0,


A4: I has zero Lebesgue measure.
Then, for i = 1, 2 :
X
E Mu,i
(I) =

E i X (t) /X(t) = x, X (t) = 0 pX(t),X (t) (x, 0)dt

dx
u

and both members are finite.

2.1

Processes defined on a smooth manifold.

Let U be a differentiable manifold (by differentiable we mean infinitely differentiable)


of dimension d. We suppose that U is orientable in the sense that there exists a
non-vanishing differentiable d-form on U . This is equivalent to assuming that
there exists an atlas (Ui , i ); i I such that for any pair of intersecting charts
(Ui , i ), (Uj , j ), the Jacobian of the map i 1
j is positive.
We consider a Gaussian stochastic process with real values and C 2 paths X =
{X(t) : t U } defined on the manifold U . In this subsection, our aim is to write
Rice Formulae for this kind of processes under various geometric settings for U .
More precisely we will consider three cases: first, when U is a manifold without any
additional structure on it; second, when U has a Riemannian metric; third, when it
11

is embedded in an Euclidean space. We will make use of these formulae in Section


3 but they have an interest in themselves. (See Taylor and Adler (2002) for other
details or similar results).
We will assume that in every chart X(t) and DX(t) have a non-degenerate joint
distribution and that hypothesis A3 is verified. For S a Borel subset of U , the
X
(S), the number of local
following quantities are well defined and measurable : Mu,1
X
maxima and Mu,2 (S), the number of critical points.
2.1.1

Abstract manifold

Proposition 2.2 For k = 1, 2 the quantity which is expressed in every chart with
coordinates s1 , ..., sd as
+

dxE k (Y (s))/Y (s) = x, Y (s) = 0 pY (s),Y

(s) (x, o)

di=1 dsi ,

(6)

where Y (s) is the process X written in the chart : Y = X 1 , defines a d-form


k on U and for every Borel set S U

X
k = E Mu,k
(S) .

Proof: Note that a d-form is a measure on U whose image in each chart is


absolutely continuous with respect to Lebesgue measure di=1 dsi ,. To prove that (6)
defines an d-form, it is sufficient to prove that its density with respect to di=1 dsi ,
satisfies locally the change-of-variable formula. Let (U1 , 1 ), (U2 , 2 ) two intersecting
charts and set
1
1
U3 := U1 U2 ; Y1 := X 1
1 ; Y2 := X 2 ; H := 2 1 .

Denote by s1i and s2i , i = i, ..., d the coordinates in each chart. We have
Y1
=
s1i
2 Y1
=
s1i s1j

i ,j

Y2 Hi
s2i s1i

2 Y2 Hi Hj
+
s2i s2j s1i s1j

Thus at every point


Y1 (s1 ) = H (s1 )
12

Y2 (s2 ),

Y2 2 Hi
.
s2i s1i s1j

pY1 (s1 ),Y1 (s1 ) (x, 0) = pY2 (s2 ),Y2 (s2 ) (x, 0)| det(H (s1 )|1
and at a singular point
Y1 (s1 ) = H (s1 )

Y2 (s2 )H (s1 ).

On the other hand, by the change of variable formula


di=1 ds1i = | det(H (s1 )|1 di=1 ds2i .
Replacing in the integrand in (6), one checks the desired result.
For the second part again it suffices to prove it locally for an open subset S
included in a unique chart. Let (S, ) a chart and let again Y (s) be the process
written in this chart, it suffices to check that
X
(S) =
E Mu,k
+

d(s)
(S)

dx E k (Y (s))/Y (s) = x, Y (s) = 0 pY (s),Y

(s) (x, 0).

(7)

X
Y
Since Mu,k
(S) is equal to Mu,k
{(S)} we see that the result is a direct consequence of Theorem (2.3)

2.1.2

Riemannian manifold

The form in (6) is intrinsic (in the sense that it does not depend on the parametrization) but the terms inside the integrand are not. It is possible to give a complete
intrinsic expression in the case when U is a equipped with a Riemannian metric.
When such a Riemannian metric is not given, it is always possible to use the metric
g induced by the process itself (see Taylor and Adler, 2002) by setting
gs (Y, Z) = E

Y (X) Z(X)

for Y, Z belonging the tangent space T (s) at s U . Y (X), (resp. Z(X)) denotes
the action of the tangent vector Y (resp. Z) on the function X. This metric leads
to very simple expressions for centred variance-1 Gaussian processes.
The main point is that at a singular point of X the second order derivative D2 X
is intrinsic since it defines locally the Taylor expansion. Given the Riemannian
metric gs the second differential can be represented by an endomorphism that will
be denoted 2 X(s).
D2 X(s){Y, Z} = Y (Z(X) = Z(Y (X) = gs (2 X(s)Y, Z).
13

(8)

In fact, at a singular point the definition given by formula (8) coincide with the
definition of the Hessian read in and orthonormal basis. This endomorphism is
intrinsic and of course its determinant. So in a chart
det 2 X(s) = det(D2 X(s)) det(gs )1 ,

(9)

and 2 X(s) is negative definite if and only if D2 X(s) is. Hence


k 2 X(s) = k D2 X(s) det(gs )1 ; (k = 1, 2)
We turn now to the density in (6). The gradient at some location s is defined
as the unique vector X(s) T (s) such that gs (X(s), Y ) = DX(s){Y }. In a
chart the vector of coordinates of the gradient in the basis xi , i = 1, d is given
1
by gs DX(s) where DX(s) is now the vector of coordinates of the derivative in
the basis dxi , i = 1, d. The joint density at (x, 0) of X(s), X(s) is intrinsic only
if read in an orthonormal basis of the tangent space. In that case the vector of
coordinates is given by
X(s) = gs

1/2

X(s) = gs

1/2

DX

By the change of variable formula :


pX(s),X(s) (x, 0) = pX(s),DX(s) (x, 0) det(gs )
Remembering that the Riemannian volume V ol satisfies
V ol =

det(gs ) di=1 ds2i

we can rewrite expression (6) as


+

dx E k (2 X(s)/X(s) = x, X(s) = 0 pX(s),X(s) (x, 0) V ol

(10)

where we have omitted the tilde above X(s) for simplicity. This is the Riemannian
intrinsic expression.
2.1.3

Embedded manifold

In most practical applications, U is naturally embedded in an Euclidean space Rm .


Examples of such situations are given by U being a sphere or the boundary of a
domain in Rm . In such a case we look for an expression for (10) as a function of
14

the natural derivative on Rm . The manifold is equipped with the metric induced
by the Euclidean metric in Rm . Considering the form (10), clearly the Riemannian
volume is just the geometric measure on U .
Following Milnor (1965), we assume that the process Xt is defined on an open
neighbourhood of U so that the ordinary derivatives X (s) and X (s) are well defined
for s U . Denoting the projector onto the tangent and normal spaces by PT (s) and
PN (s) , we have.
X(s) = PT (s) (X (s)).
Wee now define the second fundamental form II of U embedded in Rm than can be
defined in our simple case as the bilinear application ( see Kobayashi Nomizu 199?
T 2, chap. 7 for details).
Y, Z T (s)

PN (s) (X Y ).

where X Y is the Levi-Civita connection on Rn . The next formula is well known,


or easy to check at a singular point, and gives the expression of the Hessian on U .
Y, Z T (s)

X (s){Y, Z}+ < II{Y, Z}, X (s) >,

(11)

The determinant of the bilinear form given by (11), expressed in an orthonormal


basis, gives the value of det 2 X(s) . As a conclusion we get the expression of
every terms involved in (10).
Examples:
Codimension 1: with a given orientation we get
2 X = XT + II.XN
where XT is the tangent projection of the second derivative and XN the normal
component of the gradient.
Sphere: When U is a sphere of radius r > 0 in Rd+1 oriented towards the inside
2 X = XT + r(Id)d XN

(12)

Curve: When the manifold is a curve parametrized by arc length


E Muk (U ) =

dx
u

dt
0

E k XT (t) + C(t)XN (t)/X(t) = x, XT (t) = 0 pX(t),XT (t) (x, 0), (13)


Where C(t) is the curvature at location t and XN (t) is the derivative taken is
the direction of the normal to the curve at point t.
15

Remark: One can consider a number of variants of Rice formulae, in which we


may be interested in computing the moments of the number of roots of the equation
Z(t) = u under some additional conditions. This has been the case in the statement
of Theorem 2.3 in which we have given formulae for the first moment of the number
of zeroes of X in which X is bigger than u (i=2) and also the real-valued process
X has a local maximum (i=1).
We just consider below two additional examples of variants that we state here
for further reference. We limit the statements to random fields defined on subsets
of Rd . Similar statements hold true when the parameter set is a general smooth
manifold. Proofs are essentially the same as the previous ones.
Variant 1: Assume that Z1 , Z2 are Rd -valued random fields defined on compact
subsets I1 , I2 of Rd and suppose that (Zi , Ii ) (i = 1, 2) satisfy the hypotheses of
Theorem 2.1 and that for every s I1 and t I2 , the distribution of Z1 (s), Z2 (t)
does not degenerate. Then, for each pair u1 , u2 Rd :
E NuZ11 (I1 )NuZ22 (I2 )
=
I1 I2

dt1 dt2 E (| det(Z1 (t1 ))|| det(Z2 (t2 ))|/Z1 (t1 ) = u1 , Z2 (t2 ) = u2 ) pZ1 (t1 ),Z2 (t2 ) (u1 , u2 ),
(14)

Variant 2: Let Z, I be as in Theorem 2.1 and a real-valued bounded random


variable which is measurable with respect to the -algebra generated by the process
Z. Assume that for each t I, there exists a continuous Gaussian process {Y t (s) :
s I}, for each s, t I a non-random function t (s) : Rd Rd and a Borelmeasurable function g : C R where C is space of real-valued continuous functions
on I equipped with the supremum norm, such that:
1. = g Y t (.) + t (.)Z(t)
2. Y t (.) and Z(t) are independent
3. for each u0 R, almost surely the function
u

g Y t (.) + t (.)u

is continuous at u = u0
Then the formula :
E NuZ (I) =

E (| det(Z (t))|/Z(t) = u) pZ(t) (u)dt,


I

16

holds true.
We will be particularly interested in the function = 1IMI <v for some v R. We
will see that later on that it satisfies the above conditions under certain hypotheses
on the process Z.

First Derivative, First Form.

Our main goals in this and the next section are to prove existence and regularity of
the derivatives of the function u
FI (u) and, at the same time, that they satisfy
some implicit formulae that can be used to provide bounds on them. In the following
we assume that I is a d-dimensional C manifold embedded in RN , N d. and

are respectively the geometric measures on I and I. Unless explicit statement of


the contrary, the topology on I will be he relative topology.
In this section we prove formula (17) for FI (u). -that we call first form- which
is valid for -almost every u, under strong regularity conditions on the paths of the
process X. In fact, the hypothesis that X is Gaussian is only used in Rice formula
itself and in Lemma 3.1 which gives a bound for the joint density
pX(s),X(t),X (s),X (t) .
In both places, one can substitute Gaussianity by appropriate conditions that permit
to obtain similar results.
More generally, it is easy to see that inequality (15) below is valid under quite
general non Gaussian conditions and implies that we have the following upper bound
for the density of the distribution of the random variable MI .
FI (u)

E 1 (X (t))/X(t) = u, X (t) = 0 pX(t),X (t) (u, 0)(dt)+


I

E 1 (X (t))/X(t) = u, X (t) = 0 pX(t),X (t) (u, 0)


(dt), (15)

where the function 1 has been defined in the statement of Theorem 2.3 and X
denotes the restriction of X to the boundary I.
Even for d = 1 (one parameter processes) and X Gaussian and stationary, inequality (15) provides reasonably good upper bounds for FI (u) (see Diebolt and
Posse (1996), Azas and Wschebor (2001). We will see an example for d = 2 at the
end of this section.
17

In the next section, we are able to prove that FI (u) is a C 1 function and that
formula (17) can be essentially simplified by getting rid of the conditional expectation, thus obtaining the second form for the derivative. This is done under weaker
regularity conditions but the assumption that X is Gaussian becomes essential.
In case the dimension d of the parameter is equal to 1, this is the starting point
to continue the differentiation procedure and under hypotheses H2k one is able to
(k)
prove that FI is a C k function and to obtain implicit formulae for FI (see Azas &
Wschebor, 2001)
When d > 1, a certain number of difficulties arise and it is not clear that the
process can continue beyond k = 2. With the purpose of establishing such formula
for FI , we introduce in Section 4 the helix-processes which appear in a natural
way in these formulae and have paths possessing singularities of a certain form that
will be described precisely in that section.
Definition 3.1 Let X : I R be real-valued stochastic process defined on a
subset of Rd . We will say that X satisfies condition (Hk ), k a positive integer, if
the following three conditions hold true:
X is Gaussian;
a.s. the paths of X are of class C k ;
for any choice of pairwise different values of the parameter t1 , ...., tn the joint
distribution of the random variables
X(t1 ), ..., X(tn ), X (t1 ), ..., X (tn ), ....., X (k) (t1 ), ..., X (k) (tn )

(16)

has maximum rank. Note that the number of distinct real-valued Gaussian
variables belonging to this set (16), on account of exchangeability of the order
of differentiation, is equal to
n 1+

d
d+1
k+d1
+
+ .... +
d1
d1
d1

The next proposition shows that there exist processes that satisfy (Hk ).
Proposition 3.1 Let X = X(t) : t Rd be a centred stationary Gaussian process having continuous spectral density f X . Assume that f X (x) > 0 for every x Rd
and that for any > 0 f X (x) C x holds true for some constant C and all
x Rd .
Then, X satisfies (Hk ) for every k = 1, 2, ...
18

Proof: The proof is an adaptation of the proof of a related result for d = 1


(Cramer & Leadbetter (1967), p. 203).
It is well-known that the hypothesis implies that the paths are C k for every
k = 1, 2, ... As for the non-degeneracy condition, let t1 , ..., tn be pairwise different
values of the parameter. Denote by k1 ,k2 ...,kd X the partial derivative of X k1 times
with respect to the first coordinate, k2 times with respect to the second, ...., kd times
with respect to the d-th coordinate. We want to prove that, for any k = 1, 2, ... the
centred Gaussian joint distribution of the random variables
k1 ,k2 ...,kd X(th )
where the d-tuple (k1 , ..., kd ) varies on the set of non-negative integers such that
k1 + ...kd d and th varies in the set {t1 , ...tn }, is non-degenerate. For this purpose,
it suffices to show that if we put
n

Z=
h=1

k1 ,k2 ...,kd ,h k1 ,k2 ...,kd X(th )

where k denotes summation over all the d-tuples of non-negative integers k1 , k2 ..., kd
such that k1 +k2 +..+kd k and k1 ,k2 ...,kd ,h are complex numbers, then E |Z|2 = 0
implies k1 ,k2 ...,kd ,h = 0 for any choice of the indices k1 , k2 ..., kd , h in the sum. Using
the spectral representation, and denoting x = (x1 , ..., xd ),
n

E |Z|

k1 ,k2 ...,kd ,h k1 ,k2 ...,kd ,h


h,h =1

(ix1 )k1 ...(ixd )kd (ix1 )k1 ...(ixd )kd .


Rd

. exp [i x, th th ] f X (x)dx
where the inner sum is over all 2dtuples of non-negative integers k1 , k2 ..., kd , k1 , k2 ..., kd
such that k1 + k2 + .. + kd k, k1 + k2 + .. + kd k. Hence,
2

E |Z|

k1

=
Rd h=1

kd

k1 ,k2 ...,kd ,h (ix1 ) ...(ixd ) exp [i x, th ]

f X (x)dx

The hypothesis on f X implies that if E |Z|2 = 0, then


n

h=1

k1 ,k2 ...,kd ,h (ix1 )k1 ...(ixd )kd exp [i x, th ] = 0 for all x Rd .

The result follows from the fact that the set of functions xk11 ...xkdd exp [i x, th ] where
k1 , k2 ..., kd , h vary as above, is linearly independent.

19

Theorem 3.1 (First derivative, first form) Let X : I R be a Gaussian


process, I a C compact d-dimensional manifold .
Assume that X verifies Hk for every k = 1, 2, ...
Then, the function u
FI (u) is absolutely continuous and its Radon-Nikodym
derivative is given for almost every u by:
FI (u) = (1)d
(1)d1
I

E (det(X (t)) 1IMI u /X(t) = u, X (t) = 0) pX(t),X (t) (u, 0)(dt)+

E det(X (t)) 1IMI u /X(t) = u, X (t) = 0 pX(t),X (t) (u, 0)


(dt). (17)

Proof : For u < v and S (respectively S) a subset of I (resp. I), let us denote
Mu,v (S) =

{t S : u < X(t) v, X (t) = 0, X (t) 0}

Mu,v (S) =

t S : u < X(t) v, X (t) = 0, X (t) 0

Step 1. Let h > 0 and consider the increment


FI (u) FI (u h) = P {MI u}

1 Muh,u (I) 1
Muh,u (I)

Let us prove that


1, Muh,u (I) 1 = o(h) as h 0.
P Muh,u (I)

(18)

In fact, for > 0 :


1, Muh,u (I) 1
P Muh,u (I)
E Muh,u (I )Muh,u (I) + E (Muh,u (I \ I )) (19)
The first term in the right-hand member of (19) can be computed by means of
a Rice-type Formula, and it can be expressed as:
u

(dt)(dt)
I I

dxdx
uh

E 1 (X (t)) 1 (X (t))/X(t) = x, X(t) = x, X (t) = 0, X (t) = 0


pX(t),X(t),X (t),X (t) (x, x, 0, 0),
20

where the function 1 has been defined in Theorem 2.3.


Since in this integral t t , the integrand is bounded and the integral is
O(h2 ).
For the second term in (19) we apply Rice formula again. Taking into account
that the boundary of I is smooth and compact, we get:
E (Muh,u (I \ I )}
u

E 1 (X (t))/X(t) = x, X (t) = 0 pX(t),X (t) (x, 0) dx

(dt)
I\I

uh

(const) h (I \ I ) (const) h.,


where the constant does not depend on h and . Since > 0 can be chosen arbitrarily
small, (18) follows and we may write:
FI (u) FI (u h)
1 + P MI u, Muh,u (I) 1 + o(h)
= P MI u, Muh,u (I)
as h 0.
Note that the foregoing argument also implies that FI is absolutely continuous
with respect to Lebesgue measure and that the density is bounded above by the
right-hand member of (17). In fact:
1 + P Muh,u (I) 1
FI (u) FI (u h) P Muh,u (I)
+ E Muh,u (I)
E Muh,u (I)
and it is enough to apply Rice Formula to each one of the expectations on the
right-hand side.
The delicate part of the proof consists in showing that we have equality in (17).
Step 2. For g : I R we put
g

= sup |g(t)|
tI

and if k is a non-negative integer,


g

,k

sup
k1 +k2 +..+kd k

21

k1 ,k2 ...,kd g

For fixed > 0 (to be chosen later on) and h > 0,we denote by Eh the event:
Eh =

,4

Because of the Landau-Shepp-Fernique inequality (see Landau-Shepp, 1970 or Fernique, 1975) there exist positive constants C1 , C2 such that
P(EhC ) C1 exp C2 h2 = o(h) as h 0
so that to have (17) it suffices to show that, as h 0 :
E
E

1I
Muh,u (I)
1IMI u 1IEh = o(h)

Muh,u (I)1

(20)

Muh,u (I) 1IMuh,u (I)1 1IMI u 1IEh = o(h)

(21)

We prove (20). (21) can be proved in a similar way.


We have:
Put Muh,u = Muh,u (I).
E

Muh,u 1IMuh,u 1 1IMI u 1IEh E (Muh,u (Muh,u 1) 1IEh )


u

(s) (t)
II

dx1 dx2
uh

E 1 (X (s)) 1 (X (t)) 1IEh /X(s) = x1 , X(t) = x2 , X (s) = 0, X (t) = 0


.pX(s),X(t),X (s),X (t) (x1, x2 , 0, 0), (22)
on applying Rice formula for the second factorial moment.
Our goal is to prove that the integrand in the right member of (22), that is:
u

dx1 dx2

As,t =
uh

E |det(X (s) det(X (t)| 1IX

(s)0,X (t)0

1IEh /X(s) = x1 , X(t) = x2 , X (s) = 0, X (t) = 0


.pX(s),X(t),X (s),X (t) (x1, x2 , 0, 0), (23)

is o(h) as h 0 uniformly on s, t. Note that when s, t vary in a domain of the form


D := {t, s I : t s > } for some > 0, then the Gaussian distribution in (23)
is non-degenerate and As,t is bounded by (const)h2 , the constant depending on the
minimum of the determinant:
det Var (X(s), X(t), X (s), X (t) ,
22

for s, t D .
So it is enough to prove that As,t = o(h) for t s small, and we may assume
that s and t are in the same chart (U, ). Writing the process in this chart we may
assume that I is a ball or a half ball in Rd . Let s, t two such points, define the
process Y = Y s,t by
Y ( ) = X s + (t s)

; [0, 1].

Under the conditioning one has:


Y (0) = x1 ,

Y (1) = x2 ,

Y (0) = Y (1) = 0

Y (0) = X (s)[(t s), (t s)] ; Y (1) = X (t)[(t s), (t s)].


Consider the interpolation polynomial Q of degree 3 such that
Q(0) = x1 ,

Q(1) = x2 ,

Q (0) = Q (1) = 0

Check that
Q(y) = x1 + (x2 x1 ) y 2 (3 2y), Q (0) = Q (1) = 6(x2 x1 )
Denote
Z( ) = Y ( ) Q( ) 0 1.
Under the conditioning, one has:
Z(0) = Z(1) = Z (0) = Z (1) = 0
and if also the event Eh occurs, an elementary calculation shows that for 0 1 :
|Z (4) ( )|
|Y (4) ( )|
= sup
(const) t s 4 h .
2!
2!
[0,1]
[0,1]

|Z ( )| sup

(24)

On the other hand, check that if A is a positive semi-definite symmetric d d


real matrix and v1 is a vector of Euclidean norm equal to 1, then the inequality
det(A) Av1 , v1

det(B)

holds true, where B is the (d 1) (d 1) matrix


B = (( Avj , vk ))j,k=2,...,d
23

(25)

and {v1 , v2 , ..., vd } an orthonormal basis of Rd containing v1 .


Assume X(s) is negative definite, and that the event Eh occurs. We can apply
(25) to the matrix A = X(s) and the unit vector
ts
.
ts

v1 =

Note that in that case, the elements of matrix B are of the form X(s)vj , vk
hence bounded by (const)h . So,

det [X (s)] X (s)v1 , v1 Cd h(d1) = Cd [Y (0)]

ts

2 (d1)

the constant Cd depending only on the dimension d.


Similarly, if X(t) is negative definite, and the event Eh occurs, then:
det [X (t)] Cd [Y (1)]

ts

2 (d1)

Hence, if C is the condition {X(s) = x1 , X(t) = x2 , X (s) = 0, X (t) = 0}:


E |det(X (s)) det(X (t))| 1IX
Cd2 h2(d1) t s

Cd2

Cd2

2(d1)

2(d1)

(s)0,X (t)0

1IEh /C

E [Y (0)] [Y (1)]

ts

Y (0) + Y (1)
2

ts

Z (0) + Z (1)
2

(const) Cd2 h2d t s

1IEh /C
1IEh /C
1IEh /C

We now turn to the density in (22) using the following Lemma which is similar
to Lemma 4.3., p. 76, in Piterbarg (1996).
Lemma 3.1 For all s, t I:
ts

d+3

pX(s),X(t),X (s),X (t) (0, 0, 0, 0) D

(26)

where D is a constant.
Proof. Assume that (26) does not hold, i.e., that there exist two convergent
sequences {sn }, {tn } in I , sn s , tn t such that
tn sn

d+3

pX(sn ),X(tn ),X (sn ),X (tn ) (0, 0, 0, 0) +


24

(27)

If s = t , (27) can not hold, since the non degeneracy condition assures that this
sequence has the finite limit t s d+3 pX(s ),X(t ),X (s ),X (t ) (0, 0, 0, 0). So, s = t .
Since one can assume with no loss of generality that I is a ball or a half ball, the
n
segment [sn , tn ] is contained in I. Denote the unit vector e1,n = ttnn s
,complete
sn
d
it to an orthonormal basis {e1,n , e2,n , ..., ed,n } of R and take a subsequence of the
integers {nk } so that ej,nk ej as k + for j = 1, ..., d. In what follows, without
loss of generality, we assume that {nk } is the sequence of all positive integers. For
each Rd we denote 1,n , ..., d,n the coordinates of in the basis {e1,n , ..., ed,n }.
Note that tn sn has coordinates (t1,n s1,n , 0, ..., 0) = ( tn sn , 0, ..., 0).
Also, we denote 1 , ..., d the coordinates of in the basis {e1 , ..., ed }
The following computation is similar to the proof of Lemma 3.2. in Azas &
Wschebor (2001). We have:
n = det Var (X(sn ), X(tn ), X (sn ), X (tn ))
X
X
X
X
= det Var X(sn ), X(tn ),
(sn ),
(tn ), ...,
(sn ),
(tn )
1,n
1,n
d,n
d,n
X
X
X
= det Var X(sn ),
(sn ), Y1,n , Z1,n ,
(sn ), Z2,n , ...,
(sn ), Zd,n
1,n
2,n
d,n
where

X
(sn )(t1,n s1,n )
1,n
X
X
2
Z1,n =
(tn )
(sn )
Y1,n
1,n
1,n
t1,n s1,n
X
X
X
X
(tn )
(sn ), ....., Zd,n =
(tn )
(sn )
Z2,n =
2,n
2,n
d,n
d,n
Using now Taylor expansions and taking into account the integrability of the supremum of bounded Gaussian process, we have:
Y1,n = X(tn ) X(sn )

Y1,n =
Z1,n

(t1,n s1,n )2 2 X
(sn ) + 1,n (t1,n s1,n )3
2
2
1,n

(t1,n s1,n )2 3 X
=
(sn ) + n (t1,n s1,n )3
3
6
1,n

2X
(sn ) + 2,n (t1,n s1,n )2 , ......,
2,n 1,n
2X
= (t1,n s1,n )
(sn ) + d,n (t1,n s1,n )2
d,n 1,n

Z2,n = (t1,n s1,n )


Zd,n

25

where the random variables 1,n , 2,n , ..., d,n , n are uniformly bounded in L2 of the
underlying probability space.
Substituting into n it follows that:
144 (t1,n s1,n )[8+2(d1)] n
X
2X
3X
2X
X
X

det Var X(s ), (s ), ...,


(s
),
(s )
(s
),
(s
),
(s
),
...,
1
2
2 1
d
d 1
(1 )3
and this limit is bounded below by a positive constant, independent of s , because
of the non-degeneracy assumption. Since t1,n s1,n = tn sn , this contradicts
(27) and finishes the proof of the Lemma.
Returning to the proof of Theorem 3.1.
To bound the expression in (22) we use Lemma 3.1 and the bound on the conditional expectation, thus obtaining
E (Muh,u (Muh,u 1)1IEh )
(const)Cd2 h2d D

ts

d+1

ds dt

II

dx1 dx2
uh

(const) h22d
since the function (s, t)
t s d+1 is Lebesgue-integrable in I I. The last
constant depends only on the dimension d and the set I, Taking small enough
(20) follows.
An example: Let {X(s, t)} be a real-valued two-parameter Gaussian, centred
stationary isotropic process with covariance . Assume that its spectral measure
is absolutely continuous with density
(ds, dt) = f ()dsdt,
So that

= (s2 + t2 ) 2 .

f ()d = 1.
0

Assume further that Jk = 0 k f ()d < , for 1 k 5. Our aim is to give an


explicit upper bound for the density of the probability distribution of MI where I
is the unit disc i. e.
I = {(s, t) : s2 + t2 1}
26

Using (15) which is a consequence of Theorem 3.1 and the invariance of the law of
the process, we have
FI (u) E 1 (X (0, 0))/X(0, 0) = u, X (0, 0) = (0, 0) pX(0,0),X (0,0) (u, (0, 0))
(1, 0))/X(1, 0) = u, X
(1, 0) = 0 p
+ 2E 1 (X
(1,0) (u, 0) = I1 + I2 . (28)
X(1,0),X
We denote by X, X , X the value of the different processes at some point (s, t);
by Xss , Xst , Xtt the entries of the matrix X and by and the standard normal
density and distribution.
One can easily check that:
X is independent of X and X , and has variance J3 Id
Xst is independent of X, X Xss and Xtt , and has variance 4 J5
Conditionally on X = u, the random variables Xss and Xtt have
expectation: J3
J (J3 )2
variance: 3
4 5
covariance: 4 J5 (J3 )2 .
Using an elementary computation we get that the expectation of the negative part
of a Gaussian variable with expectation and variance 2 is equal to

( ) (
).

We obtain
I2 =

2
(u)
J3

3
J5 (J3 )2
4

with

1
2

(bu) + J3 u(bu) ,

J3

b=

3
J
4 5

(J3 )2

1
2

As for I1 we remark that, conditionally on X = u, Xss + Xtt and Xss Xtt are
independent, so that a direct computation gives:
I1 =

1
(u)E 1 2J3 u
8J3

J5 2
(2 + 32 )
4

1I{ < 2J u} 1I
1
3
{ 1 2J3 u
27

, (29)
J5 2
(2 + 32 ) > 0}
4

Where 1 , 2 , 3 are standard independent normal random variables and 2 = 2J5


4 2 J32 . Finally we get

2
I1 =
(u)
(2 +a2 c2 x2 )(acx)+[2a2 (acx)](acx) x(x)dx,
8J3
0
with a = 2J3 u, c =

J5
.
4

First derivative, second form

We choose, once for all along this section a finite atlas A for I. Then, to every t I
it is possible to associate a fixed chart that will be denoted (Ut , t ). When t I,
t (Ut ) can be chosen to be a half ball with t (t) belonging to the hyperplane limiting
this half ball. For t I, let Vt an open neighbourhood of t whose closure is included
in Ut and t a C function such that
t 1
t 0

on
on

Vt
Utc

(30)
(31)

For every t I and s I we define the normalization n(t, s) in the following


way:
for s Vt , we set in the chart (Ut , t )
n1 (t, s) =

1
s t 2.
2

(32)

By in the chart we mean that s t , is in fact t (t) t (s) .


for general s we set
n(t, s) = t (s)n1 (t, s) + 1 t (s)
Note that in the flat case (d=N) the simpler definition n(t, s) =
works.

1
2

st

For every t I and s I, we set instead of formula (32)


n1 (t, s) = |(s t)N | +

1
s t 2.
2

where (st)N is the normal component of (st) with respect to the hyperplane
delimiting the half ball t (Ut ) . The rest of the definition is the same.
28

Definition 4.1 We will say that f is an helix-function - or an h-function - on I


with pole t I satisfying hypothesis Ht,k , k integer k > 1 if
f is a bounded C k function on I\{t} .
f (s) := n(t, s)f (s) can be prolonged as function of class C k on I.
Definition 4.2 In the same way X is called an h-process with pole t I satisfying
hypothesis Ht,k , k integer k > 1 if
Z is a Gaussian process with C k paths on I\{t} .
Z(s) := n(t, s)Z(s) can be prolonged as a process of class C k on I,
for t I;
with Z(t) = 0 Z (t) = 0. If s1 , ..., sm are pairwise different points of I\{t}
then the distribution of
Z (2) (t), ..., Z (k) (t), Z(s1 ), ..., Z (k) (s1 ), ..., Z (k) (sm )
does not degenerate.
for t I; Z(s) := n(t, s)Z(s) can be prolonged as a process of class C k on I
(t) = 0 and if s1 , ..., sm are pairwise different points of I\{t}
with Z(t) = 0 Z
then the distribution of
Z N (t), Z (2) (t), ..., Z (k) (t), Z(s1 ), ..., Z (k) (s1 ), ..., Z (k) (sm )
does not degenerate. Z N (t) is the derivative normal to the boundary of I at t.
We use the terms h-function and h-process since the function and the paths
of the process need not to extend to a continuous function at the point t. However,
the definition implies the existence of radial limits at t. So the process may take the
form of a helix around t.
Lemma 4.1 Let X be a process satisfying Hk , k 2, and f be a C k function I R
set for s I, s = t
(A) For t I,
X(s) = ats X(t)+ < bts , X (t) > +n(t, s)X t (s),
where ats and bts are the regression coefficients.
In the same way, set
f (s) = ats f (t)+ < bts , f (t) > +n(t, s)f t (s),
29

using the regression coefficients associated to X.


(B) For t I, s T, s = t set
(t) > +n(t, s)X t (s)
X(s) = a
ts X(t)+ < bts , X
and

f (s) = a
ts f (t)+ < bts , f (t) > +n(t, s)f t (s),

Then s
X t (s) and s
pole t satisfying Ht,k .

f t (s) are respectively a h-process and a h-function with

the other one being similar. In fact,


Proof: We give the proof in the case t I,
t
the quantity denoted by X (s) is just X(s) ats X(t) < bts , X (t) >. On L2 (, P ),
let be the projector on the orthogonal complement to the subspace generated by
X(t), X (t). Using a Taylor expansion
X(s) = X(t)+ < (s t), X (t) > + t s

X (1 )t + s v, v (1 )d,
0

With v =

st
st

. This implies that


t

X (s) = t s

X (1 )t + s v, v (1 )d ,

(33)

which gives the result due to the non degeneracy condition.


We state now an extension of Ylvisakers Theorem (1968) on the regularity of
the distribution of the maximum of a Gaussian process which we will use in the
proof of Theorem 4.2 and might have some interests in itself.
Theorem 4.1 Let Z : T R a Gaussian separable process on some parameter
set T and denote by M Z = suptT Z(t) which is a random variable taking values in
R {+}. Assume that there exists 0 > 0, m > such that
m(t) = E(Zt ) m
2 (t) = Var(Zt ) 02
for every t T . Then the distribution of the random variable M Z is the sum of an
atom at + and a-possibly defective-probability measure on R which has a locally
bounded density.
30

Proof: Suppose first that X : T R a Gaussian separable process satisfying


Var(Xt ) = 1 ; E(Xt ) 0,
for every t T . A close look at Ylvisakers proof (1968) shows that the distribution
of the supremum M X has a density pM X that satisfies
pM X (u) (u) =

exp(u2 /2)
for every u R

2 /2)dv
exp(v
u

(34)

Let now Z satisfy the hypotheses of the theorem. For given a, b R, a < b,
choose A R+ so that |a| < A and consider the process:
X(t) =

Z(t) a |m | + A
.
+
(t)
0

Clearly for every t T :


E X(t) =

m(t) a |m | + A
|m | + |a| |m | + A
+

+
0,
(t)
0
0
0

and
Var X(t) = 1.
So that (34) holds for the process X.
On the other hand:
|m | + A
|m | + A b a
{a < M Z b} {
< MX
+
}.
0
0
0
And it follows that
P a < MZ b

|m |+A ba
+
0
0
|m |+A
0

(u)du =
a

v a + |m | + A
1

dv.
0
0

which shows the statement.


Set now (t) 1. The key point is that, due to regression formulae, under the
condition X(t) = u, X (t) = 0 the event
Au (X, ) := X(s) u, s I
coincides with the event
Au (X t , t ) := X t (s) t (s)u, s I\{t} ,
where X t and t are the h-process and the h-function defined in Lemma 4.1.
31

Theorem 4.2 (First derivative, second form) Let X : I R be a Gaussian


process, I a C compact manifold contained in Rd .
Assume that X has paths of class C 2 and for s = t the triplet (X(s), X(t), X (t))
in R R Rd has a non-degenerate distribution.
Then, the result of Theorem 3.1 is valid, the derivative FI (u) given by relation
(17) can be written as
d

FI (u) = 1
+ 1

d1
I

E det X t (t) t (t)u 1IAu (X t , t ) pX(t),X (t) (u, 0)(dt)


t (t) t (t)u 1IAu (X t , t ) p
E det X
(dt), (35)
X(t),X (t) (u, 0)

and this expression is continuous as a function of u.


t

(t) should be understood in the sense that we first define X t and


The notation X
then calculate its second derivative along I.
Proof: As a first step, assume that the process X satisfies the hypotheses of
theorem 3.1, which are stronger that those in the present theorem.
We prove that the first term in (17) can be rewritten as the first term in (35).
One can proceed in a similar way with the second term, mutatis mutandis. For that
purpose, use the remark just before the statement of Theorem 4.2 and the fact that
under the condition
X(t) = u, X (t) = 0
, X (t) is equal to
X t (t) t (t)u.
Replacing in the conditional expectation in (17) and on account of the Gaussianity of the process, we get rid of the conditioning and obtain the first term in
(35).
We now study the continuity of u
FI (u). The variable u appears at three
locations
in the density pX(t),X (t) (u, 0) which is clearly continuous
in
E det X t (t) t (t)u 1IAu (X t , t )
where it occurs twice: in the first factor and in the indicator function.

32

Due to the integrability of the supremum of bounded Gaussian processes, it is


easy to prove that this expression is continuous as a function of the first u.
As for the u in the indicator function, set
v := det X t (t) t (t)v

(36)

and, for h > 0, consider the quantity


E v 1IAu (X t , t ) E v 1IAuh (X t , t )
which is equal to
E v 1IAu (X t , t )\Auh (X t , t ) E v 1IAuh (X t , t )\Au (X t , t )

(37)

Apply Schwarzs inequality to the first term in (37).


E v 1IAu (X t , t )\Auh (X t , t ) E(v2 )P{Au (X t , t )\Auh (X t , t )}

1/2

The event Au (X t , t )\Auh (X t , t ) can be described as


s I\{t} : X t (s) t (s)u 0 ; s0 I\{t} : X t (s0 ) t (s0 )(u h) > 0
This implies that t (s0 ) > 0 and that
t

sup X t (s) t (s)u 0.


sI\{t}

Now, observe that our improved version of Ylvisakers theorem (Theorem 4.1),applies
to the process s
X t (s) t (s)u defined on I\{t}. This implies that the first term
in (37) tends to zero as h 0. An analogous argument applies to the second term.
Finally, the continuity of FI (u) follows from the fact that one can pass to the limit
under the integral sign in (35).
To finish the proof we still have to show that the added hypotheses are in fact
unnecessary for the validity of the conclusion. Suppose now that the process X
satisfies only the hypotheses of the theorem and define
X (t) = Z (t) + Y (t)

(38)

where for each > 0, Z is a real-valued Gaussian process defined on I, measurable with respect to the -algebra generated by {X(t) : t I}, possessing C
33

paths and such that almost surely Z (t), Z (t), Z (t) converge uniformly on I to
X(t), X (t), X (t) respectively as 0. One standard form to construct such an approximation process Z is to use a C partition of the unity on I and to approximate
locally the composition of a chart with the function X by means of a convolution
with a C kernel.
In (38), Y denotes the restriction to I of a Gaussian centred stationary process
satisfying the hypotheses of proposition 3.1, defined on RN , and independent of
X. Clearly X satisfies condition (Hk ) for every k, since it has C paths and
the independence of both terms in (38) ensures that X inherits from Y the nondegeneracy condition in Definition 3.1. So, if
MI = max X (t) and FI (u) = P{MI u}
tI

one has
FI (u) = 1
+ 1

d1

E det X

(t)

E det X

(t)

(t)u 1IAu (X

(t)u 1IAu (X

t , ,t )

t , t )

pX

pX

(t),X (t) (u, 0)(dt)

(dt),
(t) (u, 0)
(t),X

(39)

We want to pass to the limit as 0 in (39). We prove that the right-hand member
is bounded if is small enough and converges to a continuous function of u as 0.
Since MI MI , this implies that the limit is continuous and coincides with FI (u)
by a standard argument on convergence of densities. We consider only the first term
in (39), the second is similar.
The convergence of X and its first and second derivative, together with the
non-degeneracy hypothesis imply that uniformly on t I, as 0 :
pX

(t),X (t) (u, 0)

pX(t),X (t) (u, 0).

The same kind of argument can be used for


det X

(t)

(t)u ,

on account of the form of the regression coefficients and the definitions of X t and
t . The only difficulty is to prove that, for fixed u:
P{C C} 0 as

0,

where
C = Au (X t , t )
34

(40)

C = Au (X t , t )
We prove that
a. s. 1IC 1IC as

0,

(41)

which implies (40).


First of all, note that the event
L=

sup

X t (s) t (s)u = 0

sI\{t}

has zero probability, as already mentioned.


Second, from the definition of X t (s) and the hypothesis, it follows that , as 0,
X ,t (s), ,t (s) converge to X t (s), t (s) uniformly on I\{t}. Now, if
/ C, there
exists s = s() I\{t} such that
X t (
s) t (
s)u > 0
and for

> 0 small enough, one has


X t (
s) t (
s)u > 0,

which implies that


/C.
On the other hand, let C\L. This implies that
sup

X t (s) t (s)u < 0.

sI\{t}

From the above mentioned uniform convergence, it follows that if


enough, then
sup X t (s) t (s)u < 0,

> 0 is small

sI\{t}

hence C . (41) follows.


So, we have proved that the limit as 0 of the first term in (39) is equal to the
first term in (35). Since,if > 0 is small enough the integrand is bounded for t I
and u in a compact interval of the real line.
It remains only to prove that the first term in (35) is a continuous function of u.
For this purpose, it suffices to show that the function
u

P{Au (X t , t )}.

35

is continuous. This is a consequence of the inequality


P{Au+h (X t , t )}P{Au (X t , t )} P

sup

X t (s) t (s)u

|h| sup | t (s)|

sI\{t}

sI\{t}

and of Theorem 4.1, applied once again to the process s


on I\{t}.

X t (s) t (s)u defined

Second derivative

Theorem 5.1 Suppose now that I is a d-dimensional smooth manifold without


boundary and that the process X satisfies hypothesis H4 then the distribution of
MI admits a density which is absolutely continuous and its derivative satisfies :

FI (u) = 1

(1,0)

E det X t (t) t (t)u 1IAu pX(t),X (t) (u, 0)dt


n

t (t)

E
I

i,j

i,j=1

ds t (s)

dt
I

Ci,j (u) 1IAu pX(t),X (t) (u, 0)dt

I
t

E det X (s) (s)u det X (t) (t)u 1IAu /X t (s) = t (s)u, X t (s) = t (s)u
pX t (s),X t (s) t (s)u, t (s)u pX(t),X (t) u, 0 +
S d1

(dw)E det X tT (w) tT (w)x v 1IAu /X tN (w) = tN (w)x, X tN T (w) = tN T (w)x


pX tN (w),X tN T (w) tN (w)x, tN T (w)x

(42)

(1,0)

Where Au stands for Au (X t , t ) and pX(t),X (t) , Ci,j (u), X tT , X tN , X tN T , tT , tN and


tN T , are defined in the proof.
Proof: We have to check that the expression given in Theorem 4.2 that now
takes the form
FI (u) = 1

d
I

E det X t (t) t (t)u 1IAu (X t , t ) pX(t),X (t) (u, 0)(dt)

(43)

is differentiable with respect to u. A sufficient condition is the integrand itself to be


differentiable with a derivative integrable in t, u , t I, u in a compact interval.
36

The derivative of the integrand in (43) is the sum of the three derivatives corresponding to the three locations where the variable u appears, namely :
in the density pX(t),X (t) (u, 0) which is clearly differentiable with bounded
derivative :
(1,0)
pX(t),X (t) (u, 0).
This gives the first term in (42).
In the derivative with respect to the first occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
The derivative of which is
d

t (t)

i,j

Ci,j (u) 1IAu (X t , t ) ,

i,j=1

where Ci,j (u) is the cofactor of location (i, j) in the matrix X t (t) t (t)u.
This quantity is uniformly bounded when u varies in a compact interval, which
follows easily from an expression of the type (33). This gives the second term
in (42).
in the derivative with respect to the second occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
To evaluate this derivative define v as in (36) and set for sufficiently small:
I := I\B(t, ) ; Au = Au (X t , t ) := X t (s) t (s)u, s I ,
B(t, ) being the ball with center t and radius in the chart (t , Ut ). By dominated
convergence
E v 1IAu+h (X t , t ) 1IAu (X t , t )

= lim E v ( 1IAu+h (X t , t ) 1IAu (X t , t )


0

On I , X t is a process satisfying H4. In the same manner as in Lemma 3.3 of


Azas and Wschebor (2001), we can generalize the proof of Theorem 3.1 to the case

37

of a non constant function t and a non constant random variable v to obtain


E v ( 1IAu+h 1IAu )
u+h

dx
I

(ds)(1)d t (s)E det Y t (s) v 1IAx /X t (s) = t (s)x, X t (s) = t (s)x


pX t (s),X t (s) t (s)x, t (s)x

u+h

dx
u

S(t,)

t (s) = t (s)x

(ds)(1)d1 t (s)E det Y t (s) v 1IAx /X t (s) = t (s)x, X


pX t (s),X t (s) t (s)x, t (s)x = I1 + I2 , (44)

where S(t, ) is the sphere with centre t and radius , Y t (s) = X t (s) t (s)x ,
t (s) t (s)x.
Y t (s) = X
Let us prove that the first integral converges as 0. The only problem is the
behaviour around t. So it is sufficient to prove the convergence locally around t in
the chart (t , Ut ) with s in Vt which implies that n(s, t) = 12 t s 2 . Without loss
of generality we may assume that the representation of t in this chart is the point 0
in Rd . To study the behaviour of the integrand as s 0, we choose an orthonormal
basis with ss as first vector and set s = (, 0, ...0)T . At s = 0 the process X t and
its derivative have the following expansions (for short, derivatives are indicated by
sub-indices).
1 2
3
4
X 11 + X 111 + X 1111 + o(4 )
2
6
24
2
3
X t

(s) = X t1 (s) = X 11 + X 111 + X 1111 + o(3 )


s1
2
6
2 t
2
X

(s) = X t11 (s) = X 11 + X 111 + X 1111 + o(2 )


2
s1
2
t
X
(s) = X tj (s) = X 1j + O(2 ) (j = 1),
sj
X t (s) =

where X ij = X tij (0), X 111 =

3X t
(0)
3 s1

and X 1111 =

4X t
(0)
4 s1

Since

X t (s) = m(s).X t
with
m(s) := 1/n(s, 0) =

s21

2
4
m(s)
(, 0...0) = 3
;
2
+ ... + sd
s1

38

(45)
(46)
(47)
(48)

m(s)
2 m(s)
12
(, 0...0) = 0 (i = 1) ;
(, 0...0) = 4
2
si
s1

4
2 m(s)
2 m(s)
(,
0...0)
=
(i
=
1)
;
(, 0...0) = 0 (i = j).
s2i
4
si sj
Using derivation rules, we get

X t (s) = X 11 + X 111 + O(2 )


3
X t
1
(s) = X1t (s) =
X + O()
s1
3 111
2X t
1
t
(s) =
(s) = X11
+ O()
X
2
s1
6 1111
X t
2
(s) = Xjt (s) =
X + O(1) (j = 1)
sj
1j
2X t
2
(s) = 2 X ij + O(1 ) (j = 1) (i = 1) (i = j)
si sj

(49)
(50)
(51)
(52)
(53)

from that we deduce that


pX t (s),X t (s) t (s)x, t (s)x (const)(d1)
Since if Ax occurs and X t (s) = t (s)x, then a.s. the matrix X t (s) t (s)x is
definite negative, and using relation (25), we have
d

det X t (s) t (s)x

|Xiit (s) iit (s)x|

i=1

where the notation iit (s) has an obvious meaning. The condition
C(s) = X t (s) = t (s)x ; X t (s) = t (s)x
converges as s 0 to the condition
X 11 = 11 x ; X 111 = 111 x ; X ti (0) = ti (0)x (i = 2, d)
which is non singular (again the notations 11 , 111 , ti are obvious). Consider
a Gaussian variable which is measurable with respect to the process and which is
39

bounded in probability. then its distribution conditional to C(s) remains bounded


in probability. Since for i = 1,
Xiit (s) iit (s)x =

1
2
2
1
X
+
X

x = Op (2 ),
ii
11
2
2
2 ii
2 11

this variable has the same order of magnitude under C(s).


On the other hand, under C(s)
X11 (s) = Op (1) ; 11 (s) = O(1)
Finally we get:
E

det X t (s) t (s)x v 1IA


= O(2(d1)
x

Since ( t (s) is bounded we see that the integrand is O(1d ) which ensures
convergence of I1 as 0. One easily check that the bound for the integrand is
uniform in t.
We consider now the limit of I2 as 0. It is enough to prove that for each
x R the expression
u+h

d1

(1)

(ds) t (s)

dx
u

S(t,)

t
t
t (s) = t (s)x p t t
E det X (s) v 1IAx /X t (s) = t (s)x, X
X (s),X (s) (s)x, (s)x .
(54)

converges boundedly as 0. Make in (54), the change of variable s = t + w,


w S d1 , and it becomes
u+h

(1)d1
u

2(d1)

(dw) t (t + w)

dx
S d1

t (t+w) v 1IA /X t (t+w) = t (t+w)x, X


t (t+w) = t (t+w)x
det X
x
p t
t (t + w)x, t (t + w)x , (55)
t
X (t+w), X (t+w)

where we have used that


PX,Y (x, u) = d1 PX,Y (x, u)
if (X, Y ) is a random vector in R Rd1 and > 0.
Now consider the following decomposition in block of the matrix X t (t) written
in a basis with first vector equal to w S d1
40

X tN (w) is the second derivative of X t in the direction w : wT X t w .


X tT (w) is the (d 1) (d 1) matrix that consists of the second derivatives
of X t in the direction that are orthogonal to w.
X tN T (w) is the (d 1) vector that consist of the cross second derivative X t ,
one in the direction w one in the d 1 direction orthogonal to w.
We have that the expression of X t (t) in the new basis is
X tN (w)
X tN T (w)

X tN T (w)
X tT (w)

We make the same decomposition with obvious notations for t (t).


Relations (49) to (52) imply that as 0
2(d1) pX t (t+w),X t (t+w) t (t+w)x, t (t+w)x pX tN (w),X tN T (w) tN (w)x, tN T (w)x
(56)
and (53) implies that
2 t
2 t
(t + w) tT (w).
X (t + w) X tT (w) ,
2
2
Noting that 1IAx 1IAx we get that as 0:
2(d1)
t (t+w) v 1IA /X t (t+w) = t (t+w)x, X
t (t+w) = t (t+w)x
E det X
x
2(d1)
t
E det X tT (w) tT (w)x v 1IAx /X N
(w) = tN (w)x, X tN T (w) = tN T (w)x .
Remarking that the integrand is uniformly bounded, we are ready to pass to the
limit and get the result.

Asymptotic expansion of F (u) for large u

Corollary 6.1 Suppose that the process X satisfies the conditions of Theorem 4.2
and that in addition E(Xt ) = 0 and Var(Xt ) = 1.
Then, as u + F (u) is equivalent to
ud
u2 /2
e
(2)(d+1)/2

det((t))
I

where (t) is the variance-covariance matrix of X (t).


41

1/2

dt,

(57)

Proof: Set r(s, t) := E X(s), X(t)), and for i, j = 1, d,


ri; (s, t) :=

r(s, t);
si

rij; (s, t) :=

2
r(s, t);
si sj

ri;j (s, t) :=

2
r(s, t).
si tj

For every t, i and j


ri; (t, t) = 0,

ij (t) = ri;j (t, t) = rij; (t, t).

Thus X(t) and X (t) are independent. Regression formulae imply that
ats = r(s, t),

t (s) =

1 r(t, s)
.
n(s, t)

This implies that t (t) = (t) and that the possible limit values of t (s) as s t
are in the set {v T (t)v : v S d1 }. Due to the non-degeneracy condition these
quantities are minorized by a positive constant. On the other hand for s = t
t (s) > 0. This shows that for every t I one has inf sI t (s) > 0. Since for every
t I the process X t is bounded it follows that
a.s. 1IAu (X t , t ) 1 as u +.
Also
det X t (t) t (t)u

(1)d det (t) ud .

A dominated convergence argument shows that the first term in (35) is equivalent
to
2 /2

ud det(t )(2)1/2 eu

(2)d/2 det(t )

1/2

dt =

ud
2
eu /2
(d+1)/2
(2)

det(t )
I
2 /2

The same kind of argument shows that the second term is O ud1 eu
completes the proof.

1/2

which

References

Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics


for General Gaussian Processes, IMS, Hayward, Ca.
42

dt.

Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution
of the maximum of a Gaussian random fields. To appear in Extremes.
Azas, J-M. and Wschebor M. (1999). Regularite de la loi du maximum de
processus gaussiens reguliers. C.R. Acad. Sci. Paris, t. 328, serieI, 333-336.
Azas, J-M. and Wschebor M. (2001). On the regularity of the distribution of
the maximum of one-parameter Gaussian processes. Probab. Theory Relat. Fields,
119, 70-98.
Brillinger D. R., (1972). On the number of solutions of systems of random
equations. The Annals of Math. Statistics, 43, 534540.
Caba
na, E. M., (1985). Esperanzas de integrales sobre conjuntos de de nivel
aleatorios (spanish). Actas del segundo Congreso latinoamericano de probabilidades
Y estadistica mathematica. Caracas, 65-81
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic
Processes, J. Wiley & Sons, New-York.
Cucker, F.; Wschebor, (2003) M. On the Expected Condition Number of Linear
Programming Problems, to appear in Numerische Mathematik.
Diebolt, J. and Posse, C. (1996). On the Density of the Maximum of Smooth
Gaussian Processes.Ann. Probab., 24, 1104-1129.
Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York
Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes.Ecole dEte de Probabilites de Saint Flour. Lecture Notes in Mathematics,
480, Springer-Verlag,New-York.
Kobayashi Nomizu 199? Foundation of differential geometry. J. Wiley & Sons,
New-York.
Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process.
Sankya Ser. A 32, 369-378.
Lifshits, M.A.(1995). Gaussian random functions . Kluwer, The Netherlands.
Milnor, J. W.(1965). Topology from the differentiable viewpoint. The Univerity
Press of Virginia , Charlottesville.
Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes
and Fields. American Mathematical Society. Providence, Rhode Island.
Piterbarg V. I., (1996b). Rices Method for Large Excursions of Gaussian Random Fields. Technical Report No. 478, University of North Carolina. Translation
of Rices method for Gaussian random fields.
Taylor J.E., and Adler R. (2002) Euler characteristics for Gaussian fields on
manifolds. Preprint
Tsirelson, V.S. (1975). The Density of the Maximum of a Gaussian Process. Th.
Probab. Appl., 20, 847-856.
43

Weber, M. (1985). Sur la densite du maximum dun processus gaussien. J. Math.


Kyoto Univ., 25, 515-521.
Ylvisaker, D. (1968). A Note on the Absence of Tangencies in Gaussian Sample
Paths.The Ann. of Math. Stat., 39, 261-262.

44

Citations
From References: 1
From Reviews: 0

Article

MR2478201 (Review) 60-02 (60E15 60G05 60G15 60G60 60G70)


Azas, Jean-Marc (F-TOUL3-SPM); Wschebor, Mario (UR-UREPS-CM)
Level sets and extrema of random processes and fields.
John Wiley & Sons, Inc., Hoboken, NJ, 2009. xii+393 pp. $110.00. ISBN 978-0-470-40933-6
This book presents modern developments on the following two subjects: understanding the properties of level sets of a given random field X = (Xt , t T ) and analysis and computation of the
distribution function of the random variable MT = suptT X(t), provided that X is real-valued.
Chapter 1 of the book contains a number of fundamental classical results on stochastic processes,
for example, Kolmogorovs consistency theorem and the 0-1 law for Gaussian processes, but
a particular emphasis is placed on sufficient conditions for continuity, Holder continuity and
differentiability of trajectories of stochastic processes. Most of the results on path regularity are
not restricted to the Gaussian case, and many apply to the multiparameter (i.e. random field)
setting. The last section of this chapter contains Bulinskayas sufficient condition for a oneparameter process not to have almost surely critical points in a given level set, plus an extension
of Ylvisakers theorem in the Gaussian case. Specifically, it is shown here that when the mean
of the Gaussian process is bounded from below and its variance is bounded away from zero, the
supremum of the process over a given fixed parameter set has probability distribution equal to
the sum of an atom at infinity and a (possibly degenerate) probability measure on the reals with a
locally bounded density. The end-of-chapter exercises include derivation of regularity properties
of the paths of fractional Brownian motion and Brownian local time.
Chapter 2 opens with the proof of the latest (2002) refinement of the Slepian inequalities,
due to W. V. Li and Q. M. Shao [Probab. Theory Related Fields 122 (2002), no. 4, 494508;
MR1902188 (2003b:60034)], where the difference between the cumulative distribution functions
of two centered Gaussian n-dimensional vectors (with variances normalized to one and arbitrary
n 2) both evaluated at a given point a Rn is bounded above by the following sum:
1
2

X
(arcsin rij
1i<jn

Y +
arcsin rij
) exp

a2i + a2j

,
2(1 + ij )

X
Y
and rij
are covariances between Xi and Xj and between Yi and Yj , respectively, and
where rij
X
Y
ij = max(|rij |, |rij
|). Two more related comparison lemmas are stated. One of these is the
well-known Sudakov-Fernique inequality showing that if variances of arbitrary increments of a
Gaussian process X are less than or equal to variances of similar increments of a Gaussian process
Y then the mean of the supremum of X is less than or equal to the mean of the supremum of
Y , provided that the two Gaussian processes are separable centered with almost surely bounded
paths. Next the authors present the proof due to C. Borell of Ehrhards inequality [C. R. Math.
Acad. Sci. Paris 337 (2003), no. 10, 663666; MR2030108 (2004k:60102)] valid for general Borel
subsets of Rn (with no restrictions on the convexity of those sets). Namely, let n be the standard
Gaussian probability measure on Rn . Then for any pair A and B of Borel sets in Rn and all

(0, 1), the following inequality holds:


1 (n (A + (1 )B)) 1 (n (A)) + (1 )1 (n (B)).
The authors then derive a version of a Gaussian isoperimetric inequality and use it to prove the
Borell-Sudakov-Tsirelson inequality, which gives an exponential bound for P (|MT (MT )| >
x), where MT is the supremum of a Gaussian process over [0, T ] and (MT ) is the median
of distribution of MT . The next inequality for the tails of the distribution of the supremum is
similar but involves the mean of MT rather than the median and is due to Ibragimov, who proved
the inequality using stochastic analysis tools. Chapter 2 concludes with the proof of Dudleys
inequality, which establishes an upper bound on the mean of the supremum (of a possibly nonGaussian process) in terms of an integral of a square-root of the logarithm of covering numbers.
Chapter 3 is entirely devoted to the treatment of Rice formulas for one-parameter processes
and centers on integral representations of moments of the number of (up- and down-) crossings
for both Gaussian and non-Gaussian processes having continuously differentiable sample paths.
Formal proofs of these results are preceded by nice intuitive discussions, whereas at the end of the
chapter the authors suggest a number of useful exercises.
Chapter 4 starts with the application of Rice formulas to derive bounds for the tails of the distribution of the maximum of one-parameter Gaussian processes with continuously differentiable
sample paths and, in the stationary case, to subsequently characterize the asymptotic behavior of
P (MT > x) as x . This chapter also contains two detailed examples of statistical applications of the distribution of the maximum to genetics and to the study of mixtures of Gaussian
distributions. In the first case the problem is that of testing that a given putative gene has no influence on a given quantitative trait within the classical framework of a linear model with i.i.d.
errors. In the second case the problem is that of testing
H0 : Y N (, 2 )
versus
H1 : Y pN (1 , 2 ) + (1 p)N (2 , 2 ),
first under the assumption that = 1 = 0 and 2 R while 2 = 1 (which corresponds to a
simple Gaussian mixture model), next under no additional assumptions on the means but 2 =
1 (i.e. test of one population versus two when variance is known), and finally with no additional
assumptions on either means or variance (i.e. test of one population versus two when variance is
unknown). Since the distribution of the likelihood ratio test (LRT) statistic is related to that of the
maximum of a rather regular Gaussian process, the authors use the Rice formulas to address the
question of whether the power of the LRT is influenced significantly by the size of the interval(s)
in which the parameters live and whether the LRT is more powerful than the hypothesis tests based
on moments (the answer to the latter question turns out to be negative).
The next chapter focuses on both theoretical and numerical analysis of the Rice series, which are
representations of the distribution function of the maximum of a given stochastic process in terms
of series of factorial moments of the number of up-crossings of the underlying process. The authors
prove two key results. The first is applicable to both non-Gaussian and Gaussian cases but assumes
that the underlying process X has C sample paths and establishes a general sufficient condition
on the distribution of X and its derivatives such that the following Rice series representation of

the cumulative distribution function FMT of the maximum of X in terms of factorial moments m
of the number of up-crossings of X of a given level u, starting below u at time 0, holds:

()

1 FMT = P (X(0) > u) +

(1)m+1
m=1

m
.
m!

Moreover, when the infinite series is truncated, the error bound for the resulting approximation is
also given. The second key result shows that for a Gaussian centered and stationary process on R
with covariance such that (0) = 1 and has a Taylor expansion at zero which is absolutely
convergent at t = 2T , the conditions of the above general Rice series theorem are satisfied and thus
representation () is valid. Much of the remainder of the chapter is devoted to efficient numerical
computation of the factorial moments of up-crossings, which is important for applications of the
Rice series. In particular it is shown that the Rice series approach is a priori better than the Monte
Carlo method (in terms of comparison of the complexities of the computation of the distribution
of the maximum) and, for standard error bounds, allows one to compute the desired distribution
with just a few terms of the Rice series. Chapter 5 concludes with a modification of the general
Rice series theorem discussed earlier to include continuous processes that do not have sufficiently
differentiable paths, which is achieved by employing in the series the factorial moments of upcrossings of an -mollified version (with > 0) of the underlying process and then taking to
0.
Chapter 6 revisits the subject of Rice formulas but in a much richer multiparameter setting.
The authors start by proving the area formula, then establish Rice formulas for the moments
of multiparameter Gaussian random fields (from a domain in Rd to Rd ) having continuously
differentiable trajectories, and also prove a closely related result on the expected number of
weighted roots corresponding to a given level set. Next, Rice formulas for the expected number of
local maxima and the expected number of critical points of a Gaussian random field with domain
D are established, where D is a C 2 -manifold (at first, the manifold has no additional structure,
then the results are further specialized to the cases when D has a Riemannian metric and when D
is embedded in a Euclidean space). Analogous results are subsequently also proved for the case of
Gaussian random fields from Rd to Rd but now d > d .
Chapter 7 is devoted to the analysis of regularity of the distribution of the maximum of Gaussian
random fields. The key result here is the representation formula for the density of the maximum
of a Gaussian real-valued field with C 2 -paths defined on an open set containing S, where S is a
compact subset of Rd which can be written as the disjoint union of a finite number of orientable
C 3 manifolds Sj of dimension j without boundary (where j = 0, . . . , d). Moreover, under certain
nondegeneracy conditions, this density of the maximum is shown to be continuous. On the other
hand, restricting attention to the one-parameter case allows the authors to derive subtler results on
the degree of smoothness of the distribution of the maximum. Namely, if a Gaussian process on
[0, 1] has paths in C 2k then the cumulative distribution function of the maximum is shown to be of
class C k .
Chapter 8 generally studies tails of the distribution of the maximum of a random field and is
divided into two parts. In the first part the authors focus solely on the case of one-parameter
Gaussian processes and analyze the asymptotic behavior of the successive derivatives of the

distribution of the maximum as well as the tails of the distribution of the maximum of certain
unbounded Gaussian processes. In the latter case the probability q that the supremum is finite is
strictly less than one, and the aim is to understand the speed at which P (MT u) converges to q
as u grows to +. In the second part the authors establish bounds for the density of the maximum
of a multiparameter Gaussian random field and subsequently analyze the asymptotic behavior of
the maximum given by
P (M > u) = A(u) exp(u2 /(2 2 )) + B(u),
where A(u) is a known function with polynomially bounded growth as u +, 2 =
supt Var(X(t)), and B(u) is an error bounded by a centered Gaussian density with variance
smaller than 2 .
Chapter 9 develops an efficient method, based on record times, for the numerical computation of
the distribution of the maximum of one- and two-parameter Gaussian random fields. The authors
first consider the parameter space [0, 1] and prove that if X is a Gaussian process with C 1 -paths,
then the maximum M = max{X(t), t [0, 1]} has a distribution with tails of the form
() P (M > u) =
1

E[(X (t)+ )1{tR} |X(t) = u]pX(t) (u)dt,

P (X(0) > u) +
0

where pX(t) () is the probability density of X(t) and R is the set of record times, i.e. R = {t
[0, 1]: X(s) < X(t), s [0, t)}. The latter result is derived from Rychliks formula, which in
turn is based on the idea that
P (M u) = P (X(0) > u) + P (t R: X(t) = u) =
P (X(0) > u) + E[#{t R: X(t) = u}],
since the number of record times t such that X(t) = u is either 0 or 1. Then, upon using a
discretization of the condition {X(s) < X(t), s [0, t)}, one can use formula () to obtain
explicit upper bounds on P (M > u):
P (X(0) > u)
1

E (X (t)+ )1{X(0)<u,...,X(t(n1)/n)<u} |X(t) = u pX(t) (u)dt.

+
0

On the other hand, a similar time discretization provides the trivial lower bound
P (M > u) 1 P (X(0) u, . . . , X((n 1)/n) u),
where (at least for n up to 100) the integrals in the above upper and lower bounds can be easily
computed using the Matlab toolbox MAGP developed by Mercadier (2005). Subsequently this
record method is adapted by the authors to deal with the case of a two-parameter Gaussian random
field.
Chapter 10 presents asymptotic results for one-parameter stationary Gaussian processes on time
intervals whose size tends to infinity. First, provided that the level u tends to infinity jointly with
the size of the time interval so that the expectation of the number of up-crossings remains constant
and under the assumption of some local regularity (given by Gemans condition) and some mixing
(given by Bermans condition) of the underlying process, the Volkonski-Rozanov theorem [V. A.

Volkonski and Yu. A. Rozanov, Teor. Veroyatnost. i Primenen. 6 (1961), 202215; MR0137141
(25 #597)] is proved, showing that the asymptotic distribution of the number of up-crossings
is Poisson. The latter in turn implies that the suitably renormalized maximum of the process
converges to a Gumbel distribution. On the other hand, when the level u is fixed, under certain
conditions, the number of (up-)crossings is shown to satisfy a central limit theorem. In terms of
extensions of these results to a multiparameter setting, the authors quote Piterbargs theorem [V. I.
Piterbarg, Asymptotic methods in the theory of Gaussian processes and fields, Translated from the
Russian by V. V. Piterbarg, Amer. Math. Soc., Providence, RI, 1996; MR1361884 (97d:60044)] for
a multiparameter analogue of the Volkonski-Rozanov theorem. The multiparameter extensions of
the central limit type results for up-crossings are not directly developed in the book, but several
useful references are provided.
Chapter 11 deals with applications of Rice formulas to the study of some geometric characteristics of random sea surfaces. The random sea surface is modeled as a Gaussian stationary
3-parameter field which is the limit of the superposition of infinitely many elementary sea waves.
Namely, if one considers a moving incompressible fluid in a domain of infinite depth, then the
classical Euler equations, after some approximations, imply that the sea level X(t, x, y), where t
is time and (x, y) are spatial variables, satisfies
X(t, x, y) = f cos(t t + x x + y y + ),
where f and are the amplitude and phase, and the pulsations t , x and y are some parameters
2
satisfying the Airy relation 2x + 2y = gt , where g is the acceleration of gravity. If units are chosen
so that g = 1 and if f and g are independent random variables with f having Rayleigh distribution
and being uniform on [0, ], then X(t, x, y) is the Gaussian sine-cosine process of the form
X(t, x, y) = 1 sin(t t + x x + y y) + 2 cos(t t + x x + y y),
where 1 and 2 are independent standard normal random variables. The Rice formula is used
to derive from the directional spectrum of the sea various properties of the distribution of such
geometric characteristics like length of crests and velocities of contours. In addition, two nonGaussian generalizations of the above Gaussian sea surface model are also briefly discussed.
Chapter 12 is devoted to the application of the Rice formula to the study of the number of real
roots of a system of random equations, with a particular emphasis placed on large polynomial
systems with random coefficients. The authors start by proving the Shub-Smale theorem [M. Shub
and S. J. Smale, in Computational algebraic geometry (Nice, 1992), 267285, Progr. Math., 109,
Birkhauser Boston, Boston, MA, 1993; MR1230872 (94m:68086)] showing that if N X equals the
number of roots of the system of equations Xi (t) = 0 for all i = 1, . . . , m, where
(i)

aj1 ,...,jm tj11 tjmm ,

Xi (t) :=
j1 ++jm di
(i)

with coefficients {aj1 ,...,jm : i = 1, . . . , m; j1 + + jm di } being centered independent Gauss(i)

di !
ian random variables with variances Var(aj1 ,...,jm ) = j1 !jm !(di (j
, then E(N X ) =
++j
))!
1
m

d1 dm . Next, assuming that di = d for all i = 1, . . . , m, where 2 d d0 < for some


constant d0 independent of m, the authors establish the asymptotic behavior as m of the

variance of N X / dm . Namely, it is shown that for d = 2 the asymptotic variance of N X / dm


3 log(m)
is log(m)
2m , for d = 3 the asymptotic variance is 2m2 , while for d 4 the asymptotic variance is
Kd
for certain known constants Kd . Further extensions of the Shub-Smale result to other sysm3(d2)
tems that are invariant under the orthogonal group of the underlying Euclidean space Rm and to
certain systems with noncentered random coefficients are also developed.
The last chapter (Chapter 13) of the book is devoted to the application of the Rice formula to
the study of condition numbers of random matrices. Condition numbers arise when one wants
to understand how the solution x Rn of a linear system of equations Ax = b is affected by
perturbations in the input (A, b), in which case the condition number is defined as k(A) = A
A1 , where A denotes the usual operator norm. The meaning of k(A) is that of a bound for
the amplification of the relative error between output and input when the input is small. This type
of application is a new field aiming to further the understanding of algorithm complexity via the
randomization of the problems that the algorithms are designed to solve.
The book is a very valuable addition to the literature on Gaussian processes, random fields and
extreme value theory. It is well written and self-contained and presents a significant number of
detailed and original applications to genomics, oceanography, the study of systems of random
equations and condition numbers of random matrices. In comparison with another recent book [R.
J. Adler and J. E. Taylor, Random fields and geometry, Springer, New York, 2007; MR2319516
(2008m:60090)] (with which it has some overlap in the material on the Rice formula and Rice
series and on tails of the distribution of the maximum), this book has a distinct analytic rather
than geometric flavor, making it more accessible to audiences with no background in differential
geometry (albeit at the expense of omitting some beautiful results on the geometry of excursion
sets, for example). Since the approaches adopted in these two books are very different and there
is generally little overlap in the material, the two books complement each other well. Another
valuable feature of the book under review, both from the self-study point of view and for its use
as a textbook in graduate classes, is the inclusion of end-of-chapter exercises. The latter not only
reinforce the material presented but also expose readers to a variety of new topics and ideas.
Reviewed by Anna Amirdjanova
c Copyright American Mathematical Society 2010

arXiv:1102.0389v2 [math.PR] 14 Mar 2011

Bernoulli 17(1), 2011, 170193


DOI: 10.3150/10-BEJ265

Rice formulae and Gaussian waves


R. LEON
2 and MARIO WSCHEBOR3
JEAN-MARC AZAS1 , JOSE
1

Universite de Toulouse, IMT, LSP, F31062 Toulouse Cedex 9, France. E-mail: azais@cict.fr
Escuela de Matem
atica, Facultad de Ciencias, Universidad Central de Venezuela, A.P. 47197,
Los Chaguaramos, Caracas 1041-A, Venezuela. E-mail: jose.leon@ciens.ucv.ve
3
Centro de Matem
atica, Facultad de Ciencias, Universidad de la Rep
ublica, Calle Igu
a 4225,
11400 Montevideo, Uruguay. E-mail: wschebor@cmat.edu.u
2

We use Rice formulae in order to compute the moments of some level functionals which are
linked to problems in oceanography and optics: the number of specular points in one and two
dimensions, the distribution of the normal angle of level curves and the number of dislocations
in random wavefronts. We compute expectations and, in some cases, also second moments of
such functionals. Moments of order greater than one are more involved, but one needs them
whenever one wants to perform statistical inference on some parameters in the model or to test
the model itself. In some cases, we are able to use these computations to obtain a central limit
theorem.
Keywords: dislocations of wavefronts; random seas; Rice formulae; specular points

1. Introduction
Many problems in applied mathematics require estimations of the number of points, the
length, the volume and so on, of the level sets of a random function {W (x) : x Rd }, or
of some functionals defined on them. Let us mention some examples which illustrate this
general situation:
1. A first example in dimension one is the number of times that a random process
{X(t) : t R} crosses the level u:
NAX (u) = #{s A : X(s) = u}.
Generally speaking, the probability distribution of the random variable NAX (u) is unknown, even for simple models of the underlying process. However, there exist some
formulae to compute E(NAX ) and also higher order moments; see, for example, [6].
2. A particular case is the number of specular points of a random curve or a random
surface. Consider first the case of a random curve. A light source placed at (0, h1 ) emits a
ray that is reflected at the point (x, W (x)) of the curve and the reflected ray is registered

This is an electronic reprint of the original article published by the ISI/BS in Bernoulli,
2011, Vol. 17, No. 1, 170193. This reprint differs from the original in pagination and
typographic detail.
1350-7265

2011 ISI/BS

Rice formulae and Gaussian waves

171

by an observer placed at (0, h2 ). Using the equality between the angles of incidence and
reflection with respect to the normal vector to the curve (i.e., N (x) = (W (x), 1)), an
elementary computation gives
W (x) =

2 r1 1 r2
,
x(r2 r1 )

(1)

where i := hi W (x) and ri := x2 + 2i , i = 1, 2. The points (x, W (x)) of the curve


such that x is a solution of (1) are called specular points. For each Borel subset A of
the real line, we denote by SP 1 (A) the number of specular points belonging to A. One
of our aims is to study the probability distribution of SP 1 (A).
3. The following approximation, which turns out to be very accurate in practice for
ocean waves, was introduced some time ago by Longuet-Higgins ([10, 11]; see also [9]).
If we suppose that h1 and h2 are large with respect to W (x) and x, then ri = i +
x2 /(2i ) + O(h3
i ). (1) can then be approximated by
W (x)

x 1 + 2 x h1 + h2

= kx,
2 1 2
2 h1 h2

where k :=

1 1
1
.
+
2 h1 h2

(2)

Set Y (x) := W (x) kx and let SP 2 (A) denote the number of roots of Y (x) belonging to
the set A, an approximation of SP 1 (A) under this asymptotic. The first part of Section
2 below will be devoted to obtaining some results on the distribution of the random
variable SP 2 (R).

4. Let W : Q Rd Rd with d > d be a random field and define the level set
W
CQ
(u) = {x Q : W (x) = u}.

Under certain general conditions, this set is a (d d )-dimensional manifold, but, in any
case, its (d d )-dimensional Hausdorff measure is well defined. We denote this measure
by dd . Our interest will be in computing the mean of the dd -measure of this level
W
set, that is, E[dd (CQ
(u))], as well as its higher moments. It will also be of interest to
compute
E

W (u)
CQ

Y (s) ddd (s) ,

where Y (s) is some random field defined on the level set. One can find formulae of this
type, as well as a certain number of applications, in [5, 14] (d = 1), [3], Chapter 6, and
[1].
5. Another set of interesting problems is related to phase singularities of random wavefronts. These correspond to lines of darkness in light propagation, or threads of silence in
sound propagation [4]. In a mathematical framework, they can be defined as the locations
of points where the amplitudes of waves vanish. If we represent a wave as
W (x, t) = (x, t) + i(x, t),

x Rd ,

172

J.-M. Azas, J.R. Le


on and M. Wschebor

where , are independent homogenous Gaussian random fields, then the dislocations
are the intersections of the two random surfaces (x, t) = 0, (x, t) = 0. Here, we only
consider the case d = 2. At fixed time, say t = 0, we will compute the expectation of the
random variable #{x S : (x, 0) = (x, 0) = 0}.
The aim of this paper is threefold: (a) to re-formulate some known results in a modern
language; (b) to prove a certain number of new results, both for the exact and approximate models, especially variance computations in cases in which only first moments have
been known until now, thus contributing to improve the statistical methods derived from
the probabilistic results; (c) in some cases, to prove a central limit theorem.
Rice formulae are our basic tools. For statements and proofs, we refer to the recent book
[3]. On the other hand, we are not giving full proofs since the required computations are
quite long and involved; one can find details and some other examples that we do not treat
here in [2]. For numerical computations, we use MATLAB programs which are available
at the site http://www.math.univ-toulouse.fr/~azais/prog/programs.html.
In what follows, d denotes the Lebesgue measure in Rd , d (B) the d -dimensional
Hausdorff measure of a Borel set B and M T the transpose of a matrix M . (const) is
a positive constant whose value may change from one occurrence to another. p (x) is
the density of the random variable or vector at the point x, whenever it exists. If not
otherwise stated, all random fields are assumed to be Gaussian and centered.

2. Specular points in dimension one


2.1. Expectation of the number of specular points
We first consider the Longuet-Higgins approximation (2) of the number of SP (x, W (x)),
that is,
SP 2 (I) = #{x I : Y (x) = W (x) kx = 0}.
We assume that {W (x) : x R} has C 2 paths and is stationary. The Rice formula for the
first moment ([3], Theorem 3.2) then applies and gives
E(|Y (x)||Y (x) = 0)pY (x) (0) dx =

E(SP 2 (I)) =
I

G(k,

=
I

1
kx
4 )
2
2

kx
1
E(|Y (x)|)

2
2
I

dx
(3)

dx,

where 2 and 4 are the spectral moments of W and


G(, ) := E(|Z|),

Z N (, 2 ) = [2(/) 1] + 2(/),

(4)

where () and () are respectively the density and cumulative distribution functions
of the standard Gaussian distribution.

Rice formulae and Gaussian waves

173

If we look at the total number of specular points over the whole line, we get

1 k2
1 k4
24 1
G(k, 4 )
1+
+
+ ,

E(SP 2 (R)) =
k
k
2 4 24 24

(5)

which is the result given in [10],


part II, formula (2.14), page 846. Note that this quantity

4
is an increasing function of k .
We now turn to the computation of the expectation of the number of specular points
SP 1 (I) defined by (1). It is equal to the number of zeros of the process {Z(x) := W (x)
m1 (x, W (x)) : x R}, where
m1 (x, w) =

x2 (h1 w)(h2 w) + [x2 + (h1 w)2 ][x2 + (h2 w)2 ]


.
x(h1 + h2 2w)

Assume that the process {W (x) : x R} is Gaussian, centered and stationary, with 0 = 1.
The process Z is not Gaussian, so we use [3], Theorem 3.4, to get
b

E(SP 1 ([a, b])) =

dx
a

E(|Z (x)||Z(x) = 0, W (x) = w)

(6)

2
2
1
1
em1 (x,w)/(22 ) dw.
ew /2
22
2

For the conditional expectation in (6), note that


Z (x) = W (x)

m1
m1
(x, W (x))
(x, W (x))W (x)
x
w

so that under the condition {Z(x) = 0, W (x) = w}, we get


Z (x) = W (x) K(x, w),

where K(x, w) =

m1
m1
(x, w) +
(x, w)m1 (x, w).
x
w

Once again, using Gaussian regression, we can write (6) in the form
E(SP 1 ([a, b])) =

1
2

4 22
2

dx
a

G(m, 1) exp

m2 (x, w)
1
w2 + 1
2
2

dw, (7)

where m = m(x, w) = (2 w + K(x, w))/ 4 22 and G is defined in (4). In (7), the


integral is convergent as a , b + and this formula is well adapted to numerical
approximation.
We have performed some numerical computations to compare the exact expectation
given by (7) with the approximation (3) in the stationary case. The result depends on
h1 , h2 , 4 and 2 , and, after scaling, we can assume that 2 = 1. When h1 h2 , the approximation (3) is very sharp. For example, if h1 = 100, h2 = 100, 4 = 3, the expectation
of the total number of specular points over R is 138.2; using the approximation (5), the

174

J.-M. Azas, J.R. Le


on and M. Wschebor

Figure 1. Intensity of specular points in the case h1 = 100, h2 = 300, 4 = 3. Solid line corresponds to the exact formula, dashed line corresponds to the approximation (3).

result with the exact formula is around 2.102 larger (this is the same order as the error
in the computation of the integral). For h1 = 90, h2 = 110, 4 = 3, the results are 136.81
and 137.7, respectively. If h1 = 100, h2 = 300, 4 = 3, the results differ significantly and
Figure 1 displays the densities in the integrand of (6) and (3) as functions of x.

2.2. Variance of the number of specular points


We assume that the covariance function E(W (x)W (y)) = (x y) has enough regularity
to perform the computations below, the precise requirements being given in the statement
of Theorem 1.
Writing, for short, S = SP 2 (R), we have
Var(S) = E(S(S 1)) + E(S) [E(S)]2 .

(8)

Using [3], Theorem 3.2, we have


E(S(S 1)) =

R2

E(|W (x) k||W (y) k||W (x) = kx, W (y) = ky)

pW (x),W (y) (kx, ky) dx dy,


where
pW (x),W (y) (kx, ky)

(9)

Rice formulae and Gaussian waves

175

1 k 2 (2 x2 + 22 (x y)xy + 2 y 2 )
,
=
exp
2
2
22 2 (x y)
2 2 2 (x y)
1

(10)

under the condition that the density (10) does not degenerate for x = y.
For the conditional expectation in (9), we perform a Gaussian regression of W (x)
(resp., W (y)) on the pair (W (x), W (y)). Putting z = x y, we obtain
W (x) = y (x) + ay (x)W (x) + by (x)W (y),
ay (x) =

(z) (z)
,
22 2 (z)

by (x) =

2 (z)
,
2 (z)

22

where y (x) is Gaussian centered, independent of (W (x), W (y)). The regression of


W (y) is obtained by permuting x and y.
The conditional expectation in (9) can now be rewritten as an unconditional expectation:
E

y (x) k (z) 1 +

(z)x + 2 y
22 2 (z)

x (y) k (z) 1 +

(z)y + 2 x
22 2 (z)

. (11)

Note that the singularity on the diagonal x = y is removable since a Taylor expansion
shows that for z 0,
(z) 1 +

1 4
(z)x + 2 y
=
x(z + O(z 3 )).
22 2 (z)
2 2

(12)

It can be checked that


2 (z) = E((y (x))2 ) = E((x (y))2 ) = 4
E(y (x)x (y)) = (4) (z) +

2 (z) (z)
.
22 2 (z)

2 2 (z)
,
22 2 (z)

(13)
(14)

Moreover, if 6 < +, we can show that as z 0, we have


2 (z)

1 2 6 24 2
z
4
2

(15)

and it follows that the singularity on the diagonal of the integrand in the right-hand side
of (9) is also removable.
We will make use of the following auxiliary statement that we state as a lemma for
further reference. The proof requires some calculations, but is elementary, so we omit it.
The value of H(; 0, 0) can be found in, for example, [6], pages 211212.
Lemma 1. Let
H(; , ) = E(| + || + |),

176

J.-M. Azas, J.R. Le


on and M. Wschebor

where the pair (, ) is centered Gaussian, E( 2 ) = E( 2 ) = 1, E() = .


Then, if 2 + 2 1 and 0 1,
H(; , ) = H(; 0, 0) + R2 (; , ),
where
H(; 0, 0) =

1 2 +

2
arctan

1 2

and

|R2 (; , )| 3(2 + 2 ).

In the next theorem, we compute the equivalent of the variance of the number of specular points, under certain hypotheses on the random process W and with the LonguetHiggins asymptotic. This result is new and useful for estimation purposes since it implies
that, as k 0, the coefficient of variation of the random variable S tends to zero at
a known speed. Moreover, it will also appear in a natural way when normalizing S to
obtain a central limit theorem.
Theorem 1. Assume that the centered Gaussian stationary process W = {W (x) : x R}
is -dependent, that is, (z) = 0 if |z| > , and that it has C 4 -paths. Then, as k 0, we
have
1
Var(S) = + O(1),
(16)
k
where
=
(z) =

J
+
2

24
24
3
,

J=

2 (z)H((z); 0, 0))
2(2 + (z))

dz,

(z)2 (z)
1
(4)

(z)
+
,
2 (z)
22 2 (z)

2 (z) is defined in (13) and H is defined in Lemma 1. Moreover, as k 0, we have


Var(S)
k.
E(S)
Remarks.
(1) The -dependence hypothesis can be replaced by some weaker mixing condition,
such as
|(i) (z)| (const)(1 + |z|)

(0 i 4)

for some > 1, in which case the value of should be


=

1
24
+

2 (z)H((z); 0, 0) 1 4
dz.

2
2 2 + (z)

Rice formulae and Gaussian waves

177

The proof of this extension can be constructed along the same lines as the one we
give below, with some additional computations.
(2) The above computations complete the study done in [10] (Theorem 4). In [9], the
random variable SP 2 (I) is expanded in the WienerHermite chaos. The aforementioned expansion yields the same formula for the expectation and also allows a
formula to be obtained for the variance. However, this expansion is difficult to
manipulate in order to get the result of Theorem 1.
Proof of Theorem 1. We use the notation and the computations preceding the statement of the theorem.
Divide the integral on the right-hand side of (9) into two parts, corresponding to
|x y| > and |x y| , that is,
E(S(S 1)) =

|xy|>

|xy|

= I1 + I2 .

(17)

In the first term, the -dependence of the process implies that one can factorize the
conditional expectation and the density in the integrand. Taking into account that for
each x R, the random variables W (x) and W (x) are independent, we obtain for I1
I1 =
|xy|>

E(|W (x) k|)E(|W (y) k|)pW (x) (kx)pW (y) (ky) dx dy.

On the other hand, we know that W (x) (resp., W (x)) is centered normal with variance
2 (resp., 4 ). Hence,
4 )]

I1 = [G(k,

2
|xy|>

1
1 k 2 (x2 + y 2 )
dx dy.
exp
22
2
2

To compute the integral on the right-hand side, note that the integral over the whole x, y
plane is equal to 1/k 2 so that it suffices to compute the integral over the set |x y| .
Changing variables, this last integral is equal to
+

x+

dx

1 k 2 (x2 + y 2 )
1
dy =
exp
+ O(1),
22
2
2
k 2

where the last term is bounded if k is bounded (remember that we are considering an
approximation in which k 0). Therefore, we can conclude that
|xy|>

1
1 k 2 (x2 + y 2 )

1
dx dy = 2
exp
+ O(1),
22
2
2
k
k 2

from which we deduce, performing a Taylor expansion, that


I1 =

24 1

+ O(1) .
k 2 k 2

(18)

178

J.-M. Azas, J.R. Le


on and M. Wschebor

Let us now turn to I2 . Using Lemma 1 and the equivalences (12) and (15), whenever
|z| = |x y| , the integrand on the right-hand side of (9) is bounded by
(const )[H((z); 0, 0) + k 2 (x2 + y 2 )].
We divide the integral I2 into two parts.
First, on the set {(x, y) : |x| 2, |x y| }, the integral is clearly bounded by some
constant.
Second, we consider the integral on the set {(x, y) : x > 2, |x y| }. (The symmetric
case, replacing x > 2 by x < 2, is similar that is the reason for the factor 2 in what
follows.) We have (recall that z = x y)
2 (z)[H((z); 0, 0) + R2 ((z); , )]

I2 = O(1) + 2
|xy|,x>2

1
2 22 2 (z)

exp

1 k 2 (2 x2 + 2 (x y)xy + 2 y 2 )
dx dy,
2
22 2 (x y)

which can be rewritten as

2 (z)[H((z); 0, 0) + R2 ((z); , )]

I2 = O(1) + 2

k2 z 2
2
1
1
1

exp
(z) + (z)

2
2(2 + (z))
2
2
+

1
2(2 (z))

exp k 2

dz

(x z/2)2
dx.
2 (z)

Changing variables, the inner integral becomes


1

k 2

+
0

1
1
1 1
exp 2 d =
+ O(1),
2
2
2 2k

(19)

where 0 = 2 2k(2 z/2)/ 2 (z).


Substituting this into I2 , we obtain
J
I2 = O(1) + .
k 2
To finish, combine (20) with (18), (17), (8) and (5).

(20)

Rice formulae and Gaussian waves

179

2.3. Central limit theorem


Theorem 2. Assume that the process W satisfies the hypotheses of Theorem 1. In addition, we assume that the fourth moment of the number of approximate specular points
on an interval having length equal to 1 is uniformly bounded in k, that is, for all a R
and 0 < k < 1,
4

E([SP 2 ([a, a + 1])] ) (const ).

(21)

Then, as k 0,
S

24 /1/k
/k

N (0, 1)

in distribution.

Remarks. One can give conditions for the additional hypothesis (21) to hold true. Even
though they are not nice, they are not costly from the point of view of physical models.
For example, either one of the following conditions implies (21):
(i) the paths x W (x) are of class C 11 (use [3], Theorem 3.6, with m = 4, applied to
the random process {W (x) : x R});
(ii) the paths x W (x) are of class C 9 and the support of the spectral measure has
an accumulation point (apply [3], Example 3.4, Proposition 5.10 and Theorem 3.4, to
show that the fourth moment of the number of zeros of W (x) is bounded).
Note that the asymptotic here differs from other ones existing in the literature on related
subjects (compare with, e.g., [7] and [12]).
Proof of Theorem 2. Let and be real numbers satisfying the conditions 1/2 <
< 1, + > 1, 2 + < 2. It suffices to prove the convergence as k takes values on a
sequence of positive numbers tending to 0. To keep in mind that the parameter is k, we
use the notation S(k) := S = SP 2 (R).
Choose k small enough so that k > 2 and define the sets of disjoint intervals, for
j = 0, 1, . . ., [k ] ([] denotes integer part),
Ujk = ((j 1)[k ] + /2, j[k ] /2),
Ijk = [j[k ] /2, j[k ] + /2].

Each interval Ujk has length [k ] and two neighboring intervals Ujk are separated
by an interval of length . So, the -dependence of the process implies that the random
variables SP 2 (Ujk ), j = 0, 1, . . . , [k ], are independent. A similar argument applies to
SP 2 (Ijk ), j = 0, 1, . . ., [k ].
We write
SP 2 (Ujk ),

T (k) =
|j|[k ]

where the equivalence is due to Theorem 1.

Vk = (Var(S(k)))1/2

k/,

180

J.-M. Azas, J.R. Le


on and M. Wschebor

The proof is performed in two steps, which easily imply the statement. In the first, it
is proved that Vk [S(k) T (k)] tends to 0 in the L2 of the underlying probability space.
In the second step, we prove that Vk T (k) is asymptotically standard normal.
Step 1. We first prove that Vk [S(k) T (k)] tends to 0 in L1 . Since it is non-negative,
it suffices to show that its expectation tends to zero. We have
S(k) T (k) =

SP 2 (Ijk ) + Z1 + Z2 ,
|j|<[k ]

where Z1 = SP 2 (, [k ] [k ] + /2), Z2 = SP
2 ([k ] [k ] /2, +)).
k
Using the fact that E(SP 2 (I)) (const) I (kx/ 2 ) dx, we can show that
+

Vk E(S(k) T (k)) (const )k 1/2

=0

[k ]k

(kx/ 2 ) dx ,

+
[k ][k ]

which tends to zero as a consequence of the choice of and . It suffices to prove that
Vk2 Var(S(k) T (k)) 0 as k 0. Using independence, we have
Var(S(k) T (k)) =

Var(SP 2 (Ijk )) + Var(Z1 ) + Var(Z2 )


|j|<[k ]

|j|<[k ]

E(SP 2 (Ijk )(SP 2 (Ijk ) 1))

+ E(Z1 (Z1 1)) + E(Z2 (Z2 1)) + E(S(k) T (k)).


We already know that Vk2 E(S(k) T (k)) 0. Since each Ijk can be covered by a fixed
number of intervals of size one, we know that E(SP 2 (Ijk )(SP 2 (Ijk ) 1)) is bounded by a
constant which does not depend on k and j. Therefore,
Vk2
|j|<[k ]

E(SP 2 (Ijk )(SP 2 (Ijk ) 1)) (const )k 1 ,

which tends to zero because of the choice of . The remaining two terms can be bounded
in a similar form as in the proof of Theorem 1.
Step 2. T (k) is a sum of independent, but not equidistributed, random variables. To
prove that it satisfies a central limit theorem, we will use a Lyapunov condition based of
fourth moments. Set
Mjm := E{[SP 2 (Ujk ) E(SP 2 (Ujk ))]m }.
For the Lyapunov condition, it suffices to verify that
4
|j|[k ]

Mj4 0

as k 0, where 2 :=

Mj2 .
|j|[k ]

(22)

Rice formulae and Gaussian waves

181

To prove (22), we divide each interval Ujk into p = [k ] 1 intervals I1 , . . . , Ip of equal


size . We have
E(SP 1 + + SP p )4 =

E(SP i1 SP i2 SP i3 SP i4 ),

(23)

1i1 ,i2 ,i3 ,i4 p

where SP i stands for SP 2 (Ii ) E(SP 2 (Ii )). Since the size of all intervals is equal
to , given the finiteness of fourth moments in the hypothesis, it follows that
E(SP i1 SP i2 SP i3 SP i4 ) is bounded.
On the other hand, the number of terms which do not vanish in the sum of the righthand side of (23) is O(p2 ). In fact, if one of the indices in (i1 , i2 , i3 , i4 ) differs by more
than 1 from all the others, then E(SP i1 SP i2 SP i3 SP i4 ) = 0. Hence,
E[SP 2 (Ujk ) E(SP 2 (Ujk ))]4 (const)k 2
so that |j|[k ] Mj4 = O(k 2 k ). The inequality 2 + < 2 implies the Lyapunov
condition.

3. Specular points in two dimensions.


Longuet-Higgins approximation
We consider, at fixed time, a random surface depending on two space variables x and y.
The source of light is placed at (0, 0, h1) and the observer is at (0, 0, h2). The point (x, y)
is a specular point if the normal vector n(x, y) = (Wx , Wy , 1) to the surface at (x, y)
satisfies the following two conditions:
the angles with the incident ray I = (x, y, h1 W ) and the reflected ray R =
(x, y, h2 W ) are equal (to simplify notation, the argument (x, y) has been removed);
it belongs to the plane generated by I and R.
Setting i = hi W and ri = x2 + y 2 + i , i = 1, 2, as in the one-parameter case, we
have
x
2 r1 1 r2
2 r1 1 r2
y
Wx = 2
,
Wy = 2
.
(24)
2
2
x +y
r2 r1
x +y
r2 r1
When h1 and h2 are large, the system above can be approximated by
Wx = kx,

Wy = ky,

(25)

under the same conditions as in dimension one.


Next, we compute the expectation of SP 2 (Q), the number of approximate specular
points, in the sense of (25), that are in a domain Q. In the remainder of this paragraph, we
limit our attention to this approximation and to the case in which {W (x, y) : (x, y) R2 }
is a centered Gaussian stationary random field.

182

J.-M. Azas, J.R. Le


on and M. Wschebor

Let us define
Wx (x, y) kx
Wy (x, y) ky

Y(x, y) :=

(26)

Under very general conditions, for example, on the spectral measure of {W (x, y) : x, y
R}, the random field {Y (x, y) : x, y R} satisfies the conditions of [3], Theorem 6.2, and
we can write
E(| det Y (x, y)|)pY(x,y) (0) dx dy

E(SP 2 (Q)) =

(27)

since for fixed (x, y), the random matrix Y (x, y) and the random vector Y(x, y) are
independent so that the condition in the conditional expectation can be eliminated. The
density in the right-hand side of (27) has the expression
pY(x,y) (0) = p(Wx ,Wy ) (kx, ky)
1
=
2

1
20 02 211

k2
(02 x2 211 xy + 20 y 2 ) .
exp
2(20 02 211 )

(28)

To compute the expectation of the absolute value of the determinant in the righthand side of (27), which does not depend on x, y, we use the method of [4]. Set
2
:= det Y (x, y) = (Wxx k)(Wyy k) Wxy
.
We have
+

E(||) = E

1 cos(t)
dt .
t2

(29)

Define
2
h(t) := E[exp(it[(Wxx k)(Wyy k) Wxy
])].

Then
E(||) =

+
0

1 Re[h(t)]
dt .
t2

We now proceed to give a formula for Re[h(t)]. Define

0 1/2 0
A = 1/2 0
0
0
0 1
and denote by the variance matrix of (Wxx , Wyy , Wx,y )

40
:= 22
31

22
04
13

31
13 .
22

(30)

Rice formulae and Gaussian waves

183

Let 1/2 A1/2 = P diag(1 , 2 , 3 )P T , where P is orthogonal. Then


2

h(t) = eitk E(exp[it((1 Z12 k(s11 + s21 )Z1 ) + (2 Z22 k(s12 + s22 )Z2 )
+ (3 Z32 k(s13 + s23 )Z3 ))]),

(31)

where (Z1 , Z2 , Z3 ) is standard normal and sij are the entries of 1/2 P T .
One can check that if is a standard normal variable and , are real constants, > 0,
then
2

E(ei (+) ) = (1 2i )1/2 ei

/(12i )

2
1
2
+
i

+
exp
1 + 4 2
1 + 4 2
(1 + 4 2 )1/4

where = 12 arctan(2 ), 0 < < /4. Substituting this into (31), we obtain
3

dj (t, k)

Re[h(t)] =

1 + 42j t2

j=1

(j (t) + k 2 tj (t)) ,

cos

(32)

j=1

where, for j = 1, 2, 3:
dj (t, k) = exp

k 2 t2 (s1j + s2j )2
;
2 1 + 42j t2

j (t) =

1
arctan(2j t),
2

j (t) =

1
(s1j + s2j )2 j
.
t2
3
1 + 42j t2

0 < j < /4;

Introducing these expressions into (30) and using (28), we obtain a new formula which
has the form of a rather complicated integral. However, it is well adapted to numerical
evaluation. On the other hand, this formula allows us to compute the equivalent as
k 0 of the expectation of the total number of specular points under the LonguetHiggins approximation. In fact, a first-order expansion of the terms in the integrand
gives a somewhat more accurate result, one that we now state as a theorem.
Theorem 3.
E(SP 2 (R2 )) =

m2
+ O(1),
k2

(33)

where
+

m2 =
0
+

=
0

1[

3
2 2 1/2
cos(
j=1 (1 + 4j t )]
2
t

1 23/2 [

3
j=1 (Aj

3
j=1 j (t))

dt

1 + Aj )](1 B1 B2 B2 B3 B3 B1 )
t2

dt,

(34)

184

J.-M. Azas, J.R. Le


on and M. Wschebor

Figure 2. Intensity function of the specular points for the Jonswap spectrum.

Aj = Aj (t) = (1 + 42j t2 )1/2 ,

Bj = Bj (t) =

(1 Aj )/(1 + Aj ).

Note that m2 depends only on the eigenvalues 1 , 2 , 3 and is easily computed


numerically. We have performed a numerical computation using a standard sea model
with a Jonswap spectrum and spread function cos(2). It corresponds to the default
parameters of the Jonswap function of the toolbox WAFO [13]. The variance matrix of
the gradient and the matrix are, respectively,

9 3 0
114
0
,
= 104 3 11 0 .
104
0 81
0 0 3

The integrand in (27) is displayed in Figure 2 as a function of the two space variables
x, y. The value of the asymptotic parameter m2 is 2.527103.
We now consider the variance of the total number of specular points in two dimensions,
looking for analogous results to the one-dimensional case (i.e., Theorem 1), in view of
their interest for statistical applications. It turns out that the computations become

Rice formulae and Gaussian waves

185

much more complicated. The statements on variance and speed of convergence to zero of
the coefficient of variation that we give below include only the order of the asymptotic
behavior in the Longuet-Higgins approximation, but not the constant. However, we still
consider them to be useful. If one refines the computations, rough bounds can be given on
the generic constants in Theorem 4 on the basis of additional hypotheses on the random
field.
We assume that the real-valued, centered, Gaussian stationary random field {W (x) : x
R2 } has paths of class C 3 , the distribution of W (0) does not degenerate (i.e., Var(W (0))
is invertible). Moreover, let us consider W (0), expressed in the reference system xOy
of R2 as the 2 2 symmetric centered Gaussian random matrix
W (0) =

Wxx (0)
Wxy (0)

Wxy (0)
Wyy (0)

The function
z

(z) = det[Var(W (0)z)],

defined on z = (z1 , z2 )T R2 , is a non-negative homogeneous polynomial of degree 4 in


the pair z1 , z2 . We will assume the non-degeneracy condition
min{(z) : z = 1} = > 0.

(35)

Theorem 4. Let us assume that {W (x) : x R2 } satisfies the above conditions and that
it is also -dependent, > 0, that is, E(W (x)W (y)) = 0 whenever x y > . Then, for
k small enough,
Var(SP 2 (R2 ))

L
,
k2

(36)

where L is a positive constant depending on the law of the random field.


Moreover, for k small enough, by using the result of Theorem 3 and (36), we get
Var(SP 2 (R2 ))
L1 k,
E(SP 2 (R2 ))
where L1 is a new positive constant.
Proof. To simplify notation, let us denote T = SP 2 (R2 ). We have
Var(T ) = E(T (T 1)) + E(T ) [E(T )]2 .

(37)

We have already computed the equivalents as k 0 of the second and third term in
the right-hand side of (37). Our task in what follows is to consider the first term.
The proof is performed along the same lines as the one of Theorem 1, but instead of
applying a Rice formula for the second factorial moment of the number of crossings of a

186

J.-M. Azas, J.R. Le


on and M. Wschebor

one-parameter random process, we need [3], Theorem 6.3, for the factorial moments of a
2-parameter random field. We have
E(T (T 1)) =

E(| det Y (x)|| det Y (y)||Y(x) = 0, Y(y) = 0)


R2 R2

pY(x),Y(y) (0, 0) dx dy
=
xy >

dx dy +

xy

dx dy = J1 + J2 .

For J1 , we proceed as in the proof of Theorem 1, using the -dependence and the
evaluations leading to the statement of Theorem 3. We obtain
J1 =

m22 O(1)
+ 2 .
k4
k

(38)

One can show that under the hypotheses of the theorem, for small k, one has
J2 =

O(1)
.
k2

(39)

We refer the reader to [2] for the lengthy computations leading to this inequality. In view
of (37), (33) and (38), this suffices to prove the theorem.

4. The distribution of the normal to the level curve


Let us consider a modeling of the sea W (x, y, t) as a function of two space variables and
one time variable. Usual models are centered Gaussian stationary with a particular form
of the spectral measure that is presented, for example, in [3]. We denote the covariance
by (x, y, t) = E(W (0, 0, 0)W (x, y, t)).
In practice, one is frequently confronted with the following situation: several pictures of
the sea on time over an interval [0, T ] are stocked and some properties or magnitudes are
observed. If the time T and the number of pictures are large, and if the process is ergodic
in time, then the frequency of pictures that satisfy a certain property will converge to
the probability of this property happening at a fixed time.
Let us illustrate this with the angle of the normal to the level curve at a point chosen
at random. We first consider the number of crossings of a level u by the process W (, y, t)
for fixed t and y, defined as
W (,y,t)

N[0,M1 ] (u) = #{x : 0 x M1 ; W (x, y, t) = u}.


We are interested in computing the total number of crossings per unit time when integrating over y [0, M2 ], that is,
1
T

M2

dt
0

W (,y,t)

N[0,M1 ] (u) dy.

(40)

Rice formulae and Gaussian waves

187

If the ergodicity assumption in time holds true, then we can conclude that a.s.
M2

1
T

W (,0,0)

W (,y,t)

N[0,M1 ] (u) dy M1 E(N[0,M1 ] (u)) =

dt
0

M1 M2

200 1/2u2 /000


,
e
000

where
abc =
R3

ax by ct d(x , y , t )

are the spectral moments of W . Hence, on the basis of the quantity (40), for large T ,
one can make inference about the value of certain parameters of the law of the random
field. In this example, these are the spectral moments 200 and 000 .
If two-dimensional level information is available, one can work differently because there
exists an interesting relationship with Rice formulae for level curves that we explain in
what follows. We can write (x = (x, y))
W (x, t) = W (x, t) (cos (x, t), sin (x, t))T .
Using a Rice formula, more precisely, under conditions of [3], Theorem 6.10,
M2

200 u2 /(2000 )
e
,
000
0
CQ (0,u)
(41)
where Q = [0, M1 ] [0, M2 ]. We have a similar formula when we consider sections of the
set [0, M1 ] [0, M2 ] in the other direction. In fact, (41) can be generalized to obtain the
Palm distribution of the angle .
Set h1 ,2 = I[1 ,2 ] and, for 1 < 2 , define
W (,y,0)

| cos (x, 0)| d1 =

N[0,M1 ] (u) dy = E

2 (Q)

F (2 ) F (1 ) := E(1 ({x Q : W (x, 0) = u; 1 (x, s) 2 }))


=E

h1 ,2 ((x, s)) d1 (x) ds

(42)

CQ (u,s)

= 2 (Q)E h1 ,2

2
y W
1/2 exp(u /(200 ))

((x W )2 + (y W )2 )
.
x W
2000

Defining = 200 020 110 and assuming 2 (Q) = 1 for ease of notation, we readily
obtain
F (2 ) F (1 )
2

eu /(2000 )

(2)3/2 ()1/2 000


2

h1 ,2 () x2 + y 2 e(1/(2))(02 x
R2

eu /(200 )

(2)3/2 (+ )1/2 000

211 xy+20 y 2 )

dx dy

188

J.-M. Azas, J.R. Le


on and M. Wschebor
+

2
1

2 exp

2
(+ cos2 ( ) + sin2 ( )) d d,
2+

where + are the eigenvalues of the covariance matrix of the random vector
(x W (0, 0, 0), y W (0, 0, 0)) and is the angle of the eigenvector associated with + .
Noting that the exponent in the integrand can be written as 1/ (1 2 sin2 ( ))
with 2 := 1 + / and that
+
0

2 exp

H2
2

,
2H

it is easy to obtain that


2

F (2 ) F (1 ) = (const )

(1 2 sin2 ( ))

1/2

d.

From this relation, we get the density g() of the Palm distribution, simply by dividing
by the total mass:
g() =

(1 2 sin2 ( ))1/2

2
2
1/2 d
(1 sin ( ))

(1 2 sin2 ( ))1/2
.
4K( 2 )

(43)

Here, K is the complete elliptic integral of the first kind. This density characterizes
the distribution of the angle of the normal at a point chosen at random on the level
curve. In the case of a random field which is isotropic in (x, y), we have 200 = 020
and, moreover, 110 = 0, so that g turns out to be the uniform density over the circle
(Longuet-Higgins says that over the contour, the distribution of the angle is uniform
(cf. [11], page 348)). We have performed the numerical computation of the density (43)
for an anisotropic process with = 0.5, = /4. Figure 3 displays the densities of the
Palm distribution of the angle showing a large departure from the uniform distribution.
Let us turn to ergodicity. For a given subset Q of R2 and each t, let us define At =
{W (x, y, t) : > t; (x, y) Q} and consider the -algebra of t-invariant events A = At .
We assume that for each pair (x, y), (x, y, t) 0 as t +. It is well known that under
this condition, the -algebra A is trivial, that is, it only contains events having probability
zero or one (see, e.g., [6], Chapter 7). This has the following important consequence in
our context. Assume that the set Q has a smooth boundary and, for simplicity, unit
Lebesgue measure. Let us consider
Z(t) =

H(x, t) d1 (x)

(44)

CQ (u,t)

with H(x, t) = H(W (x, t), W (x, t)), where W = (Wx , Wy ) denotes the gradient in the
space variables and H is some measurable function such that the integral is well defined.
This is exactly our case in (42). The process {Z(t) : t R} is strictly stationary and,

Rice formulae and Gaussian waves

189

Figure 3. Density of the Palm distribution of the angle of the normal to the level curve in the
case = 0.5 and = /4.

in our case, has a finite mean and is Riemann-integrable. By the BirkhoffKhintchine


ergodic theorem ([6], page 151), a.s. as T +,
1
T

T
0

Z(s) ds EB [Z(0)],

where B is the -algebra of t-invariant events associated with the process Z(t). Since for
each t, Z(t) is At -measurable, it follows that B A so that EB [Z(0)] = E[Z(0)]. On the
other hand, the Rice formula yields (taking into account the fact that stationarity of W
implies that W (0, 0) and W (0, 0) are independent)
E[Z(0)] = E[H(u, W (0, 0)) W (0, 0) ]pW (0,0)(u).
We consider now the central limit theorem. Let us define
Z(t) =

1
t

t
0

[Z(s) E(Z(0))] ds.

(45)

To compute the variance of Z(t), one can again use the Rice formula for the first moment
of integrals over level sets, this time applied to the R2 -valued random field with parameter
in R4 , {(W (x1 , s1 ), W (x2 , s2 ))T : (x1 , x2 ) Q Q, s1, s2 [0, t]} at the level (u, u). We get
Var Z(t) =

2
t

t
0

s
I(u, s) ds,
t

190

J.-M. Azas, J.R. Le


on and M. Wschebor

where
I(u, s) =
Q2

E[H(x1 , 0)H(x2 , s) W (x1 , 0)

W (x2 , s) |W (x1 , 0) = u; W (x2 , s) = u]


2

pW (x1 ,0),W (x2 ,s) (u, u) dx1 dx2 (E[H(u, W (0, 0)) W (0, 0) ]pW (0,0)(u)) .
Assuming that the given random field is time--dependent, that is, (x, y, t) = 0 (x, y)
whenever t > , we readily obtain

t Var Z(t) 2

I(u, s) ds := 2 (u)

as t .

(46)

Now, using a variant of the HoeffdingRobbins theorem [8] for sums of -dependent
random variables, we can establish the following theorem.
Theorem 5. Assume that the random field W and the function H satisfy the conditions
of [3], Theorem 6.10. Assume, for simplicity, that Q has Lebesgue measure. Then:
(i) if the covariance (x, y, t) tends to zero as t + for every value of (x, y) Q,
we have
1
T

T
0

Z(s) ds E[H(u, W (0, 0)) W (0, 0) ]pW (0,0)(u),

where Z(t) is defined by (44).


(ii) if the random field W is -dependent in the sense above, we have

tZ(t) = N (0, 2 (u)),


where Z(t) is defined by (45) and 2 (u) by (46).

5. Application to dislocations of wavefronts


In this section, we follow the article [4] by Berry and Dennis. Dislocations are lines in space
or points in the plane where the phase of the complex scalar wave (x, t) = (x, t)ei(x,t)
is undefined. With respect to light, they are lines of darkness; with respect to sound,
threads of silence. Here, we only consider two-dimensional space variables x = (x1 , x2 ).
It is convenient to express by means of its real and imaginary parts:
(x, t) = (x, t) + i(x, t).
Thus, dislocations are the intersection of the surfaces (x, t) = 0 and (x, t) = 0.
Let us quote the authors of [4]: Interest in optical dislocations has recently revived,
largely as a result of experiments with laser fields. In low-temperature physics, (x, t)
could represent the complex order parameter associated with quantum flux lines in a
superconductor or quantized vortices in a superfluid (cf. [4] and the references therein).

Rice formulae and Gaussian waves

191

In what follows, we assume an isotropic Gaussian model. This means that we will
consider the wavefront as an isotropic Gaussian field
(x, t) =
R2

exp (i[ k x c|k|t])

(|k|)
|k|

1/2

dW (k),

where k = (k1 , k2 ), |k| = k12 + k22 , (k) is the isotropic spectral density and W = (W1 +
iW2 ) is a standard complex orthogonal Gaussian measure on R2 with unit variance. We
are only interested in t = 0 and we put (x) := (x, 0) and (x) := (x, 0). We have,
setting k = |k|,
R2

(k)
k

1/2

cos( k x )

R2

(k)
k

1/2

cos( k x )

(x) =
(x) =

dW1 (k)

R2

(k)
k

1/2

sin( k x )

R2

(k)
k

1/2

sin( k x )

dW2 (k) +

dW2 (k), (47)


dW1 (k). (48)

The covariances are


E[(x)(x )] = E[(x)(x )] = (|x x |) :=

J0 (k|x x |)(k) dk,

(49)

where J (x) is the Bessel function of the first kind of order . Moreover, E[(r1 )(r2 )] = 0.

5.1. Mean number of dislocation points


Let us denote by {Z(x) : x R2 } a random field having values in R2 , with coordinates
(x), (x), which are two independent Gaussian stationary isotropic random fields with
the same distribution. We are interested in the expectation of the number of dislocation
points
d2 := E[#{x S : (x) = (x) = 0}],
where S is a subset of the parameter space having area equal to 1.
Without loss of generality, we may assume that Var((x)) = Var((x)) = 1 and for the
derivatives, we set 2 = Var(i (x)) = Var(i (x)), i = 1, 2. Then, according to the Rice
formula,
d2 = E[| det(Z (x))|/Z(x) = 0]pZ(x) (0).
An easy Gaussian computation gives d2 = 2 /(2) ([4], formula (4.6)).

5.2. Variance
Again, let S be a measurable subset of R2 having Lebesgue measure equal to 1. We have
Var(NSZ (0)) = E(NSZ (0)(NSZ (0) 1)) + d2 d22

192

J.-M. Azas, J.R. Le


on and M. Wschebor

and for the first term, we use the Rice formula for the second factorial moment ([3],
Theorem 6.3), that is,
E(NSZ (0)(NSZ (0) 1)) =

A(s1 , s2 ) ds1 ds2 ,


S2

where
A(s1 , s2 ) = E[| det Z (s1 ) det Z (s2 )||Z(s1 ) = Z(s2 ) = 02 ]pZ(s1 ),Z(s2 ) (04 ).
Here, 0p denotes the null vector in dimension p.
Taking into account the fact that the law of the random field Z is invariant under
translations and orthogonal transformations of R2 , we have
A(s1 , s2 ) = A((0, 0), (r, 0)) = A(r)

with r = s1 s2 .

The Rice function A(r) has two intuitive interpretations. First, it can be viewed as
A(r) = lim

1
E[N (B((0, 0), )) N (B((r, 0), ))].
2 4

Second, it is the density of the Palm distribution, a generalization of the horizontal


window conditioning of the number of zeros of Z per unit surface, locally around the
point (r, 0), conditionally on the existence of a zero at (0, 0) (see [6]). In [4], A(r)/d22 is
called the correlation function.
To compute A(r), we denote by 1 , 2 , 1 , 2 the partial derivatives of , with respect
to first and second coordinate. Therefore,
A(r) = E[| det Z (0, 0) det Z (r, 0)||Z(0, 0) = Z(r, 0) = 02 ]pZ(0,0),Z(r,0) (04 )
= E[|(1 2 2 1 )(0, 0)(1 2 2 1 )(r, 0)||Z(0, 0) = Z(r, 0) = 02 ]

(50)

pZ(0,0),Z(r,0)(04 ).
The density is easy to compute:
pZ(0,0),Z(r,0) (04 ) =

1
,
(2)2 (1 2 (r))

where (r) =

J0 (kr)(k) dk.

The conditional expectation turns out to be more difficult to calculate, requiring a long
computation (we again refer to [2] for the details). We obtain the following formula (that
can be easily compared to the formula in [4] since we are using the same notation):
A(r) =

A1
3
4 (1 C 2 )

1
(Z2 2Z12 t2 )
1
1

dt,
t2
(1 + t2 ) Z2 (Z2 Z12 t2 )

Rice formulae and Gaussian waves

193

where we have defined


C := (r),
A1 = F0 F0
Z =
Z1 =

E = (r),
E2
,
1 C2

H = E/r,
A2 =

E2C
F02 H 2
1 F
2
F0
1 C2
A2
,
1 + Zt2

Z2 =

1 + t2
.
1 + Zt2

F = (r),

H F (1 C 2 ) E 2 C
,
F0 F0 (1 C 2 ) E 2

F0

E2
1 C2

F0 = (0),

Acknowledgement
This work has received financial support from the European Marie Curie Network
SEAMOCS.

References
[1] Azas, J.-M., Le
on, J. and Ortega, J. (2005). Geometrical characteristic of gaussian sea
waves. J. Appl. Probab. 42 119. MR2145485
[2] Azas, J.-M., Le
on, J. and Wschebor, M. (2009). Some applications of Rice formulas to
waves. Available at ArXiv:0910.0763v1 [math.PR].
[3] Azas, J.-M. and Wschebor, M. (2009). Level Sets and Extrema of Random Processes and
Fields. Hoboken, NJ: Wiley. MR2478201
[4] Berry, M.V. and Dennis, M.R. (2000). Phase singularities in isotropic random waves. Proc.
R. Soc. Lond. Ser. A 456 20592079. MR1794716
[5] Caba
na, E. (1985). Esperanzas de Integrales sobre Conjuntos de Nivel aleatorios. In Actas
del 2o. Congreso Latinoamericano de Probabilidad y Estadistica Matem
atica, Spanish
6582. Caracas, Venezuela: Regional Latino americana de la Soc. Bernoulli.
[6] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. New
York: Wiley. MR0217860
[7] Cuzick, J.A. (1976). Central limit theorem for the number of zeros of a stationary Gaussian
process. Ann. Probab. 4 547556. MR0420809
[8] Hoeffding, W. and Robbins, H. (1948). The central limit theorem for dependent random
variables. Duke Math. J. 15 773780. MR0026771
[9] Kratz, M. and Le
on, J.R. (2009). Level curves crossings and applications for Gaussian
models. Extremes. DOI: 10.1007/s10687-009-0090-x.
[10] Longuet-Higgins, M.S. (1960). Reflection and refraction at a random surface, I, II, III. J.
Optical Soc. Amer. 50 838856. MR0113489
[11] Longuet-Higgins, M.S. (1962). The statistical geometry of random surfaces. In Proc. Symp.
Appl. Math. Vol. XIII 105143. Providence, RI: Amer. Math. Soc. MR0140175
[12] Piterbarg, V. and Rychlik, I. (1999). Central limit theorem for wave functionals of Gaussian
processes. Adv. in Appl. Probab. 31 158177. MR1699666

194

J.-M. Azas, J.R. Le


on and M. Wschebor

[13] WAFO-group (2000). WAFO A Matlab Toolbox for Analysis of Random


Waves and Loads. Lund Univ., Sweden: Math. Stat. Center. Available at
http://www.maths.lth.se/matstat/wafo.
[14] Wschebor, M. (1985). Surfaces Aleatoires. Lecture Notes in Math. 1147. Berlin: Springer.
MR0871689
Received April 2009 and revised January 2010

arXiv:0910.0763v1 [math.PR] 5 Oct 2009

Some applications of Rice formulas to waves


Jean-Marc Azas

Jose R. Leon

Mario Wschebor

October 5, 2009

Abstract
We use Rices formulas in order to compute the moments of some
level functionals which are linked to problems in oceanography and optics. For instance, we consider the number of specular points in one or
two dimensions, the number of twinkles, the distribution of normal angle
of level curves and the number or the length of dislocations in random
wavefronts. We compute expectations and in some cases, also second moments of such functionals. Moments of order greater than one are more
involved, but one needs them whenever one wants to perform statistical
inference on some parameters in the model or to test the model itself.
In some cases we are able to use these computations to obtain a Central
Limit Theorem.

AMS Subject Classification: Primary 60G15; Secondary 60G60 78A10 78A97


86A05
Keywords: Rice formula, specular points, dislocations of wavefronts, random
seas.

Introduction

Many problems in applied mathematics require to estimate the number of points,


the length, the volume and so on, of the level sets of a random function W (x),
where x Rd , so that one needs to compute the value of certain functionals of
the probability distribution of the size of the random set
W
CA
(u, ) := {x A : W (x, ) = u},

for some given u.


Let us mention some examples which illustrate this general situation:
Universit
e de

Toulouse, IMT, LSP, F31062 Toulouse Cedex 9, France. Email: azais@cict.fr


de Matem
atica. Facultad de Ciencias. Universidad Central de Venezuela. A.P.
47197, Los Chaguaramos, Caracas 1041-A, Venezuela. Email: jose.leon@ciens.ucv.ve
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igu
a
4225. 11400. Montevideo. Uruguay. wschebor@cmat.edu.uy
Escuela

The number of times that a random process {X(t) : t R} crosses the


level u:
NAX (u) = #{s A : X(s) = u}.
Generally speaking, the probability distribution of the random variable
NAX (u) is unknown, even for the simplest models of the underlying process.
However, there exist some formulas to compute E(NAX ) and also higher
order moments.
A particular case is the number of specular points of a random curve or a
random surface.
Consider first the case of a random curve. We take cartesian coordinates
Oxz in the plane. A light source placed at (0, h1 ) emits a ray that is
reflected at the point (x, W (x)) of the curve and the reflected ray is registered by an observer placed at (0, h2 ).
Using the equality between the angles of incidence and reflexion with respect to the normal vector to the curve - i.e. N (x) = (W (x), 1) - an
elementary computation gives:
W (x) =

2 r1 1 r2
x(r2 r1 )

(1)

x2 + 2i , i=1,2.

where i := hi W (x) and ri :=

The points (x, W (x)) of the curve such that x is a solution of (1) are called
specular points. We denote by SP1 (A) the number of specular points
such that x A, for each Borel subset A of the real line. One of our aims
in this paper is to study the probability distribution of SP1 (A).
The following approximation, which turns out to be very accurate in practice for ocean waves, was introduced long ago by Longuet-Higgins (see [13]
and [14]):
Suppose that h1 and h2 are big with respect to W (x) and x, then ri =
i + x2 /(2i ) + O(h3
i ). Then, (1) can be approximated by
W (x)

x h1 + h2
x 1 + 2

= kx,
2 1 2
2 h1 h2

where
k :=

(2)

1 1
1
+
.
2 h1
h2

Denote Y (x) := W (x) kx and SP2 (A) the number of roots of Y (x)
belonging to the set A, an approximation of SP1 (A) under this asymptotic.
The first part of Section 3 below will be devoted to obtain some results
on the distribution of the random variable SP2 (R).

Consider now the same problem as above, but adding a time variable t,
that is, W becomes a random function parameterized by the pair (x, t).
We denote Wx , Wt , Wxt , ... the partial derivatives of W .
We use the Longuet-Higgins approximation (2), so that the approximate
specular points at time t are (x, W (x, t)) where
Wx (x, t) = kx.
Generally speaking, this equation defines a finite number of points which
move with time. The implicit function theorem, when it can be applied,
shows that the x-coordinate of a specular point moves at speed
dx
Wxt
=
.
dt
Wxx k
The right-hand side diverges whenever Wxx k = 0, in which case a flash
appears and the point is called a twinkle. We are interested in the
(random) number of flashes lying in a set A of space and in an interval
[0, T ] of time. If we put:
Wx (x, t) kx
Wxx (x, t) k

Y(x, t) :=

(3)

then, the number of twinkles is:


T W(A, T ) := {(x, t) A [0, T ] : Y(x, t) = 0}

Let W : Q Rd Rd with d > d be a random field and let us define


the level set
W
CQ
(u) = {x Q : W (x) = u}.

Under certain general conditions this set is a (dd )-dimensional manifold


but in any case, its (d d )-dimensional Hausdorff measure is well defined.
We denote this measure by dd . Our interest will be to compute the
W
(u))] as well
mean of the dd -measure of this level set i.e. E[dd (CQ
as its higher moments. It will be also of interest to compute:
E[

W (u)
CQ

Y (s)ddd (s)].

where Y (s) is some random field defined on the level set. Caba
na [7],
Wschebor [19] (d = 1) Azas and Wschebor [4] and, in a weak form,
Z
ahle [20] have studied these types of formulas. See Theorems 5 and 6.
Another interesting problem is the study of phase singularities, dislocations of random wavefronts. They correspond to lines of darkness, in light

propagation, or threads of silence in sound [6]. In a mathematical framework they can be define as the loci of points where the amplitude of waves
vanishes. If we represent the wave as
W (x, t) = (x, t) + i(x, t), where x Rd
where , are independent homogenous Gaussian random fields the dislocations are the intersection of the two random surfaces (x, t) = 0, (x, t) =
0. We consider a fixed time, for instance t = 0. In the case d = 2 we will
study the expectation of the following random variable
#{x S : (x, 0) = (x, 0) = 0}.
In the case d = 3 one important quantity is the length of the level curve
L{x S : (x, 0) = (x, 0) = 0}.
All these situations are related to integral geometry. For a general treatment
of the basic theory, the classical reference is Federers Geometric Measure Theory [9].
The aims of this paper are: 1) to re-formulate some known results in a
modern language or in the standard form of probability theory; 2) to prove
new results, such as computations in the exact models, variance computations
in cases in which only first moments have been known, thus improving the
statistical methods and 3) in some case, obtain Central Limit Theorems.
The structure of the paper is the following: In Section 2 we review without
proofs some formulas for the moments of the relevant random variables. In
Section 3 we study expectation, variance and asymptotic behavior of specular
points. Section 4 is devoted to the study of the distribution of the normal to the
level curve. Section 5 presents three numerical applications. Finally, in Section
6 we study dislocations of wavefronts following a paper by Berry & Dennis [6].

Some additional notation and hypotheses


d is Lebesgue measure in Rd , d (B) the d -dimensional Hausdorff measure of a
Borel set B and M T the transpose of a matrix M . (const) is a positive constant
whose value may change from one occurrence to another.
If not otherwise stated, all random fields are assumed to be Gaussian and centered.

Rice formulas

We give here a quick account of Rice formulas, which allow to express the
expectation and the higher moments of the size of level sets of random fields
by means of some integral formulas. The simplest case occurs when both the
4

dimension of the domain and the range are equal to 1, for which the first results
date back to Rice [17] (see also Cramer and Leadbetters book [8]). When
the dimension of the domain and the range are equal but bigger than 1, the
formula for the expectation is due to Adler [1] for stationary random fields. For
a general treatment of this subject, the interested reader is referred to the book
[4], Chapters 3 and 6, where one can find proofs and details.
Theorem 1 (Expectation of the number of crossings, d = d = 1) Let W =
{W (t) : t I} , I an interval in the real line, be a Gaussian process having C 1 paths. Assume that Var(W (t)) = 0 for every t I.
Then:
(4)
E NIW (u) = E |W (t)| W (t) = u pW (t) (u)dt.
I

Theorem 2 (Higher moments of the number of crossings, d = d = 1) Let


m 2 be an integer. Assume that W satisfies the hypotheses of Theorem 1 and
moreover, for any choice of pairwise different parameter values t1 , ..., tm I
the joint distribution of the k-random vector (W (t1 ), ..., W (tm )) has a density
(which amounts to saying that its variance matrix is non-singular). Then:
E NIW (u)(NIW (u) 1)...(NIW (u) m + 1)
m

E
Im

j=1

|W (tj )| W (t1 ) = ... = W (tm ) = u pW (t1 ),...,W (tm ) (u, ..., u)dt1 ...dtm .
(5)

Under certain conditions, the formulas in Theorems 1 and 2 can be extended


to non-Gaussian processes.
Theorem 3 (Expectation, d = d > 1) Let W : A Rd Rd be a Gaussian

random field, A an open set of Rd , u a fixed point in Rd . Assume that


the sample paths of W are continuously differentiable
for each t A the distribution of W (t) does not degenerate
P({t A : W (t) = u , det(W (t)) = 0}) = 0
Then for every Borel set B included in A
E NBW (u) =

E[| det W (t)| W (t) = u]pW (t) (u)dt.


B

If B is compact, both sides are finite.


The next proposition provides sufficient conditions (which are mild) for the third
hypothesis in the above theorem to be verified (see again [4], Proposition 6.5).

Proposition 1 Under the same conditions of the above theorem one has
P({t A : W (t) = u , det(W (t)) = 0}) = 0
if
pX(t) (x) C for all x in some neighborhood of u,
at least one of the two following conditions is satisfied
a) the trajectories of W are twice continuously differentiable
b)
() = sup P{| det W (t)| < W (t) = x} 0
xV (u)

as 0 where V (u) is some neighborhood of u.


Theorem 4 (m-th factorial moment d = d > 1) Let m 2 be an integer.
Assume the same hypotheses as in Theorem 3 except for (iii) that is replaced by
(iii) for t1 , ..., tm A distinct values of the parameter, the distribution of
W (t1 ), ..., W (tm )
does not degenerate in (Rd )m .
Then for every Borel set B contained in A, one has
E

NBW (u) NBW (u) 1 ... NBW (u) m + 1


m

E
Bm

j=1

| det W (tj ) | W (t1 ) = ... = W (tm ) = u


pW (t1 ),...,W (tm ) (u, ..., u)dt1 ...dtm , (6)

where both sides may be infinite.


When d > d we have the following formula :
Theorem 5 (Expectation of the geometric measure of the level set. d > d )

Let W : A Rd be a Gaussian random field, A an open subset of Rd , d > d

and u Rd a fixed point. Assume that:


Almost surely the function t

W (t) is of class C 1 .

For each t A, W (t) has a non-degenerate distribution.


P{t A, W (t) = u, W (t) does not have full rank} = 0

Then, for every Borel set B contained in A, one has


E (dd (W, B)) =

det W (t)(W (t))T

1/2

W (t) = u

pW (t) (u)dt.

(7)

If B is compact, both sides in (7) are finite.


The same kind of result holds true for integrals over the level set, as stated
in the next theorem.
Theorem 6 (Expected integral on the level set) Let W be a random field
that verifies the hypotheses of Theorem 5. Assume that for each t A one has
another random field Y t : V Rn , where V is some topological space, verifying
the following conditions:
Y t (v) is a measurable function of (, t, v) and almost surely, (t, v)
Y t (v) is continuous.
For each t A the random process (s, v) W (s), Y t (v) defined on
W V is Gaussian.

Moreover, assume that g : A C(V, Rn ) R is a bounded function, which is


continuous when one puts on C(V, Rn ) the topology of uniform convergence on
compact sets. Then, for each compact subset B of A, one has
g(t, Y t )dd (W, dt)

E
BW 1 (u)

E [det(W (t)(W (t))T )]1/2 g(t, Y t ) Z(t) = u .pZ(t) (u)dt.

(8)

Specular points and twinkles

3.1

Number of roots

Let W (t) : Rd Rd be a zero mean stationary Gaussian field. If W satisfies


the conditions of Theorem 3 one has:
E NAW (u) = |A|E[| det(W (0))|]pW (0) (u).
where |A| denotes the Lebesgue measure of A.
For d = 1, NAW (u) is the number of crossings of the level u and the formula
becomes
W
E N[0,T
] (u) =

where

i =

i d()

u2
2 2
0 ,
e
0

i = 0, 2, 4, . . . ,

being the spectral mesure of W .


Formula (9) is in fact the one S.O. Rice wrote in the 40s see [17].

(9)

3.2

Number of specular points

We consider first the one-dimensional static case with the longuet-Higgins approximation (2) for the number of specular points, that is:
SP2 (I) = #{x I : Y (x) = W (x) kx = 0}

We assume that the Gaussian process {W (x) : x R} has C 2 paths and


Var(W (x)) is constant equal to, say, v 2 > 0. (This condition can always be
obtained by means of an appropriate non-random time change, the unit speed
transformation) . Then Theorem 1 applies and
1 kx
E(|Y (x)|) ( )dx
v
v
I
1 kx
= G(k, (x)) ( )dv, (10)
v
v
I

E(|Y (x)| Y (x) = 0)pY (x) (0)dx =

E(SP2 (I)) =
I

where 2 (x) is the variance of W (x) and G(, ) := E(|Z|), Z with distribution
N (, 2 ).
For the second equality in (10), in which we have erased the condition in the
conditional expectation, take into account that since Var(W (x)) is constant,
for each x the random variables W (x) and W (x) are independent (differentiate under the expectation sign and use the basic properties of the Gaussian
distribution).
An elementary computation gives:
G(, ) = [2(/) 1] + 2(/),
where (.) and (.) are respectively the density and the cumulative distribution
functions of the standard Gaussian distribution.
When the process W (x) is also stationary, v 2 = 2 and 2 (x) is constant
equal to 4 . If we look at the total number of specular points over the whole
line, we get

G(k, 4 )
(11)
E(SP2 (R)) =
k
which is the result given by [14] (part II, formula (2.14) page 846). Note that
this quantity is an increasing function of k4 ) .
Since in the longuet-Higgins approximation k 0, one can write a Taylor
expansion having the form:
E(SP2 (R))

24 1
1 k2
1 k4
1+
+
+ ...
k
2 4
24 24

(12)

Let us turn to the variance of the number of specular points, under some
additional restrictions. First of all, we assume for this computation that the
8

given process {W (x) : x R} is stationary with covariance function


E(W (x)W (y)) = (x y). is assumed to have enough regularity as to perform the computations below, the precise requirements on it being given in the
statement of Theorem 7.
Putting for short S = SP2 (R), we have:
Var(S) = E(S(S 1)) + E(S) [E(S)]2

(13)

The first term can be computed using Theorem 2:


E(S(S 1)) =

R2

E |W (x) k||W (y) k| W (x) = kx, W (y) = ky

.pW (x),W (y) (kx, ky) dxdy

(14)

where
1 k 2 (2 x2 + 22 (x y)xy + 2 y 2 )
,
2
22 2 (x y)
2 22 2 (x y)
(15)
under the additional condition that the density (15) does not degenerate for
x = y.
For the conditional expectation in (14) we perform a Gaussian regression of
W (x) (resp. W (y)) on the pair (W (x), W (y)). Putting z = x y, we obtain:
1

pW (x),W (y) (kx, ky) =

exp

W (x) = y (x) + ay (x)W (x) + by (x)W (y)


(z) (z)
22 2 (z)
2 (z)
,
by (x) = 2
2 2 (z)

ay (x) =

(16)

where y (x) is Gaussian centered, independent of (W (x), W (y)). The regression of W (y) is obtained by permuting x and y.
The conditional expectation in (14) can now be rewritten as an unconditional
expectation:
(z)x + 2 y
22 2 (z)

(z)y + 2 x
22 2 (z)
(17)
Notice that the singularity on the diagonal x = y is removable, since a Taylor
expansion shows that for z 0:
E y (x) k (z) 1 +

(z) 1 +

x (y) k (z) 1 +

(z)x + 2 y
1 4
=
x z + O(z 3 ) .
22 2 (z)
2 2

(18)

One can check that


2 (z) = E (y (x))2 = E (x (y))2 = 4
9

2 2 (z)
22 2 (z)

(19)

and
E y (x)x (y) = (4) (z) +

2 (z) (z)
.
22 2 (z)

(20)

Moreover, if 6 < +, performing a Taylor expansion one can show that as


z 0 one has
1 2 6 24 2
z
(21)
2 (z)
4
2
and it follows that the singularity at the diagonal of the integrand in the righthand side of (14) is also removable.
We will make use of the following auxiliary statement that we state as a
lemma for further reference.The proof requires some calculations, but is elementary and we skip it. The value of H(; 0, 0) below can be found for example
in [8], p. 211-212.
Lemma 1 Let
H(; , ) = E(| + || + |)
where the pair (, ) is centered Gaussian, E( 2 ) = E( 2 ) = 1, E() = .
Then,
H(; , ) = H(; 0, 0) + R2 (; , )
where

H(; 0, 0) =

1 2 +

2
arctan

and

1 2

|R2 (; , )| 3(2 + 2 )

if 2 + 2 1 and 0 1.

In the next theorem we compute the equivalent of the variance of the number
of specular points, under certain hypotheses on the random process and with
the longuet-Higgins asymptotic. This result is new and useful for estimation
purposes since it implies that, as k 0, the coefficient of variation of the
random variable S tends to zero at a known speed. Moreover, it will also appear
in a natural way when normalizing S to obtain a Central Limit Theorem.
Theorem 7 Assume that the centered Gaussian stationary process W = {W (x) :
x R} is dependent, that is, (z) = 0 if |z| > , and that it has C 4 -paths.
Then, as k 0 we have:
Var(S) =

1
+ O(1).
k

where
J
= +
2

24
24
,
3

2
10

(22)

J=

2 (z)H (z); 0, 0)
2(2 + (z))

dz,

(23)

the functions H and 2 (z) have already been defined above, and
(z) =

(z)2 (z)
1
(4)

(z)
+
.
2 (z)
22 2 (z)

Remarks on the statement.


The assumption that the paths of the process are of class C 4 imply that
8 < . This is well-known for Gaussian stationary processes (see for
example [8]).
Notice that since the process is -dependent, it is also -dependent for any
> . It is easy to verify that when computing with such a instead of
one gets the same value for .
One can replace the -dependence by some weaker mixing condition, such
as
(i) (z) (const)(1 + |z|) (0 i 4)
for some > 1, in which case the value of should be replaced by:
=

1
24
+

2 (z)H((z); 0, 0)
2

2 +

(z)

1 4

dz.
2

The proof of this extension can be performed following the same lines as
the one we give below, with some additional computations.
Proof of the Theorem: We use the notations and computations preceding
the statement of the theorem.
Divide the integral on the right-hand side of (14) into two parts, according as
|x y| > or |x y| , i.e.
E(S(S 1)) =

... = I1 + I2 .

... +

(24)

|xy|

|xy|>

In the first term, the dependence of the process implies that one can
factorize the conditional expectation and the density in the integrand. Taking
into account that for each x R, the random variables W (x) and W (x) are
independent, we obtain for I1 :
I1 =
|xy|>

E |W (x) k| E |W (y) k| pW (x) (kx)pW (y) (ky)dxdy.

On the other hand, we know that W (x) (resp. W (x)) is centered normal with
variance 2 (resp. 4 ). Hence:
I1 = G(k,

4 )

2
|xy|>

1 k 2 (x2 + y 2 )
1
exp
dxdy,
22
2
2
11

To compute the integral on the right-hand side, notice that the integral over
the whole x, y plane is equal to 1/k 2 so that it suffices to compute the integral
over the set |x y| . Changing variables, this last one is equal to
+

x+

dx

x
+

1
=
2k 2

1
1 k 2 (x2 + y 2 )
dy
exp
22
2
2
1

e 2 u du

u+ k

e 2 v dv

u k

+ O(1),
=
k 2
where the last term is bounded if k is bounded (in fact, remember that we are
considering an approximation in which k 0). So, we can conclude that:
|xy|>

1
1
1 k 2 (x2 + y 2 )

dxdy = 2
exp
+ O(1)
22
2
2
k
k 2

Replacing in the formula for I1 and performing a Taylor expansion, we get:


I1 =

24 1

+ O(1) .
2
k
k 2

(25)

Let us now turn to I2 .


Using Lemma 1 and the equivalences (18) and (21), whenever |z| = |x y|
, the integrand on the right-hand side of (14) is bounded by
(const) H((z); 0, 0) + k 2 (x2 + y 2 ) .
We divide the integral I2 into two parts:
First, on the set {(x, y) : |x| 2, |x y| } the integral is clearly bounded
by some constant.
Second, we consider the integral on the set {(x, y) : x > 2, |x y| }.
(The symmetric case, replacing x > 2 by x < 2 is similar,that is the reason
for the factor 2 in what follows).
We have (recall that z = x y):
2 (z) H (z); 0, 0 + R2 (z); ,

I2 = O(1) + 2
|xy|,x>2

22 2 (z)

exp

12

1 k 2 (2 x2 + 2 (x y)xy + 2 y 2 )
dxdy
2
22 2 (x y)

which can be rewritten as:

I2 = O(1) + 2

2 (z) H (z); 0, 0 + R2 (z); ,

1
2(2 + (z))
+

exp

1
2(2

(z))

k2 z 2
2
1
1

2 2 (z) 2 + (z) 2

exp k 2

dz

(x z/2)2
dx
2 (z))

In the inner integral we perform the change of variables

2k(x z/2)
=
2 (z)
so that it becomes:
+
1
1 1
1
1
exp 2 d =

+ O(1)
2
k 2 0
2
2 2k

where 0 = 2 2k(2 z/2)/ 2 (z).

(26)

Notice that O(1) in (26) is uniformly bounded, independently of k and z, since


the hypotheses on the process imply that 2 (z) is bounded below by a
positive number, for all z.
We can now replace in the expression for I2 and we obtain
J
I2 = O(1) + .
k 2

(27)

To finish, put together (27) with (25), (24), (13) and (12).
Corollary 1 Under the conditions of Theorem 7, as k 0:
Var(S)
k.
E(S)
The proof follows immediately from the Theorem and the value of the expectation.
The computations made in this section are in close relation with the two
results of Theorem 4 in Kratz and Leon [12]. In this paper the random variable
SP2 (I) is expanded in the Wiener-Hermite Chaos. The aforementioned expansion yields the same formula for the expectation and allows obtaining also a
formula for the variance. However, this expansion is difficult to manipulate in
order to get the result of Theorem 7.
Let us now turn to the Central Limit Theorem.
13

Theorem 8 Assume that the process W satisfies the hypotheses of Theorem 7.


In addition, we will assume that the fourth moment of the number of approximate specular points on an interval having length equal to 1 is bounded uniformly
in k, that is
4
E SP2 ([0, 1])
(const)
(28)
Then, as k 0,

24 1
k

/k

N (0, 1),

where denotes convergence in distribution.


Remark.
One can give conditions for the added hypothesis (28) to hold true, which
require some additional regularity for the process. Even though they are not
nice, they are not costly from the point of view of physical models. For example,
either one of the following conditions imply (28):
The paths x
W (x) are of class C 11 . (Use Theorem 3.6 of [4] with m = 4,
applied to the random process {W (x) : x R}. See also [16]).
The paths x
W (x) are of class C 9 and the support of the spectral
measure has an accumulation point: apply Exercice 3.4 of [4] to get the
non-degeneracy condition, Proposition 5.10 of [4] and Rice formula (Theorem 2) to get that the fourth moment of the number of zeros of W (x)
is bounded.
Proof of the Theorem. Let and be real numbers satisfying the conditions
1/2 < < 1, + > 1, 2 + < 2. It suffices to prove the convergence as k
takes values on a sequence of positive numbers tending to 0. To keep in mind
that the parameter is k, we use the notation
S(k) := S = SP2 (R)
Choose k small enough, so that k > 2 and define the sets of disjoint
intervals, for j = 0, 1, . . . , [k ]:
Ujk = (j 1)[k ] + /2, j[k ] /2 ,
Ijk = j[k ] /2, j[k ] + /2 .

[.] denotes integer part.


Notice that each interval Ujk has length [k ] and that two neighboring
intervals Ujk are separated by an interval of length . So, the -dependence of
the process implies that the random variables SP2 (Ujk ), j = 0, 1, . . . , [k ]
14

are independent. A similar argument applies to SP2 (Ijk ), j = 0, 1, . . . , [k ].


We denote:
SP2 (Ujk ),

T (k) =
|j|[k ]

Denote
Vk = Var(S(k))

1/2

where the equivalence is due to Theorem 7.

k/

We give the proof in two steps, which easily imply the statement. In the
first one, we prove that
Vk [S(k) T (k)]

tends to 0 in the L2 of the underlying probability space.


In the second step we prove that
Vk T (k)
is asymptotically standard normal.

Step 1. We prove first that Vk [S(k) T (k)] tends to 0 in L1 . Since it is


non-negative, it suffices to show that its expectation tends to zero. We have:
S(k) T (k) =

SP2 (Ijk ) + Z1 + Z2
|j|<[k ]

where
Z1 = SP2 , [k ].[k ] + /2 ,
Z2 = SP2 [k ].[k ] /2, +) .
Using the fact that E SP2k (I) (const)
+

Vk E(S(k)T (k)) (const)k 1/2

=0

(kx/ 2 )dx, one can show that

[k ]k

+
2

(kx/

2 )dx .

[k ][k ]

which tends to zero as a consequence of the choice of and .


It suffices to prove that Vk2 Var S(k) T (k) 0 as k 0. Using independence:
Var S(k) T (k) =

Var SP2 (Ijk ) + Var(Z1 ) + Var(Z2 )


|j|<[k ]

|j|<[k ]

E SP2 (Ijk )(SP2 (Ijk ) 1)

+ E(Z1 (Z1 1)) + E(Z2 (Z2 1)) + E S(k) T (k) .


15

(29)

We already know that Vk2 E S(k) T (k) 0. Using the hypotheses of the
theorem, since each Ijk can be covered by a fixed number of intervals of size one,
we know that E SP2 (Ijk )(SP2 (Ijk ) 1) is bounded by a constant which does
not depend on k and j. We can write
Vk2
|j|<[k ]

E SP2 (Ijk )(SP2 (Ijk ) 1) (const)k 1

which tends to zero because of the choice of . The remaining two terms can
be bounded by calculations similar to those of the proof of Theorem 7.
Step 2.
T (k) is a sum of independent but not equi-distributed random
variables. To prove it satisfies a Central Limit Theorem, we use a Lyapunov
condition based of fourth moments. Set:
Mjm := E

SP2 (Ujk ) E SP2 (Ujk )

For the Lyapunov condition it suffices to verify that


4
|j|[k ]

Mj4 0 as k 0,

(30)

where
2 :=

Mj2 .
|j|[k ]

To prove (30), let us partition each interval Ujk into p = [k ] 1 intervals


I1 , ...Ip of equal size . We have
E SP1 + + SPp )4 =

E SPi1 SPi2 SPi3 SPi4 ,

(31)

1i1 ,i2 ,i3 ,i4 p

where SPi stands for SP2 (Ii ) E SP2 (Ii ) Since the size of all the intervals
is equal to and given the finiteness of fourth moments in the hypothesis, it
follows that E SPi1 SPi2 SPi3 SPi4 is bounded.
On the other hand, notice that the number of terms which do not vanish in
the sum of the right-hand side of (31) is O(p2 ). In fact, if one of the indices in
(i1 , i2 , i3 , i4 ) differs more than 1 from all the other, then E SPi1 SPi2 SPi3 SPi4
vanishes. Hence,
E SP2 (Ujk ) E SP2 (Ujk )

(const)k 2

so that |j|[k ] Mj4 = O(k 2 k ). The inequality 2 + < 2 implies Lyapunov condition.

16

3.3

Number of specular points without approximation

We turn now to the computation of the expectation of the number of specular


points SP1 (I) defined by (1). This number of specular points is equal to the
number of zeros of the process
Z(x) := W (x) m1 (x, W (x)) = 0,
where
m1 (x, w) =

x2 (h1 w)(h2 w) + [x2 + (h1 w)2 ][x2 + (h2 w)2 ]


.
x(h1 + h2 2w)

Assume that the process {W (x) : x R} is Gaussian, centered, stationary, with


0 = 1. The process Z(t) is not Gaussian and we must use a generalization of
Theorem 1, namely Theorem 3.2 of [4] to get
b

E SP1 ([a, b]) =

dx
a

E |Z (x)| Z(x) = 0, W (x) = w


m2 (x,w)
w2
1
1
12
2
e
dw.
. e 2
22
2

(32)

For the conditional expectation in (32), notice that


Z (x) = W (x)

m1
m1
(x, W (x))
(x, W (x))W (x),
x
w

so that under the condition,


Z (x) = W (x)K(x, w), where K(x, w) =

m1
m1
(x, w))+
(x, w))m1 (x, w).
x
w

Using that for each x, W (x) and W (x) are independent random variables
and performing a Gaussian regression of W (x) on W (x), we can write (32) in
the form:

E SP1 ([a, b])


b

dx
a

E | 2 w K(x, w)|

1
1
m2 (x, w)
exp (w2 + 1
) dw.
2
2
2 2
(33)

where is centered Gaussian with variance 4 22 . Formula (33) can still be


rewritten as:
E SP1 ([a, b])
=

1
2

4 22
2

dx
a

m2 (x, w)
1
) dw,
G(m, 1) exp (w2 + 1
2
2
17

(34)

where
m = m(x, w) =

2 w + K(x, w)
4 22

Notice that in (34), the integral is convergent as a , b + and


that this formula is well-adapted to numerical approximation.

3.4

Number of twinkles

We give a proof of a result stated in [14] (part III pages 852-853).


We consider Y(x, t) defined by (3) and we limit ourselves to the case in which
W (x, t) is centered and stationary. If Y satisfies the conditions of Theorem 3,
by stationarity we get
E T W(I, T ) = T

E | det Y (x, t)| Y(x, t) = 0 pY(x,t) (0)dx.

(35)

Since Wxx and Wx are independent with respective variances


+

40 =

4 (d, d )

20 =

2 (d, d ),

where is the spectral measure of the stationary random field W (x, t). The
density in (35) satisfies
pY(x,t) (0) = (20 )1/2 kx(20 )1/2 (40 )1/2 k(40 )1/2 .
On the other hand
Y (x, t) =

Wxx (x, t) k
Wxxx (x, t)

Wxt (x, t)
Wxxt (x, t)

Under the condition Y(x, t) = 0, one has


| det(Y (x, t))| = |Wxt (x, t)Wxxx (x, t)|.
Computing the regression it turns out that the conditional distribution of the
pair (Wxt (x, t), Wxxx (x, t)) under the same condition, is the one of two independent centered gaussian random variables, with the following parameters:
31
k and variance 22
40
40
kx and variance 60
expectation
20
expectation

231
, for the first coordinate
40
240
, for the second coordinate
20

(36)
(37)

It follows that:
E | det(Y (x, t))| Y(x, t) = 0 = G

31
k,
40
18

22

40
231
.G
kx,
40
20

60

240
20

Summing up:
1
E T W(R, T ) =
T
1

40
=

40

k
40

setting 6 := 60
III page 853).

3.5

31
k,
40

240
20

22
31
k,
40

231
40

G
R

22

40
kx,
20

231 1
40 k

60

1
240

20
20

6 20 + 40
6

20 40
6 + 240

kx

20
(38)

This result is equivalent to formula (4.7) of [14] (part

Specular points in two dimensions

We consider at fixed time a random surface depending on two space variables x


and y. The source of light is placed at (0, 0, h1 ) and the observer is at (0, 0, h2 ).
The point (x, y) is a specular point if the normal vector n(x, y) = (Wx , Wy , 1)
to the surface at (x, y) satisfies the following two conditions:
the angles with the incident ray I = (x, y, h1 W ) and the reflected
ray R = (x, y, h2 W ) are equal (for short the argument (x, y) has
been removed),
it belongs to the plane generated by I and R.
Setting i = hi W and ri =
case we have:

x2 + y 2 + i , i = 1, 2, as in the one-parameter

x
2 r1 1 r2
,
2
+y
r2 r1
y
2 r1 1 r2
Wy = 2
.
x + y 2 r2 r1

Wx =

x2

(39)

When h1 and h2 are large, the system above can be approximated by


Wx = kx
Wy = ky,

(40)

under the same conditions as in dimension 1.


Next, we compute the expectation of SP2 (Q), the number of approximate
specular points in the sense of (40) that are in a domain Q. In the remaining of
this paragraph we limit our attention to this approximation and to the case in
which {W (x, y) : (x, y) R2 } is a centered Gaussian stationary random field.

19

Let us define:

Wx (x, y) kx
Wy (x, y) ky

Y(x, y) :=

(41)

Under very general conditions, for example on the spectral measure of {W (x, y) :
x, y R} the random field {Y (x, y) : x, y R} satisfies the conditions of Theorem 3, and we can write:
E SP2 (Q) =
Q

E | det Y (x, y)| pY(x,y) (0) dxdy,

(42)

since for fixed (x, y) the random matrix Y (x, y) and the random vector Y (x, y)
are independent, so that the condition in the conditional expectation can be
erased.
The density in the right hand side of (42) has the expression
pY(x,y) (0) = p(Wx ,Wy ) (kx, ky)
k2
02 x2 211 xy + 20 y 2 .
2(20 02 211 )
20 02 211
(43)
To compute the expectation of the absolute value of the determinant in the right
hand side of (42), which does not depend on x, y, we use the method of [6]. Set
2
:= det Y (x, y) = (Wxx k)(Wyy k) Wxy
.
=

1
2

exp

We have
E(||) = E

1 cos(t)
dt .
t2

(44)

Define
2
h(t) := E exp it[(Wxx k)(Wyy k) Wxy
]

Then
E(||) =

+
0

1 Re[h(t)]
dt .
t2

To compute h(t) we define

0 1/2 0
0
A = 1/2 0
0
0 1

and the variance matrix of Wxx , Wyy , Wx,y

40 22 31
:= 22 04 13 .
31 13 22

20

.
(45)

Let1/2 A1/2 = P diag(1 , 2 , 3 )P T where P is orthogonal. Then by a


diagonalization argument
h(t) = eitk

E exp it (1 Z12 k(s11 +s21 )Z1 )+(2 Z22 k(s12 +s22 )Z2 )+(3 Z32 k(s13 +s23 )Z3 )
where (Z1 , Z2 , Z3 ) is standard normal and sij are the entries of 1/2 P T .
One can check that if is a standard normal variable and , are real
constants, > 0:

E ei (+)

i 2

= (12i )1/2 e (12i ) =

2
2
1
+i
+
exp
1 + 4 2
1 + 4 2
(1 + 4 2 )1/4

where

1
arctan(2 ), 0 < < /4.
2
Replacing in (46), we obtain for Re[h(t)] the formula:
=

Re[h(t)] =
j=1

dj (t, k)
1+

42j t2

j (t) + k 2 tj (t)

cos

(47)

j=1

where, for j = 1, 2, 3:
dj (t, k) = exp

k 2 t2 (s1j + s2j )2
,
2
1 + 42j t2

j (t) =

1
arctan(2j t), 0 < j < /4,
2

j (t) =

(s1j + s2j )2 j
1
.
t2
3
1 + 42j t2

Introducing these expressions in (45) and using (43) we obtain a new formula
which has the form of a rather complicated integral. However, it is well adapted
to numerical evaluation.
On the other hand, this formula allows us to compute the equivalent as
k 0 of the expectation of the total number of specular points under the
longuet-Higgins approximation. In fact, a first order expansion of the terms in
the integrand gives a somewhat more accurate result, that we state as a theorem:
Theorem 9
E SP2 (R2 ) =

21

m2
+ O(1)
k2

(48)

(46)

where
+

m2 =

3
j=1 (1

+ 42j t2 )

cos

3
j=1

j (t)

t2

1/2

1 23/2

3
j=1

Aj

1 + Aj
t2

dt

1 B1 B2 B2 B3 B3 B1

dt,
(49)

where
Aj = Aj (t) = 1 + 42j t2

1/2

, Bj = Bj (t) =

(1 Aj )/(1 + Aj ).

Notice that m2 only depends on the eigenvalues 1 , 2 , 3 and is easily


computed numerically.
In Flores and Leon [10] a different approach was followed in search of a formula for the expectation of the number of specular points in the two-dimensional
case, but their result is only suitable for Montecarlo approximation.
We now consider the variance of the total number of specular points in
two dimensions, looking for analogous results to the one-dimensional case (i.e.
Theorem 7 and its Corollary 1), in view of their interest for statistical applications. It turns out that the computations become much more involved. The
statements on variance and speed of convergence to zero of the coefficient of
variation that we give below include only the order of the asymptotic behavior
in the longuet-Higgins approximation, but not the constant. However, we still
consider them to be useful. If one refines the computations one can give rough
bounds on the generic constants in Theorem 10 and Corollary 2 on the basis of
additional hypotheses on the random field.
We assume that the real-valued, centered, Gaussian stationary random field
{W (x) : x R2 } has paths of class C 3 , the distribution of W (0) does not
degenerate (that is Var(W (0)) is invertible). Moreover, let us consider W (0),
expressed in the reference system xOy of R2 as the 2 2 symmetric centered
Gaussian random matrix:
W (0) =

Wxx (0) Wxy (0)


Wxy (0) Wyy (0)

The function
z

(z) = det Var W (0)z ,

defined on z = (z1 , z2 )T R2 , is a non-negative homogeneous polynomial of


degree 4 in the pair z1 , z2 . We will assume the non-degeneracy condition:
min{(z) : z = 1} = > 0.

(50)

Theorem 10 Let us assume that {W (x) : x R2 } satisfies the above condtions


and that it is also -dependent, > 0, that is, E W (x)W (y) = 0 whenever
22

x y > .
Then, for k small enough:
Var SP2 (R2 )

L
,
k2

where L is a positive constant depending upon the law of the random field.
A direct consequence of Theorems 9 and 10 is the following:
Corollary 2 Under the same hypotheses of Theorem 10, for k small enough,
one has:
Var SP2 (R2 )
L1 k
E SP2 (R2 )
where L1 is a new positive constant.
Proof of Theorem 10. For short, let us denote T = SP2 (R2 ). We have:
Var(T ) = E(T (T 1)) + E(T ) [E(T )]2

(51)

We have already computed the equivalents as k 0 of the second and third


term in the right-hand side of (51). Our task in what follows is to consider the
first term.
The proof is performed along the same lines as the one of Theorem 7, but
instead of applying Rice formula for the second factorial moment of the number of crossings of a one-parameter random process, we need Theorem 4 for
dimension d = 2. We write the factorial moment of order m = 2 in the form:
E(T (T 1))
=
R2 R2

E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0 pY(x),Y(y) (0, 0) dxdy

... dxdy +
xy >

... dxdy = J1 + J2 .
xy

For J1 we proceed as in the proof of Theorem 7, using the -dependence and


the evaluations leading to the statement of Theorem 9. We obtain:
J1 =

O(1)
m22
+ 2 .
4
k
k

(52)

Let us show that for small k,


O(1)
.
k2
In view of (51), (48) and (52) this suffices to prove the theorem.
J2 =

23

(53)

We do not perform all detailed computations. The key point consists in


evaluating the behavior of the integrand that appears in J2 near the diagonal
x = y, where the density pY(x),Y(y) (0, 0) degenerates and the conditional expectation tends to zero.
For the density, using the invariance under translations of the law of W (x) :
x R2 , we have:
pY(x),Y(y) (0, 0) = pW (x),W (y) (kx, ky)
= pW (0),W (yx) (kx, ky)
= pW (0),[W (yx)W (0)] (kx, k(y x)).
Perform the Taylor expansion, for small z = y x R2 :
W (z) = W (0) + W (0)z + O( z 2 ).
Using the non-degeneracy assumption (50) and the fact that W (0) and W (0)
are independent, we can show that for x, z R2 , z :
C1
exp C2 k 2 ( x C3 )2
z 2

pY(x),Y(y) (0, 0)

where C1 , C2 , C3 are positive constants.


Let us consider the conditional expectation. For each pair x, y of different
points in R2 , denote by the unit vector (y x)/ y x and n a unit vector orthogonal to . We denote respectively by Y, Y, n Y the first and
second partial derivatives of the random field in the directions given by and n.
Under the condition
Y(x) = 0, Y(y) = 0
we have the following simple bound on the determinant, based upon its definition
and Rolles Theorem applied to the segment [x, y] = {x + (1 )y}:
det Y (x) Y(x)

n Y(x) y x

sup

Y(s)

n Y(x)

(54)

s[x,y]

So,
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0
y x 2E
= z 2E

sup

Y(s)

n Y(x)

n Y(y) W (x) = kx, W (y) = ky

s[x,y]

sup

Y(s)

n Y(0)

s[0,z]

24

n Y(z) W (0) = kx,

W (z) W (0)
= k ,
z

where the last equality is again a consequence of the stationarity of the random
field {W (x) : x R2 }.
At this point, we perform a Gaussian regression on the condition. For the
condition, use again Taylor expansion, the non-degeneracy hypothesis and the
independence of W (0) and W (0). Then, use the finiteness of the moments of
the supremum of bounded Gaussian processes (see for example [4], Ch. 2), take
into account that z to get the inequality:
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0 C4 z

1+k x

(55)

where C4 is a positive constant. Summing up, we have the following bound for
J2 :
J2 C1 C4 2

1+k x

R2
+

= C1 C4 2 2 2

1 + k
0

exp C2 k 2 ( x C3 )2 dx
4

(56)

exp C2 k 2 ( C3 )2 d

Performing the change of variables w = k, (53) follows.

The distribution of the normal to the level


curve

Let us consider a modeling of the sea W (x, y, t) as a function of two space variables and one time variable. Usual models are centered Gaussian stationary
with a particular form of the spectral measure that we discuss briefly below.
We denote the covariance by (x, y, t) = E(W (0, 0, 0)W (x, y, t)).
In practice, one is frequently confronted with the following situation: several
pictures of the sea on time over an interval [0, T ] are stocked and some properties
or magnitudes are observed. If the time T and the number of pictures are large,
and if the process is ergodic in time, the frequency of pictures that satisfy a
certain property will converge to the probability of this property to happen at
a fixed time.
Let us illustrate this with the angle of the normal to the level curve at a
point chosen at random. We consider first the number of crossings of a level
u by the process W (, y, t) for fixed t and y, defined as
W (,y,t)

N[0,M1 ] (u) = #{x : 0 x M1 ; W (x, y, t) = u}.


We are interested in computing the total number of crossings per unit time when
integrating over y [0, M2 ] i.e.
1
T

M2

dt
0

W (,y,t)

N[0,M1 ] (u) dy.


25

(57)

If the ergodicity assumption in time holds true, we can conclude that a.s.:
1
T

M2

W (,y,t))

N[0,M1 ]

dt
0

W (,0,0))

(u) dy M1 E N[0,M1 ]

(u) =

M1 M2

200 21 u2
000 ,
e
000

where
abc =
R3

ax by ct d(x , y , t )

are the spectral moments.


Hence, on the basis of the quantity (57) for large T , one can make inference
about the value of certain parameters of the law of the random field. In this
example these are the spectral moments 200 and 000 .
If two-dimensional level information is available, one can work differently because there exists an interesting relationship with Rice formula for level curves
that we explain in what follows.
We can write (x = (x, y)):
W (x, t) = ||W (x, t)||(cos (x, t), sin (x, t))T .
Instead of using Theorem 1, we can use Theorem 6, to write
M2

E
0

W (,y,0)

N[0,M1 ] (u) dy = E
=

CQ (0,u)

2 (Q)

| cos (x, 0)| d1

200 2u2
000 ,
e
000

(58)

where Q = [0, M1 ] [0, M2 ]. We have a similar formula when we consider sections of the set [0, M1 ] [0, M2 ] in the other direction. In fact (58) can be
generalized to obtain the Palm distribution of the angle .
Set h1 ,2 = 1I[1 , 2 ] , and for 1 < 2 define
F (2 ) F (1 ) : = E 1 ({x Q : W (x, 0) = u ; 1 (x, s) 2 })
=E

h1 ,2 ((x, s))d1 (x)ds


CQ (u,s)

(59)
u2

exp(
)
y W
)((x W )2 + (y W )2 )1/2 ] 200 .
= 2 (Q)E[h1 ,2 (
x W
2000
Denoting = 200 020 110 and assuming 2 (Q) = 1 for ease of notation, we

26

readily obtain
F (2 ) F (1 )

u2

e 2000

=
(2)3/2 ()1/2 000

R2

h1 ,2 () x2 + y 2 e 2 (02 x

211 xy+20 y 2 )

dxdy

u2

e 200

=
(2)3/2 (+ )1/2 000
2 exp(

2
(+ cos2 ( ) + sin2 ( )))dd
2+

where + are the eigenvalues of the covariance matrix of the random vector
(x W (0, 0, 0), y W (0, 0, 0)) and is the angle of the eigenvector associated to
+ . Remarking that the exponent in the integrand can be written as
1/ (1 2 sin2 ( )) with 2 := 1 + / and that
+
0

2 exp

H2
2

2H

it is easy to get that


2

F (2 ) F (1 ) = (const)

1 2 sin2 ( )

1/2

d.

From this relation we get the density g() of the Palm distribution, simply by
dividing by the total mass:
g() =

1 2 sin2 ( )

1 2 sin2 ( )

1/2
1/2

=
d.

1 2 sin2 ( )
4K( 2 )

1/2

(60)

Here K is the complete elliptic integral of the first kind. This density characterizes the distribution of the angle of the normal at a point chosen at random
on the level curve.
In the case of a random field which is isotropic in (x, y), we have 200 = 020
and moreover 110 = 0, so that g turns out to be the uniform density over
the circle (Longuet-Higgins says that over the contour the distribution of the
angle is uniform (cf. [15], pp. 348)).
Let now W = {W (x, t) : t R+ , x = (x, y) R2 } be a stationary zero
mean Gaussian random field modeling the height of the sea waves. It has the
following spectral representation:
ei(1 x+2 y+t)

W (x, y, t) =

27

f (1 , 2 , )dM (1 , 2 , ),

where is the manifold {21 + 22 = 4 } (assuming that the acceleration of


gravity g is equal to 1) and M is a random Gaussian orthogonal measure defined
on (see [13]). This leads to the following representation for the covariance
function
(x, y, t) =
=

ei(1 x+2 y+t) f (1 , 2 , )2 (dV )

ei(

x cos + 2 y sin +t)

G(, )dd,

where, in the second equation, we made the change of variable 1 = 2 cos ,


2 = 2 sin and G(, ) = f ( 2 cos , 2 sin , )2 3 . The function G is
called the directional spectral function. If G does not depend of the random field W is isotropic in x, y.
Let us turn to ergodicity. For a given subset Q of R2 and each t, let us define
At = {W (x, y, t) : > t ; (x, y) Q}
and consider the -algebra of t-invariant events A = At . We assume that
for each pair (x, y), (x, y, t) 0 as t +. It is well-known that under
this condition, the -algebra A is trivial, that is, it only contains events having
probability zero or one (see for example [8], Ch. 7).
This has the following important consequence in our context. Assume further
that the set Q has a smooth boundary and for simplicity, unit Lebesgue measure.
Let us consider
Z(t) =
H x, t d1 (x),
CQ (u,t)

where H x, t = H W (x, t), W (x, t) , where W = (Wx , Wy ) denotes gradient in the space variables and H is some measurable function such that the integral is well-defined. This is exactly our case in (59). The process {Z(t) : t R} is
strictly stationary, and in our case has a finite mean and is Riemann-integrable.
By the Birkhoff-Khintchine ergodic theorem ([8] page 151), a.s. as T +,
1
T

T
0

Z(s)ds EB [Z(0)],

where B is the -algebra of t-invariant events associated to the process Z(t).


Since for each t, Z(t) is At -measurable, it follows that B A, so that EB [Z(0)] =
E[Z(0)]. On the other hand, Rices formula yields (take into account that stationarity of W implies that W (0, 0) and W (0, 0) are independent):
E[Z(0)] = E[H u, W (0, 0) ||W (0, 0)||]pW (0,0) (u).

28

We consider now the CLT. Let us define


Z(t) =

1
t

Z(s) E(Z(0)) ds,

In order to compute second moments, we use Rice formula for integrals over
level sets (cf. Theorem 6), applied to the vector-valued random field
X(x1 , x2 , s1 , s2 ) = (W (x1 , s1 ), W (x2 , s2 ))T .
The level set can be written as:
CQ2 (u, u) = {(x1 , x2 ) QQ : X(x1 , x2 , s1 , s2 ) = (u, u)}
So, we get
Var Z(t) =

2
t

t
0

for 0 s1 t, 0 s2 t.

s
(1 )I(u, s)ds,
t

where
I(u, s) =
Q2

E H(x1 , 0)H(x2 , s) W (x1 , 0)

W (x2 , s)

W (x1 , 0) = u ; W (x2 , s) = u

pW (x1 ,0),W (x2 ,s) (u, u)dx1 dx2 E[H u, W (0, 0) ||W (0, 0)||]pW (0,0) (u)
Assuming that the given random field is time--dependent, that is,
(x, y, t) = 0 (x, y), whenever t > , we readily obtain

t Var Z(t) 2

I(u, s)ds := 2 (u) as t .

Using now a variant of the Hoeffding-Robbins Theorem [11] for sums of dependent random variables, we get the CLT:

tZ(t) N (0, 2 (u)).

Numerical computations

Validity of the approximation for the number of specular


points
In the particular case of stationary processes we have compared the exact expectation given by (32) with the approximation (10).
In full generality the result depends on h1 , h2 , 4 and 2 . After scaling, we
can assume for example that 2 = 1.
The main result is that, when h1 h2 , the approximation (10) is very sharp.
For example with the value (100, 100, 3) for (h1 , h2 , 4 ), the expectation of the
total number of specular points over R is 138.2; using the approximation (11)
29

the result with the exact formula is around 2.102 larger but it is almost hidden
by the precision of the computation of the integral.
If we consider the case (90, 110, 3), the results are respectively 136.81 and
137.7.
In the case (100, 300, 3), the results differ significantly and Figure 1 displays
the densities (32) and (10)

0.7

0.6

0.5

0.4

0.3

0.2

0.1

100

200

300

400

500

Figure 1: Intensity of specular points in the case h1 = 100, h2 = 300, 4 = 3.


In solid line exact formula, in dashed line approximation (10)

Effect of anisotropy on the distribution of the angle of the


normal to the curve
We show the values of the density given by (60) in the case of anisotropic
processes = 0.5 and = /4. Figure 2 displays the densities of the Palm distribution of the angle showing a large departure from the uniform distribution.

Specular points in dimension 2


We use a standard sea model with a Jonswap spectrum and spread function
cos(2). It corresponds to the default parameters of the Jonswap function of
the toolbox WAFO [18]. The variance matrix of the gradient is equal to
104

114 0
0 81

30

0.55
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
2

1.5

0.5

0.5

1.5

Figure 2: Density of the Palm distribution of the angle of the normal to the
level curve in the case = 0.5 and = /4
and the matrix of Section 3.5 is

3 0
11 0
0 3

9
= 104 3
0

The spectrum is presented in Figure 3


Directional Spectrum
Level curves at:
2
4
6
8
10
12

90

0.8

120

60
0.6
0.4

150

30

0.2

180

330

210

300

240
270

Figure 3: Directional Jonswap spectrum as obtained using the default options


of Wafo

31

The integrand in (42) is displayed in Figure 4 as a function of the two space


variables x, y. The value of the asymptotic parameter m2 defining the expansion
on the expectation of the numbers of specular points, see(48), is 2.527103.

Figure 4: Intensity function of the specular points for the Jonswap spectrum

The Matlab programs used for these computations are available at

\protect\vrule width0pt\protect\href{http://www.math.univ-toulouse.fr/\string~azais/prog/pro

Application to dislocations of wavefronts

In this section we follow the article by Berry and Dennis [6]. As these authors, we
are interested in dislocations of wavefronts. These are lines in space or points in
the plane where the phase , of the complex scalar wave (x, t) = (x, t)ei(x,t) ,
is undefined, (x = (x1 , x2 )) is a two dimensional space variable). With respect
to light they are lines of darkness; with respect to sound, threads of silence.

32

It will be convenient to express by means of its real and imaginary parts:


(x, t) = (x, t) + i(x, t).
Thus the dislocations are the intersection of the two surfaces
(x, t) = 0

(x, t) = 0.

We assume an isotropic Gaussian model. This means that we will consider the
wavefront as an isotropic Gaussian field
(x, t) =
R2

exp (i[ k x c|k|t])(

(|k|) 1/2
) dW (k),
|k|

where, k = (k1 , k2 ), |k| =


k12 + k22 , (k) is the isotropic spectral density
and W = (W1 + iW2 ) is a standard complex orthogonal Gaussian measure
on R2 , with unit variance. Here we are interested only in t = 0 and we put
(x) := (x, 0) and (x) := (x, 0).
We have, setting k = |k|
cos( kx )(

(x) =
R2

(k) 1/2
) dW1 (k)
k

R2

(k) 1/2
) dW2 (k)+
k

R2

sin( kx )(

(k) 1/2
) dW2 (k) (61)
k

sin( kx )(

(k) 1/2
) dW1 (k) (62)
k

and
cos( kx )(

(x) =
R2

The covariances are


E [(x)(x )] = E [(x)(x )] = (|x x |) :=

J0 (k|x x |)(k)dk

where J (x) is the Bessel function of the first kind of order .


E [(r1 )(r2 )] = 0.

(63)

Moreover

Three dimensional model


In the case of a three dimensional Gaussian field, we have x = (x1 , x2 , x3 ),
k = (k1 , k2 , k3 ),k = |k| = k12 + k22 + k32 and
(x) =
R3

exp i[ k x ] (

(k) 1/2
) dW (k).
k2

In this case, we write the covariances in the form:

E [(r1 )(r2 )] = 4
0

sin(k|r1 r2 |)
(k)dk.
k|r1 r2 |

(64)

The same formula holds true for the process and also E [(r1 )(r2 )] = 0 for
any r1 , r2 , showing that the two coordinates are independent Gaussian fields .
33

6.1

Mean length of dislocation curves, mean number of


dislocation points

Dimension 2: Let us denote {Z(x) : x R2 } a random field with values in R2 ,


with coordinates (x), (x), which are two independent Gaussian stationary
isotropic random fields with the same distribution. We are interested in the
expectation of the number of dislocation points
d2 := E[#{x S : (x) = (x) = 0}],
where S is a subset of the parameter space having area equal to 1.
Without loss of generality we may assume that Var((x)) = Var((x)) = 1
and for the derivatives we set 2 = Var(i (x)) = Var(i (x)), i = 1, 2. Then,
using stationarity and the Rice formula (Theorem 3) we get
d2 = E[| det(Z (x))|/Z(x) = 0]pZ(x) (0),
The stationarity implies independence between Z(x) and Z (x) so that the conditional expectation above is in fact an ordinary expectation. The entries of
Z (x) are four independent centered Gaussian variables with variance 2 , so
that, up to a factor, | det(Z (x))| is the area of the parallellogram generated by
two independent standard Gaussian variables in R2 . One can easily show that
the distribution of this volume is the product of independent square roots of
a 2 (2) and a 2 (1) distributed random variables. An elementary calculation
gives then: E[| det(Z (x))|] = 2 . Finally, we get
d2 =
This quantity is equal to
formula (4.6).

K2
4

1
2
2

in Berry and Dennis [6] notations, giving their

Dimension 3: In the case, our aim is to compute


d3 = E[L{x S : (x) = (x) = 0}]

where S is a subset of R3 having volume equal to 1 and L is the length of the


curve. Note that d3 is denoted by d [6]. We use the same notations and remarks
except that the form of the Rices formula is (cf. Theorem 5)
d3 =

1
E[(det Z (x)Z (x)T )1/2 ].
2

Again
E[(det(Z (x)Z (x)T )1/2 ] = 2 E(V ),
where V is the surface area of the parallelogram generated by two standard
Gaussian variables in R3 . A similar method to compute the expectation of this
random area gives:
E(V ) = E(

4
2 (3)) E( 2 (2)) =
2
34

=2
2

Leading eventually to
d3 =

2
.

In Berry and Dennis notations [6] this last quantity is denoted by


their formula (4.5).

6.2

k2
3

giving

Variance

In this section we limit ourselves to dimension 2. Let S be again a measurable


subset of R2 having Lebesgue measure equal to 1. The computation of the
variance of the number of dislocations points is performed using Theorem 4 to
express
E NSZ (0) NSZ (0) 1

=
S2

A(s1 , s2 )ds1 ds2 .

We assume that {Z(x) : x R2 } satisfies the hypotheses of Theorem 4 for


m = 2. Then use
Var NSZ (0) = E NSZ (0) NSZ (0) 1

+ d2 d22 .

Taking into account that the law of the random field is invariant under translations and orthogonal transformations of R2 , we have
A(s1 , s2 ) = A (0, 0), (r, 0) = A(r)

whith r = s1 s2 ,

The Rices function A(r)) , has two intuitive interpretations. First it can be
viewed as
A(r) = lim

1
E N B((0, 0), ) N B((r, 0), ) .
2 4

Second it is the density of the Palm distribution (a generalization Horizontal


window conditioning of [8]) of the number of zeroes of Z per unit of surface,
locally around the point (r, 0) given that there is a zero at (0, 0).
A(r)/d22 is called correlation function in [6].
To compute A(r), we put 1 , 2 , 1 2 for the partial derivatives of , with
respect to first and second coordinate.
and
A(r) = E | det Z (0, 0) det Z (r, 0)| Z(0, 0) = Z(r, 0) = 02 pZ(0,0),Z(r,0) (04 )
=E

1 2 2 1 (0, 0) 1 2 2 1 (r, 0)
pZ(0,0),Z(r,0) (04 )

Z(0, 0) = Z(r, 0) = 02
(65)

where 0p denotes the null vector in dimension p.

35

The density is easy to compute


pZ(0,0),Z(r,0) (04 ) =

1
, where (r) =
2
(2) (1 2 (r))

J0 (kr)(k)dk.

We use now the same device as above to compute the conditional expectation
of the modulus of the product of determinants, that is we write:
|w| =

(1 cos(wt)t2 dt.

(66)

and also the same notations as in [6]

C := (r)

E = (r)
H = E/r

F
= (r)

F0 = (0)

The regression formulas imply that the conditional variance matrix of the vector
W = 1 (0), 1 (r, 0), 2 (0), 2 (r, 0), 1 (0), 1 (r, 0), 2 (0), 2 (r, 0) ,
is given by
= Diag A, B, A, B
with
A=

B=

E C
F 1C
2
E2
F0 1C
2

F0
H

H
F0

E
F0 1C
2
E2 C
F 1C
2

Using formula (66) the expectation we have to compute is equal to


1
2

1
1
1
1
2
1 T (t1 , 0) T (t1 , 0) T (0, t2 ) T (0, t2 )
dt2 t2
1 t2
2
2
2
2

1
1
1
1
+ T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) (67)
4
4
4
4

dt1

where
T (t1 , t2 ) = E exp i(w1 t1 + w2 t2 )
with
w1 = 1 (0)2 (0) 1 (0)2 (0) = W1 W7 W3 W5
w2 = 1 (r, 0)2 (r, 0) 1 (r, 0)2 (r, 0) = W2 W8 W4 W6 .

36

T (t1 , t2 ) = E exp(iWT HW) where W has the distribution N (0, ) and

0
0
0
D
0
0
D 0
,
H=
0 D
0
0
D
0
0
0

1 t1 0
.
2 0 t2
A standard diagonalization argument shows that
D=

T (t1 , t2 ) = E exp(iWT HW) = E exp(i

j j2 ) ,
j=1

where the j s are independent with standard normal distribution and the j
are the eigenvalues of 1/2 H1/2 . Using the characteristic function of the 2 (1)
distribution:
8

E exp(iWT HW) =

j=1

(1 2ij )1/2 .

(68)

Clearly
1/2 = Diag A1/2 , B 1/2 , A1/2 , B 1/2
and
1/2 H1/2

0
0
=
0
MT

0
0
0
MT
M
0
0
0

M
0

0
0

with M = A1/2 DB 1/2 .


Let be an eigenvalue of 1/2 H1/2 It is easy to check that 2 is an eigenvalue of MMT . Respectively if 21 and 22 are the eigenvalues of MMT , those
of 1/2 H1/2 are 1 (twice) and 2 (twice).
Note that 21 and 22 are the eigenvalues of MMT = A1/2 DBDA1/2 or
equivalently, of DBDA. Using (68)
E exp(iWT HW) = 1+4(21 +22 )+1621 22

= 1+4tr(DBDA)+16 det(DBDA)

where
DBDA =

1
4

E
t21 F0 (F0 1C
2 ) + t1 t2 H(F
E2
t1 t2 H(F0 1C 2 ) + t22 F0 (F

E2C
1C 2 )
E2C
1C 2 )

E C
t21 F0 (F 1C
2 ) + t1 t2 H(F0
E2 C
t1 t2 H(F 1C 2 ) + t22 F0 (F0

So,
E2
E2C
) + 2t1 t2 H(F
)
2
1C
1 C2
E2C 2
E2 2
)

(F

)
(F0
1 C2
1 C2

4tr(DBDA) = (t21 + t22 )F0 (F0


16 det(DBDA) = t21 t22 F02 H 2

37

(69)
(70)

E2
1C 2 )
E2
1C 2 )

giving
T (t1 , t2 ) = E exp(iWT HW)

E2
E2C
)
+
2t
t
H(F

)
1
2
1 C2
1 C2
E2 2
E2C 2
+ t21 t22 F02 H 2 (F0
)

(F

)
1 C2
1 C2

Performing the change of variable t = A1 t with A1 = F0 (F0


integral (67) becomes
= 1 + (t21 + t22 )F0 (F0

A1
2
1

dt1

(71)

E2
1C 2 )

the

2
dt2 t2
1 t2

1
1
1
1
1
+
+
2
2
2
2
2
2
2
2
1 + t1 1 + t2
2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z 1 + (t1 + t2 ) + 2A2 t1 t2 + t21 t22 Z
=
1

where

A1
2

dt1

1
1

+
1 + t21
1 + t22

1 + (t21 + t22 ) + t21 t22 Z

1 + (t21 + t22 ) + t21 t22 Z

A2 =
Z=

2
dt2 t2
1 t2

2
2
H F (1C )E C
F0 F0 (1C 2 )E 2
F02 H 2
E2 C 2
1 (F 1C
2 ) .(F0
F02

(72)

4A22 t21 t22

E2
2
1C 2 )

In this form, and up to a sign change, this result is equivalent to Formula (4.43)
of [6] (note that A22 = Y in [6]).
In order to compute the integral (72), first we obtain

1
1
1
dt2 = .
2
t2
1 + t22

We split the other term into two integrals, thus we have for the first one
1
2

1
1
1

dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z
1 + t21
=

1
2(1 + t21 )

1
=
2(1 + t21 )

(1 + t21 Z)t22 2A2 t1 t2


1
dt2
t22 1 + t21 2A2 t1 t2 + (1 + t21 Z)t22
t22 2Z1 t1 t2
1
dt2 = I1 ,
t22 t22 2Z1 t1 t2 + Z2

1+t2

A2
where Z2 = 1+Zt12 and Z1 = 1+Zt
2.
1
1
Similarly for the second integral we get

38

1
2

1
1
1

dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) + 2A2 t1 t2 + t1 t2 Z
1 + t21
=

I1 + I2 =

1
2(1 + t21 )

1
2(1 + t21 )

1
=
(1 + t21 )

t22 + 2Z1 t1 t2
1
dt2 = I2
2
2
t2 t2 + 2Z1 t1 t2 + Z2

t22 + 2Z1 t1 t2
t22 2Z1 t1 t2
1
+
dt2
t22 t22 2Z1 t1 t2 + Z2
t22 + 2Z1 t1 t2 + Z2

t22 + (Z2 4Z12 t21 )


dt2
t42 + 2(Z2 2Z12 t21 )t22 + Z22

(Z2 2Z12 t21 )


1
=
.
2
(1 + t1 ) Z2 (Z2 Z12 t21 )

In the third line we have used the formula provided by the method of residues.
In fact, if the polynomial X 2 SX + P with P > 0 has not root in [0, ), then

t4

t2
dt =
St2 + P

( P ).
P (S + 2 P )

In our case = (Z2 4Z12 t21 ), S = 2(Z2 2Z12 t21 ) and P = Z22 .
Therefore we get

A(r) =

A1
3
4 (1 C 2 )

(Z2 2Z12 t21 )


1
1
dt1 .
1

t21
(1 + t21 ) Z2 (Z2 Z12 t21 )

Acknowledgement
This work has received financial support from European Marie Curie Network
SEAMOCS.

References
[1] R. J. Adler, The Geometry of Random Fields, Wiley,(1981).
[2] R. J. Adler and J. Taylor, Random Fields and Geometry. Springer, (2007).
[3] J-M. Azas, J. Leon and J. Ortega, Geometrical Characteristic of Gaussian
sea Waves. Journal of Applied Probability , 42,1-19. (2005).
[4] J-M. Azas, and M. Wschebor, Level set and extrema of random processes
and fields, Wiley (2009).
[5] J-M. Azas, and M. Wschebor, On the Distribution of the Maximum of a
Gaussian Field with d Parameters, Annals of Applied Probability, 15 (1A),
254-278, (2005).
39

[6] M.V. Berry, and M.R. Dennis, Phase singularities in isotropic random waves,
Proc. R. Soc. Lond, A, 456, 2059-2079 (2000).
[7] E. Caba
na, Esperanzas de Integrales sobre Conjuntos de Nivel aleatorios. Actas del 2 Congreso Latinoamericano de Probabilidad y Estadistica
Matem
atica, Editor: Sociedad Bernoulli secci
on de Latinoamerica, Spanish
, Caracas, 65-82 (1985).
[8] H. Cramer and M.R. Leadbetter, Stationary and Related Stochastic Processes, Wiley (1967).
[9] H. Federer, Geometric Measure, Springer (1969).
[10] E. Flores and J.R. Leon, Random seas, Levels sets and applications,
Preprint (2009).
[11] W. Hoeffding and H. Robbins, The Central Limit Theorem for dependent
random variables, Duke Math. J. 15 , 773-780,(1948).
[12] M. Kratz and J. R. Leon, Level curves crossings and applications for Gaussian models, Extremes, DOI 10.1007/s10687-009-0090-x (2009).
[13] P. Kree and C. Soize, Mecanique Aletoire, Dunod (1983).
[14] M. S. Longuet-Higgins, Reflection and refraction at a random surface. I, II,
III, Journal of the Optical Society of America, vol. 50, No.9, 838-856 (1960).
[15] M. S. Longuet-Higgins, The statistical geometry of random surfaces. Proc.
Symp. Appl. Math., Vol. XIII, AMS Providence R.I., 105-143 (1962).
[16] Nualart, D. and Wschebor, M., Integration par parties dans lespace de
Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83-109
(1991).
[17] S.O. Rice,(1944-1945). Mathematical Analysis of Random Noise, Bell System Tech. J., 23, 282-332; 24, 45-156 (1944-1945).
[18] WAFO-group . WAFO - A Matlab Toolbox for Analysis of Random Waves
and Loads - A Tutorial. Math. Stat., Center for Math. Sci., Lund Univ., Lund,
Sweden. ISBN XXXX, URL http://www.maths.lth.se/matstat/wafo.(2000)
[19] M. Wschebor, Surfaces Aleatoires. Lecture Notes Math. 1147, Springer,
(1985).
[20] U. Z
ahle, A general Rice formula, Palm measures, and horizontal-window
conditioning for random fields, Stoc. Process and their applications, 17, 265283 (1984).

40






 



 
















AL

CHAPTER 1

MA

TE

RI

CLASSICAL RESULTS ON THE


REGULARITY OF PATHS

CO

PY
R

IG

HT

ED

This initial chapter contains a number of elements that are used repeatedly in the
book and constitute necessary background. We will need to study the paths of
random processes and fields; the analytical properties of these functions play a
relevant role. This raises a certain number of basic questions, such as whether the
paths belong to a certain regularity class of functions, what one can say about
their global or local extrema and about local inversion, and so on. A typical
situation is that the available knowledge on the random function is given by
its probability law, so one is willing to know what one can deduce from this
probability law about these kinds of properties of paths. Generally speaking,
the result one can expect is the existence of a version of the random function
having good analytical properties. A version is a random function which, at
each parameter value, coincides almost surely with the one given. These are the
contents of Section 1.4, which includes the classical theorems due to Kolmogorov
and the results of Bulinskaya and Ylvisaker about the existence of critical points
or local extrema having given values. The essence of all this has been well known
for a long time, and in some cases proofs are only sketched. In other cases we
give full proofs and some refinements that will be necessary for further use.
As for the earlier sections, Section 1.1 contains starting notational conventions
and a statement of the Kolmogorov extension theorem of measure theory, and
Sections 1.2 and 1.3 provide a quick overview of the Gaussian distribution and
some connected results. Even though this is completely elementary, we call the
readers attention to Proposition 1.2, the Gaussian regression formula, which

Level Sets and Extrema of Random Processes and Fields, By Jean-Marc Azas and Mario Wschebor
Copyright 2009 John Wiley & Sons, Inc.

10

KOLMOGOROVS EXTENSION THEOREM

11

will appear now and again in the book and can be considered as the basis of
calculations using the Gaussian distribution.

1.1. KOLMOGOROVS EXTENSION THEOREM

Let ( , A, P) be a probability space and (F, F) a measurable space. For any


measurable function
Y : ( , A) (F, F),
that is, a random variable with values in F , the image measure
Q(A) = P(Y 1 (A))

AF

is called the distribution of Y .


Except for explicit statements to the contrary, we assume that probability
spaces are complete; that is, every subset of a set that has zero probability is
measurable. Let us recall that if (F, F, ) is a measure space, one can always
define its completion (F, F1 , 1 ) by setting
F1 = {A : B, C, A = B

C, such that B F, C D F, (D) = 0},


(1.1)

and for A F1 , 1 (A) = (B), whenever A admits the representation in (1.1).


One can check that (F, F1 , 1 ) is a complete measure space and 1 an extension
of .
A real-valued stochastic process indexed by the set T is a collection of random
variables {X(t) : t T } defined on a probability space ( , A, P). In what follows
we assume that the process is bi-measurable. This means that we have a -algebra
T of subsets of T and a Borel-measurable function of the pair (t, ) to the reals:
X : (T

, T A) (R, BR )

(BR denotes the Borel -algebra in R), so that


X(t)() = X(t, ).
Let T be a set and RT = {g : T R} the set of real-valued functions defined
on T . (In what follows in this section, one may replace R by Rd , d > 1.) For
n = 1, 2, . . . , t1 , t2 , . . . , tn , n distinct elements of T , and B1 , B2 , . . . , Bn Borel
sets in R, we denote
C(t1 , t2 , . . . , tn ; B1 , B2 , . . . , Bn ) = {g RT : g(tj ) Bj , j = 1, 2, . . . , n}
and C the family of all sets of the form C(t1 , t2 , . . . , tn ; B1 , B2 , . . . , Bn ). These
are usually called cylinder sets depending on t1 , t2 , . . . , tn . The smallest -algebra

12

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

of parts of RT containing C will be called the Borel -algebra of RT and denoted


by (C).
Consider now a family of probability measures
{Pt1 ,t2 ,...,tn }t1 ,t2 ,...,tn T ;

n=1,2,...

(1.2)

as follows: For each n = 1, 2, . . . and each n-tuple t1 , t2 , . . . , tn of distinct elements of T , Pt1 ,t2 ,...,tn is a probability measure on the Borel sets of the product
space Xt1 Xt2 Xtn , where Xt = R for each t T (so that this product
space is canonically identified as Rn ).
We say that the probability measures (1.2) satisfy the consistency condition if
for any choice of n = 1, 2, . . . and distinct t1 , . . . , tn , tn+1 T , we have
Pt1 ,...,tn ,tn+1 (B R) = Pt1 ,...,tn (B)
for any Borel set B in Xt1 Xtn . The following is the basic Kolmogorov
extension theorem, which we state but do not prove here.
Theorem 1.1 (Kolmogorov). {Pt1 ,t2 ,...,tn }t1 ,t2 ,...,tn T ; n=1,2,..., satisfy the consistency condition if and only if there exists one and only one probability measure
P on (C) such that
P(C(t1 , . . . , tn ; B1 , . . . , Bn )) = Pt1 ,...,tn (B1 Bn )

(1.3)

for any choice of n = 1, 2, . . ., distinct t1 , . . . , tn T and Bj Borel sets in Xtj ,


j = 1, . . . , n.
It is clear that if there exists a probability measure P on (C) satisfying (1.3),
the consistency conditions must hold since
C(t1 , . . . , tn , tn+1 ; B1 , . . . , Bn , Xtn+1 ) = C(t1 , . . . , tn ; B1 , . . . , Bn ).
So the problem is how to prove the converse. This can be done in two steps:
(1) define P on the family of cylinders C using (1.3) and show that the definition
is unambiguous (note that each cylinder has more than one representation); and
(2) apply Caratheodorys theorem on an extension of measures to prove that this
P can be extended in a unique form to (C).
Remarks
1. Theorem 1.1 is interesting when T is an infinite set. The purpose is to be
able to measure the probability of sets of functions from T to R (i.e., subsets
of RT ) which cannot be defined by means of a finite number of coordinates,
which amounts to looking only at the values of the functions at a finite number
of t-values.

KOLMOGOROVS EXTENSION THEOREM

13

Notice that in the case of cylinders, if one wants to know whether a given
function g : T R belongs to C(t1 , . . . , tn ; B1 , . . . , Bn ), it suffices to look at
the values of g at the finite set of points t1 , . . . , tn and check if g(tj ) Bj
for j = 1, . . . , n. However, if one takes, for example, T = Z (the integers) and
considers the sets of functions
A = {g : g : T R, lim g(t) exists and is finite}
t+

or
B = {g : g : T R, sup |g(t)| 1},
tT

it is clear that these sets are in (C) but are not cylinders (they depend on an
infinite number of coordinates).
2. In general, (C) is strictly smaller than the family of all subsets of RT . To
see this, one can check that
(C) = {A RT : TA T , TA countable and BA a Borel set in RTA ,
such that g A if and only if g/TA BA }.

(1.4)

The proof of (1.4) follows immediately from the fact that the right-hand side is
a -algebra containing C. Equation (1.4) says that a subset of RT is a Borel set
if and only if it depends only on a countable set of parameter values. Hence,
if T is uncountable, the set
{g RT : g is a bounded function}
or
{g RT : g is a bounded function, |g(t)| 1 for all t T }
does not belong to (C). Another simple example is the following: If T = [0, 1],
then
{g RT : g is a continuous function}
is not a Borel set in RT , since it is obvious that there does not exist a countable
subset of [0, 1] having the determining property in (1.4). These examples lead to
the notion of separable process that we introduce later.
3. In the special case when
= RT , A = (C), and X(t)() = (t),
{X(t) : t T } is called a canonical process.
4. We say that the stochastic process {Y (t) : t T } is a version of the process
{X(t) : t T } if P(X(t) = Y (t)) = 1 for each t T .

14

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

1.2. REMINDER ON THE NORMAL DISTRIBUTION

Let be a probability measure on the Borel subsets of Rd . Its Fourier transform


: Rd C is defined as
(z) =

Rd

exp(i z, x )(dx),

where , denotes the usual scalar product in Rd .


We use Bochners theorem (see, e.g., Feller, 1966): is the Fourier transform
of a Borel probability measure on Rd if and only if the following three conditions
hold true:
1. (0) = 1.
2. is continuous.
3. is positive semidefinite; that is, for any n = 1, 2, . . . and any choice of
the complex numbers c1 , . . . , cn and of the points z1 , . . . , zn , one has
n

(zj zk )cj ck 0.
j,k=1

The random vector with values in Rd is said to have the normal distribua
tion, or the Gaussian distribution, with parameters (m, ) [m Rd and
d d positive semidefinite matrix] if the Fourier transform of the probability
distribution of is equal to
(z) = exp i m, z

1
2

z, z

z Rd .

When m = 0 and = Id = identity d d matrix, the distribution of is called


standard normal in Rd . For d = 1 we use the notation
1
2
(x) = e(1/2)x
2

and

(x) =

(y) dy

for the density and the cumulative distribution function of a standard normal
random variable, respectively.
If is nonsingular, is said to be nondegenerate and one can verify that it
has a density with respect to Lebesgue measure given by
(dx) =

1
1
exp (x m)T
(2)d/2 (det( ))1/2
2

(x m) dx

x T denotes the transpose of x. One can check that


m = E( ),
so m and

= Var( ) = E(( m)( m)T ),

are, respectively, the mean and the variance of .

REMINDER ON THE NORMAL DISTRIBUTION

15

From the definition above it follows that if the random vector with values
in Rd has a normal distribution with parameters m and , A is a real matrix
with n rows and d columns, and b is a nonrandom element of Rn , then the
random vector A + b with values in Rn has a normal distribution with parameters (Am + b, A AT ). In particular, if is nonsingular, the coordinates of the
random vector 1/2 ( m) are independent random variables with standard
normal distribution on the real line.
Assume now that we have a pair and of random vectors in Rd and Rd ,
respectively, having finite moments of order 2. We define the d d covariance
matrix as
Cov(, ) = E(( E( )( E()T ).
It follows that if the distribution of the random vector (, ) in Rd+d is normal
and Cov(, ) = 0, the random vectors and are independent. A consequence
of this is the following useful formula, which is standard in statistics and gives
a version of the conditional expectation of a function of given the value of .
Proposition 1.2. Let and be two random vectors with values in Rd and Rd ,
respectively, and assume that the distribution of (, ) in Rd+d is normal and
Var() is nonsingular. Then, for any bounded function f : Rd R, we have
E(f ( )| = y) = E(f ( + Cy))

(1.5)

for almost every y, where


C = Cov(, )[Var()]1

(1.6)

and is a random vector with values in Rd , having a normal distribution with


parameters
E( ) CE(), Var( ) Cov(, )[Var()]1 [Cov(, )]T .

(1.7)

Proof. The proof consists of choosing the matrix C so that the random vector
= C
becomes independent of . For this purpose, we need the fact that
Cov( C, ) = 0,
and this leads to the value of C given by (1.6). The parameters (1.7) follow
immediately.
In what follows, we call the version of the conditional expectation given by
formula (1.5), Gaussian regression. To close this brief list of basic properties,

16

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

we mention that a useful property of the Gaussian distribution is stability under


passage to the limit (see Exercise 1.5).
Let r : T T R be a positive semidefinite function and m : T R a
function. In this more general context, that r is a positive semidefinite function,
means that for any n = 1, 2, . . . and any choice of distinct t1 , . . . , tn T , the
matrix ((r(tj , tk )))j,k=1,...,n is positive semidefinite. [This is consistent with the
previous definition, which corresponds to saying that r(s, t) = (s t), s, t Rd
is positive semidefinite.]
Take now for Pt1 ,...,tn the Gaussian probability measure in Rn with mean
mt1 ,...,tn := (m(t1 ), . . . , m(tn ))T
and variance matrix
t1 ,...,tn

:= ((r(tj , tk )))j,k=1,...,n .

It is easily verified that the set of probability measures {Pt1 ,...,tn } verifies the
consistency condition, so that Kolmogorovs theorem applies and there exists
a unique probability measure P on the measurable space (RT , (C)), which
restricted to the cylinder sets depending on t1 , . . . , tn is Pt1 ,...,tn for any choice
of distinct parameter values t1 , . . . , tn . P is called the Gaussian measure generated by the pair (m, r). If {X(t) : t T } is a real-valued stochastic process with
distribution P, one verifies that:

For any choice of distinct parameter values t1 , . . . , tn , the joint distribution


of the random variables X(t1 ), . . . , X(tn ) is Gaussian with mean mt1 ,...,tn
and variance t1 ,...,tn .
E(X(t)) = m(t) for t T .
Cov(X(s), X(t)) = E((X(s) m(s))(X(t) m(t))) = r(s, t) for s, t T .

A class of examples that appears frequently in applications is the d-parameter


real-valued Gaussian processes, which are centered and stationary, which means
that
T = Rd ,
m(t) = 0,
r(s, t) = (t s).
A general definition of strictly stationary processes is given in Section 10.2.
If the function is continuous, (0) = 0, one can write
( ) =

Rd

exp(i , x )(dx),

where is a Borel measure on Rd with total mass equal to (0). is called the
spectral measure of the process. We usually assume that (0) = 1: that is, that
is a probability measure which is obtained simply by replacing the original
process {X(t) : t Rd } by the process {X(t)/( (0))1/2 : t Rd }.

17

REMINDER ON THE NORMAL DISTRIBUTION

Example 1.1 (Trigonometric Polynomials). An important example of stationary


Gaussian processes is the following. Suppose that is a purely atomic probability
symmetric measure on the real line; that is, there exists a sequence {xn }n=1,2,.. of
positive real numbers such that

({xn }) = ({xn }) =

1
2 cn

for n = 1, 2, . . . ;

({0}) = c0 ;

cn = 1.
n=0

Then a centered Gaussian process having as its spectral measure is

1/2

X(t) = c0 0 +

cn1/2 (n cos txn + n sin txn )

t R,

(1.8)

n=1

where the {n }nZ is a sequence of independent identically distributed random


variables, each having a standard normal distribution. In fact, the series in (1.8)
converges in L2 ( , F, P ) and

E(X(t)) = 0 and E(X(s)X(t)) = c0 +

cn cos[(t s)xn ] = (t s).


n=1

We use the notation


k :=

x k (dx)

k = 0, 1, 2, . . .

(1.9)

whenever the integral exists. k is the kth spectral moment of the process.
An extension of the preceding class of examples is the following. Let
(T , T , ) be a measure space, H = L2R (T , T , ) the Hilbert space of real-valued
square-integrable functions on it, and {n (t)}n=1,2,... an orthonormal sequence
in H . We assume that each function n : T R is bounded and denote
Mn = suptT |n (t)|. In addition, let {cn }n=1,2,.. be a sequence of positive
numbers such that

cn < ,

cn Mn2 <

n=1

n=1

and {n }n=1,2,... a sequence of independent identically distributed (i.i.d.) random


variables, each with standard normal distribution in R.
Then the stochastic process

X(t) =

cn1/2 n n (t)
n=1

(1.10)

18

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

is Gaussian, centered with covariance

r(s, t) = E(X(s)X(t)} =

cn n (s)n (t).
n=1

Formulas (1.8) and (1.10) are simple cases of spectral representations of


Gaussian processes, which is an important subject for both theoretical purposes
and for applications. A compact presentation of this subject, including the
KarhunenLo`eve representation and the connection with reproducing kernel
Hilbert spaces, may be found in Ferniques lecture notes (1974).

1.3. 01 LAW FOR GAUSSIAN PROCESSES

We will prove a 01 law for Gaussian processes in this section without attempting
full generality. This will be sufficient for our requirements in what follows. For
a more general treatment, see Fernique (1974).
Definition 1.3. Let X = {X(t) : t T } and Y = {Y (t) : t S} be real-valued
stochastic processes defined on some probability space ( , A, P ). X and
Y are said to be independent if for any choice of the parameter values
t1 , . . . , tn T ; s1 , . . . , sm S, n, m 1, the random vectors
(X(t1 ), . . . , X(tn )), (Y (s1 ), . . . , Y (sm ))
are independent.
Proposition 1.4. Let the processes X and Y be independent and E (respectively,
F ) belong to the -algebra generated by the cylinders in RT (respectively, RS ).
Then
P(X() E, Y () F ) = P(X() E)P(Y () F ).

(1.11)

Proof. Equation (1.11) holds true for cylinders. Uniqueness in the extension
theorem provides the result.
Theorem 1.5 (01 Law for Gaussian Processes). Let X = {X(t) : t T }
be a real-valued centered Gaussian process defined on some probability space
( , A, P) and (E, E) a measurable space, where E is a linear subspace of RT
and the -algebra E has the property that for any choice of the scalars a, b R,
the function (x, y)
ax + by defined on E E is measurable with respect
to the product -algebra. We assume that the function X : E defined as
X() = X(, ) is measurable ( , A) (E, E). Then, if L is a measurable
subspace of E, one has
P(X() L) = 0 or 1.

01 LAW FOR GAUSSIAN PROCESSES

19

Proof. Let {X(1) (t) : t T } and {X(2) (t) : t T } be two independent processes
each having the same distribution as that of the given process {X(t) : t T }.
For each , 0 < < /2, consider a new pair of stochastic processes, defined
for t T by
Z(1) (t) = X(1) (t) cos + X(2) (t) sin
Z(2) (t) = X(1) (t) sin + X(2) (t) cos .

(1.12)

Each of the processes Z(i) (t)(i = 1, 2) has the same distribution as X .


In fact, E(Z(1) (t)) = 0 and since E(X(1) (s)X(2) (t)) = 0, we have E(Z(1) (s)
(1)
Z (t)) = cos2 E(X(1) (s)X(1) (t)) + sin2 E(X(2) (s)X(2) (t)) = E(X(s)X(t)).
A similar computation holds for Z(2) .
Also, the processes Z(1) and Z(2) are independent. To prove this, note that for
any choice of t1 , . . . , tn ; s1 , . . . , sm , n, m 1, the random vectors
(Z(1) (t1 ), . . . , Z(1) (tn )), (Z(2) (s1 ), . . . , Z(2) (sm ))
have a joint Gaussian distribution, so it suffices to show that
E(Z(1) (t)Z(2) (s)) = 0
for any choice of s, t T to conclude that they are independent. This is easily
checked.
Now, if we put q = P(X() L), independence implies that for any ,
q(1 q) = P(E )

where E = {Z(1) L, Z(2)


/ L}.

If , (0, /2), = , the events E and E are disjoint. In fact, the matrix
cos
cos

sin
sin

is nonsingular and (1.12) implies that if at the same time Z(1) L, Z(1) L,
then X(1) (), X(2) () L also, since X(1) (), X(2) () are linear combinations of
Z(1) and Z(1) . Hence, Z(2) , Z(2) L and E , E cannot occur simultaneously.
To finish, the only way in which we can have an infinite family {E }0<</2 of
pairwise disjoint events with equal probability is for this probability to be zero.
That is, q(1 q) = 0, so that q = 0 or 1.
In case the parameter set T is countable, the above shows directly that
any measurable linear subspace of RT has probability 0 or 1 under a centered
Gaussian law. If T is a -compact topological space, E the set of real-valued

20

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

continuous functions defined on T , and E the -algebra generated by the topology of uniform convergence on compact sets, one can conclude, for example, that
the subspace of E of bounded functions has probability 0 or 1 under a centered
Gaussian measure. The theorem can be applied in a variety of situations similar
to standard function spaces. For example, put a measure on the space (E, E) and
take for L an Lp of this measure space.

1.4. REGULARITY OF PATHS


1.4.1. Conditions for Continuity of Paths

Theorem 1.6 (Kolmogorov). Let Y = {Y (t) : t [0, 1]} be a real-valued


stochastic process that satisfies the condition
(K) For each pair t and t + h [0, 1],
P{|Y (t + h) Y (t)| (h)} (h),
where and are even real-valued functions defined on [1, 1], increasing
on [0, 1], that verify

(2n ) < ,

n=1

2n (2n ) < .

n=1

Then there exists a version X = {X(t) : t T } of the process Y such that the
paths t
X(t) are continuous on [0, 1].
Proof. For n = 1, 2, . . .; k = 0, 1, . . . , 2n 1, let
Ek,n =

k+1
2n

k
2n

(2

2n 1

) ,

En =

Ek,n .
k=0

From the hypothesis, P(En ) 2n (2n ), so that


n=1 P(En ) < . The
BorelCantelli lemma implies that P(lim supn En ) = 0, where

lim sup En = { : belongs to infinitely many En s}.


n

In other words, if
/ lim supn En , one can find n0 () such that if n n0 (),
one has
Y

k+1
2n

k
2n

< (2n )

for all k = 0, 1, . . . , 2n 1.

REGULARITY OF PATHS

21

Denote by Y (n) the function whose graph is the polygonal with vertices
(k/2n , Y (k/2n )), k = 0, 1, . . . , 2n ; that is, if k/2n t (k + 1)/2n , one has
Y (n) (t) = (k + 1 2n t)Y

k
2n

k+1
.
2n

+ (2n t k)Y

/ lim supn En , one easily


The function t
Y (n) (t) is continuous. Now, if
checks that there exists some integer n0 () such that
Y (n+1) Y (n)

2(n+1)

for n + 1 n0 ()

(n+1)
(here denotes the sup norm on [0, 1]). Since
) < by
n=1 (2
(n)
the hypothesis, the sequence of functions {Y } converges uniformly on [0, 1]
to a continuous limit function that we denote X(t), t [0, 1].
We set X(t) 0 when lim supn En . To finish the proof, it suffices to
show that for each t [0, 1], P(X(t) = Y (t)) = 1.

If t is a dyadic point, say t = k/2n , then given the definition of the sequence
of functions Y (n) , it is clear that Y (m) (t) = Y (t) for m n. Hence, for

/ lim supn En , one has X(t) = limm Y (m) (t) = Y (t). The result
follows from P((lim supn En )C ) = 1 (AC is the complement of the set
A).
If t is not a dyadic point, for each n, n = 1, 2, . . . , let kn be an integer such
that |t kn /2n | 2n , kn /2n [0, 1]. Set
Fn =

Y (t) X

kn
2n

(2n ) .

We have the inequalities


P(Fn ) P

Y (t) X

kn
2n

kn
2n

kn
2n

(2n ) ,

and a new application of the BorelCantelli lemma gives P(lim supn Fn )


= 0. So if
/ [lim supn En ] [lim supn Fn ], we have at the same
time, X(kn /2n )() X(t)() as n because t
X(t) is continuous, and X(kn /2n )() Y (t)() because |Y (t) X(kn /2n )| < (2n ) for
n n1 () for some integer n1 ().
This proves that X(t)() = Y (t)() for almost every .
Corollary 1.7 Assume that the process Y = {Y (t) : t [0, 1]} satisfies one of the
following conditions for t, t + h [0, 1]:

22

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

(a)
E(|Y (t + h) Y (t)|p )

K|h|
,
| log |h||1+r

(1.13)

where p,r, and K are positive constants, p < r.


(b) Y is Gaussian, m(t) := E(Y (t)) is continuous, and
Var(Y (t + h) Y (t))

C
| log |h||a

(1.14)

for all t, sufficiently small h, C some positive constant, and a > 3.


Then the conclusion of Theorem 1.6 holds.

Proof
(a) Set
(h) =

1
| log |h||b

(h) =

|h|
| log |h||1+rbp

1<b<

r
p

and check condition (K) using a Markov inequality.


(b) Since the expectation is continuous, it can be subtracted from Y (t), so that
we may assume that Y is centered. To apply Theorem 1.6, take
(h) =

1
1
with 1 < b < (a 1)/2 and (h) = exp | log |h||a2b .
b
| log |h||
4C

Then
P(|Y (t + h) Y (t)| (h)) = P | |

(h)
,
Var(Y (t + h) Y (t))

where stands for standard normal variable. We use the following usual bound
for Gaussian tails, valid for u > 0:
P(| | u) = 2P( u) =

e(1/2)x dx
2

2 1 (1/2)u2
e
.
u

With
the foregoing choice of () and (), if |h| is small enough, one has
(h)/ Var(Y (t + h) Y (t)) > 1 and
P(|Yt+h Y (t)| (h)) (const) (h).
where (const) denotes a generic constant that may vary from line to line. On the
n
n
n
other hand,
1 (2 ) < and
1 2 (2 ) < are easily verified.

REGULARITY OF PATHS

23

Some Examples

1. Gaussian stationary processes. Let {Y (t) : t R} be a real-valued Gaussian centered stationary process with covariance ( ) = E(Y (t) Y (t + )). Then
condition (1.14) is equivalent to
C
| log | ||a
for sufficiently small | |, with the same meaning for C and a.
2. Wiener process. Take T = R+ . The function r(s, t) = s t is positive
semidefinite. In fact, if 0 s1 < < sn and x1 , . . . , xn R, one has
(0) ( )

(sj sk ) xj xk =
j,k=1

(sk sk1 )(xk + + xn )2 0,

(1.15)

k=1

where we have set s0 = 0.


Then, according to Kolmogorovs extension theorem, there exists a centered
Gaussian process {Y (t) : t R+ } such that E(Y (s)Y (t)) = s t for s, t 0. One
easily checks that this process satisfies the hypothesis in Corollary 1.7(b), since
the random variable Y (t + h) Y (t), h 0 has the normal distribution N (0, h)
because of the simple computation
E([Y (t + h) Y (t)]2 ) = t + h 2t + t = h.
It follows from Corollary 1.7(b) that this process has a continuous version on
every interval of the form [n, n + 1]. The reader will verify that one can also
find a version with continuous paths defined on all R+ . This version, called the
Wiener process, is denoted {W (t) : t R+ }.
3. Ito integrals. Let {W (t) : t 0} be a Wiener process on a probability space
( , A, P). We define the filtration {Ft : t 0} as Ft = {W (s) : s t}, where
the notation means the -algebra generated by the set of random variables
{W (s) : s t} (i.e., the smallest -algebra with respect to which these random
variables are all measurable) completed with respect to the probability measure P.
Let {at : t 0} be a stochastic process adapted to the filtration {Ft : t 0}.
This means that at is Ft -measurable for each t 0. For simplicity we assume
that {at : t 0} is uniformly locally bounded in the sense that for each T > 0
there exists a constant CT such that |at ()| CT for every and all t [0, T ].
For each t > 0, one can define the stochastic Ito integral
t

Y (t) =

as dW (s)
0

as the limit in L2 = L2 ( , A, P) of the Riemann sums


m1

SQ =

atj (W (tj +1 ) W (tj ))


j =0

24

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

when NQ = sup{(tj +1 tj ) : 0 j m 1} tends to 0. Here Q denotes the partition 0 = t0 < t1 < < tm = t of the interval [0, t] and {at : t 0} an adapted
stochastic process, bounded by the same constant as {at : t 0} and such that
m1

atj 1I{tj s<tj +1 }


j =0

tends to {at : 0 s t} in the space L2 ([0, t] , P) as NQ 0. is a


Lebesgue measure on the line.
Of course, the statements above should be proved to be able to define Y (t)
in this way (see, e.g., McKean, 1969). Our aim here is to prove that the process
{Y (t) : t 0} thus defined has a version with continuous paths. With no loss of
generality, we assume that t varies on the interval [0, 1] and apply Corollary
1.7(a) with p = 4.
We will prove that
E((Y (t + h) Y (t))4 ) (const)h2 .
For this, it is sufficient to see that if Q is a partition of the interval [t, t + h],
h > 0,
4
E(SQ
) (const)h2 ,

(1.16)

where (const) does not depend on t, h, and Q, and then apply Fatous lemma
when NQ 0.
Let us compute the left-hand side of (1.16). Set j = W (tj +1 ) W (tj ). We
have
m1
4
E(SQ
)=
j1 ,j2 ,j3 ,j4 =0

E(atj1 atj2 atj3 atj4

j1

j2

j3

j4 ).

(1.17)

If one of the indices, say j4 , satisfies j4 > j1 , j2 , j3 , the corresponding term


becomes
4

(atjh

jh )

=E E

(atjh

h=1

jh )|Ftj4

h=1
3

=E

(atjh

jh )atj4 E(

j4 |Ftj4 )

=0

h=1

since
3

E(

j |Ftj ) = E(

j ) = 0 and

(atjh
h=1

jh )atj4

isFtj4 measurable.

REGULARITY OF PATHS

25

In a similar way, if j4 < j1 = j2 = j3 (and similarly, if any one of the indices


is strictly smaller than the others and these are all equal), the corresponding term
vanishes since in this case
4

atjh

=E E

jh

atj1

j4 |Ftj1

atj4

j1

h=1

= E at3j atj4

j4 E

3
j |Ftj

3
j

=0

3
j1 |Ftj1

because
E

=E

= 0.

The terms with j1 = j2 = j3 = j4 give the sum


m1

m1

atj

4
j

C14

j =0

3 (tj +1 tj )2 3 C14 h2 .
j =0

Finally, we have the sum of the terms corresponding to 4-tuples of indices


j1 , j2 , j3 , and j4 such that for some permutation (i1 , i2 , i3 , i4 ) of (1, 2, 3, 4),
one has ji1 , ji2 < ji3 = ji4 . This is
m1

6
j3 =1 0j1 ,j2 <j3

E atj1 atj2 at2j

j1

j2

2
j3

Conditioning on Ftj3 in each term yields for this sum


m1

6
j3 =1 0j1 ,j2 <j3

(tj3 +1 tj3 )E atj1 atj2 at2j

j1

j2

= 6E

m1

(tj3 +1 tj3 )at2j

j3 =1
m1

6 C12

atj

atj

j3 1
j =0

j3 1

(tj3 +1 tj3 )E

j3 =1

j =0
j3 1

m1

= 6 C12

(tj3 +1 tj3 )
j3 =1

E at2j (tj +1 tj ) 3C14 h2 .


j =0

Using (1.17), one obtains (1.16), and hence the existence of a version of the Ito
integral possessing continuous paths.

26

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

Separability. Next, we consider the separability of stochastic processes. The


separability condition is shaped to avoid the measurability problems that we have
already mentioned and to use, without further reference, versions of stochastic
processes having good path properties. We begin with a definition.

Definition 1.8. We say that a real-valued stochastic process {X(t) : t T }, T a


topological space, is separable if there exists a fixed countable subset D of T such
that with probability 1,
sup X(t) = supX(t) and

tV D

tV

inf X(t) = inf X(t) for all open sets V .

tV D

tV

A consequence of Theorem 1.6 is the following:


Proposition 1.9. Let {Y (t) : t I }, I an interval in the line, be a separable random process that satisfies the hypotheses of Theorem 1.6. Then, almost surely
(a.s.), its paths are continuous.

Proof. Denote by D the countable set in the definition of separability. With no


loss of generality, we may assume that D is dense in I . The theorem states that
there exists a version {X(t) : t I } that has continuous paths, so that
P(X(t) = Y (t) for all t D) = 1.
Let
E = {X(t) = Y (t) for all t D}
and
F =

sup Y (t) = sup Y (t)

J I,J =(r1 ,r2 ),r1 ,r2 Q

tJ D

and

tJ

inf Y (t) = inf Y (t) .

tJ D

tJ

Since P(E F ) = 1, it is sufficient to prove that if E F , then


X(s)() = Y (s)() for all s I .
So, let E F and s I . For any > 0, choose r1 , r2 Q such that
s < r1 < s < r2 < s + .
Then, setting J = (r1 , r2 ),
Y (s)() supY (t)() = sup Y (t)() = sup X(t)() supX(t)().
tJ

tJ D

tJ D

tJ

REGULARITY OF PATHS

27

Letting 0, it follows that


Y (s)() lim sup X(t)() = X(s)()
ts

since t
X(t)() is continuous.
In a similar way, one proves that Y (s)() X(s)().
The separability condition is usually met when the paths have some minimal
regularity (see Exercise 1.7). For example, if {X(t) : t R} is a real-valued
process having a.s. c`ad-l`ag paths (i.e., paths that are right-continuous with left
limits), it is separable. All processes considered in the sequel are separable.
Some Additional Remarks and References. A reference for Kolmogorovs
extension theorem and the regularity of paths, at the level of generality we have
considered here, is the book by Cramer and Leadbetter (1967), where the reader
can find proofs that we have skipped as well as related results, examples, and
details. For d-parameter Gaussian processes, a subject that we consider in more
detail in Chapter 6, in the stationary case, necessary and sufficient conditions
to have continuous paths are due to Fernique (see his St. Flour 1974 lecture
notes) and to Talagrand (1987) in the general nonstationary case. In the Gaussian
stationary case, Belayev (1961) has shown that either: with probability 1 the paths
are continuous, or with probability 1 the supremum (respectively, the infimum)
on every interval is + (respectively, ). General references on Gaussian
processes are the books by Adler (1990) and Lifshits (1995).

1.4.2. Sample Path Differentiability and Holder


Conditions

In this section we state some results, without detailed proofs. These follow the
lines of the preceding section.
Theorem 1.10. Let Y = {Y (t) : t [0, 1]} be a real-valued stochastic process
that satisfies the hypotheses of Theorem 1.6 and additionally, for any triplet t h,
t, t + h [0, 1], one has
P(|Y (t + h) + Y (t h) 2Y (t)| 1 (h)) 1 (h),
where 1 and 1 are two even functions, increasing for h > 0 and such that

n=1

2n 1 (2n ) < ,

2n 1 (2n ) < .

n=1

Then there exists a version X = {X(t) : t T } of the process Y such that almost
surely the paths of X are of class C 1 .

28

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

Sketch of the Proof. Consider the sequence {Y (n) (t) : t [0, 1]}n=1,2,... of
polygonal processes introduced in the proof of Theorem 1.6. We know that
a.s. this sequence converges uniformly to X = {X(t) : t [0, 1]}, a continuous
version of Y. Define:
Y (n) (t) := Y (n) (t )
Y

(n)

(0) := Y

(n)

for 0 < t 1 (left derivative)

(0 ) (right derivative).

One can show that the hypotheses imply:


1. Almost surely, as n , Y (n) () converges uniformly on [0, 1] to a function X().
2. Almost surely, as n , supt[0,1] |Y (n) (t + ) Y (n) (t)| 0.
To complete the proof, check that the function t
X(t) a.s. is continuous
and coincides with the derivative of X(t) at every t [0, 1].
Example 1.2 (Stationary Gaussian Processes). Let Y = {Y (t) : t R} be a centered stationary Gaussian process with covariance of the form
1
( ) = E(Y (t)Y (t + )) = (0) 2 2 + O
2

2
| log | ||a

with 2 > 0, a > 3. Then there exists a version of Y with paths of class C 1 . For
the proof, apply Theorem 1.10.
A related result is the following. The proof is left to the reader.
Proposition 1.11 (Holder Conditions). Assume that
E(|Y (t + h) Y (t)|p ) K|h|1+r

for t, t + h [0, 1],

(1.18)

where K, p, and r are positive constants, r p. Then there exists a version of


the process Y = {Y (t) : t [0, 1]} with paths that satisfy a Holder condition with
exponent for any such that 0 < < r/p.
Note that, for example, this proposition can be applied to the Wiener process (Brownian motion) with r = (p 2)/2, showing that it satisfies a Holder
condition for every < 12 .

REGULARITY OF PATHS

29

1.4.3. Higher Derivatives

Let X = {X(t) : t R} be a stochastic process and assume that for each t R,


one has X(t) L2 ( , A, P).
Definition 1.12. X is differentiable in quadratic mean (q.m.) if for all t R,
X(t + h) X(t)
h
converges in quadratic mean as h 0 to some limit that will be denoted X (t).
The stability of Gaussian random variables under passage to the limit implies
that the derivative in q.m. of a Gaussian process remains Gaussian.
Proposition 1.13. Let X = {X(t) : t R} be a stochastic process with mean m(t)
and covariance r(s, t) and suppose that m is C 1 and that r is C 2 . Then X is
differentiable in the quadratic mean.
Proof. We use the following result, which is easy to prove: The sequence
Z1 , . . . , Zn of real random variables converges in q.m. if and only if there
exists a constant C such that E(Zm Zn ) C as the pair (m, n) tends to infinity.
Since m(t) is differentiable, it can be substracted from X(t) without changing
its differentiability, so we can assume that the process is centered. Then for all
real h and k,
E

X(t + h) X(t) X(t + k) X(t)


h
k
=

1
r(t + h, t + k) r(t, t + k) r(t, t + h) + r(t, t)
hk
r11 (t, t) as (k, h) (0, 0),

where r11 (s, t) := 2 r(s, t)/st. This shows differentiability in q.m.


We assume, using the remark in the proof above, that X is centered and
satisfies the conditions of the proposition. It is easy to prove that
E(X(s)X (t)) = r01 (s, t) :=

r
(s, t),
t

and similarly, that the covariance of X = {X (t) : t R} is r11 (s, t). Now let X
be a Gaussian process and X its derivative in quadratic mean. If this satisfies,
for example, the criterion in Corollary 1.7(b), it admits a continuous version
Y = {Y (t) : Y (t); t R}. Set
t

Y (t) := X(0) +

Y (s) ds.
0

30

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

Clearly, Y has C 1 -paths and E(X(s), Y (s)) = r(s, 0) + 0 r01 (s, t) dt = r(s, s).
In the same way, E(Y (s)2 ) = r(s, s), so that E([X(s) Y (s)]2 ) = 0. As a consequence, X admits a version with C 1 paths.
Using this construction inductively, one can prove the following:

Let X be a Gaussian process with mean C k and covariance C 2k and such that
its kth derivative in quadratic mean satisfies the weak condition of Corollary
1.7(b). Then X admits a version with paths of class C k .
If X is a Gaussian process with mean of class C and covariance of class
C , X admits a version with paths of class C .

In the converse direction, regularity of the paths implies regularity of the


expectation and of the covariance function. For example, if X has continuous
sample paths, the mean and the variance are continuous. In fact, if tn , n = 1, 2, . . .
converges to t, then X(tn ) converges a.s. to X(t), hence also in distribution.
Using the form of the Fourier transform of the Gaussian distribution, one easily
proves that this implies convergence of the mean and the variance. Since for
Gaussian variables, all the moments are polynomial functions of the mean and
the variance, they are also continuous. If the process has differentiable sample
paths, in a similar way one shows the convergence
m(t + h) m(t)
E(X (t))
h
as h 0, showing that the mean is differentiable.
For the covariance, restricting ourselves to stationary Gaussian processes
defined on the real line, without loss of generality we may assume that
the process is centered. Put (t) = r(s, s + t). The convergence in distribution of (X(h) X(0))/ h to X (0) plus the Gaussianity imply that
Var (X(h) X(0))/ h has a finite limit as h 0. On the other hand,
Var

X(h) X(0)
h

=2

1 cos hx
(dx),
h2

where is the spectral measure.


Letting h 0 and applying Fatous lemma, it follows that
2 =

x 2 (dx) lim inf Var


h0

X(h) X(0)
h

< .

Using the result in Exercise 1.4, is of class C 2 .


This argument can be used in a similar form to show that if the process
has paths of class C k , the covariance is of class C 2k . As a conclusion, roughly
speaking, for Gaussian stationary processes, the order of differentiability of the
sample paths is half of the order of differentiability of the covariance.

REGULARITY OF PATHS

31

1.4.4. More General Tools

In this section we consider the case when the parameter of the process lies in Rd
or, more generally, in some general metric space. We begin with an extension of
Theorem 1.6.
Theorem 1.14. Let Y = {Y (t) : t [0, 1]d } be a real-valued random field that
satisfies the condition
(Kd ) For each pair t, t + h [0, 1]d ,
(h),

P{|Y (t + h) Y (t)| (h)}


where h = (h1 , . . . , hd ), h = sup1id |hi |, and , are even real-valued
functions defined on [1, 1], increasing on [0, 1], which verify

(2n ) < ,

n=1

2dn (2n ) < .

n=1

Then there exists a version X = {X(t) : t [0, 1]d } of the process Y such
that the paths t X(t) are continuous on [0, 1]d .
Proof. The main change with respect to the proof of Theorem 1.6 is that
we replace the polygonal approximation, adapted to one-variable functions by
another interpolating procedure. Denote by Dn the set of dyadic points of order
n in [0, 1]d ; that is,
Dn = t = (t1 , . . . , td ) : ti =

ki
, ki integers , 0 ki 2n , i = 1, . . . , d .
2n

Let f : [0, 1]d R be a function. For each n = 1, 2, . . ., one can construct a


function f (n) : [0, 1]d R with the following properties:

f (n) is continuous.
f (n) (t) = f (t) for all t Dn .
f (n+1) f (n) = maxtDn+1 \Dn |f (t) f (n) (t)|, where
sup-norm on [0, 1]d .

denotes

A way to define f (n) is the following: Let us consider a cube Ct,n of the
nth-order partition of [0, 1]d ; that is,
Ct,n = t + 0,

1
2n

32

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

where t Dn with the obvious notation for the sum. For each vertex , set
f (n) ( ) = f ( ).
Now, for each permutation of {1, 2, . . . , d}, let S be the simplex
S = t + s : s = (s(1) , . . . , s(d) ), 0 s(1) s(d)

1
.
2n

It is clear that Ct,n is the union of the S s over all permutations. In a unique
way, extend f (n) to S as an affine function. It is then easy to verify the afore
mentioned properties and that
f (n+1) f (n)

sup
s,tDn+1 ,|ts|=2(n+1)

|f (s) f (t)|.

The remainder of the proof is essentially similar to that of Theorem 1.6.


From this we deduce easily
Corollary 1.15. Assume that the process Y = {Y (t) : t [0, 1]d } verifies one of
two conditions:
(a)
E(|Y (t + h) Y (t)|p )

Kd |h|d
,
| log |h||1+r

(1.19)

where p, r, and K are positive constants, p < r.


(b) If Y is Gaussian, m(t) = E(Y (t)) is continuous and
Var(Y (t + h) Y (t))

C
| log |h||a

(1.20)

for all t and sufficiently small h and a > 3.


Then the process has a version with continuous paths.
Note that the case of processes with values in Rd need not to be considered
separately, since continuity can be addressed coordinate by coordinate. For Holder
regularity we have
Proposition 1.16. Let Y = {Y (t) : t [0, 1]d } be a real-valued stochastic process with continuous paths such that for some q > 1, > 0,
E |Y (s) Y (t)|q (const) s t

d+

Then almost surely, Y has Holder paths with exponent /2q.

REGULARITY OF PATHS

33

Until now, we have deliberately chosen elementary methods that apply to


general random processes, not necessarily Gaussian. In the Gaussian case, even
when the parameter varies in a set that does not have a restricted geometric
structure, the question of continuity can be addressed using specific methods.
As we have remarked several times already, we only need to consider centered
processes.
Let {X(t) : t T } be a centered Gaussian process taking values in R. We
assume that T is some metric space with distance denoted by . On T we define
the canonical distance d,
d(s, t) :=

E(X(t) X(s))2 .

In fact, d is a pseudodistance because two distinct points can be at d distance


zero. A first point is that when the covariance r(s, t) function is -continuous,
which is the only relevant case (otherwise there is no hope of having continuous
paths), d-continuity and -continuity are equivalent. The reader is referred to
Adler (1990) for complements and proofs.
Definition 1.17. Let (T , d) be a metric space. For > 0 denote by N () =
N (T , d, ) the minimum number of closed balls of radius with which we can
cover T (the value of N can be +).
We have the following theorem:
Theorem 1.18 (Dudley, 1973). A sufficient condition for {X(t) : t T } to have
continuous sample paths is
+

log(N ())

1/2

d < .

log(N ()) is called the entropy of the set T .


A very important fact is that this condition is necessary in some relevant cases:
Theorem 1.19 (Fernique, 1974). Let {X(t) : t T }, T compact, a subset of Rd ,
be a stationary Gaussian process. Then the following three statements are equivalent:

Almost surely, X() is bounded.


Almost surely, X() is continuous.
+

log(N ())
0

1/2

d < .

34

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

This condition can be compared with Kolmogorovs theorem. The reader can
check that Theorem 1.19 permits us to weaken the condition of Corollary 1.7(b)
to a > 1. On the other hand, one can construct counterexamples (i.e., processes
not having continuous paths) such that (1.14) holds true with a = 1. This shows
that the condition of Corollary 1.7(b) is nearly optimal and sufficient for most
applications. When the Gaussian process is no longer stationary, M. Talagrand
has given necessary and sufficient conditions for sample path continuity in terms
of the existence of majorizing measures (see Talagrand, 1987).
The problem of differentiability can be addressed in the same manner as for
d = 1. A sufficient condition for a Gaussian process to have a version with C k
sample paths is for its mean to be C k , its covariance C 2k , and its kth derivative
in quadratic mean to satisfy some of the criteria of continuity above.

1.4.5. Tangencies and Local Extrema

In this section we give two classical results that are used several times in the book.
The first gives a simple sufficient condition for a one-parameter random process
not to have a.s. critical points at a certain specified level. The second result
states that under mild conditions, a Gaussian process defined on a quite general
parameter set with probability 1 does not have local extrema at a given level.
We will use systematically the following notation: If is a random variable with
values in Rd and its distribution has a density with respect to Lebesgue measure,
this density is denoted as
p (x)

x Rd .

Proposition 1.20 (Bulinskaya, 1961). Let {X(t) : t I } be a stochastic process


with paths of class C 1 defined on the interval I of the real line. Assume that for
each t I , the random variable X(t) has a density pX(t) (x) which is bounded as
t varies in a compact subset of I and x in a neighborhood v of u R. Then
P(TuX = ) = 0,
where TuX = {t : t I, X(t) = u, X (t) = 0} is the set of critical points with value
u of the random path X().
Proof. It suffices to prove that P(TuX J = ) = 0 for any compact subinterval
J of I . Let be the length of J and t0 < t1 < < tm be a uniform partition
of J (i.e., tj +1 tj = /m for j = 0, 1, , m 1). Denote by X (, J ) the
modulus of continuity X on the interval J and E, the event
E, = {X (, J ) }.

REGULARITY OF PATHS

35

Let > 0 be given; choose > 0 so that P(E, ) < and m so that /m < ,
and [u l/m, u + l/m] v. We have
m1

P(TuX

J = ) P(E, ) +

C
P({TuX [tj , tj +1 ] = } E,
)
j =0

m1

m1

P |X(tj ) u|

< +
j =0

= +
j =0

|xu|( /m)

pX(tj ) (x) dx.

If C is an upper bound for pX(t) (x), t J, |x u| l/m, we obtain


P(TuX J = ) + C .
Since > 0 is arbitrary, the result follows.
The second result is an extension of Ylvisakers theorem, which has the following statement:
Theorem 1.21 (Ylvisaker, 1968). Let {Z(t) : t T } be a real-valued Gaussian process indexed on a compact separable topological space T having continuous paths and Var(Z(t)) > 0 for all t T . Then, for fixed u R, one has
P(EuZ = ) = 0, where EuZ is the set of local extrema of Z() having value equal
to u.
The extension is the following:
Theorem 1.22 Let {Z(t) : t T } be a real-valued Gaussian process on some
parameter set T and denote by M Z = suptT Z(t) its supremum (which takes
values in R {+}). We assume that there exists a nonrandom countable set
D, D T , such that a.s. M Z = suptD Z(t). Assume further that there exist
02 > 0, m > such that
m(t) = E(Z(t)) m
2 (t) = Var(Z(t)) 02

for every t T .

Then the distribution of the random variable M Z is the sum of an atom at +


and a (possibly defective) probability measure on R which has a locally bounded
density.
Proof. Step 1. Suppose first that {X(t) : t T } satisfies the hypotheses of the
theorem, and, moreover,
Var(X(t)) = 1,

E(X(t)) 0

36

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

for every t T . We prove that the supremum M X has a density pM X , which


satisfies the inequality
exp(u2 /2)

pM X (u) (u) :=

for every u R.

(1.21)

exp(v 2 /2) dv

Let D = {tk }k=1,2,... . Almost surely, M X = sup{X(t1 ) . . . X(tn ) . . .}. We set


Mn := sup X(tk ).
1kn

Since the joint distribution of X(tk ), k = 1, . . . , n, is Gaussian, for any choice


of k, = 1, . . . , n; k = , the probability P{X(tk ) = X(t )} is equal to 0 or 1.
Hence, possibly excluding some of these random variables, we may assume that
these probabilities are all equal to 0 without changing the value of Mn on a set
of probability 1. Then the distribution of the random variable Mn has a density
gn () that can be written as
n

gn (x) =

P(X(tj ) < x, j = 1, . . . , n; j = k|X(tk ) = x)


k=1

e(1/2)(xm(tk ))

= (x)Gn (x),

2
2

where denotes the standard normal density and


n

Gn (x) =

P(Yj < x m(tj ), j = 1, . . . , n; j = k|Yk


k=1

= x m(tk ))e

(1.22)

xm(tk )(1/2)m2 (tk )

with
Yj = X(tj ) m(tj )

j = 1, . . . , n.

Let us prove that x


Gn (x) is an increasing function.
Since m(t) 0, it is sufficient that the conditional probability in each term of
(1.22) be increasing as a function of x. Write the Gaussian regression
Yj = Yj c j k Yk + c j k Yk

with

cj k = E(Yj Yk ),

where the random variables Yj cj k Yk and Yk are independent. Then the conditional probability becomes
P(Yj cj k Yk < x m(tj ) cj k (x m(tk )), j = 1, . . . , n; j = k).

REGULARITY OF PATHS

37

This probability increases with x because 1 cj k 0, due to the Cauchy


Schwarz inequality. Now, if a, b R, a < b, since Mn M X ,
P{a < M X b} = lim P(a < Mn b).
n

Using the monotonicity of Gn , we obtain


+

Gn (b)

(x) dx

Gn (x)(x) dx =

gn (x) dx 1,

so that
b

P{a < Mn b} =
a

(x) dx
a
1

gn (x) dx Gn (b)
(x) dx

(x) dx

This proves (1.21).


Step 2. Now let Z satisfy the hypotheses of the theorem without assuming the
added ones in step 1. For given a, b R, a < b, choose A R+ so that |a| < A
and consider the process
X(t) =

|m | + A
Z(t) a
.
+
(t)
0

Clearly, for every t T ,


E X(t) =

|m | + |a| |m | + A
m(t) a
|m | + A

+
0
+
(t)
0
0
0

and

Var X(t) = 1,

so that (1.21) holds for the process X.


On the other hand,
{a < M Z b} {1 < M X 2 },
where
1 =

|m | + A
,
0

2 =

|m | + A b a
+
.
0
0

It follows that
P a < MZ b

2
1

which proves the statement.

(u) du =
a

v a + |m | + A
1
dv,

0
0

38

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

Theorem 1.21 follows directly from Theorem 1.22, since under the hypotheses
of Theorem 1.21, we can write
{EuX = }

({MU = u} {mU = u}),


U F

where MU (respectively, mu ) is the maximum (respectively, the minimum) of


the process on the set U and F denotes a countable family of open sets being a
basis for the topology of T .
Remark. We come back in later chapters to the subject of the regularity properties of the probability distribution of the supremum of a Gaussian process.

EXERCISES

1.1. Let T = N be the set of natural numbers. Prove that the following sets
belong to (C).
(a) c0 (the set of real-valued sequences {an } such that an 0). Suggestion:

Note that c0 =
nm {|an | < 1/k}.
k=1
m=1
2
(b)
(the set of real-valued sequences {an } such that n |an |2 < ).
(c) The set of real-valued sequences {an } such that limn an 1.
1.2. Take T = R, T = BR . Then if for each
t

the function
(1.23)

X(t, ),

the path corresponding to , is a continuous function, the process is


bi-measurable. In fact, check that
X(t, ) = lim X(n) (t, ),
n+

where for n = 1, 2, . . . , X(n) is defined by


k=+

X(n) (t, ) =

Xk/2n ()1I{k/2n

t <(k+1)/2n } ,

k=

which is obviously measurable as a function of the pair (t, ). So the limit


function X has the required property. If one replaces the continuity of the
path (1.23) by some other regularity properties such as right continuity,
bi-measurability follows in a similar way.

EXERCISES

39

1.3. Let U be a random variable defined on some probability space ( , A, P),


having uniform distribution on the interval [0, 1]. Consider the two stochastic processes
Y (t) = 1It=U
X(t) 0.
The process Y (t) is sometimes called the random parasite.
(a) Prove that for all t [0, 1], a.s. X(t) = Y (t).
(b) Deduce that the processes X(t) and Y (t) have the same probability
distribution P on R[0,1] equipped with its Borel -algebra.
(c) Notice that for each in the probability space, supt[0,1] Y (t) = 1 and
supt[0,1] X(t) = 0, so that the suprema of both processes are completely different. Is there a contradiction with the previous point?
1.4. Let be a Borel probability measure on the real line and
transform; that is,
( ) =

its Fourier

exp(i x)(dx).

(a) Prove that if


k =

|x|k (dx) <

for some positive integer k, the covariance


(k)

( ) =

() is of class C k and

(ix)k exp(i x)(dx).

(b) Prove that if k is even, k = 2p, the reciprocal is true: If


C 2p , then 2p is finite and
(t) = 1 2

is of class

t2
t4
t 2p
+ 4 + + (1)2p 2p
+ o(t 2p ).
2!
4!
(2p)!

Hint: Using induction on p and supposing that k is infinite, then for


every A > 0, one can find some M > 0 such that
M
M

x k (dx) A.

40

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

Show that it implies that


(1)k

k!
tk

(t) 1 2

t2
t k2
+ + (1)k2 k2
2!
(k 2)!

has a limit, when t tends to zero, greater than A, which contradicts


differentiability.
(c) When k is odd, the result is false [see Feller, 1966, Chap. XVII,
example (c)].
1.5. Let {n }n=1,2,... be a sequence of random vectors defined on some probability space taking values in Rd , and assume that n in probability
for some random vector . Prove that if each n is Gaussian, is also
Gaussian.
1.6. Prove the following statements on the process defined by (1.10).
(a) For each t T the series (1.10) converges a.s.
(b) Almost surely, the function t
X(t) is in H and X() 2H =

2
n=1 cn n .
(c) {n }n=1,2,... are eigenfunctionswith eigenvalues {cn }n=1,2,... ,
respectivelyof the linear operator A : H H defined by
(Af )(s) =

r(s, t)f (t)(dt).


T

1.7. Let {X(t) : t T } be a stochastic process defined on some separable topological space T .
(a) Prove that if X(t) has continuous paths, it is separable.
(b) Let T = R. Prove that if the paths of X(t) are c`ad-l`ag, X(t) is separable.
1.8. Let {X(t) : t Rd } be a separable stochastic process defined on some
(complete) probability space ( , A, P).
(a) Prove that the subset of {X() is continuous} is in A.
(b) Prove that the conclusion in part (a) remains valid if one replaces continuous by upper continuous, lower continuous, or continuous on
the right [a real-valued function f defined on Rd is said to be continuous on the right if for each t, f (t) is equal to the limit of f (s) when
each coordinate of s tends to the corresponding coordinate of t on its
right].
1.9. Show that in the case of the Wiener process, condition (1.18) holds for
every p 2, with r = p/2 1. Hence, the proposition implies that a.s.,
the paths of the Wiener process satisfy a Holder condition with exponent
, for every < 12 .

EXERCISES

41

1.10. (Wiener integral ) Let {W1 (t) : t 0} and {W2 (t) : t 0} be two independent Wiener processes defined on some probability space ( , A, P), and
denote by {W (t) : t R} the process defined as
W (t) = W1 (t) if t 0 and W (t) = W2 (t) if t 0.
L2 (R, ) denotes the standard L2 -space of real-valued measurable
functions on the real line with respect to Lebesgue measure and
1
L2 ( , A, P) the L2 of the probability space. CK
(R) denotes the subspace
2
1
of L (R, ) of C -functions with compact support. Define the function
1
I : CK
(R) L2 ( , A, P) as
I (f ) =

f (t)W (t) dt
R

(1.24)

1
for each nonrandom f CK
(R). Equation (1.24) is well defined for each
since the integrand is a continuous function with compact support.

(a) Prove that I is an isometry, in the sense that R f 2 (t) dt = E I 2 (f ) .


(b) Show that for each f , I (f ) is a centered Gaussian random variable.
1
(R), the joint distribution
Moreover, for any choice of f1 , . . . , fp CK
of (I (f1 ), . . . ., I (fp )) is centered Gaussian. Compute its covariance
matrix.
(c) Prove that I admits a unique isometric extension I to L2 (R, ) such
that:
(1) I(f ) is a centered Gaussian random variable with variance equal
to R f 2 (t) dt; similarly for joint distributions.
(2)

f (t)g(t) dt = E I(f )I(g) .

Comment: I(f ) is called the Wiener integral of f .


1.11. (Fractional Brownian motion) Let H be a real number, 0 < H < 1. We
use the notation and definitions of Exercise 1.10.
(a) For t 0, define the function Kt : R R:
Kt (u) = (t u)H 1/2 (u)H 1/2 1Iu<0 + (t u)H 1/2 1I0<u<t .
Prove that Kt L2 (R, ).
(b) For t 0, define the Wiener integral I(Kt ), and for s, t 0, prove the
formula
CH 2H
E I(Ks )I(Kt ) =
s + t 2H |t s|2H ,
2
where CH is a positive constant depending only on H . Compute CH .

42

CLASSICAL RESULTS ON THE REGULARITY OF PATHS

1/2

(c) Prove that the stochastic process {CH I(Kt ) : t 0} has a version
with continuous paths. This normalized version with continuous paths
is usually called the fractional Brownian motion with Hurst exponent
H and is denoted {WH (t) : t 0}.
(d) Show that if H = 12 , then {WH (t) : t 0} is the standard Wiener process.
(e) Prove that for any > 0, almost surely the paths of the fractional
Brownian motion with Hurst exponent H satisfy a Holder condition
with exponent H .
1.12. (Local time) Let {W (t) : t 0} be a Wiener process defined in a probability
space ( , A, P). For u R, I an interval I [0, +] and > 0, define
(u, I ) =

1
2

1I|W (t)u|< dt =

1
({t I : |W (t) u| < }).
2

(a) Prove that for fixed u and I , (u, I ) converges in L2 ( , A, P) as


0. Denote the limit by 0 (u, I ). Hint: Use Cauchys criterion.
(b) Denote Z(t) = 0 (u, [0, t]). Prove that the random process
Z(t) : t 0 has a version with continuous paths. We call this version
the local time of the Wiener process at the level u, and denote it by
LW (u, t).
(c) For fixed u, LW (u, t) is a continuous increasing function of t 0.
Prove that a.s. it induces a measure on R+ that is singular with respect
to Lebesgue measure; that is, its support is contained in a set of
Lebesgue measure zero.
(d) Study the Holder continuity properties of LW (u, t). For future reference, with a slight abuse of notation, we will write, for any interval
I = [t1 , t2 ], 0 t1 t2 :
LW (u, I ) = LW (u, t2 ) LW (u, t1 ).

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series

Computing the maximum of random processes


and series
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State

University ; Cecile
Mercadier Lyon, France and Mario Wschebor
Universite de Toulouse

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

1 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series

Introduction

MCQMC computations of Gaussian integrals


Reduction of variance
MCQMC

Maxima of Gaussian processes

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

2 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

The lynx data


7000

6000

5000

4000

3000

2000

1000

20

40

60

80

100

120

Annual record of the number of the Canadian lynx trapped in the


Mackenzie River district of the North-West Canada for the period
1821 - 1934, (Elton and Nicholson, 1942)
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

3 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

After passage to the log and centering


3

4
0

20

40

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

60

80

100

120

4 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

Testing
The maximum of absolute value of the series is 3.0224. An estimation
of the covariance with WAFO gives
2

1.5

0.5

0.5

1.5
0

20

40

60

80

100

120

Can we judge the significativity of this quantity ?


Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

5 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Introduction

We assume the series is Gaussian.


Let the Gaussian density in R114 . We have to compute
3.0224

(x1 , . . . , x114 )dx1 , . . . , dx114


3.0224

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

6 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

MCQMC computations of Gaussian integrals


Reduction of variance

Let us consider our problem in a general setting. is a n n


covariance matrix
u
u
1

I :=
l1

(x)dx

(1)

ln

By conditioning or By Choleski decomposition we can write


x1 = T11 z1
x2 = T12 z1 + T22 z2
.....................................
Where the Zi s are independent standard. Integral I becomes
u1 /T11

I :=

(z1 )dz1
l1 /T11

u2 T12 z1
T22
(z2 )dz2
l2 T12 z1
T22

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

(2)

7 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

MCQMC computations of Gaussian integrals


Reduction of variance

Now making the change of variables ti = (zi )


1

1 (u1 /T11 )

I :=

dt1
1 (l1 /T11 )

u2 T12 1 (t1 )
T22
dt2
l2 T12 1 (t1 )
T22

(3)

And by a final scaling this integral can be written as an integral on the


hypercube [0, 1]n .
I :=

h(t)dt.

(4)

[0,1]n

At this stage, if form (4) is evaluated by MC it corresponds to an


important reduction of variance (102 , 103 ) with respect to the form
(1). The transformation up to there is elementary but efficient.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

8 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

QMC
In the form (4) the MC evaluation is based on
M

h(ti )

I = 1/M
i=1

it is well known that its convergence is slow : O(M 1/2 ).


The Quasi Monte Carlo Method is based on the of searching
sequences that are more random than random. A popular method is
based on lattice rules. Let Z1 be a nice integer sequence in Nn , the
rule consist of choosing
i.z
ti =
,
M
where the notation
means that we have taken the fractional part
componentwise. M is chosen prime.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

9 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

Theorem
(Nuyens and cools, 2006) Assume that h is the tensorial product of
periodic functions that belong to a Koborov space (RKHS). Then the
minimax sequence and the worst error can be calculated by a
polynomial algorithm. Numerical results show that the convergence is
roughly O(M 1 ).
This result concerns the worst case so it is not so relevant

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

10 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

A meta theorem

If h does not satisfies the conditions of the preceding theorem we can


still hope QMC to be faster than MC

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

11 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

MCQMC
Let (ti , i) be the lattice sequence, the way of estimating the integral
can be turn to be random but exactly unbiased by setting
M

I = 1/M

ti + U

i=1

where U is uniform on [0, 1]n .


By the meta theorem I has small variance.
So we can make N independent replications of this calculation and
construct Student-type confidence intervals. It is correct whatever the
properties of the function h are.
N must be chosen small : in practical 12.

Conclusion : At the cost of a small loss in speed ( 12 ) we have a


reliable estimation of error.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

12 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

This method has been used to construct confidence bands for


electrical load curves prediction. Azas, Bercu, Fort, Lagnoux Le
(2009)
Individual 11
2.5
2
1.5
1
0.5
0
0.5
1
1.5
2

50

100

150

200

250

300

350

In this figure the weekly cycle of a firm is decomposed on a Fourier


basis. Each coefficient is learned on a large learning sample The
error is supposed Gaussian and its variance structure can be
computed by linear models formulas.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

13 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Do processes exist ?
In this part X(t) is a Gaussian process defined on a compact interval
[0, T].
Since such a process is always observed in a finite set of times and
since the previous method work with say n = 1000, is it relevant to
consider continuous case ?
Answer yes : random process occur as limit statistics. Consider for
example the simple mixture model
H0 : Y N(0, 1)
H1 : Y pN(0, 1) + (1 p)N(, 1) p [0, 1], M R

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

(5)

14 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Theorem (Asymptotic distribution of the LRT)


Under some conditions the LRT of H0 against H1 has , under H0 , the
distribution of the random variable
1
sup {Z 2 (t)},
2 tM

(6)

where Z(.) is a centered Gaussian process covariance function


r(s, t) =

est 1
es2 1

et2 1

In this case there is no discretization.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

15 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Maxima of Gaussian processes

The record method

IE X (t)+ 1IX(s)u,s<t X(t) = u)pX(t) (u)dt

P{M > u} = P{X(0) > u} +


0

(7)

after discretization of [0, T], Dn = {0, T/n, 2T/n, . . . , T} Then


P{sup X(t) > u} P{M > u} P{X(0) > u}
tDn
T

IE X (t)+ 1IX(s)u,s<t,sDn X(t) = u)pX(t) (u)dt

(8)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

16 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Now the integral is replaced by a trapezoidal rule using the same


discretization. Error of the trapezoidal rule is easy to evaluate .
Moreover that the different terms involved can be computed in a
recursive way.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

17 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

An example

Using MGP written by Genz, let us consider the centered stationary


Gaussian process with covariance exp(t2 /2)
[ pl, pu, el, eu, en, eq ] = MGP( 100000, 0.5, 50,
@(t)exp(-t.2 /2), 0, 4);
pu upper bound with
eu = estimate for total error,
en = estimate for discretization error, and
eq = estimate for MCQMC error ;
pl lower bound
el = error estimate (MCQMC)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

18 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Extensions

Treat all the cases : maximum of the absolute value, non centered,
non-stationary. In each case some tricks have to be used.
A great challenge is to use such formulas for fields .

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

19 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

References

Level Sets and Extrema


of Random Processes
and Fields
Jean-Marc Azas and Mario Wschebor

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

20 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Azas Genz, A. (2009),Computation of the distribution


of the maximum of stationary Gaussian sequences and
processes. In preparation
Allan Genz web site
http ://www.math.wsu.edu/faculty/genz/homepage
Mercadier,C. (2005). MAGP tooolbox,
http ://math.univ-lyon1.fr/ mercadier/
Mercadier, C. (2006), Numerical Bounds for the
Distribution of the Maximum of Some One- and
Two-Parameter Gaussian Processes, Adv. in Appl.
Probab. 38, pp. 149170.
Nuyens, D., and Cools, R. (2006), Fast algorithms for
component-by-component construction of rank-1 lattice
rules in shift-invariant reproducing kernel Hilbert
spaces, Math. Comp 75, pp. 903920.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

21 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

THANK-YOU
MERCI
GRACIAS

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

22 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

THANK-YOU
MERCI
GRACIAS

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

22 / 22

ESAIM: Probability and Statistics

September 1999, Vol. 3, p. 107129

URL: http://www.emath.fr/ps/

BOUNDS AND ASYMPTOTIC EXPANSIONS FOR THE DISTRIBUTION


OF THE MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Jean-Marc Azas 1 , Christine Cierco-Ayrolles 1, 2 and Alain Croquette 1


Abstract. This paper uses the Rice method [18] to give bounds to the distribution of the maximum
of a smooth stationary Gaussian process. We give simpler expressions of the first two terms of the Rice
series [3,13] for the distribution of the maximum. Our main contribution is a simpler form of the second
factorial moment of the number of upcrossings which is in some sense a generalization of Steinberg
et al.s formula ([7] p. 212). Then, we present a numerical application and asymptotic expansions that
give a new interpretation of a result by Piterbarg [15].

R
esum
e. Dans cet article nous utilisons la methode de Rice (Rice, 1944-1945) pour trouver un encadrement de la fonction de repartition du maximum dun processus Gaussien stationnaire regulier.
Nous derivons des expressions simplifiees des deux premiers termes de la serie de Rice (Miroshin, 1974,
Azas et Wschebor, 1997) suffisants pour lencadrement cherche. Notre contribution principale est la
donnee dune forme plus simple du second moment factoriel du nombre de franchissements vers le
haut, ce qui est, en quelque sorte, une generalisation de la formule de Steinberg et al. (Cramer and
Leadbetter, 1967, p. 212). Nous presentons ensuite une application numerique et des developpements
asymptotiques qui fournissent une nouvelle interpretation dun resultat de Piterbarg (1981).

AMS Subject Classification. 60Exx, 60Gxx, 60G10, 60G15, 60G70, 62E17, 65U05.
Received June 4, 1998. Revised June 8, 1999.

1. Introduction
1.1. Framework
Many statistical models involve nuisance parameters. This is the case for example for mixture models [10],
gene detection models [5,6], projection pursuit [20]. In such models, the distributions of test statistics are those
of the maximum of stochastic Gaussian processes (or their squares). Dacunha-Castelle and Gassiat [8] give for
example a theory for the so-called locally conic models.
Thus, the calculation of threshold or power of such tests leads to the calculation of the distribution of the
maximum of Gaussian processes. This problem is largely unsolved [2].
Keywords and phrases: Asymptotic expansions, extreme values, stationary Gaussian process, Rice series, upcrossings.

This paper is dedicated to Mario Wschebor in the occasion of his 60th birthday.

Laboratoire de Statistique et Probabilit


es, UMR C55830 du CNRS, Universite Paul Sabatier, 118 route de Narbonne, 31062
Toulouse Cedex 4, France.
2 Institut National de la Recherche Agronomique, Unit
e de Biom
etrie et Intelligence Artificielle, BP. 27, Chemin de Borde-Rouge,
31326 Castanet-Tolosan Cedex, France; e-mail: azais@cict.fr, cierco@toulouse.inra.fr, croquett@cict.fr
c EDP Sciences, SMAI 1999

108

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Miroshin [13] expressed the distribution function of this maximum as a sum of a series, so-called the Rice
series. Recently, Azas and Wschebor [3, 4] proved the convergence of this series under certain conditions and
proposed a method giving the exact distribution of the maximum for a class of processes including smooth
stationary Gausian processes with real parameter.
The formula given by the Rice series is rather complicated, involving multiple integrals with complex expressions. Fortunatly, for some processes, the convergence is very fast, so the present paper studies the bounds
given by the first two terms that are in some cases sufficient for application.
We give identities that yield simpler expressions of these terms in the case of stationary processes. Generalization to other processes is possible using our techniques but will not be detailed for shortness and simplicity.
For other processes, the calculation of more than two terms of the Rice series is necessary. In such a case,
the identities contained in this paper (and other similar) give a list of numerical tricks used by a program under
construction by Croquette.
We then use Maple to derive asymptotic expansions of some terms involved in these bounds. Our bounds
are shown to be sharp and our expansions are made for a fixed time interval and a level tending to infinity.
Other approaches can be found in the literature [12]. For example, Kratz and Rootzen [11] propose asymptotic
expansions for a size of time interval and a level tending jointly to infinity.
We consider a real valued centred stationary Gaussian process with continuous paths X = {Xt ; t [0, T ] R}.
We are interested in the random variables
X = sup Xt or X

= sup |Xt | .

t[0,T ]

t[0,T ]

For shortness and simplicity, we will focus attention on the variable X ; the necessary modifications for adapting
our method to X are easy to establish [5].
We denote by dF () the spectral measure of the process X and p the spectral moment of order p when it
exists. The spectral measure is supposed to have a finite second moment and a continuous component. This
implies ([7] p. 203) that the process is differentiable in quadratic mean and that for all pairwise different time
points t1 , . . . , tn in [0, T ], the joint distribution of Xt1 , . . . , Xtn , Xt1 , . . . , Xtn is non degenerated.
For simplicity, we will assume that moreover the process admits C 1 sample paths. We will denote by r(.) the
covariance function of X and, without loss of generality, we will suppose that 0 = r(0) = 1.
Let u be a real number, the number of upcrossings of the level u by X, denoted by Uu is defined as follows:
Uu = # {t [0, T ], Xt = u, Xt > 0}
For k N , we denote by k (u, T ) the factorial moment of order k of Uu and by k (u, T ) the factorial moment of
order k of Uu 11{X0 u} . We also define k (u, T ) = k (u, T ) k (u, T ). These factorial moments can be calculated
by Rice formulae. For example:

T 2 u2 /2
1 (u, T ) = E (Uu ) =
e
2
T

and 2 (u, T ) = E (Uu (Uu 1)) =

Ast (u) ds dt
0

with Ast (u) = E (Xs )+ (Xt )+ |Xs = Xt = u ps,t (u, u), where (X )+ is the positive part of X and ps,t the
joint density of (Xs , Xt ).
These two formulae are proved to hold under our hypotheses ( [7], p. 204). See also Wschebor [21],
Chapter 3, for the case of more general processes.
We will denote by the density of the standard Gaussian distribution. In order to have simpler expressions
x

of rather complicated formulae, we will use the folllowing three functions: (x) =
x

and (x) =
0

1
(y)dy = (x) .
2

(y)dy, (x) = 1 (x)

109

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

1.2. Main inequalities


Since the pioneering works of Rice [18], the most commonly used upper bound for the distribution of the
maximum is the following:
P (X > u) P (X0 > u) + P (Uu > 0) P (X0 > u) + E (Uu ) .
2
(u).
2
One can also see the works by [9, 15, 16].
We propose here a slight refinement of this inequality, but also a lower bound using the second factorial
moment of the number of upcrossings. Our results are based on the following remark which is easy to check: if
is a non-negative integer valued random variable, then

That is: P (X > u) (u) + T

1
E () E (( 1)) P ( > 0) E () .
2
Noting that P almost surely, {X > u} = {X0 > u} {X0 u, Uu > 0} and that E Uu (Uu 1)11{X0 u} 2 ,
we get:
P (X0 > u) + 1 (u, T )

2 (u, T )
P X u
2

P (X0 > u) + 1 (u, T ),

(1.1)

with 1 (u, T ) = E Uu 11{X0 u} .


Using the same technique as for calculating E (Uu ) and E (Uu (Uu 1)), one gets
T

1 (u, T ) =

dt
0

dx

y p0,t;t (x, u; y)dy,


0

where p0,t;t stands for the density of the vector (X0 , Xt , Xt ).


Azas and Wschebor [3, 4] have proved, under certain conditions, the convergence of the Rice series [13]
+

P X u = P (X0 > u) +

(1)m+1
m=1

m (u, T )
m!

(1.2)

and the envelopping property of this series:


n
m (u, T )
(1)m+1
if we set Sn = P (X0 > u) +
, then, for all n > 0:
m!
m=1
S2n P X u S2n1 .

(1.3)

Using relation (1.3) with n = 1 gives


P (X0 > u) + 1 (u, T )

2 (u, T )
P X u P (X0 > u) + 1 (u, T ).
2

Since 2 (u, T ) 2 (u, T ), we see that, except this last modification which gives a simpler expression, Main
inequality (1.1) is relation (1.3) with n = 1.

110

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Remark 1.1. In order to calculate these bounds, we are interested in the quantity 1 (u, T ). For asymptotic
calculations and to compare our results with Piterbargs ones, we will also consider the quantity k (u, T ). From
a numerical point of view, k (u, T ) and k (u, T ) are worth being distinguished because they are not of same
order of magnitude as u +. In the following sections, we will work with 1 (u, T ).

2. Some identities
First, let us introduce some notations that will be used in the rest of the paper. We set:
r (t)
u,
(t) = E (X0 |X0 = Xt = u) =
1 + r(t)
r 2 (t)
2 (t) = V ar (X0 |X0 = Xt = u) = 2
,
1 r2 (t)
r (t) 1 r2 (t) r(t)r 2 (t)
(t) = Cor (X0 , Xt |X0 = Xt = u) =
.
2 (1 r2 (t)) r 2 (t)
1 + (t)

and b(t) = (t).


1 (t)

Note that, since the spectrum of the process X admits a continuous component, |(t)| = 1.
In the sequel, the variable t will be omitted when it is not confusing and we will write r, r , , , , k, b instead
of r(t), r (t), (t), (t), (t), k(t), b(t).

We also define k(t) =

Proposition 2.1. (i) If (X, Y ) has a centred normal bivariate distribution with covariance matrix

1
1

then a R+
a

1
P (X > a, Y > a) = arctan

1+
(x)
2
1
0
1+
x (x) dx
1

=2

(ii) 1 (u, T ) = (u)


0
T

2 (T t)

(iii) 2 (u, T ) =
0

1 r 2
u
1+r

1
2
1 r2 (t)

1+
x
1

1r
r
u (b)
1+r
1 r2

u
1 + r(t)

dx

dt

[T1 (t) + T2 (t) + T3 (t)] dt

with:
T1 (t) = 2 (t)

1 2 (t) (b(t)) (k(t) b(t)) ,

(2.1)

T2 (t) = 2 (2 (t)(t) 2 (t))

(k(t) x) (x) dx,

(2.2)

b(t)

T3 (t) = 2 (t) (t) (k(t) b(t)) (b(t)) .

(2.3)

(iv) A second expression for T2 (t) is:


T2 (t) = ( (t)(t) (t))
2

1
arctan (k(t)) 2

b(t)

(k(t) x) (x) dx .
0

(2.4)

111

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Remark 2.2.
p. 27:

1. Formula (i) is analogous to the formula (2.10.4) given in Cramer and Leadbetters [7],

P (X > a, Y > a) = (a)(a) +


0

1
a2

exp
1z
2 1 z 2

dz.

Our formula is easier to prove and is more adapted to numerical application because, when t 0,
(t) 1 and the integrand in Cramer and Leadbetters formula tends to infinity.
2. Utility of these formulae:
these formulae permit a computation of Main inequality (1.1), at the cost of a double integral with
finite bounds. This is a notable reduction of complexity with respect to the original form. The form
(2.4) is more adapted to effective computation, because it involves an integral on a bounded interval;
this method has been implemented in a S+ program that needs about one second of Cpu to run an
example. It has been applied to a genetical problem in Cierco and Azas [6].
The form (iii) has some consequences both for numerical and theoretical purposes. The calculation of 2 (u, T )
yields some numerical difficulties around t = 0. The sum of the three terms is infinitly small with respect to
each term. To discard the diagonal from the computation, we use formula (iii) and Maple to calculate the
equivalent of the integrand in the neighbourhood of t = 0 at fixed u.
T

Recall that we have set 2 (u, T ) =

Ast (u) ds dt. The following proposition gives the Taylor expansion
0

of A at zero.

Proposition 2.3. Assume that 8 is finite. Then, as t 0:


3/2

At (u) =

1
(2 6 4 )
1 4
exp
u2
1296 (4 2 )1/2 2 2
2 4 22
2
2

t4 + O(t5 ).

Piterbarg [17] or Wschebor [21] proved that At (u) = O ( (u(1 + ))) for some 0. Our result is more precise.
Our formulae give some asymptotic expansions as u + for 1 (u, T ) and 2 (u, T ) for small T .
Proposition 2.4. Assume that 8 is finite. Then, there exists a value T0 such that, for every T < T0
11/2

4 22
27
1 (u, T ) =

4 5 (2 6 2 )3/2
2
4

4
u
4 22

u6

1+O

1
u

9/2
4 22
3 3T
2 (u, T ) =

9/2 (2 6 2 )
2
4

4
u
4 22

u5

1+O

1
u

as u +.

3. A numerical example
In the following example, we show how the upper and lower bounds (1.1) permit to evaluate the distribution
of X with an error less than 104 .
We consider the centered stationary Gaussian process with covariance (t) := exp(t2 /2) on the interval
I = [0, 1], and the levels u = 3, 2.5, . . . , 3. The term P (X0 u) is evaluated by the S -plus function P norm,
1 and 2 using Proposition 2.1 and the Simpson method. Though it is rather difficult to assess the exact
precision of these evaluations, it is clear that it is considerably smaller than 104 . So, the main source of error

112

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

is due to the difference between the upper and lower bounds in (1.1).
u
P (X0 u)
3
0.00135
2.5
0.00621
2
0.02275
1.5
0.06681
1
0.15866
0.5
0.30854
0
0.50000
0.5
0.69146
1
0.84134
1.5
0.93319
2
0.97725
2.5
0.99379
3
0.99865

1
0.00121
0.00518
0.01719
0.04396
0.08652
0.13101
0.15272
0.13731
0.09544
0.05140
0.02149
0.00699
0.00177

2
lower bound upper bound
0
0.00014
0.00014
0
0.00103
0.00103
0
0.00556
0.00556
0.00001
0.02285
0.02285
0.00002
0.07213
0.07214
0.00004
0.17753
0.17755
0.00005
0.34728
0.34731
0.00004
0.55415
0.55417
0.00002
0.74591
0.74592
0.00001
0.88179
0.88180
0
0.95576
0.95576
0
0.98680
0.98680
0
0.99688
0.99688

The calculation demands 14 s on a Pentium 100 MHz.


The corresponding program is available sending an e-mail to croquett@cict.fr.

4. Proofs
Proof of Proposition 2.1
Proof of point (i). We first search P (X > a, Y > a).
Put = cos(), [0, [, and use the orthogonal decomposition Y = X +
a X
Then {Y > a} = Z >
. Thus:
1 2
+

P (X > a, Y > a) =

a x

(x)

(x)(z) dx dz,

dx =

1 2

1 2 Z.

1
where D is the domain located between the two half straight lines starting from the point a, a
1+

with angle and .


2
2

Using a symmetry with respect to the straight line with angle passing through the origin, we get:
2
+

P (X > a, Y > a) = 2

(x)
a

1
x
1+

dx.

(4.1)

Now,
P (X > a, Y > a) = (a) P (X > a, Y < a) = (a) P (X > a, (Y ) > a) .
Applying relation (4.1) to (X, Y ) yields
+

P (X > a, Y > a) = (a) 2

(x)
a

1+
x
1

dx = 2

and

1+
x
1

(x) dx.

113

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Now, using polar coordinates, it is easy to establish that


+

(k x) (x) dx =
0

1
arctan(k)
2

which yields the first expression.


Proof of point (ii). Conditionally to (X0 = x, Xt = u), Xt is Gaussian with:
r (t)(x r(t)u)
mean m(t) =
,
1 r2 (t)
2
variance (t) already defined.
It is easy to check that, if Z is a Gaussian random variable with mean m and variance 2 , then
m
m
+ m

E Z + =

These two remarks yield 1 (u, T ) = I1 + I2 , with:


T
+
r (x r u)
I1 =
dt

p0,t (x, u) dx
(1 r2 )
0
u
T
+
r (x r u)
r (x r u)
I2 =
dt

p0,t (x, u) dx.


2)
(1

r
(1 r2 )
0
u
T

I1 can be written under the following form: I1 = (u)


0

parts leads to

I2 = (u)
0

Finally, noticing that 2 +

1 (u, T ) =

22

2
(u)
2

r
1r

u (b)

2
1r
1+r

2 1 r
r2

u
+

1+r
22 (1 r2 )

1r
u
1+r

dt. Integrating I2 by

dt.

r2
= 2 , we obtain:
1 r2
T

1r
u
1+r

dt + (u)
0

1 r2

1r
u
1+r

(b) dt.

Proof of point (iii). We set:


(x b)2 2(x b)(y + b) + (y + b)2
v(x, y) =
2(1 2 )
for (i, j) {(0, 0); (1, 0); (0, 1); (1, 1); (2, 0); (0, 2)}
+

Jij =
0

xi y j
2

1 2

exp (v(x, y)) dydx.

We first calculate the values of Jij . The following relation is clear


+

J10 J01 (1 + )bJ00

1 2
0

1 2 (k b) (b).

exp (v(x, y))


v(x, y)
dx
x
2 1 2

dy
(4.2)

114

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Symmetrically, replacing x with y and b with b in (4.2) yields


J01 J10 + (1 + )bJ00 = 1 2 (k b) (b).

(4.3)

In the same way, multiplying the integrand by y, we get


J11 J02 (1 + ) b J01 = 1 2

3/2

(k b) k b (k b) (b).

(4.4)

[ (k b) + k b (k b)] (b).

(4.5)

And then, multiplying the integrand by x leads to


J11 J20 + (1 + ) b J10 = 1 2
+

Finally, J20 J11 (1 + ) b J10 = (1 2 )

x
0

parts

3/2

exp (v(x, y))

v(x, y)
dx dy. Then, integrating by
x
2 1 2

J20 J11 (1 + ) b J10 = (1 2 ) J00 .

(4.6)

Multiplying equation (4.6) by and adding (4.5) gives:


J11 = b J10 + J00 +

1 2 [ (k b) + k b (k b)] (b).

Multiplying equation (4.3) by and adding equation (4.2) yields:


J10 = b J00 + (k b) + (k b) (b).
+

And, by formula (i), J00 = 2

(k x) (x) dx. Finally, gathering the pieces, it comes:


b

J11 = J11 (b, ) =

1 2 2

b
1

(b) + 2 b2

(k x) (x) dx + 2 b (k b) (b).
b

The final result is obtained remarking that


+

E (X0 ) (Xt ) |X0 = Xt = u = 2 (t) J11 (b(t), (t)) .


Proof of point (iv). Expression (2.4) is obtained simply using the second expression of J00 .
Note 4.1. In the following proofs, some expansions are made as t 0, some as u + and some as
(t, u) (0, +).
We define the uniform Landau symbol OU as a(t, u) = OU (b(t, u)) if there exists T0 and u0 such that for
t < T0 < T and u > u0 ,
a(t, u) (const) b(t, u).
We also define the symbol

as a(t, u)

b(t, u)

a(t, u) = OU (b(t, u))

b(t, u) = OU (a(t, u))

Note 4.2. Many results of this section are based on tedious Taylor expansions. These expansions have been
made or checked by a computer algebra system (Maple). They are not detailed in the proofs.

115

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

1 + (t)
= O(t) is small,
1 (t)

Proof of Proposition 2.3. Use form (iii) and remark that, when t is small, k(t) =
1
and, since () =
2

3
6

+ O 5 as 0, we get:
b(t)

b(t)

k(t)
arctan(k(t))
k 3 (t)

x(x)dx +
x3 (x)dx + O(t5 )
2
2 0
6 2 0

1
k(t)
2 arctan(k(t)) 2 ((0) (b(t)))
+ O(t5 ).
= 2 2 (t)(t) 2 (t)

k 3 (t)
2
+
2(0) b (t) + 2 (b(t))
6 2
In the same way:
2(t)(t)
k 3 (t) 3
b (t) + O(t5 ).
T3 (t) =
(b(t)) k(t)b(t)
6
2
And then, assuming 8 finite, use Maple to get the result.
T2 (t) = 2 2 (t)(t) 2 (t)

Proof of Proposition 2.4. We first prove the following two lemmas.


Lemma 4.3. Let l be a real positive function of class C 2 satisfying l(t) = ct + O(t2 ) as t 0, c > 0. Suppose
that 8 is finite, with the above definitions of k(t) and b(t), we have as u +:
arctan

(p+1)

(i) Ip =

t (k(t) b(t)) (l(t) u) dt = (c u)


0

1 Mp+1

2 2

( dc )
p

(cos ) d 1 + O
0

1
u

22 6 2 24
p+1
and Mp+1 = E |Z|
where Z is a standard Gaussian random variable.
4 22
T
Mp
1
(ii) Jp =
tp (l(t) u) dt = (c u)(p+1)
1+O

2
u
0

with d =

1
6

Proof of Lemma 4.3. Since the derivative of l at zero is non zero, l is invertible in some neighbourghood of zero
1
1
and its inverse l1 satisfies l1 (t) = t + O(t2 ), l1 (t) = + O(t).
c
c
We first consider Ip and use the change of variable y = l(t)u, then
l(T )u

Ip =

y
u

l1

(kb) l1

y
u

(y) l1

y
u

dy

From the expressions of k(t) and b(t), we know that


(kb)(t) =
Thus (kb) l1

y
d
= y + u OU
u
c

1
6
y2
u2

and

l(T )u

(p+1)

yp

Ip = (c u)

We use the following lemma.

22 6 2 24
t u + u O(t3 ) = d u t + u O(t3 ).
4 22

d
y + u OU
c

y2
u2

(y) 1 + OU

y
u

dy.

116

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Lemma 4.4. Let h be a real function such that h(t) = O t2 as t


0, then there exists T0 such that for
0 t T0
(u(t + h(t))) = (t u) [1 + OU (t)] .
Proof of Lemma 4.4. Taking T0 sufficiently small, we can assume that h(t)
A = | (u(t + h(t))) (t u)| u |h(t)|

tu
2

t
. Then
2

(const) u t2

tu
2

We want to prove that, in every case,


A (const) t (t u)

(4.7)

when tu 1, (t u) tu(1) and A (const) u t2 (0), thus (4.7) holds.


ut
when tu > 1, (t u) > (1) and A (const) t2 u
and (4.7) holds again.
2
End of proof of Lemma 4.3.
Due to Lemma 4.4,
l(T )u

(p+1)

yp

Ip = (c u)

0
l(T )u

yp

Put Kp (u) =
0

d
y
c

d
y
c

(y) 1 + OU

y
u

dy.

(4.8)

(y) dy. It is easy to see that, when u +,


+

yp

Kp (u) =
0

d
y
c

(y) dy + O un for every integer n > 0.


d

+
c y yp
y2 + z 2
d
y (y) dy =
exp
dz dy. Then, using polar coorc
2
2
0
0
0
d
1 Mp+1 arctan( c )
p
dinates, we derive that Kp () =
(cos ) d. So we can see that the contribution of the
2 2
0
y
term OU
in formula (4.8) is O u(p+2) which gives the desired result for Ip .
u

Moreover, Kp () =

yp

The same kind of proof gives the expression of Jp .


Proof of the equivalent of 1 (u, T ). We set
A1 (t) = (u)

2 (1 r)
u
2 (1 + r)

1r
u
1+r

r
(b)
1 r2

Then, 1 (u, T ) =

A1 (t) dt.
0

It is well known ([1], p. 932) that, as z tends to infinity,


(z) = (z)

1
1
3
3 + 5 + O(z 7 ) .
z
z
z

(4.9)

117

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

2 (1 r(t))
u for the first term and z = b(t)
2 (t)(1 + r(t))

We use this expansion for both terms of 1 (u, T ), with z =


for the second one.
Besides, remarking that

2 (1 r)
u
2 (1 + r)

1r
u
1+r

(b) ,

we get:

2 (1 + r) 1

2 (1 r) u

2 (1 r)
2 (1 + r)

u
+ OU

2
(1 + r)

2 (1 r)

1
r
1
+
3 + OU
2
b
1r b

(u)
A1 (t) =
2

3/2

2 (1 + r)
2 (1 r)
5/2

1
u5

1
b5

u3


From Taylor expansion made by Maple assuming 8 finite, we know that:


5/2

4
2
exp 2(u4
2)
1 4 2
2
t2 + O(t4 ).
A1 (t) =

7/2
8
u3 2
2

To use Lemma 4.3 point (ii) to calculate 1 (u, T ), it is necessary to have a Taylor expansion of the coefficient
22
2 (1 r)
2 (1 r(t))
of u in
u
.
We
have
lim
=
, therefore, we set:
t0 2 (t)(1 + r(t))
2 (1 + r)
4 22
2 (1 r)
22
.

2 (1 + r)
4 22

l(t) =

From Taylor expansion made by Maple assuming 8 finite, we get


1
l(t) =
6

2 (2 6 24 )
t + O(t2 ).
4 22

And, according to Lemma 4.3 point (ii),


T

1
t (l(t) u) dt =
2

1
6

Finally, remarking that (u)

2
4

22

2 (2 6 24 )
u
4 22

1
=
2
11/2

4 22
27
1 (u, T ) =

4 5 (2 6 2 )3/2
2
4

1+O

1
u

4
u , we get the equivalent for 1 (u, T ).
4 22
4
u
4 22

u6

1+O

1
u

118

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

Proof of the equivalent of 2 (u, T ). Remember that


T

2 (T t)

2 (u, T ) =
0

1
2
1 r2 (t)

u
1 + r(t)

[T1 (t) + T2 (t) + T3 (t)] dt.

(4.10)

We first calculate an expansion of term T2 = 2 2 ( b2 )

(x) (k x) dx.

b
The function x x2 1 (x) being bounded, we have
(kx) = (k b) + k (k b) (x b)

1 3
2
3
k b (k b) (x b) + OU k 3 (x b) ,
2

(4.11)

where the Landaus symbol has here the same meaning as in Lemma 4.3.
Moreover, using the expansion of given in formula (4.9), it is easy to check that as z +,
+

(z)
(z)
(z)
3 4 +O
2
z
z
z6
z
+
(z)
(z)
2
(x z) (x) dx = 2 3 + O
z
z5
z
+
(z)
3
(x z) (x) dx = O
.
z4
z
(x z) (x) dx =

Therefore, multiplying formula (4.11) by (x), integrating on [b; +[ and applying formula (4.9) once again
yield:

3
1 k2
3
1
1

+
+ k (k b) (b)
4

(k b) (b)

b b3 b5
b2
b

(k b) (b)
k
2
2
T2 = 2 b
+O
(k
b)
(b)

+
O

b7
b6

3
3

k
k

(k b) (b) + O
(b)

+O
b4
b4

Note that the penultimate term can be forgotten. Then, remarking that, as u +, b =
u, t and

k t, we obtain:
T2

2
2
= 2 2 b (k b) (b) + 2
(k b) (b) + 2
(k b) (b)
b2
b
2

2 3 (k b) (b) 6 3 (k b) (b) + 2 2 k 3 (k b) (b)


b
b
2 k
2 k
2 2 k (k b) (b) + 2
(k b) (b) + 6 2 (k b) (b)
2
b
b
+ OU t2 u5 (k b) (b) + OU t3 u4 (k b) (b) + OU t5 u2 (b)

Remark 4.5. As it will be seen later on, Lemma 4.3 shows that the contribution of the remainder to the
1
integral (4.10) can be neglected since the degrees in t and of each term are greater than 5. So, in the sequel,
u
we will denote the sum of these terms (and other terms that will appear later) by Remainder and we set:
T2 = U1 + U2 + U3 + U4 + U5 + U6 + U7 + U8 + U9 + Remainder.

119

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Now, we have

U1 + T 3 = 0
1 2 2 k = (1 + ) k so that U7 + T1 = (1 + ) 2 k (k b) (b)
2
U2 + U3 = 2
(1 + ) (k b) (b)
b
2

U4 + U5 = 4 3 (k b) (b) 1 + O t2
b
2
U8 + U9 = 4 2 k (k b) (b) 1 + O t2
b
since = 1 + O t2 .

By the same remark as Remark 4.5 above, the term O t2 can be neglected. Consequently,

T1 + T2 + T3

= 2

2
2
(1 + ) (k b) (b) 4 3 (k b) (b)
b
b

(1 + ) 2 k (k b) (b) + 2 2 k 3 (k b) (b) + 4

2
k (k b) (b)
b2

+ Remainder.

Therefore, we are leaded to use Lemma 4.3 in order to calculate the following integrals:

(T t)

0
T

(T t)
0
T

(T t)
0
T

(T t)
0

u
2u
(kb) (b) dt = (T t) m1 (t) (kb) b2 +
dt
1+r
1
+r
0

2
2u
m2 (t) (k b) b2 +
dt
1+r

2
2u
dt
m3 (t) b2 (1 + k 2 ) +
1+r

2
2u
dt
m4 (t) b2 (1 + k 2 ) +
1+r

2
2u
dt
m5 (t) b2 (1 + k 2 ) +
1+r

(T t) m1 (t) exp
0

120

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

with:
m1 (t)

=
=

2
2
1
(t) (1 + (t))
1 r2 (t) b
4 22 3
1 2 6 24
t + O t5
5/2
36
u
2

m2 (t)
m3 (t)

=
=

m4 (t)

=
=

m5 (t)

=
=

5/2

4 22
2
1
(t)
=

t + O t3
7/2
1 r2 (t) b3
u3 2
1
1

(1 + (t)) 2 (t) k(t)
2 1 r2 (t)

3/2
2 2 6 24

t4 + O t6
864 22 4 22 3/2
2
1

2 (t) k 3 (t)
2 1 r2 (t)
3/2
2 4
1 2 6 24
t + O t6
864 22 4 22 3/2
4
1
( 1998).2

(t) k(t)
b2
2 1 r2 (t)
3/2
2 6 24 4 22
2 2
1
t + O t4 .
12
32 3/2 u2

4
=

Lemma 4.3 shows that we can neglect the terms issued from the t part of the factor T t in formula (4.10).

We now consider the argument of in Lemma 4.3. We have:


2
b2
4

lim 2 +
=
t0 u
1+r
4 22
b2
2
4

lim 2 1 + k 2 +
=

t0 u
1+r
4 22
Therefore, we set:
2 2 6 24

l1 (t)

b2 (t)
2
4
+

=
2
u
1 + r(t)
4 22

l2 (t)

b2 (t)
2
4
1 + k 2 (t) +

u2
1+r
4 22

2 2 6 24
2

12 (4 22 )

t + O t3

5/2

18 (4 22 )

t + O t3 .

Then, with the notations of Lemma 4.3, we obtain:

2 = T exp

4 u2
2 (4 22 )

4 22
4 22
1 2 6 24

3
5/2
7/2
36
2 u
u3 2

3/2

2 6 24 4 22
2
1
+
J2
12
32 3/2 u2

I1

1+O

1
u

121

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

Where I1 and I3 (resp. J2 ) are defined as in Lemma


4.3 point (i) (resp. (ii) ) with l(t) = l1 (t) (resp. l(t) = l2 (t)).

2
2

2
2
8 3
3
3
(cos ) d =
and that
cos d =
, we find
Noting that
27
3
0
0

4
144 3 4 22
1
I3 =
u4 1 + O
2
u
2 22 (2 6 24 )

2 2
3 3 4 2
1
I1 =
u2 1 + O
2
u
2 2 (2 6 4 )

3
12 3 4 22
1
J2 =
u3 1 + O

2
2
u
2 (2 6 4 ) 2 (2 6 4 )
Finally, gathering the pieces, we obtain the desired expression of 2 .

5. Discussion
Using the general relation (1.3) with n = 1, we get
P X u P (X0 > u) 1 (u, T ) +

2 (u, T ) 3 (u, T )
2 (u, T )

2
2
6

A conjecture is that the orders of magnitude of 2 (u, T ) and 3 (u, T ) are considerably smaller than those of
1 (u, T ) and 2 (u, T ). Admitting this conjecture, Proposition 2.4 implies that for T small enough

9/2
4 22
T 2
3 3T
P X u = (u) +
(u)

2 9/2 (2 6 2 )
2
4
2

4
u
4 22

u5

1+O

1
u

which is Piterbargs theorem with a better remainder ([15], Th. 3.1, p. 703). Piterbargs theorem is, as far as we
know, the most precise expansion of the distribution of the maximum of smooth Gaussian processes. Moreover,
very tedious calculations would give extra terms of the Taylor expansion.

References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables.
Dover, New York (1972).
[2] R.J. Adler, An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes, IMS, Hayward, Ca
(1990).
[3] J.-M. Azas and M. Wschebor, Une formule pour calculer la distribution du maximum dun processus stochastique. C.R. Acad.
Sci. Paris Ser. I Math. 324 (1997) 225-230.
[4] J-M. Azas and M. Wschebor, The Distribution of the Maximum of a Stochastic Process and the Rice Method, submitted.
[5] C. Cierco, Probl`
emes statistiques li
es a
` la d
etection et a
` la localisation dun g`
ene `
a effet quantitatif. PHD dissertation.
University of Toulouse, France (1996).
[6] C. Cierco and J.-M. Azas, Testing for Quantitative Gene Detection in Dense Map, submitted.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes, J. Wiley & Sons, New-York (1967).
[8] D. Dacunha-Castelle and E. Gassiat, Testing in locally conic models, and application to mixture models. ESAIM: Probab.
Statist. 1 (1997) 285-317.
[9] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977) 247-254.
[10] J. Ghosh and P. Sen, On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related
results, in Proc. of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Le Cam L.M. and Olshen R.A., Eds.
(1985).

122

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE

[11] M.F. Kratz and H. Rootz


en, On the rate of convergence for extreme of mean square differentiable stationary normal processes.
J. Appl. Prob. 34 (1997) 908-923.
[12] M.R. Leadbetter, G. Lindgren and H. Rootz
en, Extremes and Related Properties of Random Sequences and Processes. SpringerVerlag, New-York (1983).
[13] R.N. Miroshin, Rice series in the theory of random functions. Vestnik Leningrad Univ. Math. 1 (1974) 143-155.
[14] M.B. Monagan, et al. Maple V Programming guide. Springer (1998).
[15] V.I. Piterbarg, Comparison of distribution functions of maxima of Gaussian processes. Theory Probab. Appl. 26 (1981) 687-705.
[16] V.I. Piterbarg, Large deviations of random processes close to gaussian ones. Theory Probab. Appl. 27 (1982) 504-524.
[17] V.I. Piterbarg, Asymptotic Methods in the Theory of Gaussian Processes and Fields. American Mathematical Society. Providence, Rhode Island (1996).
[18] S.O. Rice, Mathematical Analysis of Random Noise. Bell System Tech. J. 23 (1944) 282-332; 24 (1945) 45-156.
[19] SPLUS, Statistical Sciences, S-Plus Programmers Manual, Version 3.2, Seattle: StatSci, a division of MathSoft, Inc. (1993).
[20] J. Sun, Significance levels in exploratory projection pursuit. Biometrika 78 (1991) 759-769.
[21] M. Wschebor, Surfaces aleatoires. Mesure geometrique des ensembles de niveau. Springer-Verlag, New-York, Lecture Notes in
Mathematics 1147 (1985).

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

TAYLOR EXPANSIONS BY MAPLE

GENERAL FORMULAE
>

phi:=t->exp(-t*t/2)/sqrt(2*pi);
2

e(1/2 t )

2
We introduce mu4=lambda4-lambda22 and mu6= lambda2*lambda6-lambda4^2
to make the outputs clearer.
>
assume(t>0);
>
assume(lambda2 > 0);
>
assume(mu4 > 0);
>
assume(mu6>0);
>
interface(showassumed=2);
>
Order:=12;
:= t

>

Order := 12
r:=t->1-lambda2*t^2/2!+lambda4*t^4/4!-lambda6*t^6/6!+lambda8*t^8/8!;
1
1
1
1
2 t2 +
4 t4
6 t6 +
8 t8
2
24
720
40320
siderels:= {lambda4=mu4+lambda2^2,lambda2*lambda6-lambda4^2=mu6}:
I_r2:=t->1-r(t)*r(t);
r := t 1

>
>

I r2 := t 1 r(t)2
>

simplify(simplify(series(I_r2(t),t=0,8),siderels));

>

1
1
1
1
1
2 t2 + ( 22
4) t4 + (
6 +
2 4 +
23 ) t6 + O(t8 )
3
12
360
24
24
with assumptions on t, 2 and 4
rp:=t->diff(r(t),t);
rp := t diff(r(t), t)

>

eval(rp(t));
1
1
1
4 t3
6 t5 +
8 t7
6
120
5040
with assumptions on 2 and t

2 t +
>

rs:=t->diff(r(t),t$2);
rs := t

>

2
r(t)
t2

eval(rs(t));
1
1
1
4 t2
6 t4 +
8 t6
2
24
720
with assumptions on 2 and t

2 +

123

124

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

mu:=t->-u*rp(t)/(1+r(t));
:= t

>

u rp(t)
1 + r(t)

sig2:=t->lambda2-rp(t)*rp(t)/I_r2(t);
sig2 := t 2

>

rp(t)2
I r2(t)

simplify(taylor(sig2(t),t=0,8),siderels);
1
1 6 22 4 3 42 2 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6

>

sigma:=t->sqrt(sig2(t));
:= t

>

simplify(taylor(sigma(t),t=0,6),siderels);
1
2

>

sig2(t)

1 6 22 4 3 42 2 6 3

t + O(t5 )
144
4 2
with assumptions on t, 4, 2 and 6

4 t +

b:=t->mu(t)/sigma(t);
b := t

>

(t)
(t)

simplify(taylor(b(t),t=0,6),siderels);
u 2
1
1 u 6
+ ( u 4 +
) t2 + O(t4 )
8
36 4(3/2)
4
with assumptions on 2, 4, t and 6

>

sig2rho:=t->-rs(t)-r(t)*rp(t)*rp(t)/I_r2(t);
sig2rho := t rs(t)

>

r(t) rp(t)2
I r2(t)

simplify(taylor(sig2rho(t),t=0,8),siderels);
1
1 6 22 4 + 3 42 + 4 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6

>

rho:=t->sig2rho(t)/sig2(t);
:= t

>

sig2rho(t)
sig2(t)

simplify(taylor(rho(t),t=0,8),siderels);
1 6 2
t + O(t4 )
18 2 4
with assumptions on t, 6, 2 and 4

1 +

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS

PROOF OF PROPOSITION 2.3


>

k2:=t->(1+rho(t))/(1-rho(t));
k2 := t

>

1 + (t)
1 (t)

sk2:=simplify(taylor(k2(t),t=0),siderels);
1
1 6 2
t +
(3 26 4 + 9 24 42 + 9 22 43 2 6 22 4 3 8 22 4
36 2 4
2160
1
+ 3 44 + 13 6 42 + 5 62 ) (22 42 )t4 +
(147 28 42
907200
+ 175 6 26 4 273 26 43 + 63 24 44 + 196 6 24 42 + 120 8 24 42
+ 357 22 45 + 707 6 22 43 195 8 22 43 175 8 22 6 4 + 168 46

sk2 :=

+ 518 62 42 + 686 6 44 + 175 63) (23 43 )t6 + O(t8 )


with assumptions on t, 6, 2 and 4
>

k:=t->taylor(sqrt(sk2),t=0);

k := t taylor( sk2 , t = 0)

>

simplify(taylor(k(t),t=0,3),siderels);
1
6

>

6
t + O(t3 )
2 4
with assumptions on t, 6, 2 and 4

sqrtI_rho2:=t->k(t)*(1-rho(t));
sqrtI rho2 := t k(t) (1 (t))

>

T1:=t->sig2(t)*sqrtI_rho2(t)*phi(b(t))*phi(k(t)*b(t));
T1 := t sig2(t) sqrtI rho2(t) (b(t)) (k(t) b(t))

>

simplify(simplify(series(T1(t),t=0,6),siderels),power);
1
24

u2 22
6 4 e(1/2 4 ) 3
1

t
((5 62 22 u2 + 3 22 42 8 3 26 42 9 24 43
2880
2
9 22 44 15 6 22 42 u2 18 6 22 42 3 45 + 5 62 4 3 6 43 )

e(1/2

u2 22
4

) ( 6 4(3/2) 2(3/2) )t5 + O(t7 )

with assumptions on t, 6, 4 and 2


>
>

T2 := t->2*sig2(t)*(rho(t)-(b(t))^2)*(arctan(k(t))/(2*pi)
-k(t)/sqrt(2*pi)*(phi(0)-phi(b(t))-k(t)^2/6*(2*phi(0)-((b(t))^2+2)*phi(b(t)))));
T2 := t 2sig2(t) ((t) b(t)2 )

1
2
2
k(t)
((0)

(b(t))

k(t)
(2
(0)

(b(t)
+
2)
(b(t))))

1 arctan(k(t))
6

125

126

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

simplify(simplify(series(T2(t),t=0,6),siderels),power);
1

24

u2 22
6 (u2 22 + 4) e(1/2 4 ) 3
t + O(t5 )

4 2
with assumptions on t, 6, 2 and 4

>

T3:=t->(2*sig2(t)*(k(t)*b(t)^2))/sqrt(2*pi)*(1-(k(t)*b(t))^2/6)*phi(b(t));

>

1
sig2(t) k(t) b(t)2 (1 k(t)2 b(t)2 ) (b(t))
6
T3 := t 2
2
simplify(simplify(series(T3(t),t=0,6),siderels),power);
u2 22
1
1 e(1/2 4 ) 6 2(3/2) u2 3

t
2 u2 (27 8 22 42 + 35 62 22 u2
24
25920
4
27 26 42 81 24 43 81 22 44 162 6 22 42 135 6 22 42 u2

27 45 45 62 4 + 243 6 43)e(1/2

u2 22
4

( 6 4(5/2) )t5 + O(t7 )

with assumptions on t, 2, 4 and 6


>

A:=t->((phi(u/sqrt((1+r(t)))))^2/sqrt(I_r2(t)))*(T1(t)+T2(t)+T3(t));
(
A := t

>

u
)2 (T1(t) + T2(t) + T3(t))
1 + r(t)
I r2(t)

simplify(simplify(series(A(t),t=0,6),siderels),power);
O(t4 )
with assumptions on t

PROOF OF THE EQUIVALENT OF NU1


>

Cphib:=t->phi(t)/t-phi(t)/t^3;
Cphib := t

>

sq:=t->sqrt((1-r(t))/(1+r(t)));
sq := t

>

(t) (t)
3
t
t

1 r(t)
1 + r(t)

simplify(simplify(series(sq(t),t=0,4),siderels),power);
1 2 22 + 4 3
1

2 t
t + O(t5 )
2
48
2
with assumptions on t, 2 and 4

>

nsigma:=t->sigma(t)/sqrt(lambda2);
(t)
nsigma := t
2

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS


>
>

A1:=t->(1/sqrt(2*pi))*phi(u)*phi(sq(t)*u/nsigma(t))*((nsigma(t)/(sq(t)*u)
-(nsigma(t)/(sq(t)*u))^3)*sqrt(lambda2)+(1/b(t)-1/b(t)^3)*rp(t)/sqrt(I_r2(t)));

1
1
(

)
rp(t)
3
nsigma(t) nsigma(t)

sq(t) u
b(t) b(t)3

(u) (
) 2 +
)
(

3
3
nsigma(t)
sq(t) u
sq(t) u
I r2(t)

2
SA1:=simplify(simplify(series(A1(t),t=0,6),siderels),power);
A1 := t

>

u2 (4+22 )
)
4
1 2 e(1/2
4(5/2) 2
SA1 :=
t + O(t4 )
16
2(7/2) (3/2) u3
with assumptions on t, 4 and 2

Expansion of the exponent for using Lemma 4.3 (ii), p=2


>

L2:= t->(1-r(t))/((1+r(t))*nsigma(t)^2)-(lambda4-mu4)/mu4;

>

4 4
1 r(t)

(1 + r(t)) nsigma(t)2
4
SL2:=simplify(simplify(series(L2(t),t=0,6),siderels),power);
L2 := t

1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
We define c as the square root of the coefficient of t2
c:=sqrt(op(1,SL2))

1 2 2 6
c :=
6
4
with assumptions on 2, 6 and 4
>
nu1b:=(sqrt(2*pi))*op(1,SA1)*(c^(-3)*u^(-3)/2);
SL2 :=

u2 (4+22 )
)
4
27 2 e(1/2
4(11/2)
nu1b :=
8
2(7/2) u6 (2 6)(3/2)
with assumptions on 4, 2 and 6
PROOF OF THE EQUIVALENT OF NU2
>

m1:=t->(1+rho(t))*2*sigma(t)^2/(pi*b(t)*sqrt(I_r2(t)));
m1 := t 2

>

(1 + (t)) (t)2
b(t) I r2(t)

sm1:=simplify(simplify(series(m1(t),t=0,8),siderels),power);

1 6 4 3
sm1 :=
t + O(t5 )
36 2(5/2) u
with assumptions on t, 6, 4 and 2

127

128

J.-M. AZAS, C. CIERCO-AYROLLES AND A. CROQUETTE


>

m2:=t->(-4/pi)*sigma(t)^2*b(t)^(-3)/sqrt(I_r2(t));

>

(t)2
b(t)3 I r2(t)
sm2:=simplify(simplify(series(m2(t),t=0,6),siderels),power);

>

4(5/2)
t + O(t3 )
u3 2(7/2)
with assumptions on t, 4 and 2
m3:=t->-(1+rho(t))*sigma(t)^2*k(t)/(pi*sqrt((2*pi)*I_r2(t)));

m2 := t 4

sm2 :=

(1 + (t)) (t)2 k(t)


2 I r2(t)
sm3:=simplify(simplify(series(m3(t),t=0,6),siderels),power);

1
6(3/2) 2

sm3 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m4:=t->(2/pi)*sigma(t)^2*k(t)^3/sqrt(2*pi*I_r2(t));
m3 := t

>

>

m4 := t 2

>

>

(t)2 k(t)3

2 I r2(t)
sm4:=simplify(simplify(series(m4(t),t=0,6),siderels),power);

1
6(3/2) 2

sm4 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m5:=t->(4/pi)*sigma(t)^2*k(t)*b(t)^(-2)/sqrt(2*pi*I_r2(t));

>

(t)2 k(t)
b(t)2 2 I r2(t)
sm5:=simplify(simplify(series(m5(t),t=0,6),siderels),power);

1 6 4(3/2) 2 2
sm5 :=
t + O(t4 )
12 23 (3/2) u2
with assumptions on t, 6, 4 and 2
l12:=t-> (b(t)/u)^2 + 2/(1+r(t))-lambda4/mu4;

>

b(t)2
1
4
+2

u2
1 + r(t) 4
simplify(simplify(series(l12(t),t=0,8),siderels),power);

m5 := t 4

>

l12 := t

1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
>

l22:=t-> ((b(t)/u)^2 )*(1+k(t)^2)+2/(1+r(t))-lambda4/mu4;


l22 := t

b(t)2 (1 + k(t)2 )
1
4
+2

u2
1 + r(t) 4

MAXIMUM OF A SMOOTH STATIONARY GAUSSIAN PROCESS


>

simplify(simplify(series(l22(t),t=0,8),siderels),power);
1 2 6 2
t + O(t4 )
12 42
with assumptions on t, 2, 6 and 4

>

simplify(int( cos(t)^3, t=0..arctan(sqrt(2)/2)),power);


8
3
27

>

opm1:=op(1,sm1);

1 6 4
36 2(5/2) u
with assumptions on 6, 4 and 2

opm1 :=
>

opm2:=op(1,sm2);
4(5/2)
u3 2(7/2)
with assumptions on 4 and 2

opm2 :=
>

>

>

>

>

>

opm5:=op(1,sm5);

1 6 4(3/2) 2
opm5 :=
12 23 (3/2) u2
with assumptions on 6, 4 and 2
c1:=144*sqrt(3)*mu4^4*u^(-4)/(sqrt(2*pi)*lambda2^2*mu6^2);

3 44 2
c1 := 72 4
u 22 62
with assumptions on 4, 2 and 6
c2:=3*sqrt(3)*mu4^2*u^(-2)/(sqrt(2*pi)*lambda2*mu6);

3
3 42 2

c2 :=
2 u2 2 6
with assumptions on 4, 2 and 6
c5:=12*sqrt(3)*mu4^3*u^(-3)/(lambda2^(3/2)*mu6^(3/2));

3 43
c5 := 12 3 (3/2) (3/2)
u 2
6
with assumptions on 4, 2 and 6
B:=opm1*c1+opm2*c2+opm5*c5;

3 4(9/2) 3 2
B :=
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
simplify(B);

3 4(9/2) 3 2
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6

129

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,

Statistique spatiale et
Statistique simultanee,
maximum de processus
Rennes 24 mars 2009

Jean-Marc Azas
IMT,Toulouse
Laboratoire de Statistique et Probabilites,

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

1 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

Motivation
Exemples
Un petit exemple en dimension 1

Maximum dun processus sur la droite

Champs aleatoires

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

2 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

` signal + bruit
modele

`
En statistique spatiale on est souvent amene a` considerer
le modele
signal + bruit gaussien.
par
Des exemples de telles situations sont donnees

lagriculture de precision
les neurosciences
`

les problemes
de modelisation
des vagues

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

3 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

` signal + bruit
modele

`
En statistique spatiale on est souvent amene a` considerer
le modele
signal + bruit gaussien.
par
Des exemples de telles situations sont donnees

lagriculture de precision
les neurosciences
`

les problemes
de modelisation
des vagues

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

3 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

` signal + bruit
modele

`
En statistique spatiale on est souvent amene a` considerer
le modele
signal + bruit gaussien.
par
Des exemples de telles situations sont donnees

lagriculture de precision
les neurosciences
`

les problemes
de modelisation
des vagues

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

3 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

Agriculture de precision
Mesure du rendement par moissonneuse GPS

` que lon a mesure la variable dinter


et
sur une grille si
On considere

fine que lon peut considerer


quon lobserve sur R2 .
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

4 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

Neuroscience
` 2 dimensionnel ou 3 dimensionnel pour le
On utilise un modele

`
cerveau et on desire
savoir si il existe une zone particulierement
par une activite donnee.

activee

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

source : Maureen CLERC

5 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Exemples

Spectre de houle
On mesure localement, en temps, le spectre de vagues et on veut

detecter
des instants de changement : les transition entre les etats
de mer.

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

6 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Un petit exemple en dimension 1

Motivation
Exemples
Un petit exemple en dimension 1

Maximum dun processus sur la droite

Champs aleatoires

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

7 / 23

Statistique spatiale et maximum de processus


Statistique simultanee,

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Motivation
Un petit exemple en dimension 1

Lynx
7000

6000

5000

4000

3000

2000

1000

20

40

60

80

100

120

` Mackenzie, Nord ouest du


Prises annuelles de lynx dans la riviere

Canada durant la periode


1821 - 1934, (Elton and Nicholson, 1942)

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

8 / 23

Statistique spatiale et maximum de processus


Statistique simultanee,

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Motivation
Un petit exemple en dimension 1

On passe en log et on centre.


3

4
0

20

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

40

60

80

100

120

9 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Un petit exemple en dimension 1

Test unidimensionnel
`
On fait les hypotheses
discutables suivantes
les observations sont gaussiennes

La serie
des erreurs est stationnaire et melangeante,
La

` proie
pseudo-periodicit
e est aleatoire,
due a` un modele

predateur.

La taille de la serie
114 est suffisante pour estimer la variance

avec une erreur negligeable


par
2 :=

1
n

Xi2

qui donne 1.6387 = 1.282 .

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

10 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Un petit exemple en dimension 1

Test unidimensionnel
`
On fait les hypotheses
discutables suivantes
les observations sont gaussiennes

La serie
des erreurs est stationnaire et melangeante,
La

` proie
pseudo-periodicit
e est aleatoire,
due a` un modele

predateur.

La taille de la serie
114 est suffisante pour estimer la variance

avec une erreur negligeable


par
2 :=

1
n

Xi2

qui donne 1.6387 = 1.282 .

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

10 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Un petit exemple en dimension 1

Test unidimensionnel
`
On fait les hypotheses
discutables suivantes
les observations sont gaussiennes

La serie
des erreurs est stationnaire et melangeante,
La

` proie
pseudo-periodicit
e est aleatoire,
due a` un modele

predateur.

La taille de la serie
114 est suffisante pour estimer la variance

avec une erreur negligeable


par
2 :=

1
n

Xi2

qui donne 1.6387 = 1.282 .

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

10 / 23

Statistique spatiale et maximum de processus


Statistique simultanee,

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Motivation
Un petit exemple en dimension 1

Yi
suit approximativement

reduite)

`
un loi normale standard (centree
Dou la regle
de test
Sous lhypothese nulle dabsence de signal

si |

Yi
| > 1.96

e.

on declare
quil y a un signal au point i consider
3

4
0

20

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

40

60

80

100

120

11 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Un petit exemple en dimension 1

Risque simultane

eralement

En statistique, pour etre


schematique
on teste gen
a` 5%,
probabilite de fausse alerte de 5 %.
Un seul test OK ! !
Ici 114 tests avec une proba de fausse alerte de 5 % pour chacun
dentre eux.
`
La proba de fausse alerte totale mais certainement tres
importante ! ! !

a pour but de controler


La statistique simultanee
la probabilite
globale de fausse alerte.

La methode
le plus rudimentaire (mais pas toujours la plus mauvaise)

est la methode
de Bonferroni qui consiste a` faire chaque test
ementaire

el
au niveau = /114 dans notre cas
qnorm(1 0.025/114) = 3.5157
` multiplication par lecart

et apres
type 1.28 donne 4.5. Ce qui ne

detecte
rien Peut on faire mieux ? ?
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

12 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Motivation
Un petit exemple en dimension 1

Le maximum de la valeur une serie


gaussienne
` gen
erale

De maniere
la distribution du maximum est inconnue

meme
dans les cas les plus simples :

marche aleatoires,
processus auto-regressif
dordre 1. On peut faire

des simulations mais cest souvent long et peu precis


.

Un methode
est decrire
la densite du vecteur gaussien
(2)n/2

1
x 1 x
exp
.
det()
2

et de lintegrer
sur un hyper-rectangle [u, u]n . Cela se fait

numeriquement
par des methodes
fort complexes que je ne vais pas

decrire
pour des tailles jusqua` 1000
un niveau de
On trouve en utilisant la matrice de variance estimee,
signification de 0.4978 ce qui est clairement non significatif.
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

13 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Maximum dun processus sur la droite

Motivation
Exemples
Un petit exemple en dimension 1

Maximum dun processus sur la droite

Champs aleatoires

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

14 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Maximum dun processus sur la droite

On suppose

Que lon a tant de point que lon peut considerer


que lon
`

observe entierement
la fonction de la variable reelle
X(t).

e est regulier

Que le phenom
ene
aleatoire
consider
(derivable)
` le maximum M (sans valeur absolue pour
on considere
simplifier) sur un intervalle borne par exemple [0, T].

sur les inegalit

on utilise la methode
de Rice qui est basee
es
basiques suivantes
P{M > u} P{X(0) > u} + P{Uu > 0} P{X(0) > u} + E(Uu )

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

15 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

Statistique spatiale et maximum de processus


Statistique simultanee,
Maximum dun processus sur la droite

16 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Maximum dun processus sur la droite

Sur les franchissements dun processus la seule chose que lon


sache compter : moments
T

E (X )+ (t) X(t) = u pX(t) (u)dt

E(Uu ) =
0

` simple dans le cas stationnaire, centre,

qui a une version tres


variance 1.
2
(u)
E(Uu ) = T
2
ee.

ou 2 est le second moment spectral : la variance de la deriv

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

17 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,
Maximum dun processus sur la droite

Une precision
super-exponentielle

`
Sous certaines hypotheses
P{MT > u} = 1 (u) + T

2
(u) + O (u(1 + ))
2

Montre la qualite de la borne, montre egalement


que la forme exacte
de la covariance importe peu.

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

18 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,

Champs aleatoires

Motivation
Exemples
Un petit exemple en dimension 1

Maximum dun processus sur la droite

Champs aleatoires

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

19 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,

Champs aleatoires

`
Retour aux problemes
de lintroduction
` une fonction aleatoire

On considere
sur R2 (pour simplifier) : un

champ aleatoire.
Le nombre de franchissement courbe (de
niveau) : ne permet pas de construire de bornes.

Il faut remplacer par un autre caracteristique


geom
etrique,
par
exemple le nombre de maxima locaux au dessus dun certain
niveau.

En negligeant
les effets bords
P{M > u}

P{ maximum local au dessus de u}

E( #(maxima locaux au dessus de u))

edemment

qui peut etre


calcule par un formule de Rice comme prec

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

20 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,

Champs aleatoires

`
Theoreme

Considerons
le carre [O, T]2 , alors si le champ est centre et de

variance 1, sous certaines conditions de regularit


e :
P{M > u} =

u(u) 2 1/2 (u)


T || + T
2
2

11 +

22

+1 (u) + O (u(1 + ))
ou` est la matrice de variance-covariance du gradient.

Resultats
du a divers auteurs sous diverses conditions. Piterbarg
(1981), Taylor Takemura Adler (2005) , Azas Wschebor (2008).

On peut meme
facturer le

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

21 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,

Champs aleatoires

Conclusion

debut

En supposant que dans les exemples present


es
le bruit veifie
`

nous sommes capables de calculer


nos hypotheses
de regularit
e,

une valeur quil ne devrait pas depasser


cest a` dire la valeur
critique du test simultane

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

22 / 23

Motivation
Maximum dun processus sur la droite

Champs aleatoires

Statistique spatiale et maximum de processus


Statistique simultanee,

Champs aleatoires

MERCI

Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,

23 / 23






 



 
















The Distribution of the Maximum of a Gaussian


Process: Rice Method Revisited.
Jean-Marc Azas , azais@cict.fr
Mario Wschebor , wscheb@fcien.edu.uy
December 21, 2000

Abstract
This paper deals with the problem of obtaining methods to compute the
distribution of the maximum of a one-parameter stochastic process on a fixed
interval, mainly in the Gaussian case. The main point is the relationship
between the values of the maximum and crossings of the paths, via the socalled Rices formulae for the factorial moments of crossings.
We prove that for some general classes of Gaussian process the so-called
Rice series is convergent and can be used for to compute the distribution
of the maximum. It turns out that the formulae are adapted to the numerical
computation of this distribution and becomes more efficient than other numerical methods, namely simulation of the paths or standard bounds on the
tails of the distribution.
We have included some relevant numerical examples to illustrate the power
of the method.

Mathematics Subject Classification (1991): 60Gxx, 60E05, 60G15, 65U05.


Key words: Extreme Values, Distribution of the Maximum, Crossings of a Level,
Rices Formulae.
Short Title: Distribution of the Maximum.

Laboratoire de Statistique et Probabilites. UMR-CNRS C55830 Universite Paul Sabatier. 118,


route de Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matematica. Facultad de Ciencias. Universidad de la Rep


ublica. Calle Igua 4225.
11400 Montevideo. Uruguay.

Introduction

Let X = {Xt : t IR} be a stochastic process with real values and continuous paths
defined on a probability space (, , P ) and MT := max{Xt : t [0, T ]}.
The computation of the distribution function of the random variable MT
F (T, u) := P (MT u), u IR
by means of a closed formula based upon natural parameters of the process X is
known only for a very restricted number of stochastic processes (and trivial functions
of them): the Brownian Motion {Wt : t 0}; the Brownian Bridge, Bt := Wt
1
tW1 (0 t 1); Bt 0 Bs ds (Darling, 1983); the Brownian Motion with a linear
t
drift (Shepp, 1979); 0 Ws ds + yt (McKean, 1963, Goldman, 1971 Lachal, 1991); the
stationary Gaussian processes with covariance equal to:
1. r(t) = e|t| (Ornstein-Uhlenbeck process, DeLong, 1981),
2. r(t) = (1 |t|)+ , T a positive integer (Slepian process, Slepian 1961, Shepp,
1971),
3. r(t) even, periodic with with period 2, r(t) = 1|t| for 0 |t| 1, 0 < 2,
(Shepp and Slepian 1976),
4. r(t) = 1 |t|/1 /1 , |t| < 1 /, 0 < 1/2, T = (1 )/
(Cressie 1980),
5. r(t) = cos t.
Given the interest in F (T, u) for a large diversity of theoretical and technical
purposes an extensive literature has been developed of which we give a sample of
references pointing to various directions:
1. Obtaining inequalities for F (T, u) : Slepian (1962); Landau & Shepp (1970);
Marcus & Shepp (1972); Fernique (1974); Borell (1975); Ledoux (1996); Talagrand (1996) and references therein. A general review of a certain number of
classical results is in Adler (1990, 2000).
2. Describing the behaviour of F (T, u) under various asymptotics : Qualls and
Watanabe (1973); Piterbarg (1981, 1996); Leadbetter, Lingren and Rootzen
(1983); Berman (1985a, b, 1992); Talagrand (1988); Berman & Kono (1992) ;
Sun (1993); Wschebor (2000); Azas, Bardet and Wschebor (2000).
2

3. Studying the regularity of the distribution of MT : Ylvisaker (1968); Tsirelson


(1975); Weber (1985); Lifshits (1995); Diebolt and Posse (1996); Azas and
Wschebor (2000) and references therein.
Generally speaking, even though important results are associated with problems
1) 2) and 3) they only give limited answers to the computation of F (T, u) for fixed
T and u. As a consequence, Monte-Carlo methods based on the simulation of the
paths of the continuous parameter process X are widely used, even though they
have well-known difficulties : a) they are poor for theoretical purposes ; b) they are
expensive from the point of view of the number of elementary computations needed
to assure that the error is below a given bound and c) they always depend on the
quality of the random number generator being employed.
The approach in this paper is based upon expressing F (T, u) by means of a
series (The Rice series) whose terms contain the factorial moments of the number
of upcrossings. The underlying ideas have been known since a long time (Rice
(1944-1945), Slepian (1962), Miroshin (1974)).
The main new result in this paper is that we have been able to prove the convergence of that series in a general framework instead of considering only some
particular processes. This provides a method that can be widely applied.
A typical application is Theorem2.2 that states that if a stationary Gaussian
process has a covariance which has a Taylor expansion at zero that is absolutely
convergent at t = 2T , then F (T, u) can be computed by means of the Rice series.
On the other hand, even though Theorems 2.1 and 2.3 below do not refer specifically
to Gaussian processes, in practice, at present we are only able to apply them to the
numerical computation of F (T, u) only in Gaussian cases.
In the section Numerical examples we include a comparison between the complexities of the computations of F (T, u) using the Rice series versus Monte-Carlo
method, in the relevant case of a general class of stationary Gaussian processes. It
shows that the use of Rice series is a priori better. More important is the fact that
the Rice series is self-controlling for the numerical errors. This implies that the a
posteriori number of computations may be much smaller than the one required by
simulation. In fact, in relevant cases for standard bounds for the error, the actual
computation is performed with a few terms in the Rice series.
As examples we give tables for F (T, u) for a number of Gaussian processes.
When the length of the interval T increases, one needs an increasing mumber
of terms in the Rices series not to surpass a given bound for the approximation
error. For small values of T an large values of the level u one can use the so3

called Davies bound(1977), or more accurately, the first term in the Rice series
to obtain approximations for F (T, u). But as T increases, for moderate values of u
the Davies bound is far from the true value and one requires the computation of
the succesive terms. The numerical results are shown in the case of four Gaussian
stationary processes for which no closed formula is known.
An asymptotic approximation of F (T, u) as u + recently obtained by Azas,
Bardet and Wschebor (2000). It extends to any T a previous result by Piterbarg
(1981) for sufficiently small T .
One of the key points in the computation is the numerical approximation of the
factorial moments of upcrossings by means of Rice integral formulae. For that purpose, the main difficulty is the precise description of the behaviour of the integrands
appearing in these formulae near the diagonal, which is again an old subject that is
interesting on its own - see Belayeiv (1966), Cuzick (1975) - and remains widely open.
We have included in the Section Computation of Moments some new results, that
give partial answers and are helpful to improve the numerical methods.
The extension to processes with non-smooth trajectories can be done by smoothing the paths by means of a deterministic device, applying the previous methods
to the regularized process and estimating the error as a function of the smoothing
width. We have not included these type of results here since for the time being they
do not appear to be of practical use.
The Note (Azas & Wschebor 1997) contains a part of the results in the present
paper, without proofs.

Notations
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I) := {t I : f (t) = u}
Nu (f ; I) := (Cu (f ; I))
denote respectively the set of roots of the equation f (t) = u on the interval I and the
number of these roots, with the convention Nu (f ; I) = + if the set Cu is infinite.
Nu (f ; I) is called the number of crossings of f with the level u on the interval
I. In what follows, I will be the interval [0, T ] if it is not stated otherwise.
In the same way, if f is a differentiable function the number of upcrossings of
f is defined by means of
Uu (f ; I) := ({t I : f (t) = u, f (t) > 0}).
4

f p denotes the norm of f in Lp (I, ), 1 p +, the Lebesgue measure. The


joint density of the finite set of random variables X1 , ...Xn at the point (x1 , ...xn )
will be denoted pX1 ,...,Xn (x1 , ...xn ) whenever it exists. (t) := (2)1/2 exp(t2 /2) is
t
the density of the standard normal distribution, (t) := (u)du its distribution
function. |I| is the length of I. x+ = sup {x, 0}.
If A is a matrix, AT denotes its transposed, and if A is a square matrix, det(A)
its determinant. V ar() is the variance matrix of the (finite dimensional) random
vector and Cov(, ) the covariance of and .
For m and k, positive integers, k m, define the factorial kth power of m by
m[k] := m(m 1)...(m k + 1)
For other real values of m and k we put m[k] := 0.
If k is an integer k 1, the diagonal of I k is the set:
Dk (I) := {(t1 , ..., tk ) I k , tj = th for some pair (j, h), j = h}.
f (m) is the m-th derivative of the function f . jh = 0 or 1 according as j = h or
j = h.

The distribution of the maximum and the Rice


series

We introduce the notations


m := E((Uu )[m] 1I{X0 u} ); m := E((Uu )[m] ) (m = 1, 2, ...)
where Uu = Uu (X, [0, T ]). m is the factorial moment of the number of upcrossings
of the process X with the level u on the interval [0, T ], starting below u at t = 0.
The Rice formula to compute m , whenever it holds is:
u

m =

dt1 ...dtm

[0,T ]

dxE (Xt1 )+ ...(Xtm )+ /X0 = x, Xt1 = ... = Xtm = u

pX0 ,Xt1 ,...,Xtm (x, u, ..., u) =

(1)

[0,T ]m

dt1 ...dtm

dx

pX0 ,Xt1 ,...,Xtm ,Xt

[0,+)m

x1 ...xm

,...,Xtm (x, u, ..., u, x1 , ...xm )dx1 ...dxm .

(2)

(References for conditions for this formula to hold true that suffice for our presente purposes and also for proofs can be found, for example, in Marcus (1977) and
in Wschebor (1985).
This section contains two main results. The first is Theorem 2.1 that requires
the process to have C paths and contains a general condition enabling to compute
F (T, u) as the sum of a series. The second is Theorem 2.2 that illustrates the same
situation for Gaussian stationary processes from conditions on the the covariance.
As for Theorem 2.3, it contains upper and lower bounds on F (T, u) for processes
with C k paths verifying some additional conditions.
Theorem 2.1 Assume that a.s. the paths of the stochastic process X are of class
C and that the density pXT /2 is bounded by some constant D.
(i) If there exists a sequence of positive numbers {ck }k=1,2,... such that:
k := P

X (2k1)

ck .T 2k1 +

22k1

Dck
= o 2k (k )
(2k 1)!

(3)

then :

(1)m+1

1 F (T, u) = P (X0 > u) +


m=1

m
m!

(4)

(ii) In formula (4) the error when one replaces the infinite sum by its m0 -th

partial sum is bounded by m


where:
0 +1

m
:= sup 2k+1 k .
km

We will call the series in the right-hand term of (4) the Rice Series.
For the proof we will assume, with no loss of generality that T = 1.
We start with the following lemma on the Cauchy remainder for polynomial
interpolation (Davis 1975, Th. 3.1.1 ).
Lemma 2.1 a) Let I be an interval in the real line, f : I IR a function of
class C k , k a positive integer, t1 , ..., tk , k points in I and let P (t) be the - unique
- interpolation polynomial of degree k 1 such that for i = 1, ..., k: f (ti ) = P (ti ),
taking into account possible multiplicities.

Then, for t I :
1
(t t1 )....(t tk )f (k) ()
k!

f (t) P (t) =
where

min(t1 , ..., tk , t) max(t1 , ..., tk , t).


b) If f is of class C k and has k zeros in I (taking into account possible multiplicities), then:
|f (1/2)|

1
f (k)
k!2k

The next combinatorial lemma plays the central role in what follows. A proof is
given in Lindgren (1972).
Lemma 2.2 Let be a non-negative integer-valued random variable having finite
moments of all orders. Let k, m, M (k 0, m 1, M 1) be integers and denote
M

pk := P ( = k) ; m := E( [m] ) ; SM :=

(1)m+1
m=1

m
m!

Then
(i) For each M :

2M

pk S2M +1

pk

S2M

(5)

k=1

k=1

(ii) The sequence {SM ; M = 1, 2, ...} has a finite limit if and only if m /m! 0
as m , and in that case:

P ( 1) =

(1)m+1

pk =
m=1

k=1

m
.
m!

(6)

Remark. A by-product of Lemma 2.2 that will be used in the sequel is the following:
if in (6) one substitutes the infinite sum by the M partial sum, the absolute value
M +1 /((M + 1)!) of the first neglected term is an upper-bound for the error in the
computation of P ( 1).
7

Lemma 2.3 With the same notations as in Lemma 2.2 we have the equality:

E( [m] ) = m

(k 1)[m1] P ( k) (m = 1, 2, ...).
k=m

Proof: Check the identity


j1

[m]

(k)[m1]

=m
k=m1

for each pair of integers j, m. So,

E(

[m]

)=

[m]

P ( = j) =

j=m

(k 1)[m1] =

P ( = j)m
j=m

k=m

(k 1)[m1] P ( k).

=m

k=m

Lemma 2.4 Suppose that a.s. the paths of the process X belong to C and that
pX1/2 is bounded by the constant D. Then for any sequence {ck , k = 1, 2, ...} of
positive numbers, one has

[m]

E((Uu )

)m

(k 1)[m1] P

X (2k1)

ck +

k=m

22k1

Dck
,
(2k 1)!

(7)

Proof: Because of Lemma 2.3 it is enough to prove that P (Uu k) is bounded


by the expression in brackets in the right-hand member of (7). We have
P (Uu k) P ( X (2k1)

ck ) + P (Uu k, X (2k1)

< ck ).

Because of Rolles theorem:


{Uu k} {Nu (X; I) 2k 1},
Applying Lemma 2.1 to the function X(.) u and replacing in its statement k by
2k 1, we obtain:
{Uu k, X (2k1)

< ck } {|X1/2 u|
8

22k1

ck
}.
(2k 1)!

The remaining is plain.


Proof of Theorem 2.1 :
[m]
We use the notation m := E(Uu ) (m = 1, 2, ...).
Using Lemma 2.4 and the hypothesis we obtain:

1
(k+1)
k [m] m
2
= m 2(m+1)

m!
m! k=m
m!

1
1x

(m)

|x=1/2 = m

Since m m we can apply Lemma 2.2 to the random variable = Uu 1I{X0 u}

0.
and the result follows from m
Remarks
One can replace condition pXT /2 (x) D for all x by pXT /2 (x) D for x in some
neighbourhood of u. In this case, the statement of Theorem 2.1 holds if one adds in

(ii) that the error is bounded by m


for m0 large enough. The proof is similar.
0 +1
Also, one may substitute the one-dimensional density pXT /2 by pXt for some other
t (0, T ), introducing into the bounds the corresponding modifications.
The application of Theorem 2.1 requires an adequate choice of the sequence
{ck , k = 1, 2, ...} that depends on the available description of the process X. The
whole procedure will have some practical interest for the computation of P (M > u)

only if we get appropriate bounds for the quantities m


and the factorial moments m
can be actually computed by means of Rice formulae (or by some other procedure).
The next Theorem shows how this can be done in the case of a general class of
Gaussian stationary processes.
Theorem 2.2 Let X be Gaussian, centered and stationary, with covariance .
Assume that has a Taylor expansion at the origin that is absolutely convergent
at t = 2T. Then the conclusion of Theorem 2.1 holds true so that the Rice series
converges and F (T, u) can be computed by means of (4)
Proof. Again we assume with no loss of generality that T = 1 and that (0) = 1.
Note that the hypothesis implies that the spectral moments k exist and are
finite for every k = 0, 1, 2, ...
We will prove a stronger result, assuming the hypothesis:
H1 : 2k C1 (k!)2 .
9

It is easy to verify that if has a Taylor expansion at zero that is absolutely


convergent at t = 2, then H1 holds true. (In fact, both conditions are only slightly
different, since H1 implies that the Taylor expansion of at zero is absolutely
convergent in {|t| < 2}).
Let us check that the hypotheses of Theorem 2.1 hold true.
First, pX1/2 (x) D = (2)1/2 .
Second, let us show a sequence {ck } that satisfies (3). We have
P ( X (2k1)

(2k1)

ck ) P (|X0

| ck ) + 2P (Uck (X (2k1) , I) 1)
1/2

P (|Z| ck 4k2 ) + 2E(Uck (X (2k1) , I)), (8)


where Z is standard normal.
(2k1)
Note that {Xt
; t IR} is a Gaussian stationary centered process with co(4k2)
variance function
(.). So we can use Rice formula for the expectation of the
number of upcrossings of a stationary centered Gaussian process (see for example
Cramer & Leadbetter, 1967) to compute the second term in the right-hand member
of (8). Using the inequality 1 (x) (1/x)(x) valid for x > 0, one gets:
1/2

P( X

(2k1)

2 4k2
+ (1/)
ck

ck )

4k
4k2

1/2

exp

Choose
ck := (B1 k4k2 )1/2 if

ck := (4k )1/2 if

4k
B1 k
4k2

4k
> B1 k.
4k2

Using hypothesis H1 ), if B1 > 1 :


P ( X (2k1)

B1 k
2
1
+ (B1 k)1/2 e 2 .

ck )

Finally, choosing B1 := 4log(2):


k

2
1/2
(1 + 2(C1 + 1)k)22k

10

(k = 1, 2, ...),

c2k
24k2

(9)

so that (3) is satisfied. As a by product, note that

8
1/2
(1 + 2(C1 + 1)m)2m (m = 1, 2, ...).

(10)

Remarks
a) If one is willing to use Rice formulae to compute the factorial moments m , it
is enough to verify that the distribution of
Xt1 , ..., Xtk , Xt1 , ..., Xtk
is non-degenerate for any choice of k = 1, 2, ... (t1 , ..., tk ) I k \Dk (I). For Gaussian
stationary processes a sufficient condition for non-degeneracy is the spectral measure
not to be purely atomic (see Cramer and Leadbetter (1967) for a proof). The same
kind of argument permits to show that the conclusion remains if the spectral measure
is purely atomic and the set of its atoms has an acumulation point in IR. Sufficient
conditions for the finiteness of m are given also in Nualart & Wschebor (Lemma
1.2, 1991).
b) If instead of requiring the paths of the process X to be of class C , one relaxes
this condition up to a certain order of differentiability, one can still get upper and
lower bounds for P (M > u).
Theorem 2.3 Let X = {Xt : t I} be a real -valued stochastic process. Suppose
that pXt (x) is bounded for t I, x IR and that the paths of X are of class C p+1 .
Then
2K+1

if

(1)m+1

2K + 1 < p/2 : P (M > u) P (X0 > u) +


m=1

m
m!

and
2K

if

(1)m+1

2K < p/2 : P (M > u) P (X0 > u) +

m=1

m
.
m!

Note that all the moments in the above formulae are finite.
The proof is a straightforward application of Lemma 2.2 and Lemma 1.2 in
Nualart & Wschebor (1991).
When the level u is high, the results by Piterbag (1981, 1996), which were until
recently the sharpest known asymptotic bounds for the tail of the distribution of the
11

maximum on a fixed interval of general Gaussian stationary processes with regular


paths (for a refinement, see Azas, Bardet and Wschebor, 2000) can be deduced
from the foregoing arguments. Here, only the first term in the Rice series takes part
in the equivalent of P (M > u) as u +. More precisely, if 4 < , it is not
hard to prove that
u2 (1+)
2
(u) 1 (const)e 2 ,
2

2 (const)e

u2 (1+)
2

for a certain > 0. Lemma 3.2 implies that


u2 (1+)
2
(u) P (M > u) (const)e 2 ,
2

0 1 (u) +

(11)

which is Piterbargs result.

Computation of Moments

An efficient numerical computation of the factorial moments of crossings is associated to a fine description of the behaviour as the k-tuple (t1 , ..., tk ) approaches the
diagonal Dk (I), of the integrands
A+
t1 ,...,tk (u, ..., u) =

[0,+)m

x1 ...xm pXt1 ,...,Xtm ,Xt

,...,Xtm (u, ..., u, x1 , ...xm )dx1 ...dxm .

At1 ,...,tk (u) =

dx

[0,+)m

x1 ...xm pX0 ,Xt1 ,...,Xtm ,Xt

,...,Xtm (x, u, ..., u, x1 , ...xm )dx1 ...dxm .

that appear respectively in Rice formulae for the k th factorial moment of upcrossings or the k th factorial moment of upcrossings with the additional condition that
X0 u (see formula(2).
For example in Azas, Cierco and Croquette (1999) it is proved that if X is
Gaussian, stationary, centered and 8 < , then the integrand A+
s,t (u, u) in the
12

computation of 2 - the second factorial moment of the number of upcrossings satisfies:


3/2

A+
s,t (u, u)

1
(2 6 24 )
1 4
exp

u2
1296 (4 22 )1/2 2 22
2 4 22

(t s)4 ,

(12)

as t s 0.
(12) can be extended to non-stationary Gaussian processes obtaining an equivalence of the form:
A+
s,t (u, u)

J(t)(t s)4

as s, t t

(13)

where J(t) is a continuous non-zero function of t depending on u, that can be expressed in terms of the mean and covariance functions of the process and its derivatives. We give a proof of an equivalence of the form (13) in the next proposition.
One can profit of this equivalence to improve the numerical methods to compute
2 (the second factorial moment of the number of upcrossings restricted to X0
u). Equivalence formulae such as (12) or (13) can be used to avoid numerical
degeneracies near the diagonal D2 (I). Note that even in case X is stationary at the
departure, under conditioning on X0 , the process that must be taken into account
in the actual computation of the factorial moments of upcrossings that appear in
the Rice series(4) will be non-stationary, so that equivalence (13) is the appropriate
tool.
Proposition 3.1 Suppose that X is a Gaussian process with C 5 paths and that for
(2)
(3)
each t I the joint distribution of Xt , Xt , Xt , Xt does not degenerate.Then (13)
holds true.
1
a two-dimensional random vector having as proba2
s
bility distribution the conditional distribution of X
given Xs = Xt = u.
Xt
One has:
Proof. Denote by =

+ +
A+
pXs ,Xt (u, u)
s,t (u, u) = E 1 2

(14)

Put = t s and check the following Taylor expansions around the point s:
E (1 ) = m1 + m2 2 + L1 3

(15)

E (2 ) = m1 + m2 2 + L2 3

(16)

13

V ar () =

a 2 + b 3 + c 4 + 11 5
a 2

b+b
2

a 2

3 + d 4 + 12 5

b+b
2

3 + d 4 + 12 5

a 2 + b 3 + c 4 + 22 5

(17)

where m1 , m2 , m2 , a, b, c, d, a, b, c are continuous functions of s and L1 , L2 , 11 ,


12 , 22 are bounded functions of s and t. (15),(16) and (17) follow directly from
s
on the condition Xs = Xt = u.
the regression formulae of the pair X
Xt
Note that (as in Belyaiev, 1966 or Azas and Wschebor, 2000)
det V ar(Xs , Xt , Xs )T
det V ar(Xs , Xs , Xt Xs (t s)Xs )T
V ar(1 ) =
=
det V ar(Xs , Xt )T
det V ar(Xs , Xt Xs )T
A direct computation gives:
det V ar(Xs , Xt )T 2 det V ar(Xs , Xs )T

(18)

(2)

1 det V ar(Xs , Xs , Xs )T 2
V ar(1 )

4 det V ar(Xs , Xs )T
where denotes equivalence as 0. So,
(2)

1 det V ar(Xs , Xs , Xs )T
a=
4 det V ar(Xs , Xs )T
which is a continuous non-vanishing function for s I. Note that the coefficient of
. This follows either by
3 in the Taylor expansion of Cov(1 , 2 ) is equal to b+b
2
direct computation or noting that det V ar() is a symmetric function of the pair
s, t.
Put
(s, t) = det V ar()
The behaviour of (s, t) as s, t t can be obtained by noting that
(s, t) =

det V ar(Xs , Xt , Xs , Xt )T
det V ar(Xs , Xt )T

and applying Lemma 3.2 in Azas and Wschebor (2000) or Lemma 4.3, p.76 in
Piterbarg (1996) which provide an equivalent for the numerator, so that:
(s, t) (t) 6
14

(19)

with
(2)

(t) =

(3)

1 det V ar(Xet , Xet , Xet , Xet )T


144
det V ar(Xet , Xet )T

The non degeneracy hypothesis implies that (t) is continuous and non zero.
Then:
E

1+ 2+

1
1/2

2 [(s, t)]

xy exp
0

1
F (x, y) dxdy
2(s, t)

(20)

where
F (x, y) = V ar(2 )(x E(1 ))2 + V ar(1 )(y E(2 ))2 2Cov(1 , 2 )(x E(1 ))(y E(2 ))
Substituting the expansions (15), (16), (17) in the integrand of (20) and making
the change of variables x = 2 v, y = 2 w we get, as s, t t:
E 1+ 2+

5
2 (t)

1/2

vw exp
0

1
F (v, w) dvdw
2(t)

(21)

(t) can also be expressed in terms of the functions a, b, c, d, a , b , c :


(t) = ac + ca + 2ad

bb
2

and
2

F (v, w) = a (v m2 + w m2 ) + m21 (c + c + 2d) m1 (b b )(v + w m2 m2 )


The functions a, b, c, d, b, c, m1 , m2 that appear in these formulae are all evaluated
at the point t.
Replacing (21) and (18) into (14) one gets (13).
For k 3, the general behaviour of the functions At1 ,...,tk (u) and A+
t1 ,...,tk (u, ..., u)
when (t1 , ..., tk ) approaches the diagonal is not known. Proposition 3.3 , even though
it contains restrictive conditions (it requires E{Xt } = 0 and u = 0) can be applied
to improve the efficiency in the computation of the k th -factorial moments by means
of a Monte-Carlo method, via the use of important sampling. More precisely, when
15

k
computing the integral of A+
t1 ,...,tk (u) over I , instead of choosing at random the point
(t1 , t2 , ..., tk ) in the cube I k with a uniform distribution, we do it with a probability
law that has a density proportional to the function 1i<jk (tj ti )4 . For its proof
we will use the following auxiliary proposition, that has its own interest and extends
(19) to any k.

Proposition 3.2 Suppose that X = {Xt : t I} is a Gaussian process defined on


the interval I of the real line with C 2k1 paths, k an integer, k 2, and that the
(2k1)
joint distribution of Xt , Xt , ...., Xt
is non-degenerate for each t I. Then,

if t1 , t2 , ...., tk t :
(2k1) T

det V ar(Xt , Xt , ..., Xt


= det V ar(Xt1 , Xt1 , ..., Xtk , Xtk )
[2!.3!....(2k 1)!]2
T

(tj ti )8
1i<jk

(22)
Proof. With no loss of generality, we consider only ktuples (t1 , t2 , ...., tk ) such
that ti = tj if i = j.
Suppose f : I IR is a function of class C 2m , m 1, and t1 , t2 , ...., tm
are pairwise different points in I. We use the following notations for interpolating
polynomials:
Pm (t; f ) is the polynomial of degree 2m 1 such that
Pm (tj ; f ) = f (tj ) and Pm (t; f ) = f (tj ) for j = 1, ..., m.
Qm (t; f ) is the polynomial of degree 2m 2 such that
Qm (tj ; f ) = f (tj ) for j = 1, ..., m ; Qm (t; f ) = f (tj ) for j = 1, ..., m 1.
From Lemma 2.1 we know that
f (t) Pm (t; f ) =

f (t) Qm (t; f ) =

1
(t t1 )2 ....(t tm )2 f (2m) ()
(2m)!

1
(t t1 )2 ....(t tm1 )2 (t tm )f (2m1) ()
(2m 1)!

where
= (t1 , t2 , ...., tm , t), = (t1 , t2 , ...., tm , t)
16

(23)

(24)

and
min(t1 , t2 , ...., tm , t) , max(t1 , t2 , ...., tm , t).
Note that the function
g(t) = f (2m1) ((t1 , t2 , ...., tm , t)) =

(2m 1)! [f (t) Qm (t; f )]


(t t1 )2 ....(t tm1 )2 (t tm )

is differentiable at the point t = tm and differentiating in (24):


f (tm ) Qm (tm ; f ) =

1
(tm t1 )2 ....(tm tm1 )2 f (2m1) ((t1 , t2 , ...., tm , tm ))
(2m 1)!
(25)

Put
m = (t1 , t2 , ...., tm , tm ), m = (t1 , t2 , ...., tm , tm ).
Since Pm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm ))
and Qm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm1 ))
with coefficients depending (in both cases) only on t1 , t2 , ...., tm , t, it follows that:
= det V ar Xt1 , Xt1 , Xt2 P1 (t2 ; X), Xt2 Q2 (t2 , X), ...
T

..., Xtk Pk1 (tk ; X), Xtk Qk (tk ; X) =


1
(2) 1
, ....
= det V ar Xt1 , Xt1 , (t2 t1 )2 X1 , (t2 t1 )2 X(3)
2
2!
3!
1
1
(2k2)
..,
(tk t1 )2 ...(tk tk1 )2 Xk1 ,
(tk t1 )2 ...(tk tk1 )2 X(2k1)
k1
(2k 2)!
(2k 1)!
=

[2!...(2k 1)!]2

(tj ti )8
1i<jk

with
(2)

(2k2)

= det V ar Xt1 , Xt1 , X1 , X(3)


, ..., Xk1 , X(2k1)
2
k1

(2k1) T

det V ar(Xt , Xt , ..., Xt


as t1 , t2 , ...., tk t . This proves (22).
17

Proposition 3.3 Suppose that X is a centered Gaussian process with C 2k1 paths
and that for each pairwise distinct values of the parameter t1 , t2 , ..., tk I the joint
(2k1)
distribution of (Xth , Xth , ...., Xth
, h = 1, 2, ..., k) is non-degenerate. Then, as

t1 , t2 , ..., tk t :

A+
t1 ,...,tk (0, ..., 0) Jk (t )

(tj ti )4
1i<jk

where Jk (t) is a continuous non-zero function of t.


Proof. Introduce the notation
(k)

Dk (t) = det V ar(Xt , Xt , ...., Xt )T


In the same way as in the proof of Proposition 3.2 and with a simpler computation,
it follows that as t1 , t2 , ..., tk t
det V ar(Xt1 , Xt2 , ..., Xtk )T

1
[2!.....(k 1)!]2

(tj ti )2 . Dk1 (t ).

(26)

1i<jk

For pairwise different values t1 , t2 , ..., tk , let Z = (Z1 , ..., Zk )T be a random vector
having the conditional distribution of (Xt1 , ...., Xtk )T given Xt1 = Xt2 = ... = Xtk =
0. The (Gaussian) distribution of Z is centered and we denote its covariance matrix
by . Also put:
1 =

1
ij
det()

i,j=1,...,k

ij being the cofactor of the position (i, j) in the matrix . Then, one can write:
+
+
A+
. pXt1 ,...,Xtk (0, ..., 0)
t1 ,...,tk (0, ..., 0) = E Z1 ...Zk

(27)

and
A+
t1 ,...,tk (0, ..., 0) =

1
k
2

(2) (det())

1
2

x1 ...xk exp
(R+ )k

F (x1 , ..., xk )
2. det()

dx1 ...dxk
(28)

where
k

ij xi xj .

F (x1 , ..., xk ) =
i,j=1

18

Letting t1 , t2 , ..., tk t and using (22) and (26) we get:


det() =

det V ar(Xt1 , Xt1 , ..., Xtk , Xtk )T

det V ar(Xt1 , ..., Xtk )T

1
[k!.....(2k 1)!]2

(tj ti )6 .
1i<jk

D2k1 (t )
.
Dk1 (t )

We consider now the behaviour of the ij (i, j = 1, ..., k). Let us first look at 11 .
Using the same method as above, now applied to the cofactor of the position (1, 1)
in , one has:

11

det V ar(Xt1 , Xt2 , ..., Xtk , Xt2 , ..., Xtk )T


=

det V ar(Xt1 , ..., Xtk )T

1
[2!...(2k2)!]2

ti )8

2i<jk (tj

1
[2!.....(k1)!]2

2hk (t1

2
1i<jk (tj ti )

1
=
[k!...(2k 2)!]2

(tj ti )
2i<jk

th )4 D2k2 (t )

Dk1 (t )
6

(t1 th )
2hk

=
D2k2 (t )
Dk1 (t )

A similar computation holds for ii , i = 2, ..., k.


Consider now 12 . One has:
det E (Xt1 , Xt2 , ..., Xtk , Xt2 , ..., Xtk )T .(Xt1 , Xt2 , ..., Xtk , Xt1 , Xt3 ..., Xtk )
=
det V ar(Xt1 , ..., Xtk )T
det E (Xt2 , Xt2 , ..., Xtk , Xtk , Xt1 )T .(Xt1 , Xt1 , Xt3 , Xt3 , ..., Xtk , Xtk , Xt2 )
=

det V ar(Xt1 , ..., Xtk )T


12

1
[k!...(2k 2)!]2

(tj ti )6
3i<jk

(t1 th )4 (t2 th )4 (t2 t1 )2 .


3hk

A similar computation applies to all the cofactors ij , i = j.


Perform in the integral in (28) the change of variables
i=k

(ti tj )2 . yj

xj =
i=1,i=j

19

j = 1, ..., k

D2k2 (t )
Dk1 (t )

and the integral becomes:


(tj ti )8

y1 ...yk exp
(R+ )k

1i<jk

1
G(y1 , ..., yk )
2. det()

dy1 ...dyk

where
k

h=k

G(y1 , ..., yk ) =

i,j=1

h=k

ij

(th tj )2

(th ti )
h=1,h=i

yi yj .

h=1,h=j

so that, as t1 , t2 , ..., tk t
G(y1 , ..., yk )
D2k2 (t )
[(2k 1)!]2
det()
D2k1 (t )

i=k

yi
i=1

Now, passage to the limit under the integral sign in (28), which is easily justified by
application of the Lebesgue Theorem, leads to
E

Z1+ ...Zk+

1
(2)

k
2

|tj ti |

k!...(2k 1)!

1i<jk

Dk1 (t )
D2k1 (t )

1
2

Ik ( )

where Ik (), > 0 is

y1 ...yk exp

Ik () =
(R+ )k

i=k

yi
i=1

dy1 ...dyk = 1 Ik (1)


k

and
= [(2k 1)!]2

D2k2 (t )
D2k1 (t )

Replacing into (27) one gets the result with


Jk (t) =

2!...(2k 2)!

Ik (1)

[2(2k 1)!]2k1 [D2k1 (t)] 2

This finishes the proof.


20

D2k1 (t)
D2k2 (t)

Numerical examples

4.1

Comparison with Monte-Carlo method

First, let us compare the numerical computation based upon Theorem 2.1 with
the Monte-Carlo method based on the simulation of the paths. We do this for
stationary Gaussian processes that satisfy the hypotheses of Theorem 2.2 and also
the non-degeneracy condition that ensures that one is able to compute the factorial
moments of crossings by means of Rice formulae.
Suppose that we want to compute P (M > u) with an error bounded by , where
> 0 is a given positive number.
To proceed by simulation, we discretize the paths by means of a uniform partition
{tj := j/n, j = 0, 1, ..., n}. Denote
M (n) := sup Xtj .
0jn

Using Taylors formula at the time where the maximum M of X(.) occurs, one
gets :
0 M M (n) X

/(2n

It follows that
0 P (M > u) P (M (n) > u) = P (M > u, M (n) u)
P (u < M u + X

/(2n

)).

If we admit that the distribution of M has a locally bounded density (which is a


well-known fact under the mentioned hypotheses) the above suggests that a number
of n = (const) 1/2 points is required if one wants the mean error P (M > u)
P (M (n) > u) to be bounded by .
On the other hand, to estimate P (M (n) > u) by Monte-Carlo with a mean square
error smaller than , we require the simulation of N = (const) 2 Gaussian n-tuples
(Xt1 , ..., Xtn ) from the distribution determined by the given stationary process. Performing each simulation demands (const)nlog(n) elementary operations (Dietrich
and Newsam, 1997). Summing up, the total mean number of elementary operations
required to get a mean square error bounded by in the estimation of P (M > u)
has the form (const) 5/2 log(1/).
Suppose now that we apply Theorem 2.1 to a Gaussian stationary centered process verifying the hypotheses of Theorem 2.2 and the non-degeneracy condition.
21


The bound for m
in Equation (10) implies that computing a partial sum with
(const)log(1/) terms assures that the tail in the Rice series is bounded by . If
one computes each m by means of a Monte-Carlo method for the multiple integrals
appearing in the Rice formulae, then the number of elementary operations for the
whole procedure will have the form (const) 2 log(1/). Hence, this is better than
simulation as tends to zero.
As usual, for given > 0, the value of the generic constants decides the comparison between both methods.
More important is the fact that the enveloping property of the Rice series implies
that the actual number of terms required by the application of Theorem 2.1 can

. More
be much smaller than the one resulting from the a priori bound on m

precisely, suppose that we have obtained each numerical approximation m


of m
with a precision

|m
m | ,

and that we stop when

m
0 +1
.
(m0 + 1)!

(29)

Then, it follows that


m

m+1

(1)
m=1

m
m+1 m

(1)
(e + 1).
m! m=1
m!

Putting = /(e + 1), we get the desired bound. In other words one can profit of
the successive numerical approximations of m to determine a new m0 which turns
out to be - in certain interesting examples - much smaller than the one deduced

from the a priori bound on m


.

4.2

Comparison with usuals bounds

Next, we will give the results of the evaluation of P (MT > u) using up to three
terms in the Rice series in a certain number of typical cases. We compare these
results with the classical evaluation using what is often called the Davies (1977)
bound. In fact this bound seems to have been widely used since the work of Rice
(1944). It is an upper-bound with no control on the error, given by:
P (M > u) P (X0 > u) + E Uu ([0, T ])
22

(30)

The above mentioned result by Piterbarg (11) shows that in fact, for fixed T and
high level u this bound is sharp. In general, using more than one term of the Rice
series supplies a remarkable improvement in the computation.
We consider several stationary centered Gaussian processes listed in the following
table, where the covariances and the corresponding spectral densities are indicated.
process
X1
X2
X3
X4

covariance
1 (t) = exp(t2 /2)
2 (t) = (ch(t))1
1
3 (t) = 31/2 t sin(31/2 t)

4 (t) = e| 5t| ( 35 |t|3 + 2t2 + 5|t| + 1)

spectral density
f1 (x) = (2)1/2 exp(x2 /2)
1
f2 (x) = 2ch((x)/2)
f3 (x) = 121/2 1I{3<x<3}
4
f4 (x) = 105 (5 + x2 )4

In all cases, 0 = 2 = 1 to be able to compare the various results. Note that 1


and 3 have analytic extensions to the whole plane, so that Theorem 2.2 applies to
the processes X1 and X3 . On the other hand, even though all spectral moments of
the process X2 are finite, Theorem 2.2 applies only for a length less than /4 since
the meromorphic extension of 2 (.) has poles at the points i/2 + ki, k an integer.
With respect to 4 (.) notice that it is obtained as the convolution 5 5 5 5
where 5 (t) := e|t| is the covariance of the Ornstein-Uhlenbeck process, plus a
change of scale to get 0 = 2 = 1. The process X4 has 6 < and 8 = and its
paths are C 3 . So, for the processes X2 and X4 we apply Theorem 2.3 to compute
F (T, u).
Table 1 contains the results for T = 1, 4, 6, 8, 10 and the values u = 2, 1, 0, 1, 2, 3
using three terms of the Rices series. A single value is given when a precision of 102
is met; otherwise the lower-bound and the upper-bound given by two or three terms
of the Rices series respectively, are diplayed. The calculation uses a deterministic
evaluation of the first two moments 1 and 2 using program written by Cierco Croquette and Delmas (2000) and a Monte-Carlo evaluation of 3 . In fact, for simpler
and faster calculation, 3 has been evaluated instead of 3 providing a slightly weaker
bound.
In addition Figures 1 to 4 show the behavior of four bounds : namely, from the
highest to the lowest
The Davies bound (D) defined by formula (30)

23

u
-2

-1

Length of the time interval


1
4
6
8
0.99 1.00 1.00
1.00
0.99 1.00 1.00
1.00
1.00 1.00 1.00
1.00
0.99 1.00 1.00
1.00
0.93 1.00 1.00
0.99-1.00
0.93 0.99 1.00
0.99-1.00
0.93 1.00 1.00
1.00
0.93 1.00 1.00
0.99-1.00
0.65 0.90 0.95
0.95-0.99
0.65 0.89 0.94-0.95 0.93-0.99
0.656 0.919 0.97
0.98-0.99
0.65 0.89 0.94-0.95 0.94-0.99
0.25 0.49 0.61
0.69-0.70
0.25 0.48 0.58
0.66-0.68
0.26 0.51 0.62
0.71
0.25 0.48 0.59
0.67-0.69
0.04 0.11 0.15
0.18
0.04 0.11 0.14
0.18
0.04 0.11 0.15
0.19
0.04 0.11 0.14
0.18
0.00 0.01 0.01
0.02
0.00 0.01 0.01
0.02
0.00 0.01 0.01
0.02
0.00 0.01 0.01
0.02

T
10
1.00
1.00
1.00
1.00
0.98-1.00
0.98-1.00
0.99
0.98-1.00
0.90-1.00
0.87-1.00
0.92-1.00
0.88-1.00
0.74-0.77
0.70-0.76
0.76-0.78
0.72-0.77
0.22
0.21
0.22
0.22
0.02
0.02
0.02
0.02

Table 1: Values of P (M > u) for the different processes. Each cell contains, from
top to bottom, the values corresponding to stationary centered Gaussian processes
with covariances 1 , 2 , 3 and 4 respectively. The calculation uses three terms
of the Rice series for the upper-bound and two terms for the lower-bound. Both
bounds are rounded up to two decimals and when they differ, both displayed.

24

One, three, or two terms of the Rice series (R1, R3, R2 in the sequel) that is
K

P (X0 > u) +

(1)m+1
m=1

m
m!

with K = 1, 3 or 2.
Note that the bound D differs from R1 due to the difference between 1 and
1 . These bounds are evaluated for T = 4, 6, 8, 10, 15 and also for T = 20 and
T = 40 when they fall in the range [0, 1]. Between these values an ordinary spline
interpolation has been performed.
In addition we illustrate the complete detailed calculation in three chosen cases.
They correspond to zero and positive levels u. For u negative, it is easy to check
that the Davies bound is often greater than 1, thus non informative.
For u = 0, T = 6, = 1 , we have P (X0 > u) = 0.5, 1 = 0.955, 1 = 0.602,
2 /2 = .150, 3 /6 = 0.004, so that:
D = 1.455 , R1 = 1.103 , R3 = 0.957 , R2 = 0.953
R2 and R3 give a rather good evaluation of the probability, the Davies bound
gives no information.
For u = 1.5, T = 15, = 2 , we have P (X0 > u) = 0.067, 1 = 0.517,
1 = 0.488, 2 /2 = 0.08, 3 /6 = 0.013, so that:
D = 0.584 , R1 = 0.555 , R3 = 0.488 , R2 = 0.475
In this case the Davies bound is not sharp and a very clear improvement is
provided by the two bounds R2 and R3.
For u = 2, T = 10, = 3 , we have P (X0 > u) = 0.023, 1 = 0.215,
1 = 0.211, 2 /2 = 0.014, 3 /6 = 3104 , so that:
D = 0.238 , R1 = 0.234 , R3 = 0.220. , R2 = 0.220
In this case the Davies bound is rather sharp.
As a conclusion, these numerical results show that it is worth using several terms
of the Rice series. In particular the first three terms are relatively easy to compute
and provide a good evaluation of the distribution of M under a rather broad set of
conditions.
25

Acknowledgements
We thank C. Delmas for computational assistance. This work has received a support
ECOS program U97E02.

References

Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics


for General Gaussian Processes, IMS, Hayward, Ca.
Adler, R.J. (2000) On excursion sets, tube formulae, and maxima of random
fields. Annals of Applied Probability. To appear
Azas, J-M., Cierco, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM: Probability and Statistics, 3, 107-129.
Azas, J-M. and Wschebor M. (1997). Une formule pour calculer la distribution
du maximum dun processus stochastique. C.R. Acad. Sci. Paris, t. 324, serieI,
225-230.
Azas, J-M. and Wschebor M. (1999). Regularite de la loi du maximum de
processus gaussiens reguliers. C.R. Acad. Sci. Paris, t. 328, Ser. I, 333-336.
Azas, J-M and Wschebor, M. (2000). On the Regularity of the Distribution
of the Maximum of one-parameter Gaussian Processes, accepted for publication at
Probability Theory and Related Fields.
Azas, J-M, Bardet, J-M and Wschebor, M. (2000). On the Tails of the Distribution of the Maximum of a Smooth Stationary Gaussian Process, submitted.
Belyaev, Yu. (1966). On the number of intersections of a level by a Gaussian
Stochastic process. Theory Prob. Appl., 11, 106-113.
Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
Berman, S.M. (1985b). The maximum of a Gaussian process with nonconstant
variance. Ann. Inst. H. Poincare Probab. Statist., 21, 383-391.
Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and Brooks, Probability Series.
Berman, S.M. and Kono, N. (1989). The maximum of a gaussian process with
non-constant variance: a sharp bound for the distribution of the tail. Ann. Probab.,
17, 632-650.
Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent.
Math., 30, 207-216.

26

Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic


Processes, J. Wiley & Sons, New-York.
Cierco, C. (1996). Probl`emes statistiques lies `a la detection et `a la localisation
dun g`ene `a effet quantitatif. PHD dissertation. University of Toulouse, France.
Cierco-Ayrolles, C., Croquette, A. and Delmas, C. (2000). Computing the Distribution of the Maximum of Regular Gaussian Processes. Submitted.
Cressie, N (1980). The asymptotic distribution of the scan statistic under uniformity. Ann. Probab, 8, 828-840.
Cuzick, J. (1975). Conditions for finite moments of the number of zero crossings
for Gaussian processes.Ann. Probab, 3, 849-858.
Darling, D. A.(1983). On the supremum of certain Gaussian processes. Ann.
Probab. 11, 803-806.
Davies, R.B. (1987). Hypothesis testing when a nuisance parameter is present
only under the alternative. Biometrika 74, 33-43.
Davis, J. D. (1975). Interpolation and approximation. Dover, New York.
DeLong, D. M. (1981). Crossing probabilities for a square root boundary by a
Bessel Process. Communication in Statistics-Theory and methods. A10, 2197-2213.
Diebolt, J. and Posse, C. (1996). On the Density of the Maximum of Smooth
Gaussian Processes.Ann. Probab., 24, 1104-1129.
Dietrich, C. R. and Newsam G. N. (1997.) Fast and exact simulation of stationary Gaussian processes throught circulant embedding of the covartiance matrix. .
SIAM J. Sci. Comput. 18, 4, 1088-1107.
Fernique, X.(1974). Regularite des trajectoires des fonctions aleatoires gaussiennes.Ecole dEte de Probabilites de Saint Flour. Lecture Notes in Mathematics,
480, Springer-Verlag,New-York.
Goldman, M. (1971). On the first passage of the integrated Wiener process.The
Annals Math. Statist., 42, 6, 2150-2155.
Lachal, A. (1991). Sur le premier instant de passage de lintegrale du mouvement
Brownien. Ann. Inst. H. Poincare Probab. Statist. 27, 385-405.
Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process.
Sankya Ser. A 32, 369-378.
Leadbetter, M.R., Lindgren, G. and Rootzen, H. (1983). Extremes and related
properties of random sequences and processes. Springer-Verlag, New-York.
Ledoux, M. (1996). Isoperimetry and Gaussian Analysis. Ecole dEte de Probabilites de Saint-Flour 1994. Lecture Notes in Math. 1648, 165-264. Springer-Verlag.
New York.
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, SpringerVerlag, New-York.
27

Lifshits, M.A.(1995). Gaussian random functions . Kluwer, The Netherlands.


Lindgren, G. (1972). Wave-length and Amplitude in Gaussian Noise . Adv.
Appl. Prob., 4, 81-108.
McKean, H.P. (1963). A winding problem for a resonant driven by a white noise.
J. Math. Kyoto Univ., 2, 227-235.
Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely
Continuous Sample Paths, Ann. Probab., 5, 52-71.
Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes.
Proc. Sith Berkeley Symp. Math. Statist. Prob., 2, 423-442.
Miroshin, R. N. (1974). Rice series in the theory of random functions. Vestnik
Leningrad Univ. Math., 1, 143-155.
Miroshin, R. N. (1977). Condition for finiteness of moments of the number of
zeros of stationary Gaussian processes. Th. Prob. Appl., 22 , 615-624.
Miroshin, R. N. (1983). The use of Rice series. Th. Prob. Appl. 28, 714-726.
Nualart, D. and Wschebor, M. (1991). Integration par parties dans lespace de
Wiener et approximation du temps local. Prob. Th. Rel. Fields, 90, 83-109.
Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of
Gaussian processes. Th, Proba. Appl., 26, 687-705.
Piterbarg, V; I. (1996). Asymptotic Methods in the Theory of Gaussian Processes
and Fields. American Mathematical Society. Providence, Rhode Island.
Qualls, C and Watanabe, H. (1973). Asymptotic properties of Gaussian process.
Ann. Math. Statistics, 43, 580-596
Rice, S.O. (1944-1945). Mathematical Analysis of Random Noise.Bell System
Tech. J., 23, 282-332; 24, 45-156.
Shepp, L. A. (1971). First passage time for a particular Gaussian process. The
Ann. of Math. Stat., 42, 946-951.
Shepp, L. A. (1979). The joint density of the maximum and its location for a
Wiener process with drift. J. Appl. Prob. 16, 423-427.
Shepp, L. A. and Slepian, D.(1976). First-passage time for a particular stationary
periodic Gaussian process. J. Appl. Prob., 13, 27-38.
Slepian, D. (1961). First passage time for a particular Gaussian process.Ann.
Math. Statist., 32, 610-612.
Slepian, D. (1962). The one-sided barrier problem for Gaussian noise.Bell System
Tech. J. 42, 463-501.
Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields,
Ann. Probab., 21, 34-71.
Talagrand, M. (1988). Small tails for the supremum of a Gaussian process. Ann.
Inst. H. Poincare, Ser. B, 24, 2, 307-315.
28

Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab.


24, 1049-1103.
Tsirelson, V.S. (1975). The Density of the Maximum of a Gaussian Process.
Th. Probab. Appl., 20, 817-856.
Weber, M. (1985). Sur la densite du maximum dun processus gaussien. J. Math.
Kyoto Univ., 25, 515-521.
Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de
niveau.Lecture Notes in Mathematics, 1147, Springer-Verlag.
Wschebor, M. (2000). Sur la loi du sup de certains processus Gaussiens non
bornes. Accepted for publication by C.R. Acad. Sci. Paris.
Ylvisaker, D. (1968). A Note on the Absence of Tangencies in Gaussian Sample
Paths.The Ann. of Math. Stat., 39, 261-262.

29

, u =1

0.9

0.8

Values of the bounds

0.7

0.6

0.5

0.4

0.3

0.2

0.1

8
10
12
Length of the interval

14

16

18

20

Figure 1: For the process with covariance 1 and the level u = 1, representation of
the three upper-bounds D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval

2 u =0
1

0.95

0.9

Values of the bounds

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

10

15

Length of the interval

Figure 2: For the process with covariance 2 and the level u = 0, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
30

u =2

0.9

0.8

0.7

Values of the bounds

0.6

0.5

0.4

0.3

0.2

0.1

10

15

20
25
Length of the interval

30

35

40

Figure 3: For the process with covariance 3 and the level u = 2, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval

4 u =1.5
1

0.9

0.8

Values of the bounds

0.7

0.6

0.5

0.4

0.3

0.2

0.1

8
10
12
Length of the interval

14

16

18

20

Figure 4: For the process with covariance 4 and the level u = 1.5, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
31

Asymptotic distribution and power of the likelihood ratio test for


mixtures: bounded and unbounded cases.
Jean-Marc Azas

Elisabeth
Gassiat2

Cecile Mercadier1

September 20, 2004

Laboratoire de Statistique et Probabilites, UMR-CNRS C55830, Universite Paul Sabatier,


118 route de Narbonne, 31062 Toulouse Cedex 4. France.
2 Laboratoire de Math
ematiques, Equipe de Probabilites, Statistique et Modelisation, Bat425,
Universite de Paris-Sud, 91405 Orsay Cedex, France. elisabeth.gassiat@math.u-psud.fr.
Abstract
In this paper, we consider the log-likelihood ratio test (LRT) for testing the number of
components in a mixture of populations in a parametric family. We provide the asymptotic
distribution of the LRT statistic under the null hypothesis as well as under contiguous
alternatives when the parameter set is bounded. Moreover, for the simple contamination
model we prove that, under general assumptions, the asymptotic power under contiguous
hypotheses may be arbitrarily close to the asymptotic level when the set of parameters is
large enough. In the particular problem of normal distributions, we prove that, when the
unknown mean is not a priori bounded, the asymptotic power under contiguous hypotheses
is equal to the asymptotic level.
Keywords: Likelihood ratio test, mixture models, number of components, extreme values,
power, contiguity.
Short title: Asymptotic study of the LRT for mixtures

Introduction

Mixtures of populations is a modelling tool widely used in applications and the literature
on the subject is vast. For finite mixtures, the first task is the choice of the number of
components in the mixture. Some estimation or testing procedures have been proposed
for this purpose, see for instance the books of Titterington et al. (1985), Lindsay (1995)
and McLachlan and Peel (2000) or the papers of James et al. (2001), Gassiat (2002) and
references therein. Asymptotic optimality of the likelihood ratio test (LRT) in several
parametric contexts is well known. Using the LRT for testing the number of components
in a mixture appears quite natural. In one way, simulation studies show that the LRT
performs well in various situations (see Goffinet et al., 1992). In another way, the asymptotic
distribution and power of the test have to be evaluated to compare with other known tests.
In this paper, we focus on the asymptotic properties of the LRT for testing that i.i.d.
observations X1 , . . . , Xn come from a mixture of p0 populations in a parametric set of densities F (null hypothesis H0 ) against a mixture of p populations (alternative H1 ), where the
integers p0 and p satisfy p0 < p.
In Section 2 we apply results of Gassiat (2002) to obtain the asymptotic distribution of
the LRT statistic for testing (H0 ) against (H1 ) under the null hypothesis as well as under
contiguous alternatives. Indeed, Gassiat (2002) gives a quite weak assumption under which
the derivation of the asymptotic distribution of the LRT statistic is made in the general situation when one has to test a small model in a larger one, under the null hypothesis as well
as under contiguous hypotheses. This applies to the number of components in a mixture
of populations in a parametric set with eventually an unknown nuisance parameter. For

instance, we apply the method to multidimensional Gaussian distributions with unknown


common variance. By this way, we recover known results for mixtures of one or two populations but under weaker assumptions, or known results concerning particular parametric
families such as Gaussian or Binomial distributions; see Ghosh and Sen (1985), DacunhaCastelle and Gassiat (1997, 1999), Garel (2001), Lemdani and Pons (1997, 1999), Chen and
Chen (2001), Mosler and Seidel (2001) Chernoff and Lander (1995). We also obtain more
general results than previous ones:
They apply to general sets of parametric families with unknown nuisance parameter.

Asymptotic distribution under contiguous alternatives is considered.

However, except for smoothness assumptions, the main point is that these asymptotic results require that the parameter set is bounded.
In Sections 3 and 4 we study what happens when the set of parameters becomes larger
and larger. For simplicity we restrict our attention to the simplest model: the contamination model for family of distributions indexed by a single real parameter. Indeed, roughly
speaking, the LRT statistic converges in distribution to half the square of the supremum of
some Gaussian process indexed by a compact set of scores. But when this set of scores is
enlarged, the covariance of the Gaussian process is close to 0 for sufficiently distant scores,
so that the supremum of the Gaussian process may become arbitrarily large. Thus one also
knows that for unbounded sets of parameters, the LRT statistic tends to infinity in probability, as Hartigan first noted for normal mixtures (see Hartigan, 1985). Here, we prove
that under some extreme circumstances the LRT can have less power than moment tests or
goodness-of-fit tests. At the end of the introduction we draw carefully practical conclusions
from this result.
More precisely, let T be [T, T ] and F = {ft , t T} be a parametric set of probability
densities on R with respect to the Lebesgue measure. Using i.i.d. observations X1 , . . . , Xn ,
we consider the testing problem for the density g of the observations.
(H0 ) : g = f0

against

(H1 ) : g = (1 )f0 + ft , 0 1, t T.

(1)

We prove that:
For general parametric sets F, T = [T, T ] and T large enough, under contiguous
alternatives, the LRT for (1) has asymptotic power close to the asymptotic level,
under some smoothness assumptions, see Theorem 7.
A set of assumptions is given for which Theorem 7 applies in the case of translation
mixtures, that is when ft () = f0 ( t), see Corollary 1. This is done in Section 3.

When ft is the standard Gaussian with mean t we get the normal mixture problem.
When the set of means is not a priori bounded, that is T = R, Liu and Shao (2004)
obtained the asymptotic distribution of the LRT under the null hypothesis by using
the strong approximation proved in Bickel and Chernoff (1993). We prove in Theorem
8 of Section 4 that the asymptotic power under contiguous alternatives is equal to the
asymptotic level.
The way to obtain these results is to gather together: expansion of the LRT obtained in
Gassiat (2002) to identify contiguity and apply Le Cams third Lemma (see van der Vaart,
1998), behaviour of the supremum of a Gaussian process on an interval with bounds tending
to infinity as obtained in Azas and Mercadier (2004), and the normal comparison inequality
as refined in Li and Shao (2002). Proof of most results of Sections 3 and 4 are detailed in
Section 5.
Independently of our work, for the Gaussian model with unbounded means, Hall and
Stewart (2004) obtained the speed of separation of alternatives that ensures asymptotic
power to be bigger than asymptotic level. Their result indicates
that it should be log log n/ n

contrary to the classical parametric situation, where 1/ n is the speed of separation.

Practical application
Such tests that have power less or equal to level are sometimes called worthless (see for
example van der Vaart, 1998) but this word would be dangerous because practical interpretation of our result must take into account the following points:
It is well known that for mixtures of population in general the convergence to the
asymptotic distribution is very slow. For example for a very simple test, as the skewness test, Boistard (2003) showed that n = 103 observations are needed to meet the
asymptotic distribution.
For maximum likelihood estimates (MLE) and tests, the problem of the speed of
convergence to the asymptotic distribution is very difficult to address since in practice
MLE are computed through iterative algorithms and are only approximative. The
most famous one is the EM algorithm and its variants. All these algorithms depend
on tuning constants, in particular concerning the stopping rule. It is shown for example
in table 6.3 of McLachlan and Peel (2000) (based on results by Seidel et al., 2000) that
the distribution of the LRT depends heavily on these tuning constants. Simulation
results by Liu and Shao (2004) suggest that their asymptotic distribution is not met
for n = 5.103 observations.
Nowadays some results and softwares are available to compute the distribution of the
maximum of Gaussian processes. See for example Garel (2001), Delmas (2003) and
Mercadier (2004). In particular these results show that, as soon as the means are
contained in some not huge set, the asymptotic power under contiguous alternatives
of the LRT is generally better than that of moment tests or of goodness-of-fit tests.
Nevertheless, the LRT is not uniformly most powerful.
Our result that shows that the LRT is asymptotically less powerful than moment tests
is valid in practice only for very large data sets. For all the reasons above it will be very
difficult to say precisely when. Simulations have proved that in practice LRT based on
Monte-Carlo calculation of threshold or bootstrapping behave well (see Goffinet et al., 1992)
for unbounded parameter.
Our opinion is that the main consequence of our result for large or unbounded parameter
sets is that the study of the LRT for mixtures in the compact case seems to be the more
relevant case.

2 Asymptotic distribution of the LRT for the number


of populations in a mixture under null and contiguous
hypotheses.
A general theorem in Gassiat (2002) allows to find the asymptotic distribution of the LRT
for testing a particular model in a larger one, under the null hypothesis as well as under
contiguous alternatives. Roughly speaking, the asymptotic distribution is some function of
the supremum of the isonormal process on a set of score functions. The theorem holds under
a simple assumption on the bracket entropy of an enlarged set. In many applications, those
sets are parameterized by a finite dimensional parameter. In such cases,
Lipschitz properties allow to compute easily bracket entropies, as in van der Vaart
(1998, p. 271). We give some examples in the text.
The covariance structure of the isonormal process may be computed in an explicit way
and identified with the covariance function of a Gaussian field with real parameters.
We shall describe in this section how it applies to mixture models. We first recall the
general result of Gassiat (2002) and its application to a very simple contamination mixture
model: one has to test between a particular known population with some density f0 () and a
mixture of this known one and another with density ft (), t a multidimensional parameter.
Then we detail the case of two populations with eventually unknown nuisance parameter. A
typical example will be that of translation mixtures with eventually unknown scale parameter. We end the section by giving general considerations on how to deal with parametric

mixture models, allowing an unknown nuisance parameter and setting a general result in
such situations.
Assume one would like to use the LRT for testing (H0 ) : g M0 against (H1 ) : g
M, where g is the generic density of i.i.d. observations X1 , . . . , Xn , M0 M are sets of
densities with respect to some measure on Rk (or more generally on some Polish space).
n
Let n (g) = i=1 log g(Xi ) be the log-likelihood, and
n = sup

n (g)

gM

sup

n (g)

gM0

be the LRT statistic. Let also g0 be a density in M0 that will denote the true (unknown)
density of the observations. In the first considered examples, and without loss of generality,
we will assume that g0 coincides with f0 .
Throughout the paper we use 2 to denote the norm in L2 (g0 ).
0
appear naturally. Define the set S as the
When studying n (g) n (g0 ), functions gg
g0
2
subset of the unit sphere in L (g0 ) of such functions when normalized:
S=

g g0 g g0
/
g0
g0

2,

g M \ {g0 } ,

(2)

2,

g M0 \ {g0 } .

(3)

and S0 its subset when g M0 :


S0 =

g g0 g g0
/
g0
g0

A bracket [L, U ] of length is the set of functions b such that L b U , where L and U are
functions in L2 (g0 ) such that U L 2 . Define H[],2 (S, ) the entropy with bracketing
of S with respect to the norm 2 , as the logarithm of the number of brackets of length
needed to cover S. To apply the theorem in Gassiat (2002), the only needed assumption is:
1

H[],2 (S, )d < +.

(4)

This assumption implies in particular that S is Donsker and that its closure is compact. As
said before, when M is parameterized, S is also parameterized and smoothness properties will
allow to verify (4). But in general the parameterization will not be continuous throughout
S. The delicate point may be that one has to find all possible limit points, in L2 (g0 ), of
gn g0
0
0
/ gngg
sequences gngg
2 when
2 tends to 0. The set D (resp. D0 ) of limit points of
g0
0
0
gn g0
gn g0
0
sequences gngg
/
where
2
2 tends to 0, gn M \ {g0 } (resp. gn M0 \ {g0 })
g0
g0
0
will be parameterized in such a way that Lipschitz properties can be used on subsets.
Let us for example see how it applies to the simple contamination mixture model (1). In
this case,
M0 = {f0 } , M = {g,t = (1 )f0 + ft , 0 1, t [T, T ]}
for a given positive real number T . Since M0 is a singleton, we do not need to define S0 and
g g
0
, so that
D0 . One has ,tg0 0 = ftff
0
S=
ft f0
2 = 0, which
f0
gn ,tn g0
2 tends to 0
g0

If

dt =

ft f0
f0
,
ft f0
2
f0

t [T, 0) (0, T ] .

occurs if and only if t = 0, then under smoothness assumptions

if and only if n or tn tends to 0. Then dtn has two possible limit


points (depending on the sign of tn ), and

f0
f

f00
f0
, d 0+ =
.
D = dt , t [T, 0) (0, T ], d0 =
f0
f0

f0 2

f0 2

Here derivatives are taken with respect to parameter t. Again under smoothness assumptions, it will be possible to prove, considering {dt , t [T, 0), d0 } and {dt , t (0, T ], d0+ }
that the number of brackets of length needed to cover S is of order at most O(1/ ), so
that Assumption (4) holds. (A complete proof is given below for contamination models with
multidimensional parameterization).
In general when M0 contains more than one density, D0 D, and if the parameterization
is smooth enough, it will be possible to define a set U in Rk0 Rk1 and a set U0 in Rk0 such
that
D = {du , u U} and D0 = {d(v,0) , v U0 }.
Define the covariance function r(, ) on U U by
r(u1 , u2 ) =

du1 du2 g0 d.

Then, under (4), applying Theorem 3.1 in Gassiat (2002),


2 n = sup max
uU

du (Xi ), 0

sup

max

vU0

i=1

d(v,0) (Xi ), 0

+ oP0 (1),

i=1

so that 2 n converges in distribution to


2

sup (max {Z(u), 0}) sup (max {Z(v, 0), 0})

uU

(5)

vU0

where Z() is the Gaussian process on U with covariance r(, ) and P0 is the joint distribution
of the observations X1 , ..., Xn under the null hypothesis. In the particular case when M0
is reduced to a single element, a direct application of Corollary 3.1 of Gassiat (2002) gives
that 2 n converges in distribution to
sup (max {Z(u), 0})2 .

(6)

uU

It will be seen in the examples below that r(, ) is in general not continuous everywhere
on the closure of U U. Z() is not a continuous Gaussian field, though the isonormal
process on D is continuous, so that the suprema involved in (5) are a.s. finite. In general,
r(, ) is continuous almost everywhere. In the simple contamination mixture model (1), for
non null s and t,
r(s, t) =

ft f0
f0
ft f0
2
f0

fs f0
f0
fs f0
2
f0

f0 d;

(7)

r is continuous for non zero s and t and admits the following limits
r(0+ , 0+ ) = r(0 , 0 ) = 1, r(0+ , 0 ) = 1.
It is also proved in Gassiat (2002) that
if the densities gn in M \ M0 are such that
0
converges to some du0 with n gngg
2 tending to a positive constant c,
0
n
n
then the distributions (g0 ) and (gn ) are mutually contiguous, and 2 n converges
in distribution under this contiguous alternative to
gn g0
gn g0
2
g0 /
g0

sup (max {Z(u) + c r(u, u0 ), 0}) sup (max {Z(v, 0) + c r((v, 0), u0 ), 0}) .

uU

(8)

vU0

In general (5) and (8) reduce to the square of only one supremum, due to the particular
structure of the Gaussian process.
We will see, in the subsequent subsections, examples such as: translation mixtures, exponential families, in particular Bernoulli or Gaussian mixtures.

2.1

Contamination mixture.

We consider here the contamination mixture model where parameter t may be multidimensional: t T, T being a compact subset of Rk such that 0 belongs to the interior of T. Let
and , denote the Euclidean norm and scalar product in Rk . Again,
M0 = {f0 }, M = {g,t = (1 )f0 + ft , 0 1, t T} ,
S=

ft f0
f0
,
ft f0
2
f0

dt =

tT .

We shall use the following Assumptions (CM), insuring smoothness and some non degeneracy:
(CM)
ft = f0 a.e. if and only if t = 0.

t ft is twice continuously differentiable a.e. at any t T.


k
ft
i=1 i ti
2

> 0, Rk , t T such that t ,

= 0 a.e. if and only if = 0.

There exists a positive real and a function B L (f0 ) that upper bounds all
following functions:
ft 1 ft
,
, i = 1, . . . , k, t T,
f0 f0 ti
1 2 ft
, i, j = 1, . . . , k, t T, t .
f0 ti tj
Notice that in this assumption the real number is fixed. We shall prove that the condition
(4) holds true for S by splitting it into two sets
S1 = {dt , t T, t } and S2 = {dt , t T, t < }.
Since (g,t f0 )/f0 2 = (ft f0 )/f0 2 tends to 0 as soon as or t tends to 0, it is
easy to see that a limit point exists only if either t converges to a limit different of 0 or tt
converges to some . One obtains easily
D=

dt =

Set ht =

ft f 0 ft f 0
/
f0
f0

ft f0
f0 .

2,

1
t T d =
f0

i
i=1

f0
1
/
ti f0

i
i=1

f0
ti

2,

=1 .

Then, for i = 1, . . . , k, if (CM) holds,


ht
ti

t
dt
= ti
ti
ht 2

ht

ht
ht

f0 d

ht
.
ht 2

This proves that, there exists a constant C such that, for all t and s such that t and
s , |dt ds | C B t s , so that the number of brackets of length needed to
cover S1 is of order at most O(1/ k ) and that Condition (4) holds true for the set S1 .
Now, for any T such that = 1, one has letting t = , R,

(d ) =

k
ht
i=1 i ti

ht

k
ht
i=1 i ti

ht

ht
ht

f0 d

and
in [0, ] , such that
But using Taylor expansions, there exists ,
k

ht =

i
i=1

h
h0 2

=
i
+
ti
ti
2
i=1

i j
i,j=1

2 h
,
ti tj

ht
.
ht 2

i
i=1

ht
=
ti

i
i=1

2 h
h0

i j
+
.
ti
t
t
i j
i,j=1

All this leads to


Ht
h
d

i=1 i ti

(d ) =

with
Ht =

Ht
h
d

i=1 i ti

i=1

f0 d

ht
.
ht 2

ht
ht

2 h
ht
2 h
1
1

i j
i j
2 ht =

.
ti

ti tj
2 i,j=1
ti tj
i,j=1

But using (CM), this implies that for some constant C, (0, ],

(d ) C B,

and that
lim

(d ) = d

1
2

k
2 h0
i,j=1 i j ti tj
k
h0
i=1 i ti 2

1
2

k
h0
i=1 i ti
k
h0
i=1 i ti 2

k
2 h0
i,j=1 i j ti tj
k
h0
i=1 i ti 2

k
h0
i=1 i ti
k
h0
i=1 i ti 2

f0 d,

so that d is continuously differentiable on [0, ]. Using the fact that


|d d |

d d + d d + d d

C B ( + ) + d d

and that, using (CM), there exists a positive constant C such that
k

inf

=1

i
i=1

h0
ti

C,

we obtain that for some constant C , and any , such that = 1,


d d

= 1,

C B .

It is straightforward to see that the number of brackets of length needed to cover S 2 is of


order at most O(1/ k ). Thus Assumptions (CM) imply Condition (4).
Define now for all non null s and t in T,
r(s, t) =

ds dt f0 d,

(9)

and let Z() be the Gaussian field on T\{0} with covariance r. Notice that, on each direction
such that t 0 with t/ t , one may extend r(, ) by continuity, setting
r(, t) = r(t, ) =

d dt f0 d ; r(, ) =

d d f0 d.

(10)

Let n and tn be sequences such that

limn+ nn (ftn f0 )/f0 2 = c for some positive c,

either tn tends to some t0 = 0 and nn tends to some positive constant, or tn tends


to 0, and tn / tn converges to some limit .
Then:

Theorem 1 Assume (CM). Then (f0 )n and [((1 n )f0 + n ftn ) ]n are mutually
contiguous, 2 n converges under (f0 )n in distribution to
sup (max{Z(t), 0})2 =

sup Z(t)

tT

tT

and under [((1 n )f0 + n ftn ) ]n to


sup max{Z(t) + (t), 0}

sup(Z(t) + (t))

tT

tT

with
(t) = c r(t, t0 ) if tn t0 = 0, and (t) = c r(t, ) if tn 0 and tn / tn . (11)
Remark: Set m 0 under (f0 )n and m under [((1 n )f0 + n ftn ) ]n . Letting
t go to 0 radially in two opposite directions and using covariance properties in the neighbourhood of 0 we see that almost surely suptT (Z(t) + m(t)) > 0 what justifies equalities
in preceding theorem.
Let us give applications of this theorem to particular models:

2.1.1

Translation mixtures

We consider the translation mixture model, where is the Lebesgue measure and
ft () = f0 ( t).
Then, it is easy to see that Theorem 1 applies as soon as the following Assumptions (CTM)
hold:
(CTM)
f0 is positive on Rk ,

x f0 (x) is twice continuously differentiable a.e.

There exists a function B L2 (f0 ) that upper bounds all following functions:
f0
1
f0 (x t)
,
(x t) , i = 1, . . . , k, t T,
f0 (x)
f0 (x) xi
2 f0
1
(x t) , i, j = 1, . . . , k, t T, t .
f0 (x) xi xj
f0
ft
ti (x) = xi (x t),
k
f0
i=1 i xi = 0 a.e., so that

k
ft
i=1 i ti

Indeed, since

if is such that

= 0 a.e. for all t ,

then
unless = 0.

f0 (x + ) = f0 (x) for all R, which is impossible

Here are some examples of situations in which these assumptions are met : f0 being
the inverse of a polynomial with degree at least 2, among which the Cauchy density, the
Gaussian densities and the normalization of ch(x)1 .
The covariance function r is given for non null s and t by

r(s, t) =

f0 (x s)f0 (x t)
d(x) 1
f0 (x)
f0 (x s)2
d(x) 1
f0 (x)

f0 (x t)2
d(x) 1
f0 (x)

and if the dimension k = 1, one may define r(0+ , 0 ) = 1, and for non null t
f0 (x)f0 (x t)
d(x)
f0 (x)

r(0+ , t) = r(0 , t) =

f0 (x t)2
d(x) 1
f0 (x)

f02 (x)
d(x)
f0 (x)

where the derivation is with respect to x.

2.1.2

Gaussian mixtures

Without loss of generality we may assume that f0 is standard normal. Let K be a bound
for t , t T. Then the following bounds show that the function B exists for any
f0 (x t)
= exp x, t t 2 /2
f0 (x)
f0 (x t)
1
f0
(x t) = |xi ti |
f0 (x) xi
f0 (x)
2
f0 (x t)
1
f0
(x t) = |xi ti ||xj tj |
f0 (x) xi xj
f0 (x)
2
1
f0 (x t)
f0
(x t) = |(xi ti )2 1|
2
f0 (x) xi
f0 (x)

exp K x ,
( x + K) exp K x ,
( x + K)2 exp K x , i = j,
[1 + ( x + K)2 ] exp K x .

So (CTM) holds, and Theorem 1 applies, as soon as f0 is some Gaussian density on Rk and
T is compact. The covariance of the process Z is:
r(s, t) =

2.1.3

exp( t, s ) 1
.
exp( t 2 ) 1 exp( s 2 ) 1

Binomial mixtures

k!
Here is the measure with density x!(kx)!
with respect to the counting measure on the set
{0, 1, . . . , k}. We consider the binomial family Bi(k, ) with density x (1 )kx ; x =
0, 1, ..., k. Let 0 (0, 1) and ft be the density of Bi(k, 0 + t). The most relevant case for
genetic applications is the case 0 = 1/2, see Problem 1 in Chernoff and Lander (1995). We
have
ft (x) = (t + 0 )x (1 t 0 )kx ,

ft
(x) =
t
2 ft
(x) =
t2

kx
x

t + 0
1 t 0

x
kx

t + 0
1 t 0

ft (x),

x
kx
+
ft (x).
2
(t + 0 )
(1 t 0 )2
2

ft
t
It is clear that ft (x) and f
t (x) are uniformly upper bounded and that t2 (x) is upper
bounded for t small enough, proving Assumptions (CM). Direct calculations lead to

t
t
ft f 0
(x) = (1 + )x (1
)kx 1,
f0
0
1 0
r(s, t) =

(s, t)
,
(s, s) (t, t)

with
k

(1 +

(s, t) =
x=0

s
t
t
s x
) (1
)kx 1 (1 + )x (1
)kx 1 0x (1 0 )kx
0
1 0
0
1 0

which is equivalent to the result of Chernoff and Lander (1995).

2.1.4

Mixtures in exponential families

This case generalizes the preceding. Let ft be a regular exponential family with exhaustive
statistic T (x) = (T1 (x), . . . , Tk (x)):
k

ft (x) = f0 (x) exp


i=1

ti Ti (x) (t) ,

and assume T is a compact subset in the interior of the definition set of the exponential
k
family. Then t ft is infinitely differentiable on T. Let F (x) = suptT exp
i=1 ti Ti (x) ,
Assumption (CEM) will be:
(CEM)
There exists B in L2 (f0 ) that upper bounds all following functions: F , |Ti |F ,
|Ti Tj |F , i, j = 1, . . . , k.
One can see easily that (CEM) implies (CM), so that Theorem 1 applies to exponential
families as soon as (CEM) holds. Direct calculations again lead to
ft f 0
(x) = exp
f0
r(s, t) =

i=1

ti Ti (x) (t) 1,

exp (s + t) (s) (t) 1

exp (2s) 2(s) 1 exp (2t) 2(t) 1

2.2

Two populations against a single one

We consider here the case where one wants to test a single population in the family of
densities ft , t T, T compact subset of Rk against a mixture of two such populations. That
is:
M0 = {ft , t T},
and
M = {g,t1 ,t2 = (1 )ft1 + ft2 , 0 1, t1 T, t2 T} .
We suppose moreover that 0 is an interior point of T and that f0 is the unknown distribution
of the observations (with no loss of generality). We shall use Assumptions (TP), insuring
smoothness and some non degeneracy:
(TP)
(1 )ft1 + ft2 = f0 a.e. if and only if ( = 0 and t1 = 0) or ( = 1 and t2 = 0)
or (t1 = 0 and t2 = 0),
t ft is three times continuously differentiable a.e. at any t T,
k

t
Rk , t T, s T, 0, (fs f0 ) + i=1 i f
ti = 0 a.e. if and only if
s = 0 and = 0,
2
k
ft
and > 0, such that Rk , t T with t i,j=1 i j t i t
= 0 a.e. if
j
and only if = 0,

there exists a function B L2 (f0 ) that upper bounds all following functions:
1 2 ft
ft 1 ft
,
, i, j = 1, . . . , k, t T,
,
f0 f0 ti
f0 ti tj
3 ft
1
, i, j, l = 1, . . . , k, t T, t .
f0 ti tj tl

10

Then S D, S0 D0 , and D can be parameterized as follows:


D=

dt,a, =

k
1 f0
i=1 i f0 ti
,
k
1 f0
i=1 i f0 ti 2

0
a ftff
+
0
0
a ftff
+
0

t T \ {0}, Rk , a 0, a + = 1 ,

D0 = {d0,0, , = 1} .
Let r(, ) be as in Section 2.1:
hs
hs

r(s, t) =

ht
ht

f0 d
2

with ht = (ft f0 )/f0 , and Z() the associated Gaussian field.


Let W be the k-dimensional centered Gaussian variable with variance with entries :
1 f0
f0 tj
1 f0
f0 tj 2

1 f0
f0 ti
1 f0
f0 ti 2

i,j =

f0 d, i, j = 1, . . . , k,

and for any t, let C(t) be the k-dimensional vector of covariances of Z(t) and W :
C(t)i =

1 f0
f0 ti
1 f0
f0 ti 2

ht
ht

f0 d, i = 1, . . . , k.
2

Then D can be reparametrized as follows


D = dt,a, ; t T \ {0}, Rk , a 0, a2 + T + 2a T C(t) = 1 .
Using the same tricks as for proving Theorem 1, 2 n converges under (f0 )n in distribution to

sup
(aZ(t) + , W )

a 0, t T, Rk
a2 + T + 2a T C(t) = 1

Remark that:

sup

, W

(12)

T =1

sup

= W T 1 W,

, W

(13)

T =1

and that the supremum is attained for colinear to 1 W . Then consider the matrix:
=

C(t)T

1
C(t)

with inverse

uT
u M

1 =
()
1

where M = M (t) = C(t)C(t)T


, u = u(t) = M (t)C(t), = (t) = 1 +
C(t)T M (t)C(t). Now consider the maximization problem in a and :
2

sup

(aZ(t) + , W )

a0,a2 + T +2a T C(t)=1

If the maximum is attained for a > 0 , then by (13) its value is


Z(t)
W

1
()

11

Z(t)
W

(14)

which is equal to
Z(t) +

uT W

+ WT M

uuT
uT W
W = Z(t) +

+ W T 1 W.

This is the case when the first coordinate of


1
()

Z(t)
W

is non-negative that is Z(t) + u, W 0. In the other case (a = 0) the maximum is


W T 1 W by (13). Finally we have proved that the supremum in (14) is equal to
max Z(t)

C(t)T M (t)W
,0
1 + C(t)T M (t)C(t)

1 + C(t)T M (t)C(t) + W T 1 W.

(15)

This implies that the limit of 2 n under (f0 )n is equal in distribution to


sup Z(t)
tT

C(t)T M (t)W
1 + C(t)T M (t)C(t)

1 + C(t)T M (t)C(t) .

Indeed one may see, letting t go to 0 radially in two opposite directions, that the supremum
of the Gaussian process involved in formula (15) is non negative. Let now n , tn1 and tn2
(1n )ftn +n ftn f0

(1n )ftn +n ftn f0

1
2
1
2
/
be sequences such that
2 tends to some dt0 ,a0 ,0 in
f0
f0
(1n )ftn1 +n ftn2 f0
D, with limn+ n
2 = c for some positive constant c. Then, using
f0
the same tricks again:

Theorem 2 Assume (TP). Then (f0 )n and [((1 n )ftn1 + n ftn2 ) ]n are mutually
contiguous, 2 n converges under (f0 )n in distribution to
sup Z(t)
tT

C(t)T M (t)W
1 + C(t)T M (t)C(t)

1 + C(t)T M (t)C(t) ,

and under [((1 n )ftn1 + n ftn2 ) ]n to


sup aZ(t) + a0 cr(t, t0 ) + cC(t)T 0
tT

C(t)T M (t)(W + c0 + ca0 C(to ))


1 + C(t)T M (t)C(t)

1 + C(t)T M (t)C(t) ,

where if t0 = 0 then a0 = 0.
Notice that, when t0 = 0, d0,a0 ,0 = d0,0,0 , and d0,0,0 , dt,a,, , = cC(t)T 0 + c0 . This is
why one has to take a0 = 0 when t0 = 0 in the last formula of Theorem 2.

2.2.1

Examples.

Results of Section 2.2 apply to the same previous examples.


Translation mixtures, under (CTM) with moreover x f0 (x) is three times continuously differentiable a.e., and the function B L2 (f0 ) is also an upper bound
for
1
3 f0
(x t) , i, j, l = 1, . . . , k, t T, t .
f0 (x) xi xj xl
Gaussian mixtures, in this case W is a standard normal vector and for all t T
C(t) = tt 2 .
e

Bernoulli mixtures,

Mixtures in exponential families, under (CEM) with moreover: the function


B L2 (f0 ) is also an upper bound for |Ti Tj Tl |F , i, j, l = 1, . . . , k. In this

12

case, W is the Gaussian vector with covariance the correlation matrix of the vector (T1 (X), . . . , Tk (X)), when X has density f0 . Recall that the variance matrix of
the vector (T1 (X), . . . , Tk (X)) when X has density f0 is the matrix D2 of second
derivatives of the function at point 0, and the vector C(t) is given by
C(t)i =

2.3

ti (t)

ti (0)

exp((2t) 2(t)) 1 (D2 (0))i,i

, i = 1, . . . , k.

Contamination with unknown nuisance parameter

We consider here the contamination mixture model with some unknown parameter, which is
the same for all populations. A typical example may be that of mixtures of Gaussian distributions with the same unknown variance, or translation mixtures with the same unknown
scale parameter. We shall assume that the nuisance parameter is identifiable, so that its
maximum likelihood estimator is consistent. This will allow to reduce the possible nuisance
parameters in the definition of the set S to be in a neighbourhood of the true unknown one
(recall that S is only a theoretical tool to verify that some theorem apply, and compute the
set of normalized scores, so that this does not restrict the model M, for which the nuisance
parameter is not restricted to be in a neighbourhood of the true one).
Let F = {ft, , t T, A} be a set of densities with respect to some dominating measure
, where T is a compact subset of Rk and A is a compact subset of Rh . We consider here
the case where
M0 = {f0,, A},
and
M = {g,t, = (1 )f0, + ft, , 0 1, t T, A} .
The unknown true distribution of the observations will be f0,0 . We suppose that (0, 0 )
is an interior point of T A. We shall use Assumptions (CMN), insuring smoothness and
some non degeneracy:
(CMN)
(1 )f0, + ft, = f0,0 a.e. if and only if = 0 and [ = 0 or t = 0],
(t, ) ft, is twice continuously differentiable a.e. at any (t, ) T A,
> 0, such that Rh , t T, A with 0 , 0:
f
h
= 0 a.e. if and only if t = 0 and = 0,
(ft,0 f0,0 ) + i=1 i 0,
i
k

t,0
0,
and Rk , t , 0 :
i=1 i ti +
i=1 i i = 0 a.e. if and
only if = 0 and = 0 .
There exists a function B L2 (f0,0 ) that upper bounds all following functions:

1
ft,
ft,
ft,
1
, i = 1, . . . , k,
, i = 1, . . . , h, (t, ) TA, 0 ,
,
f0,0 f0,0 ti
f0,0 i
1
f0,0
1
f0,0

2 ft,
2 ft,
1
, i, j = 1, . . . , k,
, i = 1, . . . , k, j = 1, . . . , h,
ti tj
f0,0 ti j
2 ft,
, i, j = 1, . . . , h, (t, ) T A, 0 , t .
i j

Then, since the maximum likelihood estimator of parameter is consistent, one only needs
to verify Assumption (4) for
S=

(1 )f0, + ft, f0,0 (1 )f0, + ft, f0,0


/
f0,0
f0,0

2,

0 1, t T, A, 0 ,

where we restrict our definition to , t and such that (1 )f0, + ft, differs from f0,0 .
One has also
S0 =

f0, f0,0 f0, f0,0


/
f0,0
f0,0

2,

13

0 1, A, 0 .

Define, for (t, , , ) T R+ Rh Rk ,


k

Ht,,, = (ft,0 f0,0 ) +


and

i
i=1

f0,0
f0,0
+
i
,
i
ti
i=1

Ht,,, /f0,0
.
Ht,,, /f0,0 2

dt,,, =

The sets D and D0 can be parameterized as follows:


D = dt,,, , t T, 0, Rh , Rk , 2 +

=1 ,

D0 = d0,0,,0, Rh , = 1 .
Note that due to the existence of the nuisance parameter which is fixed to 0 , now D does
not contain S.
It will be possible to obtain the asymptotic distributions in the same way as in Section
2.2. Let again
ht
hs
f0,0 d
r(s, t) =
hs 2
ht 2
with ht = (ft,0 f0,0 )/f0,0 , and Z() the associated Gaussian field.
Note that this process is the same as the one of Section 2.1 if we set f0 = f0,0 . Let also
f0,0
0
W , and C(t) be the same as in Section 2.1 replacing f
ti by
ti .
Let V be the h-dimensional centered Gaussian variable with variance :

f
f
i,j =

1
f0,0

f0,0

1
f0,0

0,0

i
f0,0
i

f0,0

0,0

j
f0,0
j

f0,0 d, i, j = 1, . . . , h,

and for any t, let G(t) be the h-dimensional vector of covariances of Z(t) and V :

1 f0,0
ht
f0,0 i

G(t)i =
f0,0 d, i = 1, . . . , h.
1 f0,0
h
t 2
2
f0,0

Let also S be the convariance matrix of W and V , with entries:

f
f
Si,j =

1
f0,0

1
f0,0

1
f0,0

0,0

i
f0,0
i

1
f0,0

0,0

tj
f0,0
tj

Define the matrices U (t) and N (t) by


U (t) =

N (t) =

f0,0 d, i = 1, . . . , h, j = 1, . . . , k.

C(t)T
G(t)

,
1

ST
S

U (t)U (t)T

(1n )f0,n +n ftn ,n f0,0


(1n )f0,n +n ftn ,n f0,0
/
f0,0
f0,0
(1n )f0,n +n ftn ,n f0,0
=
c for some
limn+ n
2
f0,0

Let n , tn and n be sequences such that

tends to some dt0 ,0 ,0 ,0 in D, with


positive constant c. Then, using the same tricks as for proving Theorem 2 :

Theorem 3 Assume (CMN). Then (f0,0 )n and [((1 n )f0,n + n ftn ,n ) ]n are

14

mutually contiguous, 2 n converges under (f0 )n in distribution to


sup Z(t)
tT

U (t)T N (t)
1 + U (t)T N (t)U (t)

W
V
+

1 + U (t)T N (t)U (t)


W
V

ST
S

W
V

V T 1 V,

and under [((1 n )f0,n + n ftn ,n ) ]n to


sup Z(t) + c0 r(t, t0 ) + cC(t)T 0 + cG(t)T 0
tT

. 1 + U (t)T N (t)U (t) +

W + c0 + c0 C(t0 )
V + c0 + c0 G(t0 )

U (t)T N (t)
1 + U (t)T N (t)U (t)
ST
S

W + c0 + c0 C(t0 )
V + c0 + c0 G(t0 )

W + c0 + c0 C(t0 )
V + c0 + c0 G(t0 )

(V + c0 + c0 G(t0 ))T 1 (V + c0 + c0 G(t0 )),


where 0 = 0 when t0 = 0.

2.3.1

Translation mixtures with unknown scale parameter

Here h = 1, is the Lebesgue measure and


ft, () = f0,1 (( t)),
with A = [a, A] for some a > 0. Then, it is easy to see that Theorem 3 applies as soon as
the following Assumptions (CTMN) hold:
(CTMN)
f0,1 is positive on Rk ,

x f0,1 (x) is twice continuously differentiable a.e.,

There exists a function B L2 (f0,1 ) that upper bounds all following functions:
f0,1 (x t)
f0,1
1
(1 + |xi |)
,
(x t) , i = 1, . . . , k, t T,
f0,1 (x)
f0,1 (x)
xi
1
2 f0,1
(x t) , i, j = 1, . . . , k, t T, t .
(1 + |xi ||xj |)
f0,1 (x)
xi xj
These assumptions are met when f0,1 is the inverse of a polynomial with degree at least 2,
among which the Cauchy density, or the Gaussian densities and the normalization of ch(x) 1 .

2.3.2

Gaussian mixtures with unknown variance

Here h = k(k + 1)/2 since is the unknown variance. It is easy to see that Assumptions
(CTMN) hold, and Theorem 3 applies, as soon as the ft, are the Gaussian distributions
N(t, ) on Rk , T is compact, A is a compact subset of symmetric matrices that are positive
definite.

2.4

General mixtures with unknown nuisance parameter

Let F = {ft, , t T, A} be a set of densities with respect to some dominating measure


, where T is a compact subset of Rk and A is a compact subset of Rh . The parameter
t will characterize the population in the mixture, the parameter will be the same for
all populations. As a simple example one may think to Gaussian distributions (eventually
multidimensional) with t the mean vector and the variance matrix. One may define a

15

mixture with p populations as

i fti , .

gp,,T, =

(16)

i=1

Here = (1 , . . . , p ) is a vector of non negative real numbers that sum to one, T =


(t1 , . . . , tp ) Tp and A. One would like to use the LRT for testing (H0 ) : g is a
mixture of p0 populations against (H1 ) : g is a mixture of p populations, where g is the
density of i.i.d. observations X1 , . . . , Xn , and p0 < p. This is the case when
p0

M0 =

gp0 ,0 ,T0 , , 0 [0, 1]p0 , T0 Tp0 , A,

i = 1, i = 1, . . . , p0

i=1

M=

gp,,T, , [0, 1]p , T Tp , A,

i = 1, i = 1, . . . , p .
i=1

To understand what happens and how to do computations, the main point is to understand
how two mixtures with eventually different number of populations may become close.
The main weak identifiability Assumption (WID) will be that gp,,T, = gq, ,T , if and
p
q
only if = and
i=1 i ti =
i=1 i t i where z is the Dirac measure at z.
Then, if the parameterization (t, ) ft, is smooth enough, two mixtures become close
if their parameter becomes close, and their mixing measure becomes close in the weak
topology.
Let now g0 = gp0 ,0 ,T0 ,0 be a particular mixture in M0 which has exactly p0 populations
and not fewer, that will denote the true unknown density of the observations. We denote by
t0,i the elements of T0 . Since parameter is identifiable, its maximum likelihood estimator
is consistent under weak smoothness assumptions, so that to define the sets S and S0 by (2)
and (3), one may restrict by 0 for some small . Then, as seen in the previous
subsections, the main point is to find D and D0 , so as to be able to:
understand how parameterization and smoothness may be used to compute the order
of the bracketing entropy,
define the Gaussian process that is used in the limiting distribution.

For these points, smoothness assumptions and bounding with a square integrable function
have to be used together with some non degeneracy of functions that come in the norm
appearing in denominator, when this one goes to zero. In fact, if it may be degenerate, it
means that one has to go further in the order of the Taylor development until non degeneracy.
This, of course, depends on particular examples.
A rather general situation is the following. Let q = p p0 . Denote by Dt ft, the kdimensional vector of derivatives of ft, with respect to t, D ft, the h-dimensional vector
of derivatives of ft, with respect to , Dt2 ft, the k k-dimensional matrix of second
derivatives of ft, with respect to t. Introduce Assumptions (GM):
(GM)
(t, ) ft, is three times continuously differentiable a.e. at any (t, ) T A,
> 0 such that, for all i Rh , ti Rk , i Rk , i Rh , i R, i = 1, .., p0 , for
all 1 , . . . , q 0 such that i 0 , ti t0,i , i i + j j = 0 then :
q
p0
p0
p0
i
i
ti ,0 = 0 a.e.
i=1 i fti ,0 +
i=1 i fti,0 ,0 +
i=1 , D fti,0 ,i +
i=1 , Dt f
if and only if
q
p0
1
p0
= 0 and 1 = 0, . . . , p0 = 0,
i=1 i fti ,0 +
i=1 i ft0,i ,0 = 0, = 0, . . . ,
For any subset J of at most inf{p0 , q} points in T such that for each one there is one
of the t0,i s at distance at most , for any ( j )jJ of vectors of Rk , for any 1 , . . . , p0
in Rh :
p0
j T 2
j
1
p0
i
=
jJ ( ) Dt fj,0 ( ) = 0 a.e. if and only if = 0, . . . ,
i=1 , D ft0,i ,i +
j
0 and = 0, j J;

16

There exists a function B L2 (g0 ) that upper bounds all following functions:
ft, 1 ft,
1 ft,
, i = 1, . . . , k,
, i = 1, . . . , h, (t, ) T A, 0
,
g0 g0 ti
g0 i
1 2 ft,
, i, j = 1, . . . , k, (t, ) T A, 0
g0 ti tj
1 2 ft,
1 2 ft,
, i = 1, . . . , k, j = 1, . . . , h,
, i, j = 1, . . . , h,
g0 ti j
g0 i j
1
3 ft,
1
3 ft,
, i, j, l = 1, . . . , k,
, i, j = 1, . . . , k, l = 1, . . . , h
g0 ti tj tl
g0 ti tj l
(t, ) T A, 0 , t ti for some i.
Set = (( 1 )T , . . . , ( p0 )T ), with i Rh ; = (( 1 )T , . . . , ( p0 )T ), with i Rd ; T =
(t1 , . . . , tq ) Tq ; = (1 , . . . , q ) Rq ; = (1 , . . . , p0 ) [0, 1]p0 ,

i=1

i=1

i=1

and

i , Dt ft0,i ,0 ,

i , D ft0,i ,0 +

i ft0,i ,0 +

i fti ,0 +
i=1

p0

p0

p0

HT,,,, =

HT,,,, /g0
.
HT,,,,/g0 2

dT,,,, =
Define now:
K = (T, , , , ) : 1 , . . . , q 0;
1

+ . . . + p0

+ 1

i = 0;

i +
i

i
2

+ . . . + p0

i2 = 1; H(T,,,,) = 0 .

2i +

Then:
D = {dT,,,,, (T, , , , ) K} ,
and
D0 = {d0,0,0,,} .
It will be possible to obtain the asymptotic distributions in the same way as in Section 2.2
under Assumptions (WID) and (GM). Define the Gaussian field Z(T, , , , ) on K with
covariance
r ((T, , , , ), (T , , , , )) =

dT,,,, dT , , , , g0 d.

Notice that, as in previous sections, K is not closed, and r(, ) is not continuous on some
limiting points, but may be extended in some sense, as has been done for instance in Section
2.1.
g
g
Let also pn , n , Tn , n be such that n pn ,n ,Tg0n ,n 0 2 tends to some positive constant
g
g
g
g
c, with pn ,n ,Tn ,n 0 / pn ,n ,Tn ,n 0 2 tending to d in the closure of D.
g0

g0

Theorem 4 If (WID) and (GM) hold, then (g0 )n and (gpn ,n ,Tn ,n )n are mutually
contiguous, 2 n converges under (g0 )n in distribution to
2

sup
(T,,,,)K

Z(T, , , , )

17

sup
(0,0,0,,)K

Z(0, 0, 0, , )

and under (gpn ,n ,Tn ,n )n to


2

sup

0 d
dT,,,, dg

Z(T, , , , ) + c

(T,,,,)K
2

sup

Z(0, 0, 0, , ) + c

0 d
d0,0,0,,dg

. (17)

(0,0,0,,)K

It is possible to reduce the formula of the asymptotic distributions in Theorem 4 into only
one supremum, using linear algebra computations as in previous sections. We shall not give
the result for all situations since it involves too long and complicated formula. However, in
case q = 1, the result takes a simpler form that we will give below. For this one needs to
define notations. When q = 1, is reduced to and T reduces to t so that elements of D
may be written as dt,,, with

i=1

p0

p0

p0

Ht,,, =

i (ft,0 ft0,i ,0 ) +

i , Dt ft0,i ,0 .

i , D ft0,i ,0 +
i=1

i=1

0
where i=1
i 0.
Let W be the p0 (h + d)-dimensional centered Gaussian random variable with variance
such that for all and ,

H0,0,,
g0

2
2.

Let Z(t) be the (p0 )-dimensional centered Gaussian field with covariance the p0 p0 matrix
(, ) such that for all t1 , t2 ,
f f
f f

t0,i ,0

t,0

(t1 , t2 )i,j =

and let

g0
ft,0 ft0,i ,
0
g0

t0,j ,0

t,0

g0
ft,0 ft0,j ,
0
g0

g0 d,

= (t) = (t, t).


Define C = C(t) the p0 (h + d) p0 matrix such that for all (t, , , ),

p0

C(t) =
i=1

ft,0 ft0,i ,
0
g0
,
ft,0 ft0,i ,
0
2
g0

H0,0,,
g0

2,

and let A = A(t), U = U (t), M = M (t) be matrices such that


M
U

C1 C T

= M C

A =

(18)

C M C

(19)
1

(20)

Let 1 denote the p0 -dimensional vector with all coordinates equal to 1. Then:
Theorem 5 Assume (WID) and (GM), and p = p0 + 1. Then 2 n converges under (g0
)n in distribution to
sup AZ + U T W
t

A1

11T
1
T
T
1T A1 (AZ+U W ) 1<0

AZ + U T W

The distribution under contiguous alternatives is rather difficult to express in its full generality so it is omitted for simplicity. The proof of Theorem 5 is given in Section 5.

18

0
In the case of Gaussian mixtures with unknown variance, the assumption i=1
i , D ft0,i ,i +
j T 2
j
1
p0
j
= 0 and = 0, j J
jJ ( ) Dt fj,0 ( ) = 0 a.e. if and only if = 0, . . . ,
does not hold. Indeed, second derivatives with respect to t are proportional with derivatives
with respect to . In this case, it is necessary to go further in the Taylor development:
when taking third derivative with respect to t, the condition of non degeneracy holds. Also,
all derivatives till fourth order may be uniformly upper bounded with some function B as
needed. Since the limiting points of process Z need not to be known at boundary values of
K to define the asymptotic distribution of n , the following result holds:

Theorem 6 The asymptotic distributions under the null hypothesis and under contiguous
hypotheses given in Theorem 4 and Theorem 5 hold for Gaussian mixtures with the same
unknown variance matrix.

3 The LRT for contamination mixtures when the set of


parameters is large.
As already said in the introduction, the asymptotic distribution of the LRT for compact T
and A can be used in practice for large data sets. The LRT happens in this case to be more
powerful than moment tests as shown in Delmas (2003). Nevertheless it suffers from two
drawbacks:
the distribution is not free from the location of the null hypothesis inside T,

for testing one population against two (or p0 against p) the LRT with bounded parameter is not invariant by translation or change of scale.
Several solutions to the first point exist. Threshold calculation can be conducted under
the worst form of the null hypothesis (see Delmas, 2003) or one can use a Plug-in, that
is an estimate of f0 . It remains that results would be nicer if one would be able to get rid of
the compactness assumption. This section and the next one answer by the negative, showing
that in the simplest case: contamination for translation mixtures on R, the LRT is theoretically less powerful than moment tests under contiguous alternatives. As already said in the
introduction, the convergence to this result is very slow, so it is not so relevant in practice.
It mainly shows that it is difficult to construct an unbounded asymptotic theory of the LRT.
We consider in this section the contamination mixture model (1) with T = [T, T ] for
a given positive real number T and the Lebesgue measure. We use notations and results
of Section 2.1. Let n and tn be sequences such that:

limn+ nn (ftn f0 )/f0 2 = c for some positive c,

either tn tends to some t0 = 0 and nn tends to some positive constant, or tn tends


to 0, and tn / tn converges to some limit .
n

. To evaluate the asymptotic power and


and P0 = f0
Let Pn ,tn = gn ,tn
the asymptotic level for large values of T , one has to investigate the behaviour of suprema
of the Gaussian processes Z(t) and Z(t) + m(t) as defined in Theorem 1. Z is the centered
Gaussian process defined in Section 2.1 with covariance given by (9). For simplicity we
consider this process as defined on the whole real line R. We will use assumptions under
which the supremum of Z() over [T, T ] tends to infinity as T tends to infinity, and is
achieved for some t tending to infinity. So the study of this supremum on [0, T ] for large T
can be replaced by the study of the supremum on [1, T ]. The discontinuity of the covariance
function r at 0 will have for us no consequence on the extreme behaviour of the process Z.
We shall use Azas and Mercadier (2004) to derive the asymptotic distribution of suprema
of Gaussian processes. Hence let M (a, b) = supt(a,b) (Z(t) + m(t)). Since the asymptotic
distribution of 2 n , under the null hypothesis or under contiguous alternatives, in Theorem
1 can be written as M (T, T )2 (taking m(t) = 0 under the null hypothesis and m(t) = (t)
as defined by (11) under contiguous alternatives), we want to characterize asymptotic behaviours of M (T, T ) as T +. We thus introduce further notations and assumptions.

19

Write rij (s, t) instead of


Let at =

i+j
i s j t r(s, t)

2 log R(t), bt = at

and define R(t) =

log(2)
at

and bt = at

t
0

r11 (s, s) ds.

log()
at .

Let V = {V (t) = Z(R1 (t)) + m(R1 (t)), t R} be the unit-speed transformation


of Z + m in the sense that the variance of V (t) equals 1 for all t in R. We denote by r V its
covariance function.
We shall use the following Assumptions (G) on r and :
(G)
(G1)

t R,

(G2)

r(s, t) log |R(s) R(t)| 0 as |R(s) R(t)| +,

(G3)

>0

(G4)

r11 (t, t) > 0


sup

|R(s)R(t)| >

and

lim R(t) = +,

t+

|r(s, t)| < 1,

r is four times continuously differentiable,


s r11 (s, s) three times continuously differentiable,

Y
Y
> 0, r01
and r04
are bounded on {(s, t) R2 , |s| > and |t| > },

(G5)

log R(t) (t) 0.


t+

We have:
Theorem 7 Assume (CM) and (G). Then, as T tends to infinity, aT M (T, T ) bT
tends in distribution to the Gumbel distribution when m(t) = 0 as well as when m(t) = (t).
In other words, if cT, is the threshold of the test defined by
lim P0 (n > cT, ) = ,

n+

then for any contiguous alternative, the limiting power of the LRT equals its level:
lim

lim Pn ,tn (n > cT, ) = .

T + n+

Theorem 7 is proved in Section 5.


The theorem says that for T large enough, asymptotically, the LRT cannot distinguish
the null hypothesis from any contiguous alternatives.
We shall consider the translation mixture model defined in Section 2.1.1. Let f0 be a
(i)
density on R satisfying Assumptions (H) where we denote by f0 the derivative of f0 of
order i.
(H)
(H1)

x R, f0 (x) > 0, f0 four times continuously differentiable,


f

(H2)
(H3)
(H4)
(H5)

(i)

and i = 1, . . . , 4, Ki > 0, x R, f00 (x) Ki ,


= limt f0f(x+t)
= 1,
x R, limt+ f0f(x+t)
0 (t)
0 (t)

0 (t)
M > 0, x, t R, f0f(x)f
M,
0 (x+t)
2
F L () : sup|t|1 log |t| f0 (x + t) F (x),
limt+ log(t) f0 (t) = 0.

Our result is now:


Corollary 1 Assume (H). Then Theorem 7 applies to the translation mixture model.
Proof of Corollary 1 is given in Section 5.
Remarks:

20

Assumptions (H1) to (H5) are essentially conditions on the tail of f0 . (H4) and (H5)
are very weak and hold for all usual distributions. But (H1) to (H3) though rather
weak, are more restrictive. They hold for example if f0 (t) = O(t ) for > 0 as
t + and f0 (t) = O(t ) for > 0 as t . For instance, they hold for f0
being the inverse of a polynomial and in particular for the Cauchy density.
The proof relies on the verification of assumptions of Theorem 7. In particular, asymptotic behaviours of the covariance r and its derivatives have to be checked. Assumptions (H) only express sufficient conditions under which the asymptotic analysis is
done with some generality. However, though (H2) does not hold for the Gaussian
density, we also verified that Theorem 7 holds for other densities such as the Gaussian
and the normalization of ch(x)1 in spite of different justifications.
LRT has to be compared with other testing procedures such as sample mean or KolmogorovSmirnov testing procedures.
- Denote by i = xi f0 (x)d(x). Without loss of generality one can assume that 1 = 0.
If 2 < + applying Le Cams third Lemma, that is Theorem 6.6 of van der Vaart
(1998),

n X n converges in distribution, as n tends to infinity, to the Gaussian N (0, 2 )


f
under P0 and to the Gaussian N (, 2 ) under Pn ,tn , where = c/ f00 2 if tn 0
and = ct0 /

ft0 f0
2
f0

if tn t0 = 0.

Consequently the asymptotic power is greater than the level.


- Remark that, when no condition of moment is available, other tests can be also proposed. Define Fn the random distribution function and F0 the distribution function
associated to f0 . Let I denote the identity function on [0, 1] and let U be a Brownian
bridge on [0, 1]. Let denotes the supremum norm. The natural normalization of
Fn leads to the definition of the Kolmogorov-Smirnov statistic Kn and the Cramer-von
Mises statistic W2n :
Kn =

n Fn F0

and W2n =

n [Fn (x) F0 (x)]2 dF0 (x).

Set on [0, 1]
(x) =

F0 F01 (x) tn x
,
n+
tn
lim

where tn is the translation parameter of the alternative. Hence depends on the


asymptotic behaviour of tn .
Kn converges in distribution, as n tends to infinity, to U
under Pn ,tn .
W2n converges in distribution, as n tends to infinity, to
1
(U + )2 dI under Pn ,tn .
0

1
0

under P0 and U +

U2 dI under P0 and

See Shorack and Wellner (1986) for a version of theses convergences. Simulations show that
in both cases, the distribution under Pn ,tn is stochastically greater than that under P0 .
Consequently the asymptotic power is greater than the level.

4 Asymptotic distribution of the LRT for Gaussian contamination mixtures with unbounded mean under contiguous alternatives.
Consider T = R (no prior upper bound) and the testing problem (1) with
(x t)2
1
ft (x) = exp
2
2

21

Set
g0 = f0 and g,t = (1 )f0 + ft , 0 1, t T.

Let n and tn be sequences such that limn+ nn tn = R and limn+ tn = t0 R.


Note that t0 can be equal to zero.
n is now given by:
n

n =

sup

log 1 + exp[tXi

[0,1], tR i=1

t2
]1
2

Then:
Theorem 8 As n tends to infinity, 2 n log log n + log(2 2 ) tends in distribution to
the Gumbel distribution under P0 as well as under Pn ,tn for any and t0 . In other words,
let us define as rejection values the region: (n > c,n ) with
c,n =

1
G1 + log log n log(2 2 ) ,
2

where G1 is the 1 fractile of the Gumbel distribution. We have by definition


lim P0 (n > c,n ) = .

n+

Then for any and t0 , the limit power of the LRT is


lim Pn ,tn (n > c,n ) = .

n+

The theorem says that asymptotically, the LRT cannot distinguish the null hypothesis from
any contiguous alternative. Indeed, this has to be compared with other testing procedures
such as moment testing
procedures. For example, if X n is the sample mean, applying Le
Cams third Lemma, n X n converges in distribution, under P
n ,tn as n tends to infinity,
to the Gaussian N (, 1). Thus the test based on the statistic n X n has an asymptotic
power that is strictly greater than the level. As mentioned in the introduction this makes
sense in practice only for very large data sets.

Proof of Theorem 8
The separation of the hypotheses is greater when = 0. Using Lemma 14.31 of van der
Vaart (1998) it is easy to see that this is the only case to consider. Moreover by symmetry
we can suppose also that > 0. Let us introduce Sn the empirical process defined by
1
Sn (t) =
n

i=1

exp[tXi t2 ] exp(

t2
) .
2

Liu and Shao (2004, Theorem 1) recall results obtained by Bickel and Chernoff (1993) on
the process Sn :
sup Sn (t) = sup Sn (t) + oP0 (1)
tR

|t|A2,n

where A2,n = [n , n ], n = 2 log log log n and n = log n/2 2 log log n.
Through the proof of their Theorem 2 Liu and Shao (2004) state that
2 n

= sup Sn (t)2 + oP0 (1).


tR

22

(21)

Combining with (21), the last equality becomes


2 n

sup Sn (t)2 + oP0 (1).

|t|A2,n

0 the extension of P0 constructed by Bickel and Chernoff (1993) by Hungarian


Let us denote P
construction. According to their formula (39), we get
2 n = sup S0 (t)2 + oP0 (1)

(22)

|t|A2,n

where S0 is the zero mean non-stationary Gaussian process with covariance function
(s t)2
s2
t2
] exp[ ].
2
2
2

(s, t) = exp[

In their paper, Bickel and Chernoff remark that this process is very close to a stationary
process namely S0 . Because we need it later, we will use here an other way. We define the
standardized version of S0
Y0 (t) =

S0 (t)

(t, t)

S0 (t)
1 et2

in order to be able to apply the Normal Comparison Lemma (Li and Shao, 2002, Theorem
2.1). Y0 is a zero mean non-stationary Gaussian process, with unit variance and covariance
function
exp st 1

r(s, t) =

exp s2 1 exp t2 1

(23)

We have
0

sup |Y0 (t) S0 (t)|

|t|A2,n

sup (1

|t|A2,n

(t, t)) sup |Y0 (t)|.


|t|A2,n

Now the function r satisfies conditions of Corollary 1 of Azas and Mercadier (2004). Consequently we know the exact order of the maximum
1

sup |Y0 (t)| = OP0 (log log n) 2 .

|t|A2,n

This last equation can also be deduced from standard result on the maximum of stationary
Gaussian processes using the process S0 introduced by Bickel and Chernoff (1993).
On the other side, the maximum of 1 (t, t) on A2,n is obtained at n . This permits us
to write
1

0 sup |Y0 (t) S0 (t)| OP0 (log log n) 2 4 .


|t|A2,n

Finally this approximation allows us to replace S0 by Y0 in (22) to get


2 n = sup Y0 (t)2 + oP0 (1).

(24)

|t|A2,n

With the same idea as before, we define


Yn (t) =

Sn (t)
1 et2

For all t0 and all , using argument close to those that lead to formula (7) in Gassiat (2002)
we have
C(, t0 )2
dPn ,tn
(X1 , . . . , Xn ) = C(, t0 )Yn (tn )
+ oP0 (1)
(25)
log
dP0
2

23

2
t
with C(, t0 ) = if t0 = 0 and C(, t0 ) = et00 1 if t0 > 0. Since can be supposed
positive, t0 is positive. A detailed proof of formula (25) is given in Section 5. Using the
formula (39) of Bickel and Chernoff (1993), we can replace Yn by Y0 to get
log

C(, t0 )2
dPn ,tn
(X1 , . . . , Xn ) = C(, t0 )Y0 (tn )
+ oP0 (1).
dP0
2

(26)

We next use the following lemma, its proof is given in Section 5.


Lemma 1 For all t0 , 2 n log log n + log(2 2 ) and log
totically independent under P0 .

dPn ,tn
dP0

(X1 , . . . , Xn ) are asymp-

Then, as soon as one proves Lemma 1 the theorem follows from a generalization of Le Cams
third Lemma. The proof of Lemma 1 relies on a suitably chosen discretization, following
ideas in Azas and Mercadier (2004), and an application of the normal comparison lemma
as refined in Li and Shao (2002).

Proofs

5.1

Proof of Theorem 5

To obtain the supremum in the first limit of Theorem


of:

under the constraints

4, one has to compute the supremum


2

= 1, T 1 0,

where

CT

(27)

(28)

Consider the sumpremum under the first constraint. Then, similarly to the proof of Theorem
2, the value of the supremum is
AZ + U T W

A1 AZ + U T W + W T 1 W

and it is attained on some such that T 1 has the same sign as (AZ + U T W )T 1.
If (AZ + U T W )T 1 < 0, then the supremum of (27) under (28) equals the supremum of (27)
under the constraints

= 1, T 1 = 0.

(29)

Computation of this supremum using Lagrange multipliers leads to the fact that it is equal
to
11T
T
AZ + U T W + W T 1 W
AZ + U T W
A1 T
1 A1
and the Theorem is proved.

5.2

Proof of Theorem 7

Set uT,x =
Z + m.

x
aT

+ bT and M V (a, b) = supt(a,b) Vt for V the unit-speed transformation of

24

We have
P M (T, T ) uT,x = P M V R(T ), R(T ) uT,x .
Now, applying with p = 2, D1 = R(T ), R(T ) and D2 =
4 of Azas and Mercadier (2004), we obtain

R(T ), R(T ) Proposition

P M V (D1 D2 ) uT,x = P M V (D1 ) uT,x P M V (D2 ) uT,x + o(1).


Remark that in Azas and Mercadier (2004) sizes of intervals are defined as functions of the
level, here it is the opposite which is made. Furthermore, repeated application of Corollary
1 of Azas and Mercadier (2004) enables us to state for = R(T ) and = R(T ) the
convergence of a M V (0, ) b and a M V (, 0) b to the Gumbel. It follows that
M V R(T ), R(T ) is stochastically negligible compared with M V R(T ), R(T ) and
also that M 0, R(T ) resp. M R(T ), 0 is stochastically negligible compared with
M V 0, R(T ) resp. M V R(T ), 0 . By combining what precedes, we get
P M V R(T ), R(T ) uT,x
= P M V 0, R(T ) uT,x P M V R(T ), 0 uT,x + o(1),
as T tends to infinity, and which becomes
P M (T, T ) uT,x = P M (0, T ) uT,x P M (T, 0) uT,x + o(1)
when we return to the initial process Z + m.
Let G(x) = exp exp(x) denote the distribution function of the Gumbel. Corollary 1
of Azas and Mercadier (2004) yields, as T tends to infinity,
P M (0, T ) uT,x

= P aT M (0, T ) bT x + o(1)
= P aT M (0, T ) bT x + log(2) + o(1)
= G(x + log(2)) + o(1).

Since the same equality holds on (T, 0), one can conclude that
P M (T, T ) uT,x = G(x + log(2))2 + o(1) = G(x) + o(1).

5.3

Proof of Corollary 1

The proof relies on the verification of assumptions of Theorem 7.


Proof of (CM): since f0 is continuous and positive, for any real T ,
inf

t[T,T ]

f0 (t) = T > 0.

Using (H3), for all t [T, T ] and x R,


1
f0 (x t) f0 (t)
M
ft f 0
(x) sup
+1
+ 1,
f0
f0 (x)
f0 (t)
T
xR
and using (H1) and (H3)
ft
M f
M
(x) K1 , t (x) K2
f0
T
f0
T

25

Let us now prove Assumptions (G). Set


f0 (x t)f0 (x s)
d(x).
f0 (x)

N (s, t) =

Differentiation of r, for s and t in R \ {0}, is a consequence of that of N (s, t). Now, for any
integers i 4 and j 4, using (H1) and (H3)
(i)

(j)

f0 (x t)f0 (x s)
f0 (x t)f0 (x s)
f0 (x)
K i Kj
K i Kj M 2
f0 (x)
f0 (x)
f0 (t)f0 (s)
and f0 (t)f0 (s) is positively lower bounded on the neighbourhood of any (s0 , t0 ), which proves
that N is differentiable at any (s, t) (R \ {0})2 with
(i)

i+j N
(s, t) = (1)i+j
it j s

(j)

f0 (x t)f0 (x s)
d(x).
f0 (x)

Proof of (G1): we thus have for t = 0

r11 (t, t)

f02 (xt)
f0 (x) d(x)

f02 (xt)
f0 (x) d(x)

f0 (xt)2
f0 (x) d(x)
f0 (t) 2 f0 (t)f0 () 2
2
2
f0 ()
f0 ()

f0 (xt)f0 (xt)
d(x)
f0 (x)
2

f0 (t) f0 (t)f0 ()
2
f0 () ,
f0 ()

f0 (t)f0 () 4
2
f0 ()

which is positive by Cauchy-Schwarz inequality. Now,

lim r11 (t, t) =

f02 d

t+

f02 d
f02 d

f0 f0 d
2

Indeed, define the functions


A(t)

f02 (x)
d(x)
f0 (x + t)

B(t)

f02 (x)
d(x)
f0 (x + t)

C(t)

f0 (x)f0 (x)
d(x).
f0 (x + t)

Then write the function r11 under the following form:


r11 (t, t) =

B(t)f0 (t) A(t)f0 (t) f0 (t) C(t)f0 (t)


A(t)f0 (t) f0 (t)

Thanks to (H1) and (H3), integrands of Af0 , Bf0 and Cf0 are respectively dominated by
M f0 (x), K12 M f0 (x), K1 M f0 (x).
By application of (H2) and Lebesgue Theorem, we conclude using the following conver-

26

gences:
lim A(t)f0 (t) =

f02 (x) d(x)

lim B(t)f0 (t) =

f02 (x) d(x)

lim C(t)f0 (t) =

f0 (x)f0 (x) d(x).

t+

t+

t+

Thus for a positive constant R


R(t) t+ R t.

(30)

Proof of (G2): considering (30), we have to prove that


lim

|st|+

r(s, t) log |s t| = 0.

(31)

Using (H3),
f0 (t)f02 (x)
M f0 (x),
f0 (x + t)
so that using (H2),
f0 (t)f02 (x)
d(x) =
f0 (x + t)

lim

t+

f02 (x)d(x),

and there exists a constant C such that for |s t| large enough,


f0 (t)

r(s, t) C

f0 (s)

f0 (x t)f0 (x s)
d(x).
f0 (x)

Then, using (H3),


r(s, t)

CM

f0 (x)

f0 (x + s t)d(x).

But according to (H5) for any x R,


lim

|st|+

log |s t|

f0 (x + s t) = 0,

and so, one may apply Lebesgue Theorem using (H4) to obtain (31).
Proof of (G5): (G5) is a consequence of (G2) and formula (11) giving (t).
Proof of (G3): Using (30) and the fact that r11 > 0, one just has to prove that for any
> 0,
sup |r(s, t)| < 1.
(32)
|st|>

First of all, r(s, t) is a continuous function of (s, t) and |r(s, t)| < 1 as soon as s = t by
Cauchy-Schwarz inequality. Thus for any > 0, for any compact set K,
sup
|st|>, tK, sK

|r(s, t)| < 1.

On the other hand because of (G2) for |s t| sufficiently large r(s, t) is bounded away from
1, so we may suppose that |s t| is bounded. Suppose that there exists sn and tn such that
|sn tn | is bounded, |sn tn | > and r(sn , tn ) 1. By compactness it would be possible
to choose subsequences s(n) and t(n) such that s(n) t(n) c. But using the same
tricks as before (using (H2), (H3) and Lebesgue Theorem),
lim r(s(n) , t(n) ) =

n+

f0 (x)f0 (x + c)d(x)

f02 (x)d(x)

Since |c| > 0 this value differs from 1. Hence we get a contradiction with assumptions

27

made on sequences sn and tn and (32) is true.


Proof of (G4): The first part of (G4) has been already proved. We use same arguments to
prove that s r11 (s, s) is three times continuously differentiable. Now, this last regularity
associated to (30) allow us to reduce our study to that of functions r01 and r04 .
The first derivative r01 (s, t) can be written as:
f0 (t) f0 (s)f0
,
2
f0
f0
f0 (t)f0
f0 (s)f0
2
2
f0
f0

f0 (t) f0 (t)f0
f0 (t)f0 f0 (s)f0
,
,
f0
f0
f0
f0
f0
f0
f0 (t)f0 3
f0 (s)f0
2
2
f0
f0

then Cauchy-Schwarz inequality leads to


|r01 (s, t)| 2

f0 (t)
2
f0

f0 (t)f0
2
f0

This upper bound is a continuous function on t. By making appear f0 (t), it is easily seen
that it converges, as t tends to infinity, to
2

f02 d

f02 d

Moreover for any > 0, the denominator is lower bounded on D = {(s, t), s R, |t| > }.
Consequently for any > 0, (s, t) r01 (s, t) is bounded on R2 \ D .
Using easy but tedious computations and Cauchy-Schwarz inequality once more, we
have:
(k)
|r04 (s, t)|

i1 j1

f0 (t)
4
k=1
f0
f0 (t)f0
f0

ijk
2

i
2
4

where the sums on i and j are finite and where for any i and j:
k=1 ijk = i. Previous
arguments run again and permit us to assert that for any > 0 the function (s, t) r 04 (s, t)
is bounded on R2 \ D .

5.4

Proof of formula (25)

by log(1 + u) = u u2 /2 + u2 L(u) and L(u)

Define the functions L and L


= sup|v|<u |L(v)|.

It is clear that L(u)


0 as u 0 and we have
log

dPn ,tn
(X1 , . . . , Xn ) =
dP0

log 1 + n (etn Xi
i=1
n

(etn Xi

= n

t2
n
2

i=1

with
|S| nn2 t2n

1
n

i=1

t2
n
2

tn Xi

t2
n
2

tn

1)

1) +

n2
2

1
L n tn max

(etn Xi

t2
n
2

i=1

i=1,...,n

tn Xi

t2
n
2

tn

1)2 + S, (33)

t2
tn X 2n
e

1
, n = 1, 2, . . . for X of
Now it suffices to remark that the random variables
tn

distribution N (0, 1) have bounded third moment. Applying the Markov inequality

t2
tn Xi 2n

e
1
max
= oP0 ( n)
i=1,...,n
tn

28

Moreover, the class of functions


2
t2
tn x 2n
e

x
tn

is Glivenko-Cantelli in probability (indeed, it is the square of a Donsker class, as a consequence of Section 2.1), we get S = oP0 (1) and
1
n

[(etn Xi

t2
n
2

1)2 (etn 1)] = oP0 (1),

i=1

so that
log

dPn ,tn
(X1 , . . . , Xn ) = nn
dP0
=

etn 1Yn (tn ) + n

nn

t2n

and
log

5.5

nn

(etn Xi
i=1

1Yn (tn ) + n

Now setting C(, t0 ) = if t0 = 0 and C(, t0 ) =

n2 1
2 n

t2
n
2

1)2 + oP0 (1)

n2 t2n
(e 1) + oP0 (1). (34)
2

et0 1
t0

we have

etn 1 = C(, t0 ) + o(1)

C(, t0 )2
dPn ,tn
(X1 , . . . , Xn ) = C(, t0 )Yn (tn )
+ oP0 (1).
dP0
2

Proof of Lemma 1
1

Beforehand we set cn = (log log n) 2 and we recall that A2,n = [n , n ] with n =

2 log log log n and n = log n/2 2 log log n.


According to (24) and (26) we need to prove that suptA2,n (Y0 (t)cn ) and Y0 (t0 ) are asymptotically independent. To this end, we consider the discretized process {Y0 (qn k), k Z}
with a step of discretization qn depending on n in a sense which has to be defined. Let us
n
= {d1 , . . . , dN (n) }.
gather the discretized points of A2,n in Aq2,n
By triangular inequalities and simplifications we have for any x and y
P

sup Y0 (t) cn x; Y0 (t0 ) y P( sup Y0 (t) cn x)P(Y0 (t0 ) y)

tA2,n

tA2,n

2P
P

sup Y0 (d) cn x; sup Y0 (t) cn > x +

n
dAq2,n

sup Y0 (d) cn x; Y0 (t0 ) y P

n
dAq2,n

(35)

tA2,n

sup Y0 (d) cn x P Y0 (t0 ) y

n
dAq2,n

The task is now to prove that for fixed x and y each component of the upper bound converges to 0.
We define the following modification of the function r
r(t0 , t) = 0
r(s, t) = r(s, t)

n
t Aq2,n
, t = t0 ,
n
s, t Aq2,n
.

Note that under the Gaussian distribution defined by r the value of the process at t0 is
independent of the values of the process at other locations whose distribution does not
changes. This proves that r is a covariance function. We define (t) = supu, |ut0 |>t |r(u, t0 )|.

29

From (23) we have


(t) = O exp

t2
2

We restrict our attention to ns such that


cn > 2|x| ; (n ) < 1/2 so that

(x + cn )2
c2
n
2(1 + (n ))
12

The Normal Comparison Lemma (Li and Shao, 2002, Theorem 2.1) gives bounds to terms
of the type
P Y1 u1 , ..., Yn un P Y1 u1 , ..., Yn un
where Y and Y are two centered Gaussian vectors with the same variance and possibly
different covariances ij and ij , i, j = 1, ..., n. It says that
P Y1 u1 , ..., Yn un P Y1 u1 , ..., Yn un

1
2

1i<jn

arcsin(ij ) arcsin(
ij )

exp

u2i + u2j
2(1 + ij )

(36)

where z + = max {z, 0}, ij = max {|ij |, |


ij |}. Let (Const) represents a generic positive
constant. Since arcsin(x) x/2 for 0 x 1, applying inequality (36) in both directions
to the vector Y0 with covariance r and to the vector Y0 with covariance r corresponding to
n
we get:
the points belonging to {t0 } Aq2,n
P

sup Y0 (d) cn x; Y0 (t0 ) y P

n
dAq2,n

(Const)
(Const)

n
dAq2,n

n
dAq2,n

sup Y0 (d) cn x P Y0 (t0 ) y

n
dAq2,n

|r(d, t0 )| exp

(x + cn )2 + y 2
2 1 + |r(d, t0 )|

|r(d, t0 )| exp

c2n
12

c2
(Const)
exp n
qn
12

(t)dt =
n qn

which tends to zero if , for example, qn = log log n

c2
(Const)
exp n
qn
12

as soon as > 0.

To deal with the first term of (35), we denote by Uz and Uzqn the point processes of
up-crossings of level z for Y0 and its qn -polygonal approximation (linear interpolation) respectively. For any subset B of R,
Uz (B)
Uzqn (B)

= # {t B, Y0 (t) = z, Y0 (t) > 0},

= # l Z, qn (l 1) B, qn l B, Y0 qn (l 1) < z < Y0 qn l

Set the distribution function of the standard Gaussian and = 1 .


P

sup Y0 (d) cn x; sup Y0 (t) cn > x

n
dAq2,n

tA2,n

qn
(A2,n ) = 0
P Y0 (n ) > x + cn + P Y0 (n ) x + cn , Ux+cn (A2,n ) 1, Ux+c
n
qn
(x + cn ) + E Ux+cn (A2,n ) Ux+c
(A2,n )
n

where the last upper bound is due to Markov inequality. The first term above tends trivially

to zero, as for the second if we set qn = log log n


with > 12 , Condition (U7) of Lemma

30

2 of Azas and Mercadier (2004) is met. It is easy to check that since E Ux+cn (A2,n ) is
bounded we are in the condition of application of that lemma and
qn
E Ux+cn (A2,n ) Ux+c
(A2,n ) = o(1).
n

References
Azas, J.-M. and Mercadier, C. (2004), Asymptotic Poisson character of extremes in non-stationary
Gaussian models, To appear in Extremes.
Bickel, P. and Chernoff, H. (1993), Asymptotic distribution of the likelihood ratio statistic in a
prototypical non regular problem, Statistics and Probability: A Raghu Raj Bahadur Festschrift
(J.K. Ghosh, S.K. Mitra, K.R. Parthasarathy and B.L.S. Prakasa Rao eds), 8396, Wiley Eastern
Ltd.
Boistard, H. (2003), Test of goodness-of-fits for mixture of population, Technical report, Universite
Toulouse III, France.
Chen, H. and Chen, J. (2001), Large sample distribution of the likelihood ratio test for normal
mixtures, Statist. Probab. Lett., 2, 125133.
Chernoff, H. and Lander, E. (1995), Asymptotic distribution of the likelihood ratio test that a
mixture of two binomials is a single binomial, J. Statist. Plan. Inf., 43, 1940.
(1997), Testing in locally conic models, and application to
Dacunha-Castelle, D. and Gassiat, E.
mixture models, ESAIM Probab. Statist., 1, 285317.
(1999), Testing the order of a model using locally conic
Dacunha-Castelle, D. and Gassiat, E.
parameterization: population mixtures and stationary ARMA processes, Ann. Statist., 27(4),
11781209.
Delmas, C. (2003), On likelihood ratio tests in Gaussian mixture models, Sankhya, 65(3), 119.
(2002), Likelihood ratio inequalities with applications to various mixtures, Ann. Inst.
Gassiat, E.
H. Poincare Probab. Statist., 6, 897906.
Garel, B. (2001), Likelihood Ratio Test for Univariate Gaussian Mixture, J. Statist. Plann. Inference, 96(2), 325350.
Ghosh, J. and Sen, P. (1985), On the asymptotic performance of the log likelihood ratio statistic
for the mixture model and related results, Proceedings of the Berkeley Conference in Honor of
Jerzy Neyman and Jack Kiefer, vol. II, 789806, Wadsworth, Belmont, CA.
Goffinet, B., Loisel, P., and B. Laurent. (1992), Testing in normal mixture models when the
proportions are known, Biometrika 79, 842846.
Hall, P. and Stewart, M. (2004), Theoretical analysis of power in a two-component normal mixture
model. Private communication.
Hartigan, J.A. (1985), A failure of likelihood asymptotics for normal mixtures, In Proceedings of
the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer (Berkeley, Calif., 1983), Vol.
II, 807810, Wadsworth, Belmont, CA.
James, L.F., Priebe, C.E., and Marchette, D.J. (2001), Consistent Estimation of Mixture Complexity, Ann. Statist., 29, 12811296.
Lemdani, M. and Pons, O. (1997), Likelihood ratio test for genetic linkage, Statis. Probab. Lett.,
33(1), 1522.
Lemdani, M. and Pons, O. (1999), Likelihood ratio tin contamination models, Bernoulli, 5(4),
705719.
Li, W.V. and Shao, Q. (2002), A normal comparison inequality and its applications, Probab. Theory
Related Fields, 122(4), 494508.

31

Lindsay, B. G. (1995), Mixture models: Theory, geometry, and applications, NSF-CBMS Regional
Conference Series in Probability and Statistics, Vol. 5, Hayward, Calif.: Institute for Mathematical Statistics.
Liu, X. and Shao, Y. (2004), Asymptotics for the likelihood ratio test in two-component normal
mixture models, J. Statist. Plann. Inference, 123(1), 6181.
McLachlan, G. and Peel, D. (2000), Finite mixture models, Wiley Series in Probability and
Statistics: Applied Probability and Statistics, Wiley-Interscience, New York.
Mercadier, C. (2004), Computing the distribution of the maximum of random processes and fields:
how far are the Rice and Euler Characteristics valid. Preprint, Universite Toulouse III, France,
http://www.lsp.ups-tlse.fr/Fp/Mercadier.
Mosler, K. and Seidel, W. (2001), Testing for homogeneity in an exponential mixture model, Aust.
N. Z. J. Stat., 43(3), 231247.
Seidel, W., Mosler, K., and Alker, M. (2000), A cautionary note on likelihood ratio tests in mixture
models, Ann. Instit. Statist. Math., 52, 481487.
Shorack, G.R. and Wellner, J.A. (1986), Empirical processes with applications to statistics, Wiley
Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics, John
Wiley & Sons Inc. New York.
Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985), Statistical analysis of finite mixture
distributions, Wiley Series in Probability and Mathematical Statistics: Applied Probability and
Statistics, John Wiley & Sons Ltd.
van der Vaart, A. W. (1998), Asymptotic statistics, Cambridge Series in Statistical and Probabilistic
Mathematics, Cambridge University Press, Cambridge.

32

ESAIM: Probability and Statistics

August 2002, Vol. 6, 177184


DOI: 10.1051/ps:2002010

URL: http://www.emath.fr/ps/

ON THE TAILS OF THE DISTRIBUTION OF THE MAXIMUM OF A SMOOTH


STATIONARY GAUSSIAN PROCESS

Jean-Marc Azas 1 , Jean-Marc Bardet 1 and Mario Wschebor 2


Abstract. We study the tails of the distribution of the maximum of a stationary Gaussian process on
a bounded interval of the real line. Under regularity conditions including the existence of the spectral
moment of order 8, we give an additional term for this asymptotics. This widens the application of an
expansion given originally by Piterbarg [11] for a sufficiently small interval.

Mathematics Subject Classification. 60Gxx, 60E05, 60G15, 65U05.


Received March 13, 2002. Revised May 15, 2002.

1. Introduction and main result


Let X = {X(t), t [0, T ]}, T > 0 be a real-valued centered Gaussian process and denote
M := max X(t).
t[0,T ]

The precise knowledge of the distribution of the random variable M is essential in many of statistical problems;
for example, in Methodological Statistics (see Davies [8]), in Biostatistics (see Azas and CiercoAyrolles [4]).
But a closed formula based upon natural parameters of the process is only known for a very restricted number
of stochastic processes X: for instance, the Brownian motion, the Brownian bridge or the OrnsteinUhlenbeck
process (a list is given in Azas and Wschebor [6]). An interesting review of the problem could be found in
Adler [2].
We are interested here in a precise expansion of the tail of the distribution of M for a smooth Gaussian
stationary process. First, let us specify some notations
r(t) := E X(s)X(s + t) denotes the covariance function of X. With no loss of generality we will also
assume 0 = r(0) = 1;
its spectral measure and k (k = 0, 1, 2, . . . ) its spectral moments whenever they exist;
x
1
x2
(x) = exp
and (x) =
(t)dt.
2
2

Keywords and phrases: Tail of distribution of the maximum, stationary Gaussian processes.

This work was supported from ECOS program U97E02.

Laboratoire de Statistique et de Probabilit


es, UMR 5583 du CNRS, Universite Paul Sabatier, 118 route de Narbonne,
31062 Toulouse Cedex 4, France.
2 Centro de Matem
atica, Facultad de Ciencias, Universidad de la Rep
ublica, Calle Igu
a 4225, 11400 Montevideo, Uruguay;
e-mail: azais@cict.fr, bardet@cict.fr, wscheb@fcien.edu.uy
c EDP Sciences, SMAI 2002

178

J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR

Throughout this paper we will assume that 8 < and for every pair of parameter values s and t , 0 s = t T ,
the six-dimensional random vector (X(s), X (s), X (s), X(t), X (t), X (t)) has a non-degenerate distribution.
Piterbarg [11] (Th. 2.2) proved (under the weaker condition 4 < instead of 8 < ) that for each T > 0
and any u R:
2
u2 (1 + )
T (u) P (M > u) B exp
2
2

1 (u) +

(1)

for some positive constants B and . It is easy to see (see for example Miroshin [10]) that the expression inside
the modulus is non-negative, so that in fact:
0 1 (u) +

2
u2 (1 + )
T (u) P (M > u) B exp
2
2

(2)

The problem of improving relation (2) does not seem to have been solved in a satisfactory manner until now.
A crucial step has been done by Piterbarg in the same paper (Th. 3.1) in which he proved that if T is small
enough, then as u +:
P (M > u) = 1 (u) +

2
T (u)
2

3 3(4 22 )9/2
9/2
22 (2 6

T
u
u5

24 )

4
4 22

[1 + o(1)] .

(3)

The same result has been obtained by other methods (Azas and Bardet [3]; see also Azas et al. [5]).
However Piterbarg equivalent (3) is of limited interest for applications since it contains no information on
the meaning of the expression T small enough.
The aim of this paper is to show that formula (3) is in fact valid for any length T under appropriate
conditions that will be described below.
Consider the function F (t) defined by
2

F (t) :=

2 1 r(t)

2 1 r2 (t) r 2 (t)

Lemma 1. The even function F is well defined, has a continuous extension at zero and
22
1. F (0) =
;
4 22
2. F (0) = 0;
2 (2 6 24 )
3. 0 < F (0) =
< .
9(4 22 )
Proof.
1. The denominator of F (t) is equal to 1 r2 (t) Var X (0)|X(0), X(t) thus non zero due to the non
degeneracy hypothesis. A direct Taylor expansion gives the value of F (0).
2. The expression of F (t) below shows that F (0) = 0:

F (t) =

22 1 r(t) r (t) r 2 (t) 2 r (t) 1 r(t)


2 1 r2 (t) r 2 (t)

(4)

3. A Taylor expansion of (4) provides the value of F (0). Note that 4 22 can vanish only if there exists
some real such that ({}) = ({}) = 1/2. Similarly, 2 6 24 can vanish only if there exists some

179

DISTRIBUTION OF THE MAXIMUM

real and p 0 such that ({}) = ({}) = p, ({0}) = 1 2p. These cases are excluded by the non
degeneracy hypothesis.
We will say that the function F satisfies hypothesis (H) if it has a unique minimum at t = 0. The next
proposition contains some sufficient conditions for this to take place.
Proposition 1. (a) If r (t) < 0 for 0 < t T then (H) is satisfied.
(b) Suppose that X is defined on the whole line and that
1. 4 > 222 ;
2. r(t), r (t) 0 as t ;
3. there exists no local maximum of r(t) (other than at t = 0) with value greater or equal to

4 222

Then (H) is satisfied for every T > 0.


An example of a process satisfying condition (b) but not condition (a) is given by the covariance
r(t) :=

1 + cos(t) t2 /2
e
2

if we choose sufficiently small. In fact, a direct computation gives 2 = 1 + 2 /2; 4 = 3 + 3 2 + 4 /2 so that


1 + 2
4 222
=

4
3 + 3 2 + 4 /2
On [0, ), the covariance attains its second largest local maximum in the interval

2
,

, so that its value is

smaller than exp


. Hence, choosing is sufficiently small the last condition in (b) is satisfied.
The main result of this article is the following:

2 2

Theorem 1. If the process X satisfies hypothesis (H), then (3) holds true.

2. Proofs
Notations.
p (x) is the density (when it exists) of the random variable at the point x Rn .
1lC denotes the indicator function of the event C.
Uu ([a, b]), u R is the number of upscrossings on the interval [a, b] of the level u by the process X defined
as follows:
Uu ([a, b]) = #{t [a, b], X(t) = u, X (t) > 0}
For k a positive integer, k (u, [a, b]) is the kth order factorial moment of Uu ([a, b])
k (u, [a, b]) = E

Uu ([a, b]) Uu ([a, b]) 1 . . . Uu ([a, b]) k + 1

We define also
k (u, [a, b]) = E

Uu ([a, b]) Uu ([a, b]) 1 . . . Uu ([a, b]) k + 1 1l{X(0)u} .

a+ = a 0 denotes the positive part of the real number a.


(const) denotes a positive constant whose value may vary from one occurrence to another.
We will repeatedly use the following lemma, that can be obtained using a direct generalization of Laplaces
Method (see Dieudonne [9], p. 122).

180

J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR

Lemma 2. Let f (respectively g) be a real-valued function of class C 2 (respectively C k for some integer k 1)
defined on the interval [0, T ] of the real line verifying the conditions:
1. f has a unique minimum on [0, T ] at the point t = t , and f (t ) = 0, f (t ) > 0.
2. Let k = inf j : g (j) (t ) = 0 .
Define
T

h(u) =
0

1
g(t) exp u2 f (t)
2

dt.

Then, as u :
g (k) (t )
k!

h(u)

1
xk exp f (t )x2
4
J

dx

1
1
exp u2 f (t ) ,
uk+1
2

(5)

where J = [0, +) , J = (, 0] or J = (, +) according as t = 0, t = T or 0 < t < T respectively.


We will use the following well-known expansion as (Abramovitz and Stegun [1], p. 932). For each a0 > 0 as
u +

for all a a0 where O

1
exp ay 2 dy =
2
1
u7

1
1
3

+
+O
au a2 u3 a3 u5

should be interpreted as bounded by

1
u7

1
exp au2 ,
2

(6)

K
, K a constant depending only on a0 .
u7

Proof of of Theorem 1.
Step 1: The proof is based on an extension of Piterbargs result to intervals of any length. Let > 0, the
following relation is clear
P (M[0, ] > u) = P (X(0) > u) + P (Uu ([0, ]).1l{X(0)u} 1)
= 1 (u) + P (Uu ([0, ]) 1) P (Uu ([0, ]).1l{X(0)>u} 1).
In the sequel a term will be called negligible if it is O u6 exp
following relations to be proved later:

1 4 u2
2 4 22

as u +. We use the

(i) P (Uu ([0, T ]).1l{X(0)>u} 1) is negligible;


(ii) Let 2 T . Then P ({Uu ([0, ])Uu ([, 2 ]) 1}) is negligible.
With these relations, for 2 T , we have
P (M[0,2 ] > u) 1 (u) = P (Uu ([0, 2 ]) 1) + N1
= P (Uu ([0, ]) 1) + P (Uu ([, 2 ]) 1) + N2 = 2P (Uu ([0, ]) 1) + N3
= 2P (M[0, ] > u) 2 1 (u) + N4 ,

(7)

N1 N4 being negligible. Applying (7) repeatedly and on account of Piterbargs theorem that states that (3)
is valid if T is small enough, one gets the result.
Step 2: Proof of (i). Using Markovs inequality:
P (Uu ([0, T ]).1l{X(0)>u} 1) 1 (u, [0, T ]),

181

DISTRIBUTION OF THE MAXIMUM

where 1 is evaluated using the Rice formula (Cramer and Leadbetter [7])
+

1 (u, [0, T ]) =

E X + (t)|X(0) = x, X(t) = u pX(0),X(t) (x, u)dt.

dx
u

(8)

Also if Z is a real-valued random variable with a Normal-(m, 2 ) distribution,


E(Z + ) =

m
m
+ m
,

and plugging into (8) one obtains (see details in Azas et al. [5]):
1 (u, [0, T ]) =

(u)
2

dt

2 F

e 2 F y dy
2

r
F
(1 r)u2
exp


2(1 + r)
2 (1 r2 )
2

exp

r 2F y2
22 (1 r2 )

B(t, u)dt,
0

where r, r and F stand for r(t), r (t) and F (t) respectively. Clearly, since r (0) = 2 < 0, there exists T0
such that r < 0 on (0, T0 ]. Divide the integral into two parts: [0, T0 ] and [T0 , T ]. Using formula (6) on [0, T0 ]
we get
B(t, u) =

(u)
2

2 F 5/2

2 (1 r)2 3
u + O u5 (u) ,
r2

and since, as t 0, (1 r)2 r 2 = O(t2 ), Lemma 2 shows that


T0

B(t, u)dt = O u6 exp

4 u2
2(4 22 )

On the other hand, since inf t[T0 ,T ] F (t) is strictly larger than F (0), it follows easily from

exp

ay 2
2

au2
1
dy (const) exp
2
a

a > 0, u 0,

that
T

B(t, u)dt
T0

is negligible.
Step 3: Proof of (ii). Use once more Markovs inequality:
P (Uu ([0, ])Uu ([, 2 ]) 1 ) E (Uu ([0, ])Uu ([, 2 ])) .
Because of Rice formula (Cramer and Leadbetter [7]):

E (Uu ([0, ])Uu ([, 2 ])) =

(t (2 t))At (u)dt,

At2 t1 (u)dt2 dt1 =


0

with
At (u) = E X + (0)X + (t)|X(0) = X(t) = u pX(0),X(t) (u, u).

(9)

182

J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR

It is proved in Azas et al. [5], that


1
At (u) =
2
1 r2

1+r

[T1 (t, u) + T2 (t, u) + T3 (t, u)] ,

(10)

with

T1 (t, u) = 1 + (b)(kb);
+

T2 (t, u) = 2(2 2 )

(kx)(x)dx;
b

T3 (t, u) = 2(kb)(b);

r
u;
1+r
2
r
2 = 2 (t) = Var X (0)|X(0), X(t)) = 2
;
1 r2
r (1 r2 ) rr 2
= (t) = Cor X (0), X (t)|X(0), X(t)) =
;
2 (1 r2 ) r 2
1+
k = k(t) =
; b = b(t, u) = /;
1
= (t, u) = E X (0)|X(0) = X(t) = u) =

(z) =

(v)dv;
0

r, r , r again stand for r(t), r (t), r (t).


As in Step 2, we divide the integral (9) into two parts: [0, T0 ] and [T0 , 2 ]. For t < T0 , b(t, u) and k(t) are
positive, thus using expansion (6), we get the formula p. 119 of Azas et al. [5]:
T1 (u) + T2 (u) + T3 (u) =

22
42
(kb)(1 + )(b) 3 (kb)(b) (1 + )2 k(kb)(b)
b
b
2
4
k
k3
1
+ 22 k 3 (kb)(b) + 2 k(kb)(b) + O 2 (kb)(b) 7 + 6 + 4
b
b
b
b

Since T1 (u) + T2 (u) + T3 (u) is non negative, majorizing (kb) and (kb) by 1 we get
At (u) (const)

2
1 r2 (t)

(1 + )
k
1
k
k3
+ k3 + 2 + 7 + 6 + 4
b
b
b
b
b

exp

1
1 + F (t) u2
2

Now it is easy to see that, as t 0


2

(const)t2 , (1 + )

(const)t2 ,

1 r2 (t)

(const)t, b

so that
2 (1 + )
b 1 r2 (t)

b2

(const)

t3
;
u

2 k3
1 r2 (t)

(const)t4 ;

2 k
1 r2 (t)

(const)t2 u2 ;

(const)u,

183

DISTRIBUTION OF THE MAXIMUM

and also that the other terms are negligible. Then, applying Lemma 2:
T0

(t (2 t))At (u)dt

(const)u6 exp

4 u2
2(4 22 )

thus negligible.
For t T0 remark that T1 (u) + T2 (u) + T3 (u) does not change when (and consequently b) changes of sign.
Thus and b can supposed to be non-negative. Forgetting negative terms in formula (10) and majorizing
by 1; 1 (b) by (const)(b) and by (const)u, we get:
At (u) (const)2

1+r

(b)(1 + u) = (const)(1 + u) exp

1
1 + F (t) u2
2

We conclude as in Step 2.
Proof of of Proposition 1. Let us prove statement (a). The expression (4) of F shows that it is positive for
0 < t T , since r (t) < 0 and
r 2 (t)(2 r (t))(1r(t)) =

1
Var X(t)X(0) Var X (t)+X (0) Cov2 X(t)X(0), X (t)+X (0) < 0.
4
(11)

Thus the minimum is attained at zero.


22
(b) Note that F (0) =
< 1 = F (+). If F has a local minimum at t = t , equation (4) shows that r
4 22
has a local maximum at t = t so that
F (t ) =

22
1 r(t )
>
1 + r(t )
4 22

due to the last condition in (b). This proves (b).


Remark. The proofs above show that even if hypothesis (H) is not satisfied, it is still possible to improve
inequality (2). In fact it remains true for every such that
< min F (t).
t[0,T ]

The authors thank Professors P. Carmona and C. Delmas for useful talks on the subject of this paper.

References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical functions with Formulas, graphs and mathematical Tables. Dover,
New-York (1972).
[2] R.J. Adler, An introduction to Continuity, Extrema and Related Topics for General Gaussian Processes. IMS, Hayward, CA
(1990).
[3] J.-M. Azas and J.-M. Bardet, Unpublished manuscript (2000).
[4] J.-M. Azas and C. CiercoAyrolles, An asymptotic test for quantitative gene detection. Ann. Inst. H. Poincar
e Probab. Statist.
(to appear).
[5] J.-M. Azas, C. CiercoAyrolles and A. Croquette, Bounds and asymptotic expansions for the distribution of the maximum of
a smooth stationary Gaussian process. ESAIM: P&S 3 (1999) 107-129.
[6] J.-M. Azas and M. Wschebor, The Distribution of the Maximum of a Gaussian Process: Rice Method Revisited, in In and
out of equilibrium: Probability with a physical flavour. Birkhauser, Coll. Progress in Probability (2002) 321-348.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes. J. Wiley & Sons, New-York (1967).

184

J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR

[8] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977)
247-254.
[9] J. Dieudonn
e, Calcul Infinit
esimal. Hermann, Paris (1980).
[10] R.N. Miroshin, Rice series in the theory of random functions. Vestn. Leningrad Univ. Math. 1 (1974) 143-155.
[11] V.I. Piterbarg, Comparison of distribution functions of maxima of Gaussian processes. Theoret. Probab. Appl. 26 (1981)
687-705.

The Annals of Probability


1996, Vol. 24, No. 3, 11041129

ON THE DENSITY OF THE MAXIMUM OF SMOOTH


GAUSSIAN PROCESSES
By Jean Diebolt1 and Christian Posse2
CNRS and Swiss Federal Institute of Technology
We obtain an integral formula for the density of the maximum of
smooth Gaussian processes. This expression induces explicit nonasymptotic lower and upper bounds which are in general asymptotic to the density. Moreover, these bounds allow us to derive simple asymptotic formulas
for the density with rate of approximation as well as accurate asymptotic
bounds. In particular, in the case of stationary processes, the latter upper
bound improves the well-known bound based on Rices formula. In the case
of processes with variance admitting a finite number of maxima, we refine
recent results obtained by Konstant and Piterbarg in a broader context,
producing the rate of approximation for suitable variants of their asymptotic formulas. Our constructive approach relies on a geometric representation of Gaussian processes involving a unit speed parameterized curve
embedded in the unit sphere.

1. Introduction. Let X t , t I = 0 T , be a real Gaussian process


with mean 0 and continuous sample functions. Numerous papers have been
devoted to the study of
Z = sup X t
tI

[see the monographs by Adler (1981, 1990), Berman (1992), Leadbetter, Lindgren and Rootzen (1983), Ledoux and Talagrand (1991), and Piterbarg (1996)].
It turns out that the exact distribution of Z is known for the Wiener process,
1
the Brownian bridge B t , B t 0 B u du [Darling (1983)], the integrated
Wiener process [see, e.g., Lachal (1991)], a class of sawtooth processes [see,
e.g., Cressie (1980)] and the random cosine wave X t = 1 cos t + 2 sin t ,
where 1 and 2 are i.i.d. 0 1 .
Otherwise, two directions have been mainly explored. The first one consists
of deriving, under minimal restrictions, upper and lower bounds for P Z > a
for a large enough, or first-order asymptotics for P Z > a as a . At this
level of generality, these bounds often involve unknown constants and are
not sharp enough to be used as p-values in statistical tests and stochastic
modelization where precise estimations of P Z > a , lying between 0.1 and
0.01, are required. Works of the second category try to obtain precise asymptotics for P Z > a under more rigid restrictions on the process (stationarity,
Received July 1994; revised October 1995.
supported by the CNRS and the NSF.
2 Research supported by the Swiss National Science Foundation.
AMS 1991 subject classifications. Primary 60G15, 60G70; secondary 60G17.
Key words and phrases. Differential geometry, Gaussian processes, extreme value, nonasymptotic formulas, density.
1 Research

1104

SMOOTH GAUSSIAN PROCESSES

1105

smooth covariance function


). In our approach, we study the nonasymptotic
behavior of the density fZ b of Z. We derive an integral representation for
fZ b which induces sharp explicit upper and lower bounds for all b > 0. It
appears that these bounds are asymptotic to the density as b . Moreover, they allow us to derive simple asymptotic formulas for fZ b with rate
of approximation as well as specially accurate asymptotic bounds (see, e.g.,
the examples in subsection 2.3). Note that Weber (1985) and Lifshits (1986)
have studied the density of Z in a broader context and they have obtained
bounds involving unknown constants.
In this paper, we consider the class of centered Gaussian processes of the
form
(1)

1
X t = X
t

j gj t
j=1

2n

where j , j 1, are i.i.d. 0 1 , nj=1 gj2 t 1 and the functions X t > 0


and gj t , j 1, are sufficiently smooth.
The existence of the KarhunenLo`eve expansion for Gaussian processes
with continuous sample functions ensures that this class is very large [Adler
(1990)]. First, all Gaussian processes with smooth covariance function have
such a representation. Second, more general processes can be approximated
by processes of this class with respect to the uniform norm. Indeed, in the
context of nonsmooth Gaussian processes, a classical approach consists of

working with the regularized version X t = X t s s ds, where


t = t/ / is a smooth kernel. By letting 0, X t converges
to X t under weak assumptions [Azas and Florens-Zmirou (1987)]. Since
P suptI X t X t > is basically controlled by suptI Var X t X t ,
choosing such that suptI Var X t X t is sufficiently small should allow us to transform sharp bounds for P suptI X t > a into usable bounds
for P Z > a with a in a suitable fixed compact subset of 0 . Indeed,
our motivation in undertaking this study was to provide good approximations
for P Z > a around practical values of a, that is, around the 0.95 quantile, for smooth processes and processes whose covariance function admits a
Taylor expansion sufficiently close to that of smooth processes in the neighborhood of the diagonal. These approximations are under current research. Of
course, the possibility of such approximations has no direct relation with the
possible divergent first-order asymptotic expansions of the tail distributions
as a . For stationary Gaussian processes, the truncation of the spectral
measure also leads to the form (1). See also Berman (1988) for stochastic modelizations leading to processes (1), Davies (1977) for a class of statistical tests
and Konakov and Piterbarg (1983) for confidence regions in nonparametric
density estimation involving quantiles of the distribution of the maximum of
smooth Gaussian processes.
Berman (1988) studies the asymptotic behavior of the tail probabilities of
the supremum Z of processes of the form (1) with X t 1, finite n and
orthogonally invariant joint distribution of 1
n . In the normal case,

1106

J. DIEBOLT AND C. POSSE

his results render the well-known expressions (Theorem 18.1, page 37):
(2)
where L =

P Z > a L 2
T
0

n
j=1

gj 2 t

(3) P Z > a L 2

1/2

exp a2 /2

dt, and (Corollary 17.1, page 36)

exp a2 /2 +

1/2

exp x2 /2 dx

a>0

The main term appearing in both expressions results from Rices formula,
which measures the expected number of upcrossings of a fixed level a [Marcus (1977)]. In subsection 2.3, we improve the upper bound (3) for smooth
stationary processes providing a higher-order expansion for P Z > a .
Johnstone and Siegmund (1989) consider processes of the form (1) with
X t 1, finite n and 1
n uniformly distributed on the unit sphere.
By making use of the connection between the standard Gaussian distribution
in Rn and the uniform distribution on the unit sphere of Rn , we can adapt their
result (Theorem 3.3, page 190) to our context. It turns out that the resulting
upper bound is (3).
Sun (1993) investigates an asymptotic expansion for the tail probabilities
of the maximum of smooth Gaussian random fields with unit variance. In
the special case of processes, her results concern periodic processes of the
form (1) with X t 1. For finite n, Sun obtains the asymptotic formula (2)
(Theorem 3.1, page 40) as a consequence of Weyls formula for the volume of
tubes around a manifold embedded in the unit sphere. For infinite n, (2) still
holds under additional assumptions, otherwise it becomes an upper bound
(Theorems 3.2 and 3.3, page 41).
The sharpest results concerning smooth Gaussian processes are due to
Piterbarg (1981, 1988) and Konstant and Piterbarg (1993) who produce very
precise asymptotic formulas for P Z > a . In subsection 2.3, our results are
compared to theirs. In particular, we provide rates of approximation for suitable variants of the asymptotic formulas given in Konstant and Piterbarg
(1993).
Our approach is based on the interpretation of the functions gj t , j 1,
as a parameterization of a curve embedded in the unit sphere of Rn or of
the space of square summable sequences. With the canonical moving frame
induced by this parameterization, we describe each level manifold z = b ,
b R, of the functional
1
z = sup X
t
tI

xj gj t
j=1

where xj : j 1 is a realization of j : j 1 , as an envelope of the family


of hyperplanes X t 1 nj=1 xj gj t = b: t I . This technique enables us
to express the density fZ b of Z as the canonical volume of z = b , leading
to an integral formula for fZ b (Theorem 1). This representation provides
(under weaker assumptions due to a perturbation argument) nonasymptotic
lower and upper bounds for fZ b and P Z > a with remarkable asymptotic

1107

SMOOTH GAUSSIAN PROCESSES

features. Moreover, it allows us to handle processes with quite general varying


variance, to deal with the case n = and to manage the boundary effects with
great care.
The remainder of the paper is organized as follows. Our main results are
presented in Section 2 for n < and are extended to the infinite case in
Section 3. The theorems in Sections 2 and 3 are proved in Section 4. Some
related known results of differential geometry are briefly introduced in the
Appendix.
Notation. Throughout the paper, a.e. means almost every, 2 refers to the
Hilbert space of square summable sequences, x = x1 x2
is an element of
xd
either Rn or 2 , x y = nj=1 xj yj , 2 n , x = x x 1/2 , Vect x1

and Vect x1
xd denote the linear subspace spanned by x1
xd and
its orthogonal, respectively, Gram x1
xd is the determinant of the matrix G with entries Gij = ai aj [note that det G = det2 x1
xd ]. n
is the Gaussian measure on Rn with density n x = 2 n/2 exp x 2 /2 ,
x
x = 1 x and x = y dy. By convention, = 0 and = 1.
m
A denotes the set of functions A R having kth-order continuous
derivatives for k = 1
m. The partial derivatives k+l r x y /xk yl where
k+l
r
A are written Dkl r x y . The Jacobian matrix of a differentiable
mapping p: Rn Rn is denoted Dp.
2. Main results.
2.1. An integral formula. Let X t , t I = 0 T , be a Gaussian process
2
with mean 0 and variance X
t > 0, of the form
(4)

1
X t = X
t U t

U t = g t

1
where X t = X
t , g t = g1 t
gn t , n 2, and = 1
n
is a Gaussian r.v. with zero mean and identity covariance matrix. With this
representation, the covariance function rX t1 t2 of X t is given by
1
1
1
1
rX t1 t2 = X
t1 X
t2 rU t1 t2 = X
t1 X
t2

g t1 g t2

2
2
and Dkl rU t1 t2 = g k t1 g l t2 . Since X
t = rX t t = g t 2 /X
t ,
2
g t
= rU t t 1 and g t parameterizes a curve embedded in the
unit sphere Sn1 in Rn . Let us denote kl u t = Dkl rU t1 t2 t1 = t2 = t . In this
subsection, we assume that

Condition 1. X t is in 2 I and rU t1 t2
derivatives Dkl rU t1 t2 for 0 k l 2.
Condition 2. 11

has continuous partial

t = 0 for all t I.
3/2

Condition 3. t: cg t = 0 t: X t X t 12 u t /11 u t +
X t /11 u t > 0 , where the function cg t 0 defines the geodesic

1108

J. DIEBOLT AND C. POSSE

curvature of at the point g t (see the Appendix) and is given by


cg2 t = 11 u t 22 u t 212 u t /311 u t 1.
Condition 4. Whenever g t , g t and g t are linearly dependent for
t = t, g t = g t where g t = X t g t + X t g t /11 u t .
Remark. We can relax Condition 2, allowing a finite set I0 of points
such that 11 u t = 0 for t I0 . Though technical, Conditions 3 and 4
are very weak and easily checked. A sufficient standard condition is that
U t U t U t U t admits a joint density for all t = t. However, Conditions 3 and 4 are required only for the derivation of an explicit formula in
Theorem 1 and will not be used in the following subsections.
Remark. If X t 1/ is constant, Condition 3 is automatically satisfied,
g t = g t and Condition 4 means that the curve has no self-intersection,
that is, t = t g t = g t . In other words, rU t t < 1 for all t t 0 T
if g t [i.e., U t ] is T-periodic and for all t t 0 T otherwise.
We are interested in finding nonasymptotic estimates for the distribution of
Z = sup X t
tI

The key idea of our approach is to transform this problem into a geometric
problem concerning the standard Gaussian measure of certain convex subsets
of Rn . We obtain an integral formula for the density fZ of Z which is stated in
Theorem 1. The derivation of this formula which is sketched below is greatly
simplified if we parameterize with unit speed. This can be done without
loss of generality when Condition 2 holds. Let us define the Gaussian process
Y s , s J, as
1
Y s = Y
s V s

V s = f s

1/2

where Y s = X 1 s , f s = g 1 s and s = t = 0 11 u t dt
defines a unit speed parameterization of , J = 0 L with L = = T .
Then we have
Z = sup X t = sup Y s
tI

sJ

The covariance function of Y s is given by rY s1 s2 = rX 1 s1 1 s2 .


Moreover, f s = f s 1 for all s J. Note that, in terms of s = t ,
Condition 3 becomes: s: cg2 s = f s 2 1 = 0 s: Y s + Y s > 0 .
For simplicity, all our results will be expressed in terms of this unit speed
parameterization. However, simple transformations give the corresponding
formulas expressed in the original parameterization. In practice, only the latter are used.
Our method relies on the existence of an orthonormal moving frame f s ,
T s , K1 s
Kn2 s of Rn such that the space tangent to Sn1 at f s is
spanned by T s , K1 s
Kn2 s and T s = f s (see the Appendix).

1109

SMOOTH GAUSSIAN PROCESSES

Let us consider the realization of


P as Rn Rn n , where
n
n
R denotes the Borel -field of R . It follows that
1
Y s x = Y
s x f s

x Rn

is a realization of the process Y s and


(5)

P Z a = n Ca

aR

where
Ca = x Rn : sup Y s x a
sJ

The boundaries Cb of Cb , b a, partition Ca . Indeed, by Lemma 7 in Section 4,


Cb = x Rn : sup Y s x = b
sJ

This suggests that a suitable change of variable will express n Ca as an


integral over b a of appropriate superficial measures of Cb :
a

(6)

n Ca =

b Cb db =

fZ b db

Such a decomposition can be worked out basicallyfor simplicity, we assume Y s X t periodic here, the aperiodic case can be treated essentially
in the same waybecause it is possible (see Lemma 9) to parameterize Cb
by
n2

pb s u = c1 b s f s + c2 b s T s +

uj Kj s
j=1

where s J, u = u1
un2 Db s , c1 b s and c2 b s are defined in
terms of b, Y s and Y s , and Db s is a closed convex subset of Rn2 . We
show in Lemmas 11 and 12 that the transformation p: b s u pb s u
is a C1 -diffeomorphism from an open subset of Rn into Rn . By the change-ofvariable formula and Fubinis theorem, we have, for all A Rn ,
n A =
=
=

b s u p1 A
bR
bR

n p b s u Gram1/2 Dp b s u db ds du

s u pb1 ACb

n pb s u Gram1/2 Dp b s u ds du db

b A Cb db

Since Ca Cb = Cb if b a and Ca Cb = otherwise, we obtain (6).


Remark. The canonical superficial measure on Cb induced by n x is
defined by
b A Cb =

s u p1
b ACb

n pb s u Gram1/2 Dpb s u ds du

1110

J. DIEBOLT AND C. POSSE

[Berger and Gostiaux (1988), page 203]. If Y s 1/ is constant, it results


from Lemma 11 that b ACb = b ACb for all A Rn . Otherwise,
b appears as a weighted version of b , with weight Y s :
(7) b A Cb =

s u pb1 ACb

n pb s u Y s Gram1/2 Dpb s u ds du

The above approach leads to the following expression for fZ b , b R.


Theorem 1.

Under Conditions 14, the density of Z is given by


L

fZ b =

(8)

Db s

Y s b Y s + Y s

u1 cg s

n p b s u du1 dun2 ds + Z b
where
Z
with

0
b = Y 0 bY 0 n1 Gb 0

+ Y L bY L n1 Gb L

Db s = u = u1

1
un2 Rn2 : sup Y
s

Y s L-periodic
otherwise

p b s u f s

s J

n2

p b s u = b Y s f s + Y s T s
Gb l = v = v1

uj Kj s
j=1

1
vn1 Rn1 : sup Y
s
s J

pb l v f s

and
n2

pb l v = bY l f l + vn1 T l +

vj Kj l
j=1

l=0 L

Remark. Taylor expansions of order 1 of pb 0 v f s bY s [respectively pb L v f s bY s ] around l, l = 0 L, show that Gb 0 [respectively


Gb L ] has Lebesgue measure 0 in Rn1 if Y s is L-periodic.
2.2. Nonasymptotic bounds. Several nonasymptotic upper and lower
bounds for P Z > a have been proposed [see, e.g., Samorodnitsky (1991),
and the references therein, Berman and Kono (1989), and Weber (1989)].
However, these bounds, obtained in general in a broader context, either involve unknown constants or are too crude to be used as p-values in statistical
tests. In this section, we provide explicit sharp bounds for fZ b that turn
out to be asymptotic to the density as b , as shown in the next section.
From the integral representation (8), we can, under weaker assumptions
with the help of a perturbation argument, deduce an efficient and easily computable upper bound for fZ b , b R.

1111

SMOOTH GAUSSIAN PROCESSES

Theorem 2.

Under Conditions 1 and 2 and for b R

fZ b M b
=

b
2

Y s Y s + Y s exp

b Y s + Y s
cg s

(9)
+

1
2

Y s cg s exp

where
M

b2 2
s + Y 2 s
2 Y

ds
b2 2
s + Y 2 s
2 Y

b Y s + Y s
cg s

0
bY 0
b = Y 0 bY 0

+ Y L bY L 1

ds + M b

Y s L-periodic
bY L

otherwise

Remark. Expressions in terms of t = 1 s for M b and M b are ob1/2


tained by replacing L by T, ds by 11 u t dt, cg s by cg t (see the Appendix),
1/2

Y s by X t , Y s by X t /11
X t /11 u t .

t and Y s by X t 12

3/2

t /11

t +

The upper bound M b can be used to derive a lower bound for fZ b , b > 0.
Indeed, the integral formula (8) can be rewritten as
fZ b =

b
2

L
0

Y s Y s + Y s
exp

(10)

1
2

b2 2
s + Y 2 s
2 Y

L
0

Y s cg s exp

Db s

n2 Db s ds

b2 2
s + Y 2 s
2 Y

u1 n2 u du1 dun2 ds + Z b

By the definition of Db s and relation (5), n2 Db s and D s u1 n2 u du


b
can be interpreted in terms of P sups J Ws s b , where Ws s is a suitable Gaussian process of the form (4) for a.e. s J. Therefore, Theorem 2
provides upper bounds for P sups J Ws s > b and the absolute value of
the second term on the right-hand side of (10). This approach requires the
following assumptions:
Condition 5. X t is in 3 I and rU t1 t2
derivatives Dkl rU t1 t2 for 0 k + l 6.

has continuous partial

1112

J. DIEBOLT AND C. POSSE

Condition 6. For all t = t I, the joint distribution of U t U t


U t U t admits a density.
More precisely, for each s int J ,
n2 Db s

= P sup Y s b Y s = b Y s = 0
s J

1
= P sup Y
s
s J

b Y s f s f s

+ Y s T s f s

n2

j Kj s f s
j=1

where = 1
n2 is a Gaussian r.v. with mean 0 and identity covariance matrix. The Gaussian process Y s
Y s = b Y s = 0 has mean
1
given by bY
s Y s f s f s + Y s T s f s
and variance given by
2
Y
s 2s s where 2s s = 1 f s f s 2 T s f s 2 > 0 by Condition 6.
Therefore,
n2

(11)

n2 Db s

= P for all s J

j Kj s f s
j=1

bs s

where
s s = Y s Y s f s f s
1

= b Y s

bE Y s

Y s T s f s
Y s =b Y s =0

If s s > 0 for all s = s, we show (Theorem 4) that n2 Db s 1


as b . Note that, for such an s, Y s + Y s 0 since s s + h =
Y s + Y s h2 /2 + o h2 as h 0. If s s < 0 for some s = s = s, then
n2 Db s
b1
s s s 0 as b and is negligible compared
s
to the previous case. Finally, if s s 0 for all s = s and s s = 0 for
some s = s, n2 Db s is also small for large b (see Theorem 4). Moreover,
for most of the processes of interest [e.g., Y s constant on J or admitting at
least one minimum in int J or a unique minimum at the boundary, say 0,
and Y 0 = 0], the set J+ = s: s s > 0 for all s = s and Y s + Y s > 0
has positive Lebesgue measure and contains all points giving the largest contribution to fZ b . For the case where Y s admits a unique minimum at the
boundary, say 0, and Y 0 = 0, it can be shown that the main contribution to
the density is given by Z b (see Theorem 4) and, for the sake of brevity in
the present paper, we have chosen to take 0 as the lower bound for Z b .
From the above considerations, it follows that we can restrict ourselves to
the subset J+ J in the elaboration of a lower bound for fZ b , taking 0 on
J\J+ . On J+ , it is possible to rewrite (11) in terms of a process of the form (4)
and to use Theorem 2 to provide a lower bound for fZ b :
n2 Db s

= P sup Ws s b
s J

s J+

1113

SMOOTH GAUSSIAN PROCESSES

where Ws s = s1 s ks s , s1 s = Var1/2 Ws s = s s 1
s >0
s
and ks s = ks 1 s
ks n2 s
parameterizes a curve s on the unit
sphere of Rn2 . This curve is the normalized orthogonal projection of on
Vect f s T s and consequently s ks s is not unit speed. As in subsection 2.1, we need the autocovariance function of ks s which is given
by
rs s1 s2 = ks s1 ks s2

= s1 s1 1
s2
s

f s1 f s2

f s f s1

T s f s1

f s f s2
T s f s2

if s1 = s and s2 = s, rs s s2 = f s + f s f s2 / cg s s s2 and rs s s =
1. By Condition 5, rs s1 s2 has continuous partial derivatives Dkl rs s1 s2
for 0 k l 2. Let us denote kl s s = Dkl rs s1 s2 s1 = s2 = s .
In order to apply Theorem 2 to Zs = sups J Ws s , it remains to determine
1/2
ks s and the geodesic curvature cg s s of s . We have ks s = 11 s s >
0 by Condition 6 and cg2 s s = 11 s s 22 s s 212 s s /311 s s 1.
Theorem 3.

Under Conditions 2, 5 and 6 and for b > 0

fZ b m b
b
s Y s + Y s
=
2 J+ Y

(12)
b2 2
exp
Y s + Y 2 s
1
Ms b db ds
2
b
b2 2
1
s + Y 2 s 1 s Ms b ds
Y s cg s exp

2 J+
2 Y
where 0 < s = inf s J s s cg1 s Y s + Y s
Ms b =

b
2

L
0

s s s s exp
bs s
cg s s

Ms b =

1
2

2 s
b2 2
s s + s
2
11 s s
s ds

L
0

s 0 bs 0

1/2

11

s s cg

s exp

bs s
cg s s

for all s J+ and

1/2

11

b2 2
2 s
s s + s
2
11 s s

s ds + Ms b
Y s L-periodic

bs 0
1/2

11

+ s L bs L

bs L
1/2

11

otherwise

1114

J. DIEBOLT AND C. POSSE

and
s s = s s s s 12

3/2

s /11

s + s s /11

Remark. Expressions in terms of t = 1 s are obtained by applying the transformations of Theorem 2 and by replacing s s by t t =
t t /t t where 2t t = 1 rU t t 2 D210 rU t t /11 u t , t t =
1/2
X t X t rU t t X t D10 rU t t /11 u t , s s by t t /11 u t ,
3/2

s s by t t 12 u t /11 u t + t t /11 u t and rs s1 s2 by rt t1 t2


with similar transformations for kl s s .
The function Ms b involved in the expression of m b may be too complex
for practical purposes if the process Y s is not stationary. However, by means
of Laplaces formula for integral representation [De Bruijn (1962), page 65],
we can derive a good approximation of Ms b already for moderate b:
Ms b s s b s
where s is defined in Theorem 5 and can be numerically computed using
finite differences for the derivatives involved.
2.3. Asymptotic behavior of fZ b , M b and m b . A remarkable feature
of our nonasymptotic approach is that it produces naturally very fine asymptotics for fZ b and hence P Z > a . Theorem 4 states first-order asymptotic results. It shows that in general our bounds are asymptotic to fZ b as
b and it makes explicit simple asymptotic formulas for fZ b . In particular, the expression (13) extends the well-known asymptotic formula (14) for a
2
class of smooth Gaussian processes with varying variance X
t . Moreover, it
shows that the remainder term fZ b e b , where e b denotes the asymptotics of fZ b under consideration, has a superexponential decay as b .
In addition, the fine first-order asymptotic results obtained by Konstant and
Piterbarg (1993), Corollaries 2.2 and 2.3, in a broader context are recovered
in statements (iii) and (iv).
In the case of smooth stationary nonperiodic Gaussian processes [hence
Y s 1], Piterbarg (1981) provides a higher-order asymptotic expansion for
P Z > a . Under conditions comparable to Conditions D2D4 in Section 3,
his Theorem 2.2 states that there exist constants , 0 < 1, and B such
that
P Z > a L 2

exp a2 /2 + 1

B exp a2 / 1 +

We do not give such a precise result in the nonperiodic case [Theorem 4(ii)] but
we do [Theorem 4(i)] in the periodic case (not treated by Piterbarg). However,
under slightly stronger conditions than those of Theorem 2.2, Piterbarg (1981)
obtains the highest-order expansion to our knowledge. His Theorem 2.3 states
that there is an L0 small enough such that, for all L L0 ,
P Z > a = L 2

exp a2 /2 + 1

C a5 exp a2 1 + cg2 / 2cg2

1+o 1

1115

SMOOTH GAUSSIAN PROCESSES

as a where C = 61/2 3Lcg9 / 4 3/2 6 1 cg2 and 6 is the sixth spectral


moment of the process.
Together with our nonasymptotic results, Theorem 4 enables us to derive
simple second-order asymptotic bounds for fZ b (Theorem 5). In the case
of smooth stationary Gaussian processes (Examples 1 and 2), these bounds
improve the well-known results (2) and (3). In the case of smooth Gaussian
processes with a variance admitting a finite number of maxima (Example 3),
we refine Corollaries 2.1 and 2.2 of Konstant and Piterbarg (1993), producing
rates of decay of the remainders.
Theorem 4. Under the conditions of Theorem 2 for M b and of Theorem 3
for m b and consequently for fZ b
fZ b = e b 1 + R b
M b e b 1 + RM b

RM b > 0

m b e b 1 Rm b

Rm b > 0

as b where:
(i) if J \ J+ has Lebesgue measure 0 and Y s is periodic, or not periodic
with min Y 0 Y L > inf sJ Y s ,
(13)

e b =

b
2

L
0

Y s Y s + Y s exp

b2 2
s + Y 2 s
2 Y

ds

and R b = RM b = Rm b = O b for some > 0


(ii) if Y s is not periodic and Y s 1 is constant,
(14)

e b =

b
b2
L exp
2
2

R b = RM b = O b1 and Rm b = O b for some > 0;


(iii) if Y s is not periodic, Y s = < Y s for all s = s s int J
k
and q = inf k 1: Y s = k > 0 is assumed finite, then
1. e b = 1 + / 1/2 b and R b = RM b = Rm b = o 1 if
q=2
2. e b = b12/q 2 1/2 21/q q 1/q q1 q1 q + 1 1/q b and
R b = RM b = Rm b = o 1 if q 3
(iv) if Y s is not periodic, Y 0 = < Y s for all s = 0 and q = inf k
k
1: Y 0 = k > 0 is assumed finite, then
1. e b = M b R b = o 1 RM b = O b and Rm b = 1
O b for some > 0 if q = 1
2. e b = 21 1 + / 1/2 + 1 b R b = RM b = o 1 and
Rm b = 1 21 1 + / 1/2 + 1 1 o 1 if q = 2
3. e b = b12/q 2 1/2 21/q q 1/q q1 q1 q + 1 1/q b and
R b = RM b = Rm b = o 1 if q 3.

1116

J. DIEBOLT AND C. POSSE

Remark. The results stated in Theorem 4(iii) and (iv) can be easily
adapted to the case where Y s reaches its absolute minimum on a finite set
of points by adding the asymptotics over these points.
Note that when Y s reaches its maximum at the boundaries with high
probability, the main contribution to the density is given by the additional
term Z b . This phenomenon affects the good behavior of m b since we have
chosen to take 0 as the lower bound for Z b for the sake of brevity. However,
it would be possible to improve m b by introducing a term 0 < m b Z b
which corrects this imperfection. This subject is under current research.
Theorem 5. Assume that Conditions 2, 5 and 6 hold, J \ J+ has Lebesgue
measure 0 and, for a.e. s J the function s s s reaches its infimum
k in J with s si > 0 for
s > 0 at a finite number of points si i = 1
si int J and s si = 0 or s si > 0 if si = 0 L. Then there exists for a.e.
s J a positive number s such that, for b ,
b
2

fZ b

L
0

Y s Y s + Y s
exp

b2 2
s + Y 2 s
2 Y

u s b ds + Z b

where
u s b = 1 s 1

b s

1+o 1

+ s cg s b Y s + Y s
fZ b

b
2

L
0

Y s Y s + Y s exp

b s

1+o 1

b2 2
s + Y 2 s
2 Y

l s b ds

where
l s b = 1 s 1

1+o 1

b s

s cg s b Y s + Y s
The function s is given by s =

k
i=1

i s

b s

1+o 1

where:

(i) i s = 1 + s 11 s si /s si 1/2 if si int J


(ii) i s = 1 + 1 + s 11 s si /s si 1/2 /2 if si = 0 L s si = 0
s si > 0 and Y s is not periodic,
(iii) i s = 1/2 if si = 0 L s si = 0 and Y s is not periodic.
Remark. Since 1
b s b s /b s as b and s
Y s + Y s /cg s , it follows that u s b 1 + o 1 as b .
Example 1. If Y s , satisfying the conditions of Theorem 5, is stationary
and L-periodic, then cg s cg , s s = s . Therefore, s cg1 ,

1117

SMOOTH GAUSSIAN PROCESSES

s and

fZ b bL 2

exp b2 /2 1 1

1+o 1

+ cg b b 1 + o 1

Moreover, the function s is even and reaches a local minimum at s = 0 and


0 = cg1 . If this minimum is global, it turns out that = 31/2 . Otherwise,
1 cg > 0 and
1

fZ b bL 2

exp b2 /2 1 b b1 1 cg 1 + o 1

for b large enough. Therefore,

exp a2 /2

P Z > a L 2

1 cg L 2

1 + 2

1/2

1 + 2

1/2

a 1 + 2

1/2

a 1 + 2

1/2

1+o 1

which improves the well-known upper bound (3) for a large enough. Similarly,
1

P Z > a L 2

exp a2 /2

1 + cg L 2

1+o 1

Example 2. If Y s , satisfying the conditions of Theorem 5, is stationary


but not periodic, then cg s cg , s s = s s , 1 s cg for all s and
P Z > a L 2

exp a2 /2 + 1

1 + 2 s

1/2

a 2

a 1 + 2 s

1/2

for a large enough, which also improves (3). Similarly,


P Z > a L 2

exp a2 /2 2

1 + 2 s

1/2

L
0

L
0

s 1 s cg

1 + o 1 ds

s 1 s + cg

a 1 + 2 s

1/2

1 + o 1 ds

Example 3. If = Y s < Y s for all s = s , s int J , there exists


> 0 such that J = s s + J+ . We can then apply Theorem 5 to
the restriction of J to J . Let us denote
e b =

b
2

s +
s

Y s Y s + Y s exp

b2 2
s + Y 2 s
2 Y

ds

2
and + 2 = inf sJ\J Y
s + Y 2 s . By assumption, > 0. From (9) and
(12), we have M b = e b + M b + O b +
and m b = e b 1
O b
as b , with = inf sJ s > 0. As min Y 0 Y L > , it
results that

fZ b = e b 1 + O b

for some > 0 since e b = e b 1+O b +


with e b given by (13). In
other words, the remaining term fZ b e b has a superexponential decay.
Note that in Konstant and Piterbarg (1993) it is not proved that the remaining

1118

J. DIEBOLT AND C. POSSE

term P Z > a E a , where E a denotes the asymptotic formula they


obtain by making use of Laplaces formula, has such a superexponential decay.
3. Extension to series representation. The results of Section 2
can be extended to the maximum Z of Gaussian processes of the form
1
X t = X
t U t , t I = 0 T , where X t > 0, U t is a centered
Gaussian process with unit variance and covariance function rU t1 t2 =

g t1 g t2
=
The functions g t = g1 t g2 t
j=1 gj t1 gj t2
parameterize a curve embedded in the unit sphere of 2 . Let us denote
kl u t = Dkl rU t1 t2 t1 =t2 =t . We assume that
Condition D1. The function X t is in 3 I and rU t1 t2 has continuous partial derivatives Dkl rU t1 t2 for 0 k l 4.
Condition D2. 11

t = 0 for all t I.

Under these conditions, we give an upper bound for the density fZ b of


Z = suptI X t . With the following assumptions we also derive a lower bound.
Condition D3. The function X t is in 4 I and rU t1 t2 has continuous partial derivatives Dkl rU t1 t2 for 0 k + l 8.
Condition D4. For all t = t I, the joint distribution of U t U t
U t U t admits a density.
T

1/2

The length and geodesic curvature of are given by L = 0 11 u t dt


and cg2 t = 11 u t 22 u t 212 u t /311 u t 1. As in Section 2, has
t

1/2

a unit speed parameterization s = t = 0 11 u t dt under Condition D2.


1
Therefore, Z = suptI X t = supsJ Y s , where Y s = Y
s V s with
1
1
Y s = X s , V s = U s a Gaussian process of variance 1 and
covariance function rV s1 s2 = f s1 f s2 with f s = g 1 s . The functions s s , s s , s s , s , rs s1 s2 , kl s s , cg s s and s s defined
in Section 2 are also well defined in the present context and we can show the
following result.
Theorem 6. Under Conditions D1 and D2, Z has a density fZ b and
fZ b M b for b R with M b given by 9 . Under Conditions D2D4,
fZ b m b for b > 0 with m b given by 12 . In addition, Theorems 4
and 5 still hold.
4. Proofs of the results of Sections 2 and 3.

4.1. Proof of Theorem 1. We give a detailed proof for the case Y s periodic
and sketch the straightforward adaptation for the other case. A complete proof
of all results can be found in Diebolt and Posse (1995) and is available upon
request.

1119

SMOOTH GAUSSIAN PROCESSES

We need a workable description of the boundary Ca of Ca for a R. Note


first that Ca is the intersection of the closed half spaces x Rn : Y s x a ,
s J. These half spaces have hyperplane boundaries given by
H s a = x Rn : Y s x = a

(15)

sJ

Lemma 7 shows that the surface Ca is closely related to the hypersurface


enveloped by the hyperplanes H s a , s J.
Lemma 7.

If Ca = Ca = x Rn : supsJ Y s x = a .

Lemma 8. Ca = a Ca where a = sJ a s with a s denoting the


affine subspace of dimension n 2 of Rn defined by the equations

Lemma 9.

x f s

= aY s

x T s

= aY s

(i) The hypersurface

can be parameterized by
n2

(16)

pa s u = a Y s f s + Y s T s

uj Kj s
j=1

with s J u = u1
un2 Rn2 .
(ii) The hypersurface Ca can be parameterized by pa s u with s J and
u Da s , where Da s is the closed convex subset of Rn2 (possibly empty)
defined by the set of inequalities sups J Y s pa s u a .
Lemma 10. Let s0 J be given.
(i) If u Da s0 = then d a s0 u1 cg s0 0.
(ii) If u int Da s0 = and cg s0 > 0, then d a s0 u1 cg s0 > 0,
where d a s = a Y s + Y s .
Proof. (i) For each fixed u Rn2 and s0 , the function hu s0 s =
Y s pa s0 u s J, is twice differentiable and hu s0 s0 = 0. Furthermore, since f s = cg s K s f s for all s J, hu s0 s0 = d a s0
u1 cg s0 /Y s0 . If u Da s0 = , hu s0 s reaches its maximum value at
s = s0 , implying that hu s0 s0 0.
(ii) Suppose that u int Da s0 = . Let us show by contradiction that
hu s0 s0 < 0. Otherwise, we would have hu s0 s0 = 0 by (i). If hu s0 s0 = 0,
since cg s0 > 0 and u int Da s0 , we can pick v Da s0 (close enough to
u) such that hv s0 s0 > 0 (by taking v1 > u1 ), which contradicts (i).
Let us define the C1 -function
p b s u = pb s u

bR

sJ

u = u1

un2 Rn2

1120

J. DIEBOLT AND C. POSSE

This function maps Va = b s u : b a s J and u Db s Rn onto


Ca Rn since b a Cb = Ca . Moreover, the restriction of p b s u to the open
subset V0a = b s u : b < a b = 0 s int J and u int Db s
Va
maps V0a onto a subset C0a of Ca . In the following, we will assume that a is
such that V0a = .
Lemma 11. (i) det Dp b s u = Y s d b s u1 cg s .
(ii) The function p b s u is a local C1 -diffeomorphism from V0a onto C0a
and C0a is an open subset of Rn .
2
Proof. (i) follows from the fact that Gram Dp = Y
s d b s u1 cg s 2
and (ii) follows from (i), Lemma 10, Condition 3 and that det Dp b s u = 0
for b s u V0a .

Lemma 12. The function p is a one-to-one mapping from V0a onto C0a .
Proof. Using Condition 4, the proof is similar to the proof of Lemma
10(ii).
Lemma 13.
(17) n C0a =

V0a

Y s d b s u1 cg s n p b s u du1 dun2 ds db

Proof. According to Lemmas 11 and 12, the function p is a C1 diffeomorphism from V0a onto C0a . Moreover, according to Lemmas 10 and 11,
det Dp b s u > 0 for all b s u V0a . Then (17) results from the change of
variable x = p b s u applied to the integral n C0a = C0 n x dx.
a

Since Db s is a convex of Rn2 , the Lebesgue measure in Rn2 of Db s


is 0. Therefore, the Lebesgue measure of Va \ V0a is 0 and we can replace V0a
by Va in (17). Moreover, Ca \ C0a has Lebesgue measure 0 since Ca \ C0a =
p Va \p V0a p Va \ V0a and p b s u is a C1 -function from Rn to Rn .
Consequently, n Ca = n C0a which, with (5) and (6), concludes the proof of
Theorem 1 for Y s periodic.
Suppose now that Y s is not periodic. Lemma 7 still holds. In Lemma 8, we
have to replace a by a = sint J a s H 0 a H L a , where H l a
is defined in (15). Then Ca can be partitioned as Ca = Ca int Ca 0 Ca L ,
where Ca int = sint J a s Ca , Ca l = H l a Ca , l = 0 L. Lemmas 9
13 can be applied without modification to Ca int = b a Cb int . The additional
term Z b is obtained from the boundaries Cb l , b a, l = 0 L. Indeed,
for l = 0 L, Cb l can be parameterized by pb l v = bY l f l + vn1 T l +
n2
n1
: sups J Y s 1 pb l v f s b .
j=1 vj Kj l with v Gb l = v R
2
Since Gram Dpb l = Y l , we have b Cb l = vG l n pb l v Y l dv.
b

1121

SMOOTH GAUSSIAN PROCESSES

4.2. Proof of Theorem 2. (i) Under Conditions 14, the inequality (9) is a
direct consequence of Lemma 10.
(ii) To enlarge the scope of this inequality, we use the following perturbation
argument. Let us consider the auxiliary Gaussian process Y s , s J,
1
Y s = Y
s V s

V s =

n+k

j fj s

j=n+1

where j , j = n + 1
n + k, are independent standard Gaussian r.v.s, inden+k
2
2
pendent of j , j = 1
n, n+k
j=n+1 fj s 1 for s J, the
j=n+1 fj s =

functions fj s are in J , k is sufficiently large to ensure that the vectors


f s1 , f s1 , f s2 , f s2 and f s3 , where f s = fn+1 s
fn+k s ,
are linearly independent for all s1 = s2 , s1 = s3 and s2 = s3 in J. For instance,
p
V s = i=1 bi n+2i1 cos is/A + n+2i sin is/A with bi = 0, i = 1
p,
p
p
2
2
2 2
i=1 i bi = A and L/A = 0 mod 2 . This process is stationary,
i=1 bi = 1,
nonperiodic on J and its geodesic curvature cg s satisfies cg s cg > 0. We
form the process
1
Y s = Y
s 1 + 2

1/2

V s + V s

1
= Y
s

n+k

j f

j=1

where f j s = 1 + 2 1/2 fj s for j n and f j s = 1 + 2 1/2 fj s


for j > n.
It is easily shown that Y s is a centered Gaussian process with variance
2
Y
s and that f s = f 1 s
f n+k s , s J, parameterizes with unit
speed a curve whose geodesic curvature c g s = 1+2 1/2 cg2 s +2 cg2 1/2
is positive for > 0. Hence, Y s satisfies Condition 3 for > 0. By linear
independence, it also satisfies Condition 4 for > 0. Therefore, we can apply
Theorem 1 and (i) to Y s , for all > 0, to obtain that fZ b M b , where
Z = supsJ Y s and M b is given by (9) with cg s replaced by c g s .
Then
M b

b
2
+

L
0

1
2

Y s Y s + Y s exp
L
0

Y s c g s exp

b2 2
s + Y 2 s
2 Y

b2 2
s + Y 2 s
2 Y

ds
1/2

ds + M b

where M b is independent of . For 0 , the expression on the righthand side is bounded above by M b = O b exp b2 2Y /2 , where Y =
inf sJ Y s > 0 .
Finally, P Z a P Z a for all a as 0 since
1
Z Z Y

1 + 2

1/2

1 sup V s + sup V s
sJ

sJ

It follows that fZ b fZ b as 0 as in the last paragraph of the proof


of Theorem 6.

1122

J. DIEBOLT AND C. POSSE

4.3. Proof of Theorem 3. For each s J+ , n2 Db


be interpreted as P Zs b , where Zs = sups J Ws s

n2

K s f s /s s
Ws s = j=1 j j

1 cg s / Y s + Y s

s
,

in formula (10) can

s =s
s =s

with = 1
n2 a Gaussian r.v. with zero mean and identity covariance matrix. With this definition, Ws s is defined and continuous on
J, Var Ws s
= 2s s /2s s for s = s and Var Ws s = cg2 s / Y s +
Y s 2 . Moreover, Ws s is of the form (4) and satisfies Conditions 1 and 2:
Ws s = s1 s ks s , where s s = Var Ws s 1/2 and ks s =
ks 1 s
ks n2 s with ks j s = Kj s f s /s s for s = s, ks 1 s =
1 and ks j s = 0, j 2. Therefore, we can apply Theorem 2 to Zs to obtain
an upper bound for 1 n2 Db s .
This interpretation can also be used to derive an upper bound for the second term in (10). Indeed, by Stokes theorem [Berger and Gostiaux (1988),
page 195]
Db s

u1 n2 u du1

dun2 =

Db s

Db s

n2 u du2

dun2

n2 dV

where dV denotes the canonical volume element of the manifold Db s


[Berger and Gostiaux (1988), page 203]. A straightforward adaptation of
Lemmas 710 yields
Db s

n2 dV =

v p1
b s Db s

n2 pb

s v Gram1/2 Dpb

s v ds dv

where s J, v Rn4 and s v pb s s v defines a parameterization of


Db s analogous to (16). From (6) and (7) applied to Ws s , it follows that
Db s

n2 dV

1
f b = 1 s fZs b
inf s J s s Zs

where fZs b is the density of Zs .

4.4. Proof of Theorem 4. (i) Assume Y s periodic and J\J+ has Lebesgue
measure 0. Let us denote
b2 2
b L
Y s Y s + Y s exp
s + Y 2 s
ds
e1 b =
2 0
2 Y
By (9),
M b = e1 b +

b
2

L
0

Y s Y s + Y s
exp

b2 2
s + Y 2 s
2 Y

b Y s + Y s
cg s

ds

1123

SMOOTH GAUSSIAN PROCESSES

where 0 < H x = x1 x 1
x x3 x for all x > 0 and
H = 0. Since cg s is continuous and Y s + Y s is continuous and
positive on J, it follows that 1 = inf sJ Y s + Y s /cg s > 0 and M b
e1 b e1 b b1 3 b1 . By (12), m b = e1 b A B, where A involves

b Ms b db and B involves Ms b . Therefore, it suffices to examine the


asymptotic behavior of Ms b . Since s > 0,
Ms b

1
b2 2 s
exp
2
2
+ cg

1/2

L
0

s s

1/2

11

b s s

s ds

Using Taylor expansions of sufficient order, it can be shown that the


functions s s J J s s s s s s , kl s s , 0 k l 2,
and cg s s are continuous. Therefore, their supremum over J J is
finite and, by the positivity of s s as a function of s s J J,
2 = inf sJ s = inf s s JJ s s > 0. It follows that Ms b C1 b b2

for all s J and all b > 0. Hence, b Ms b db C2 b2 . Consequently,


A e1 b C3 b2 and B e1 b C4 b2 . From the asymptotic behavior of
M b and m b , it follows that e1 b = e b .
If Y s is not periodic and min Y 0 Y L > inf sJ Y s ,
bY 0
e1 b
L

= C5

Y s Y s + Y s exp

b2 2
2
0 Y
s Y 2 s
2 Y

ds

2
2
For > 0 small enough, the subset J = s J: Y
0 Y
s Y 2 s 2 of
2
J contains a nonempty interval [around a global minimum of Y
s + Y 2 s ].
Therefore, it has positive Lebesgue measure. For such an > 0,
L
0

Y s Y s + Y s exp
exp

b2 2
2

b2 2
2
0 Y
s Y 2 s
2 Y

ds

Y s Y s + Y s ds

= C6 1 b
Hence, bY 0 = e1 b O b3 , that is, M b = e1 b O b3 . Finally,
from the continuity and positivity of s s 0 and s L , supsJ Ms b =
e1 b O b4 .
(ii) Straightforward.
(iii) Let us take > 0 such that s s + J+ . Such an exists
since s s > 0 for s close to s and s J. Then there exists > 0 such that

1124

J. DIEBOLT AND C. POSSE

2
Y
s + Y 2 s +

M b =

b
2

for s s s + and

s +

Y s Y s + Y s

exp

b2 2
s + Y 2 s
2 Y

ds 1 + O

b1
b3

+ O b b +

[see (i)]. A similar result is obtained for m b , replacing O b1 /b3 by


O b2 for some 1 and 2 > 0. This shows that M b m b fZ b . The
conclusion follows by applying Laplaces formula to the main term of M b
2
in the above expression using a Taylor expansion of order q of Y
s + Y 2 s

around s = s .
(iv) Analogous to (iii).
4.5. Proof of Theorem 5. This follows directly from a straightforward adaptation of the proof of Theorem 3 and the application of Laplaces formula to
Ms b .
4.6. Proof of Theorem 6. Lemma 14 shows that the process X t admits
the representation (1). Therefore, by truncation and renormalization, we can
construct a sequence of Gaussian processes Xn t of the form (4) which converge to X t . Moreover, under Conditions D1 and D2, Xn t satisfies Conditions 1 and 2 for all n sufficiently large. Similarly, under Conditions D2D4,
Xn t satisfies Conditions 2, 5 and 6 for n large enough. Hence, the density
fZn b of Zn = suptI Xn t has an upper bound Mn b of the form (9) and a
lower bound mn b of the form (12).
We will show that Mn b M b for all b, mn b m b for all b > 0 and
the sequence fZn is weakly relatively compact in L1 R . With an equality
due to Dmitrovskii [Lifshits (1986)], this implies that Z has a density fZ b
which is the limit of fZn b and satisfies m b fZ b M b .
Lemma 14. Under Condition D1:

1
(i) there exists a representation X
t
j=1 j gj t of X t
j j 1 are i.i.d. 0 1
(ii) the functions gj t j 1 are in 3 I

(iii) sup t1
k l 3.

t2 II

Dkl rU t1 t2

n
j=1

gj

t1 gj t2

where the r.v.s

0 as n 0

Proof. (i) Straightforward.


(ii) From (i), gj t = E U t j for all j 1 and all t I. Therefore, by
k

Theorem 2.2.2 in Adler (1981) and the CauchySchwarz inequality, gj t =


k t j for all 0 k 4, where U
k t denotes the kth quadratic mean
E U
derivative of U t .
(iii) This is a consequence of Dinis theorem and Parsevals inequality.

1125

SMOOTH GAUSSIAN PROCESSES

Let Pn : 2 2 denote the orthogonal projection of


defined by Pn x = x1
xn 0
for x 2 .

onto Rn 0

Lemma 15. (i) Under Condition D1, Pn g t = 1 + n t with n t 0


as n uniformly for t I for 0 k 3
(ii) Under Conditions D1 and D2, Pn g t g t uniformly for t I
as n and there exists N1 such that inf tI Pn g t > 0 for all n N1 .
Proof. All statements are direct consequences of Lemma 14 and the continuity and positivity of g t in I.
Hence, from Lemma 15, gn t = Pn g t / Pn g t

k
gn

= gn

gn

0
, t I, is well defined for all n N1 and
t g t uniformly
for 0 k 3. The functions gn t , t I, parameterize a curve n on
the unit sphere of Rn 0
2 . Moreover, there exists N2 such that
inf tI gn t > 0 for all n N2 . The corresponding unit speed parameterization of n is defined by fn n t = fn 1 n t
fn n n t 0
,
t
where n t = 0 gn t dt . With this notation Zn = suptI Xn t =
n
1
suptI Yn n t , where Xn t = X
t Un t , Un t =
j=1 j gn j t ,
n
1
Yn n t = Yn n t Vn n t and Vn n t = j=1 j fn j n t .
k

Lemma 16. Under Conditions D1 and D2, n t k t


n1 k n t
k
1 k t
fn n t f k t all uniformly for t I and for 0
l
l
k 3 Yn n t Y t uniformly for t I and for 0 l 3 and
cg n n t cg t uniformly for t I.
Proof. All convergences follow directly from Lemma 15 [gn t g t
uniformly], Lemma 14 and the uniform continuity of X t and its derivatives
over I.
If Condition D1 is replaced by Condition D3 in Lemmas 15 and 16, all
the results hold for 0 k 7 and 0 l 4. Let us define n n t n t =
Yn n t Yn n t fn n t , fn n t
Yn n t Tn n t fn n t .
Lemma 17. Under Conditions D1 and D2, for each compact subset K+
J+ :
1

(i) there exists N3 such that, for all n N3 = inf tK+ Yn n t +


Yn n t > 0
> 0 for all
(ii) there exists N4 such that, for all n N4 n n t n t
t K+ and t I.
Proof. (i) The proof is similar to the proof of Lemma 15(ii).
(ii) By a Taylor expansion of order 3 and Lemma 16, n n t n t + h =
Yn n t + Yn n t h2 /2 + Rn t h , where Rn t h C h 3 for some con-

1126

J. DIEBOLT AND C. POSSE

stant C > 0. Moreover, by (i), we have for n N4 that n n t n t + h


h2 /2 C h > 0 for all t K+ and h = 0 such that h < = / 4C .
Since K = t t K+ I: t t is compact and the function t t
t t
is continuous and positive on K , inf t t K n n t n t
> 0
for all n N4 in view of Lemma 16. The conclusion follows for n N4 =
max N4 N4 .
Lemma 18. Under Conditions D1D4, there exists N5 such that, for all
n N5 and all t = t I fn n t
fn n t
fn n t
fn n t
are
linearly independent.
Proof. First, using a Taylor expansion of sufficient order and Lemma 16,
we show that the Gram determinant of this system is positive for all t = t
sufficiently close whenever n is large enough. Second, we take advantage of
the continuity and positivity of the corresponding Gram determinant with fn
and fn replaced by f and f for t t > 0 and its uniform convergence
n .
Lemma 19. Under Condition D1, P Zn a P Z a for all a.
Proof. First,
Z Zn sup X t Xn t inf X t
tI

2n

= sup Var U t Un t
tI

tI

sup U t Un t
tI

and
d2n t1 t2 = E U t1 Un t1

U t2 Un t2

C t1 t2

for n large enough, by Lemma 14. On the other hand, we have the following
inequality due to Dmitrovskii [Lifshits (1986)]:
P sup U t Un t > u = 2 exp u2 / 22n
tI

qn u

where qn u = 4 1 exp 21/2 6


n /u 1/2 n /u 1/2
n /u 1/2 1/2 with
1/2
n n log
C1 /n when n 0 as n and dn t1 t2 C t1 t2 .
Therefore, P Z Zn > u 0 as n and Zn converges in probability,
hence weakly, to Z. Then P Zn a P Z a for all continuity points a.
As P Z a is continuous [Tsirelson (1975)], P Zn a P Z a for
all a.
For each compact subset K+ 1 J+ , let us define the restrictions
mK+ b and mn K+ b of m b and mn b , respectively, obtained by replacing
in (12) the integral in s over J+ by the integral over K+ and n K+ , respectively. By Lebesgues convergence theorem and Lemmas 1418, it follows
that Mn b M b for all b and mn K+ b mK+ b for all b > 0. Moreover,

1127

SMOOTH GAUSSIAN PROCESSES

Mn b M b for n large enough and M b M b , where M b = C b


for some C > 0 and > 0. Hence, the sequence fZn is weakly compact in
L1 R [Bourbaki (1967), pages 112113], which means that there exist a func
tion f L1 R and a subsequence fZn such that fZn b h b db

R . By taking h b = 1 a b , it follows that


f b h b db for all h L
a
P Zn a f b db for all a. Therefore, by Lemma 19, Z has a density
fZ b = f b . Moreover, the same result holds for any accumulation point
of fZn . Finally, fZn b fZ b for a.e. b. By Tsirelson (1975), fZ b has
bounded variation on every compact interval of R. Hence, fZ b has a right
and a left limit at each point. If we select a left-continuous (say) version of
fZ b , it follows that mK+ b fZ b for all b > 0 and fZ b M b for all
b. Since the Lebesgue measure of 1 J+ \K+ can be made arbitrarily small,
it follows that m b fZ b for all b > 0.
APPENDIX
Let f s = f1 s
fn s , s J = 0 L , be a unit speed parameterization of a smooth curve embedded in the unit sphere of Rn . Therefore,
f s = f s 1, f s f s 1, f s 1 for all s J and L = is
the length of .
At each point M = M s of , we can define the unit vector tangent to
T = T s = f s and the principal normal vector N = N s = f s / f s .
Since T = 1, N = T /c is orthogonal to T, where c = c s = f s defines
the curvature of at the point M = M s .
For each s J, if N s = f s , there exists only one unit vector K=K s
Vect f T such that K Vect f N and N K = cos > 0. Moreover,
K s is C1 on each interval on which f N > 1, that is, c = 1. If c =
1 we can define K s = K s . Let us denote by K1 = K1 s
Kn2 =
Kn2 s an orthonormal basis of Vect f T such that K1 K and the functions K2
Kn2 are C1 . At each point M = M s of , the moving frame
M f T K1
Kn2 is orthonormal and the matrix of f T K1
Kn2
with respect to f T K1
Kn2 is antisymmetric:
f
0
1

T K1 K2 Kn2

1
0
0 0
0 cg 0 0

cg

f
T
K1
K2
Kn2

where cg = cg s = K1 T = c K1 N = c cos 0 defines the geodesic


curvature of at the point M = M s . By the definition of , we have f N 2 =

1128

J. DIEBOLT AND C. POSSE

sin2 = 1/c2 and then cg2 = f 2 1. Note that cg = 0 iff cos = 0, that is,
f N = 1. Finally, it can be shown that K = f + f /cg .
If g t , t I = 0 T , is a general parameterization of with g t > 0
for all t I, the corresponding unit speed parameterization of is given by
t
f s = g 1 s , s J, where s = t = 0 g u du, t I, is the arc length
of from 0 to t. We have
df
dg dt
g t
g t
=
=
=
ds
dt ds
t
g t

T s =f s =
c s N s =f s =
=
cg2 t =

1
g t
g t

2
2

d
d2 f
=
2
ds
dt
g t
g t

g
g

dt
ds

g t g t
g t 2

g t g t
g t 6

g t
2

Acknowledgments. The draft of this paper was prepared when the authors were visiting the Department of Statistics at Stanford University. We
are also indebted to one referee for bringing to our attention Piterbargs work.
REFERENCES
Adler, R. J. (1981). The Geometry of Random Fields. Wiley, New York.
Adler, R. J. (1990). An Introduction to Continuity, Extrema, and Related Topics for General
Gaussian Processes. IMS, Hayward, CA.
Azais, J. M. and Florens-Zmirou, D. (1987). Approximation du temps local des processus
gaussiens stationnaires par regularisation des trajectoires. Probab. Theory Related
Fields 76 121132.
Berger, M. and Gostiaux, B. (1988). Differential Geometry: Manifolds, Curves and Surfaces.
Springer, New York.
Berman, S. (1988). Sojourns and extremes of a stochastic process defined as a random linear
combination of arbitrary functions. Comm. Statist. Stochastic Models 4 143.
Berman, S. (1992). Sojourns and Extremes of a Stochastic Processes. Wadsworth, Belmont, CA.

Berman, S. and Kono,


N. (1989). The maximum of a Gaussian process with nonconstant variance:
a sharp bound for the distribution tail. Ann. Probab. 17 632650.
Bourbaki, N. (1967). Integration. Hermann, Paris.
Cressie, N. (1980). The asymptotic distribution of the scan statistic under uniformity. Ann.
Probab. 8 828840.
Darling, D. A. (1983). On the supremum of a certain Gaussian process. Ann. Probab. 11
803806.
Davies, R. B. (1977). Hypothesis testing when a nuisance parameter is present only under the
alternative. Biometrika 64 247254.
De Bruijn, N. G. (1962). Asymptotic Methods in Analysis 2nd ed. North-Holland, Amsterdam.
Diebolt, J. and Posse, C. (1995). On the density of the maximum of smooth Gaussian processes.
Technical Report 95.1, Appl. Statist., Dept. Mathematics, Swiss Federal Institute of
Technology, Lausanne, Switzerland.
Johnstone, I. and Siegmund, D. (1989). On Hotellings formula for the volume of tubes and
Naimans inequality. Ann. Statist. 17 184194.

SMOOTH GAUSSIAN PROCESSES

1129

Konakov, V. D. and Piterbarg, V. I. (1983). Rate of convergence of distributions of maximal


deviations of Gaussian processes and empirical density functions. II. Theory Probab.
Appl. 28 172178.
Konstant, D. G. and Piterbarg, V. I. (1993). Extreme values of the cyclostationary Gaussian
random process. J. Appl. Probab. 30 8297.
Lachal, A. (1991). Sur le premier instant de passage de lintegrale du mouvement brownien.
Ann. Inst. H. Poincare Probab. Statist. 27 385405.
Leadbetter, M. R., Lindgren, G. and Rootzen, H. (1983). Extremes and Related Properties of
Random Sequences and Processes. Springer, New York.
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, New York.
Lifshits, M. A. (1986). On the distribution of the maximum of a Gaussian process. Theory Probab.
Appl. 31 125132.
Marcus, M. B. (1977). Level crossings of a stochastic process with absolutely continuous sample
paths. Ann. Probab. 5 5271.
Piterbarg, V. I. (1981). Comparison of distribution functions of maxima of Gaussian processes.
Theory Probab. Appl. 26 687705.
Piterbarg, V. I. (1996). Asymptotic methods in the theory of Gaussian processes and fields.
Translations of Mathematical Monographs 148. Amer. Math. Soc., Providence, RI.
Samorodnitsky, G. (1991). Probability tails of Gaussian extrema. Stochastic Process. Appl. 38
5584.
Sun, J. (1993). Tail probabilities of the maxima of Gaussian random fields. Ann. Probab. 21 3471.
Tsirelson, V. S. (1975). The density of the maximum of a Gaussian process. Theory Probab.
Appl. 20 817856.
Weber, M. (1985). Sur la densite de la distribution du maximum dun processus gaussien.
J. Math. Kyoto Univ. 25 515521.
Weber, M. (1989). The supremum of Gaussian processes with a constant variance. Probab. Theory
Related Fields 81 585591.
CNRS
URA 397
LMC-IMAG
BP 53
38041 Grenoble Cedex 9
France

School of Statistics
College of Liberal Arts
270A Vincent Hall
206 Church Street
University of Minnesota
Minneapolis, Minnesota 55455
E-mail: cposse@stat.washington.edu

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series

Computing the maximum of random processes


and series
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State

University ; Cecile
Mercadier Lyon, France and Mario Wschebor
Universite de Toulouse

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

1 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series

Introduction

MCQMC computations of Gaussian integrals


Reduction of variance
MCQMC

Maxima of Gaussian processes

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

2 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

The lynx data


7000

6000

5000

4000

3000

2000

1000

20

40

60

80

100

120

Annual record of the number of the Canadian lynx trapped in the


Mackenzie River district of the North-West Canada for the period
1821 - 1934, (Elton and Nicholson, 1942)
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

3 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

After passage to the log and centering


3

4
0

20

40

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

60

80

100

120

4 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Introduction

Testing
The maximum of absolute value of the series is 3.0224. An estimation
of the covariance with WAFO gives
2

1.5

0.5

0.5

1.5
0

20

40

60

80

100

120

Can we judge the significativity of this quantity ?


Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

5 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Introduction

We assume the series is Gaussian.


Let the Gaussian density in R114 . We have to compute
3.0224

(x1 , . . . , x114 )dx1 , . . . , dx114


3.0224

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

6 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

MCQMC computations of Gaussian integrals


Reduction of variance

Let us consider our problem in a general setting. is a n n


covariance matrix
u
u
1

I :=
l1

(x)dx

(1)

ln

By conditioning or By Choleski decomposition we can write


x1 = T11 z1
x2 = T12 z1 + T22 z2
.....................................
Where the Zi s are independent standard. Integral I becomes
u1 /T11

I :=

(z1 )dz1
l1 /T11

u2 T12 z1
T22
(z2 )dz2
l2 T12 z1
T22

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

(2)

7 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

MCQMC computations of Gaussian integrals


Reduction of variance

Now making the change of variables ti = (zi )


1

1 (u1 /T11 )

I :=

dt1
1 (l1 /T11 )

u2 T12 1 (t1 )
T22
dt2
l2 T12 1 (t1 )
T22

(3)

And by a final scaling this integral can be written as an integral on the


hypercube [0, 1]n .
I :=

h(t)dt.

(4)

[0,1]n

At this stage, if form (4) is evaluated by MC it corresponds to an


important reduction of variance (102 , 103 ) with respect to the form
(1). The transformation up to there is elementary but efficient.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

8 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

QMC
In the form (4) the MC evaluation is based on
M

h(ti )

I = 1/M
i=1

it is well known that its convergence is slow : O(M 1/2 ).


The Quasi Monte Carlo Method is based on the of searching
sequences that are more random than random. A popular method is
based on lattice rules. Let Z1 be a nice integer sequence in Nn , the
rule consist of choosing
i.z
ti =
,
M
where the notation
means that we have taken the fractional part
componentwise. M is chosen prime.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

9 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

Theorem
(Nuyens and cools, 2006) Assume that h is the tensorial product of
periodic functions that belong to a Koborov space (RKHS). Then the
minimax sequence and the worst error can be calculated by a
polynomial algorithm. Numerical results show that the convergence is
roughly O(M 1 ).
This result concerns the worst case so it is not so relevant

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

10 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

A meta theorem

If h does not satisfies the conditions of the preceding theorem we can


still hope QMC to be faster than MC

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

11 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

MCQMC
Let (ti , i) be the lattice sequence, the way of estimating the integral
can be turn to be random but exactly unbiased by setting
M

I = 1/M

ti + U

i=1

where U is uniform on [0, 1]n .


By the meta theorem I has small variance.
So we can make N independent replications of this calculation and
construct Student-type confidence intervals. It is correct whatever the
properties of the function h are.
N must be chosen small : in practical 12.

Conclusion : At the cost of a small loss in speed ( 12 ) we have a


reliable estimation of error.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

12 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


MCQMC computations of Gaussian integrals
MCQMC

This method has been used to construct confidence bands for


electrical load curves prediction. Azas, Bercu, Fort, Lagnoux L e
(2009)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

13 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Do processes exist ?
In this part X(t) is a Gaussian process defined on a compact interval
[0, T].
Since such a process is always observed in a finite set of times and
since the previous method work with say n = 1000, is it relevant to
consider continuous case ?
Answer yes : random process occur as limit statistics. Consider for
example the simple mixture model
H0 : Y N(0, 1)
H1 : Y pN(0, 1) + (1 p)N(, 1) p [0, 1], M R

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

(5)

14 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Theorem (Asymptotic distribution of the LRT)


Under some conditions the LRT of H0 against H1 has , under H0 , the
distribution of the random variable
1
sup {Z 2 (t)},
2 tM

(6)

where Z(.) is a centered Gaussian process covariance function


r(s, t) =

est 1
es2 1

et2 1

In this case there is no discretization.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

15 / 22

Computing the maximum of random processes and series

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Maxima of Gaussian processes

The record method

IE X (t)+ 1IX(s)u,s<t X(t) = u)pX(t) (u)dt

P{M > u} = P{X(0) > u} +


0

(7)

after discretization of [0, T], Dn = {0, T/n, 2T/n, . . . , T} Then


P{sup X(t) > u} P{M > u} P{X(0) > u}
tDn
T

IE X (t)+ 1IX(s)u,s<t,sDn X(t) = u)pX(t) (u)dt

(8)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

16 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Now the integral is replaced by a trapezoidal rule using the same


discretization. Error of the trapezoidal rule is easy to evaluate .
Moreover that the different terms involved can be computed in a
recursive way.

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

17 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

An example

Using MGP written by Genz , let us consider the centered stationary


Gaussian process with covariance exp(t2 /2)
[ pl, pu, el, eu, en, eq ] = MGP( 100000, 0.5, 50,
@(t)exp(-t.2 /2), 0, 4);
pu upper bound with
eu = estimate for total error,
en = estimate for discretization error, and
eq = estimate for MCQMC error ;
pl lower bound
el = error estimate (MCQMC)

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

18 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Extensions

Treat all the cases : maximum of the absolute value, non centered,
non-stationary. In each case some tricks have to be used.
A great challenge is to use such formulas for fields .

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

19 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

References

Level Sets and Extrema


of Random Processes
and Fields
Jean-Marc Azas and Mario Wschebor

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

20 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

Azas Genz, A. (2009),Computation of the distribution


of the maximum of stationary Gaussian sequences and
processes. In preparation
Allan Genz web site
http ://www.math.wsu.edu/faculty/genz/homepage
Mercadier,C. (2005). MAGP tooolbox,
http ://math.univ-lyon1.fr/ mercadier/
Mercadier, C. (2006), Numerical Bounds for the
Distribution of the Maximum of Some One- and
Two-Parameter Gaussian Processes, Adv. in Appl.
Probab. 38, pp. 149170.
Nuyens, D., and Cools, R. (2006), Fast algorithms for
component-by-component construction of rank-1 lattice
rules in shift-invariant reproducing kernel Hilbert
spaces, Math. Comp 75, pp. 903920.
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

21 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

THANK-YOU
MERCI
GRACIAS

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

22 / 22

Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes

Computing the maximum of random processes and series


Maxima of Gaussian processes

THANK-YOU
MERCI
GRACIAS

Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University

; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )

22 / 22

Probab. Theory Relat. Fields 119, 7098 (2001)


Digital Object Identifier (DOI) 10.1007/s004400000102

Jean-Marc Azas Mario Wschebor

On the regularity of the distribution


of the maximum of one-parameter
Gaussian processes
Received: 14 May 1999 / Revised version: 18 October 1999 /
Published online: 14 December 2000 c Springer-Verlag 2001
Abstract. The main result in this paper states that if a one-parameter Gaussian process has
C 2k paths and satisfies a non-degeneracy condition, then the distribution of its maximum on
a compact interval is of class C k . The methods leading to this theorem permit also to give
bounds on the successive derivatives of the distribution of the maximum and to study their
asymptotic behaviour as the level tends to infinity.

1. Introduction and main results


Let X = {Xt : t [0, 1]} be a stochastic process with real values and continuous
paths defined on a probability space ( , , P ). The aim of this paper is to study
the regularity of the distribution function of the random variable M := max{Xt :
t [0, 1]}.
X is said to satisfy the hypothesis Hk , k a positive integer, if:
(1) X is Gaussian;
(2) a.s. X has C k sample paths;
(3) For every integer n 1 and any set t1 , ..., tn of pairwise different parameter
values, the distribution of the random vector:
(k)

(k)

Xt1 , ..., Xtn , Xt1 , ..., Xtn , ..., Xt1 , ..., Xtn

is non degenerate.
We denote m(t) and r(s, t) the mean and covariance functions of X, that is
i+j
m(t) := E(Xt ), r(s, t) := E (Xs m(s))(Xt m(t)) and rij := s i t j r (i, j =
0, 1, ..) the partial derivatives of r, whenever they exist.
Our main results are the following:

J.-M. Azas, M. Wschebor: Laboratoire de Statistique et Probabilites,


UMR-CNRS C55830, Universite Paul Sabatier, 118, route de Narbonne,
31062 Toulouse Cedex 4. France. e-mail: azais@cict.fr
M. Wschebor: Centro de Matematica, Facultad de Ciencias, Universidad de la Republica,
Calle Igua 4225, 11400 Montevideo, Uruguay. e-mail: wscheb@fcien.edu.uy
Mathematics Subject Classification (2000): 60G15, 60Gxx, 60E05
Key words or phrases: Extreme values Distribution of the maximum

Regularity of the distribution of the maximum

71

Theorem 1.1. Let X = {Xt : t [0, 1]} be a stochastic process satisfying H2k .
Denote by F (u) = P (M u) the distribution function of M.
Then, F is of class C k and its succesive derivatives can be computed by repeated
application of Lemma 3.3.
Corollary 1.1. Let X be a stochastic process verifying H2k and assume also that
E(Xt ) = 0 and V ar(Xt ) = 1.
Then, as u +, F (k) (u) is equivalent to
(1)k1

uk u2 /2
e
2

r11 (t, t)dt.

(1)

The regularity of the distribution of M has been the object of a number of


papers. For general results when X is Gaussian, one can mention:Ylvisaker (1968);
Tsirelson (1975); Weber (1985); Lifshits (1995); Diebolt and Posse (1996) and
references therein.
Theorem 1.1 appears to be a considerable extension, in the context of oneparameter Gaussian processes, of existing results on the regularity of the distribution
of the maximum which, as far as the authors know, do not go beyond Lipschitz condition for the first derivative. For example, it implies that if the process is Gaussian
with C paths and satisfies the non-degeneracy condition for every k = 1, 2, . . . ,
then the distribution of the maximum is C . The same methods provide bounds for
the successive derivatives as well as their asymptotic behaviour as their argument
tends to + (Corollary 1.1).
Except in Theorem 3.1, which contains a first upper bound for the density of
M, we will assume X to be Gaussian.
The proof of Theorem 1.1 is based upon the main Lemma 3.3. Before giving
the proofs we have stated Theorem 3.2 which presents the result of this Lemma in
the special case leading to the first derivative of the distribution function of M. As
applications one gets upper and lower bounds for the density of M under conditions that seem to be more clear and more general than in previous work (Diebolt
and Posse, 1996). Some extrawork is needed to extend the implicit formula (9) to
non-Gaussian processes, but this seems to be feasible.
As for Theorem 1.1 for derivatives of order greater than 1, its statement and its
proof rely heavily on the Gaussian character of the process.
The main result of this paper has been exposed in the note by Azas and
Wschebor (1999).

2. Crossings
Our methods are based on well-known formulae for the moments of crossings of
the paths of stochastic processes with fixed levels, that have been obtained by a
variety of authors, starting from the fundamental work of S.O.Rice (19441945).
In this section we review without proofs some of these and related results.
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I ) := {t I : f (t) = u}

72

J.-M. Azas, M. Wschebor

Nu (f ; I ) =

Cu (f ; I )

denote respectively the set of roots of the equation f (t) = u on the interval I and
the number of these roots, with the convention Nu (f ; I ) = + if the set Cu is
infinite. Nu (f ; I ) is called the number of crossings of f with the level u on
the interval I .
In the same way, if f is a differentiable function the number of upcrossings
and downcrossings of f are defined by means of
Uu (f ; I ) := ({t I : f (t) = u, f (t) > 0})
Du (f ; I ) := ({t I : f (t) = u, f (t) < 0}).
For a more general definition of these quantities see Cramer and Leadbetter (1967).
In what follows, f p is the norm of f in Lp (I, ), 1 p +, denoting the Lebesgue measure. The joint density of the finite set of real-valued random
variables X1 , ...Xn at the point (x1 , ...xn ) will be denoted pX1 ,...,Xn (x1 , ...xn ) whenever it exists. (t) := (2)1/2 exp(t 2 /2) is the density of the standard normal
t
distribution, (t) := (u)du its distribution function.
The following proposition (sometimes called Kacs formula) is a common tool
to count crossings.
Proposition 2.1. Let f : I = [a, b] IR be of class C1 , f (a), f (b) = u. If f
does not have local extrema with value u on the inteval I , then
Nu (f ; I ) = lim 1/(2)
0

1I{|f (t)u|<} |f (t)|dt.

For m and k, positive integers, k m, define the factorial kth power of m by


m[k] := m(m 1) (m k + 1).
For other real values of m and k we put m[k] := 0. If k is an integer k 1 and I an
interval in the real line, the diagonal of I k is the set:
Dk (I ) := {(t1 , ..., tk ) I k , tj = th for some pair (j, h), j = h}.
Finally, assume that X = {Xt : t IR} is a real valued stochastic process with C 1
paths. We set , for (t1 , ..., tk ) I k \Dk (I ) and xj IR (j = 1, ..., k):
k

At1 ,...tk (x1 , ...xk ) :=

IRk

j =1

|xj | pXt

1 ,...,Xtk ,Xt1 ,...,Xtk

(x1 , ...xk , x1 , ...xk )dx1 ...dxk


and
Ik (x1 , ...xk ) :=

Ik

At1 ,...tk (x1 , ...xk )dt1 ...dtk ,

where it is understood that the density in the integrand of the definition of At1 ,...tk
(x1 , ...xk ) exists almost everywhere and that the integrals above can take the value
+.

Regularity of the distribution of the maximum

73

Proposition 2.2. Let k be a positive integer, u a real number and I a bounded


interval in the line. With the above notations and conditions, let us assume that the
process X also satisfies the following conditions:
1. the density
pXt1 ,...,Xtk ,Xs

,...,Xs

(x1 , ...xk , x1 , ...xk )

exists for (t1 , ...tk ), (s1 , ...sk ) I k \Dk (I ) and is a continuous function of
(t1 , ...tk ) and of x1 , ...xk at the point (u, ..., u).
2. the function
(t1 , ..., tk , x1 , ...xk ) At1 ,...tk (x1 , ...xk )
is continuous for (t1 , ..., tk ) I k \Dk (I ) and x1 , ...xk belonging to a neighbourhood of u.
3. (additional technical condition)
IR3

|x1 |k1 |x2 x3 |pXt

,...,Xtk ,Xs ,Xs ,Xt1 (x1 , ...xk , x1 , x2 , x3 )dx1 dx2 dx3 0


1

as |s2 t1 | 0, uniformly as (t1 , ..., tk ) varies in a compact subset of


I k \Dk (I ) and x1 , ..., xk in a fixed neighbourhood of u.
Then,
E((Nu (X, I ))[k] ) = Ik (u, ..., u).

(2)

Both members in (2) may be +


Remarks. (a) For k = 1 formula (2) becomes
E[Nu (X; I )] =

dt

|x |pXt ,Xt (u, x )dx .

(3)

(b) Simple variations of (3), valid under the same hypotheses are:
E[Uu (X; I )] =
E[Du (X; I )] =

dt

x pXt ,Xt (u, x )dx

(4)

|x |pXt ,Xt (u, x )dx .

(5)

dt

In the same way one can obtain formulae for the factorial moments of marked
crossings, that is, crossings such that some additional condition holds true. For
example, if Y = {Yt : t IR} is some other stochastic process with real values
such that for every t , (Yt , Xt , Xt ) admit a joint density, a < b + and
Nua,b (X, I ) := {t : t I, Xt = u, a < Yt < b}.
Then
E[Nua,b (X; I )] =

b
a

dy

dt

|x |pYt ,Xt ,Xt (y, u, x )dx .

(6)

74

J.-M. Azas, M. Wschebor

+
In particular, if Ma,b
is the number of strict local maxima of X(.) on the interval I

+
such that the value of X(.) lies in the interval (a, b), then Ma,b
= D0a,b (X , I ) and:
+
]=
E[Ma,b

b
a

dy

dt

|x |pXt ,Xt ,Xt (x, 0, x )dx .

(7)

Sufficient conditions for the validity of (6) and (7) are similar to those for 3.
(c) Proofs of (2) for Gaussian processes satisfying certain conditions can be
found in Belayev (1966) and Cramer-Leadbetter (1967). Marcus (1977) contains
various extensions. The present statement of Proposition 2.2 is from Wschebor
(1985).
(d) It may be non trivial to verify the hypotheses of Proposition 2.2. However
some general criteria are available. For example if X is a Gaussian process with C1
paths and the densities
pXt1 ,...,Xtk ,Xs ,...,Xs
1

are non-degenerate for (t1 , ...tk ), (s1 , ...sk ) I k \Dk , then conditions 1, 2, 3 of
Proposition 2.2 hold true (cf Wschebor, 1985, p.37 for a proof and also for some
manageable sufficient conditions in non-Gaussian cases).
(e) Another point related to Rice formulae is the non existence of local extrema
at a given level. We mention here two well-known results:
Proposition 2.3 (Bulinskaya, 1961). Suppose that X has C1 paths and that for
every t I , Xt has a density pXt (x) bounded for x in a neighbourhood of u.
Then, almost surely, X has no tangencies at the level u, in the sense that if
TuX := {t I, Xt = u, Xt = 0},
then P (TuX = ) = 1.
Proposition 2.4 (Ylvisakers Theorem, 1968). Suppose that {Xt : t T } is a
real-valued Gaussian process with continuous paths, defined on a compact separable topological space T and that V ar(Xt ) > 0 for every t T . Then, for each
u IR, with probability 1, the function t Xt does not have any local extrema
with value u.
3. Proofs and related results
Let be a random variable with values in IRk with a distribution that admits a
density with respect to the Lebesgue measure . The density will be denoted by
p (.) . Further, suppose E is an event. It is clear that the measure
(B; E) := P ({ B} E)
defined on the Borel sets B of IRk , is also absolutely continuous with respect to .
We will denote the density of related to E the Radon derivative:
p (x; E) :=

d (.; E)
(x).
d

It is obvious that p (x; E) p (x) for -almost every x IRk .

Regularity of the distribution of the maximum

75

Theorem 3.1. Suppose that X has C2 paths, that X, X , X admit a joint density at
every time t, that for every t, Xt has a bounded density pXt (.) and that the function
1

I (x, z) :=

dt

|x |pXt ,Xt ,Xt (x, z, x )dx

is uniformly continuous in z for (x, z) in some neighbourhood of (u, 0). Then the
distribution of M admits a density pM (.) satisfying a.e.
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 > 0)
1

dt

|x |pXt ,Xt ,Xt (u, 0, x )dx .

(8)

Proof . Let u IR and h > 0. We have


P (M u) P (M u h) = P (u h < M u)
P (u h < X0 u, X0 < 0) + P (u h < X1 u, X1 > 0)
+
> 0),
+P (Muh,u
+
+
= Muh,u
where Muh,u
(0, 1), since if u h < M u, then either the maximum
occurs in the interior of the interval [0, 1] or at 0 or 1, with the derivative taking
the indicated sign. Note that
+
+
> 0) E(Muh,u
).
P (Muh,u

Using Proposition 2.3, with probability 1, X (.) has no tangencies at the level 0,
thus an upper bound for this expectation follows from the Kacs formula:
1
0 2

+
= lim
Muh,u

1
0

1I{X(t)[uh,u]} 1I{X (t)[,]} 1I{X

(t)<0} |X

(t)|dt

a.s.

which together with Fatous lemma imply:


+
E(Muh,u
) lim inf
0

1
2

dz

u
uh

I (x, z)dx =

u
uh

I (x, 0)dx.

Combining this bound with the preceeding one, we get


P (M u) P (M u h)

uh

pX0 (x; X0 < 0) + pX1 (x; X1 > 0) + I (x, 0) dx,

which gives the result.


In spite of the simplicity of the proof, this theorem provides the best known
upper-bound for Gaussian processes. In fact, in this case, formula (8) is a simpler
expression of the bound of Diebolt and Posse (1996). More precisely, if we use
their parametrization by putting
m(t) = 0 ; r(s, t) =

(s, t)
,
(s) (t)

76

J.-M. Azas, M. Wschebor

with
(t, t) = 1, 11 (t, t) = 1, 10 (t, t) = 0, 12 (t, t) = 0, 02 (t, t) = 1,
after some calculations, we get exactly their bound M(u) ( their formula (9)) for
the density of the maximum.
Let us illustrate formula (8) explicitly when the process is Gaussian, centered
with unit variance. By means of a deterministic time change, one can also assume
that the process has unit speed (V ar(Xt ) 1). Let L the length of the new
time interval. Clearly t, m(t) = 0, r(t, t) = 1, r11 (t, t) = 1, r10 (t, t) = 0,
r12 (t, t) = 0, r02 (t, t) = 1. Note that
Z N (, 2 ) E(Z ) = (/ ) (/ ).
The formulae for regression imply that conditionally on Xt = u, Xt = 0, Xt
has expectation u and variance r22 (t, t) 1. Formula (8) reduces to
pM (u) p+ (u) := (u) 1+(2)1/2
with Cg (t) :=

L
0

Cg (t)(u/Cg (t)) + u (u/Cg (t))dt ,

r22 (t, t) 1

As x +,

(x) = 1

(x)
x

(x)
x3

+O

(x)
x5

. This implies that


L

p+ (u) = (u) 1 + Lu(2)1/2 + (2)1/2 u2


0

Cg3 (t)(u/Cg (t))dt

+O u4 (u/C + ) ,
with C + := supt[0,L] Cg (t).
Furthermore the exact equivalent of pM (u) when u + is
(2)1 u L exp(u2 /2)
as we will see in Corollary 1.1.
The following theorem is a special case of Lemma 3.3. We state it separately
since we use it below to compare the results that follow from it with known results.
Theorem 3.2. Suppose that X is a Gaussian process satisfying H2 . Then M has a
continuous density pM given for every u by
pM (u) = pX0 (u ; M u) + pX1 (u ; M u)
1

+
0

dt

|x |pXt ,Xt ,Xt (u , 0, x ; M u)dx ,

(9)

where pX0 (u ; M u) = limxu pX0 (x; M u) exists and is a continuous


function of u , as well as pX1 (u ; M u) and pXt ,Xt ,Xt (u , 0, x ; M u).

Regularity of the distribution of the maximum

77

Again, we obtain a simpler version of the expression by Diebolt and Posse


(1996).
In fact, the result 3.2 remains true if X is Gaussian with C2 paths and one
requires only that Xs , Xt , Xt , Xt admit a joint density for all s, t, s = t [0, 1].
If we replace the event {M u} respectively by {X0 < 0}, {X1 > 0} and in
each of the three terms in the right hand member in formula (9) we get the general
upper-bound given by (8).
To obtain lower bounds for pM (u), we use the following immediate inequalities:
P (M u/X0 = u) = P (M u, X0 < 0/X0 = u)
P X0 < 0/X0 = u
E(Uu [0, 1]1I{X <0} /X0 = u).
0

In the same way


P (M u/X1 = u) = P (M u, X1 > 0/X1 = u)
P X1 > 0/X1 = u
E(Du [0, 1]1I{X >0} /X1 = u)
1

and if x < 0 :
P (M u/Xt = u, Xt = 0, Xt = x )
1 E([Du ([0, t]) + Uu ([t, 1])] /Xt = u, Xt = 0, Xt = x ).
If we plug these lower bounds into Formula (9) and replace the expectations of
upcrossings and downcrossings by means of integral formulae of (4), (5) type, we
obtain the lower bound:
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 < 0)
1

+
0

dt
ds

0
1

dt

|x |pXt ,Xt ,Xt (u, 0, x )dx


+

dx
0

|x |

t
0

xs pXs ,Xs ,X0, X0 (u, xs , u, x )dxs


0

ds |x |pXs ,Xs ,Xt ,Xt ,Xt (u, x , u, 0, x )dx


1
+
+ t ds 0 x pXs ,Xs ,Xt ,Xt ,Xt (u, x , u, 0, x )dx

dx .
(10)

Simpler expressions for (10) also adapted to numerical computations, can be found
in Cierco (1996).
Finally, some sharper upperbounds for pM (u) are obtained when replacing the
event {M > u} by {X0 + X1 > 2u}, the probability of which can be expressed
using the conditionnal expectation and variance of X0 + X1 ; we are able only to
express these bounds in integral form.
We now turn to the proofs of our main results.

78

J.-M. Azas, M. Wschebor

Lemma 3.1. (a) Let Z be a stochastic process satisfying Hk (k 2) and t a point


in [0, 1]. Define the Gaussian processes Z , Z , Z t by means of the orthogonal
decompositions:
Zs = a (s) Z0 + sZs s (0, 1] .
(11)
Zs = a (s) Z1 + (1 s) Zs
Zs = bt (s)Zt + ct (s) Zt +

(s t) t
Zs
2
2

s [0, 1) .
s [0, 1] s = t.

(12)
(13)

Then, the processes Z , Z , Z t can be extended defined at s = 0, s = 1, s = t


respectively so that they become pathwise continuous and satisfy Hk1 , Hk1 , Hk2
respectively.
(b) Let f be any function of class C k . When there is no ambiguity on the process Z, we will define f , f , f t in the same manner, putting f instead of Z in
(11), (12), (13), but still keeping the regression coefficients corresponding to Z.
Then f , f , f t can be extended by continuity in the same way to functions in
C k1 , C k1 , C k2 respectively.
(c) Let m be a positive integer, suppose Z satisfies H2m+1 and t1 , ..., tm belong
to [0, 1] { , }. Denote by Z t1 ,...,tm the process obtained by repeated application
of the operation of part (a) of this Lemma, that is
Zst1 ,...,tm = Z t1 ,...,tm1

tm
.
s

Denote by s1 , ..., sp (p m) the ordered p-tuple of the elements of t1 , ..., tm that


belong to [0, 1] (i.e. they are not or ). Then, a.s. for fixed values of the
symbols , the application:
s1 , ..., sp , s Zst1 ,...,tm , Z t1 ,...,tm

is continuous.
Proof . (a) and (b) follow in a direct way, computing the regression coefficients
a (s), a (s) , bt (s), ct (s) and substituting into formulae (11), (12), (13). Note
that (b) also follows from (a) by applying it to Z + f and to Z. We prove now (c)
which is a consequence of the following:
Suppose Z(t1 , ..., tk ) is a Gaussian field with C p sample paths (p 2) defined on
[0, 1]k with no degeneracy in the same sense that in the definition of hypothesis Hk
(3) for one-parameter processes. Then the Gaussian fields defined by means of:
Z (t1 , ..., tk ) = (tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 0)
for tk = 0,
Z (t1 , ..., tk ) = (1 tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 1)
for tk = 1,
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2 (Z(t1 , ..., tk1 , tk+1 )
b(t1 , ..., tk , tk+1 )Z(t1 , ..., tk )
Z
(t1 , ..., tk ))
for tk+1 = tk
c(t1 , ..., tk , tk+1 )
tk

Regularity of the distribution of the maximum

79

can be extended to [0, 1]k (respectively [0, 1]k , [0, 1]k+1 ) into fields with paths in
C p1 (respectively C p1 , C p2 ). In the above formulae,
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 0),
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 1),
- b(t1 , ..., tk , tk+1 ), c(t1 , ..., tk , tk+1 ) are the regression coefficients of
Z
Z(t1 , ..., tk1 , tk+1 ) on the pair Z(t1 , ..., tk ), t
(t1 , ..., tk ) .
k
Let us prove the statement on Z. The other two are simpler. Denote by V the subZ
(t1 , ..., tk ) . Denote
space of L2 ( , , P ) generated by the pair Z(t1 , ..., tk ), t
k

by V the version of the orthogonal projection of L2 ( , , P ) on the orthogonal


complement of V , defined by means of.
V (Y )

:= Y bZ(t1 , ..., tk ) + c

Z
(t1 , ..., tk ) ,
tk

where b and c are the regression coefficients of Y on the pair


Z(t1 , ..., tk ),

Z
(t1 , ..., tk ).
tk

Note that if {Y : } is a random field with continuous paths and such that
Y is continuous in L2 ( , , P ) , then a.s.
, t1 , ..., tk )

V (Y )

is continuous.
From the definition:
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2

(Z(t1 , ..., tk1 , tk+1 )) .

On the other hand, by Taylors formula:


Z(t1 , ..., tk1 , tk+1 ) = Z(t1 , ..., tk )+(tk+1 tk )
with
R2 (t1 , ..., tk , tk+1 ) =

tk+1
tk

Z
(t1 , ..., tk )+R2 (t1 , ..., tk , tk+1 )
tk

2Z
(t1 , ..., tk1 , ) (tk+1 ) d
tk2

so that
Z(t1 , ..., tk , tk+1 ) =

2 (tk+1 tk )2 R2 (t1 , ..., tk , tk+1 ) .

(14)

It is clear that the paths of the random field Z are p 1 times continuously differentiable for tk+1 = tk . Relation (14) shows that they have a continuous extension
to [0, 1]k+1 with Z(t1 , ..., tk , tk ) =
V

2Z
(t , ..., tk )
tk2 1

. In fact,

2 (sk+1 sk )2 R2 (s1 , ..., sk , sk+1 )

80

J.-M. Azas, M. Wschebor

= 2 (sk+1 sk )2

sk+1
sk

2Z
(s , ..., sk1 , )
tk2 1

(sk+1 ) d.

According to our choice of the version of the orthogonal projection V , a.s. the
integrand is a continuous function of the parameters therein so that, a.s.:
Z (s1 , ..., sk , sk+1 )

2Z
(t1 , ..., tk ) when (s1 , ..., sk , sk+1 )
tk2
(t1 , ..., tk , tk ).
V

This proves (c). In the same way, when p 3, we obtain the continuity of the
partial derivatives of Z up to the order p2.
The following lemma has its own interest besides being required in our proof
of Lemma 3.3. It is a slight improvement of Lemma 4.3, p. 76 in Piterbarg (1996)
in the case of one-parameter processes.
Lemma 3.2. Suppose that X is a Gaussian process with C3 paths and that for all
(2)
(3)
s = t, the distributions of Xs , Xs , Xt , Xt and of Xt , Xt , Xt , Xt do not degenerate. Then, there exists a constant K (depending on the process) such that
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) K(t s)4
for all x1 , x2 , x1 , x2 IR and all s, t, s = t [0, 1].
Proof .
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) (2)2 DetV ar(Xs , Xt , Xs , Xt )

1/2

where DetV ar stands for the determinant of the variance matrix. Since by hypothesis the distribution does not degenerate outside the diagonal s = t, the conclusion
of the lemma is trivially true on a set of the form {|s t| > }, > 0. By a compactness argument it is sufficient to prove it for s, t in a neighbourhood of (t0 , t0 )
for each t0 [0, 1]. For this last purpose we use a generalization of a technique
employed by Belyaev (1966). Since the determinant is invariant by adding linear
combination of rows (resp. columns) to another row (resp. column),
DetV ar(Xs , Xt , Xs , Xt ) = DetV ar(Xs , Xs , X s(2) , X s(3) ),
with
X s(2) = Xt Xs (t s)Xs
X s(3) = Xt Xs

2
X (2)
(t s) s

(t s)2 (2)
Xt 0
2
(t s)2 (3)
Xt 0 ,
6

The equivalence refers to (s, t) (t0 , t0 ). Since the paths of X are of class
(2)
(3)
C3 , Xs , Xs , (2(t s)2 )X s , (6(t s)2 )X s
tends almost surely to

Regularity of the distribution of the maximum


(2)

81

(3)

Xt0 , Xt0 , Xt0 , Xt0 as (s, t) (t0 , t0 ). This implies the convergence of the
variance matrices. Hence
DetV ar(Xs , Xt , Xs , Xt )

(t s)8
(2)
(3)
DetV ar(Xt0 , Xt0 , Xt0 , Xt0 ),
144

which ends the proof.


Remark. the proof of Lemma 3.2 shows that the density of Xs , Xs , Xt , Xt exists
for |s t| sufficiently small as soon as the process has C3 paths and for every t
(3)
the distribution of Xt , Xt , Xt , Xt does not degenerate. Hence, under this only
hypothesis, the conclusion of the lemma holds true for 0 < |s t| < and some
> 0.
Lemma 3.3. Suppose Z = {Zt : t [0, 1]} is a stochastic process that verifies
H2 . Define:
Fv (u) = E v .1IAu
where
Au = Au (Z, ) = {Zt (t) u f or all t [0, 1]},
(.) is a real valued C 2 function defined on [0, 1],
v = G(Zt1 (t1 )v, ..., Ztm (tm )v) for some positive integer m, t1 , ..., tm
[0, 1] , v IR and some C function G : IRm IR having at most polynomial
growth at , that is, |G(x)| C(1 + x p ) for some positive constants C, p
and all x IRm ( . stands for Euclidean norm).
Then,
For each v IR, Fv is of class C 1 and its derivative is a continuous function
of the pair (u, v) that can be written in the form:
Fv (u) = (0)E v,u .1IAu (Z

+ (1)E v,u .1IAu (Z


1

) .pZ0 ( (0) .u)


,

) pZ1 ( (1) .u)

t
(t)E v,u
Ztt t (t).u 1IAu (Z t , t )

pZt ,Zt (t) .u, (t) .u dt,

(15)

where the processes Z , Z , Z t and the functions , , t are as in Lemma


t are given by:
3.1 and the random variables v,u , v,u , v,u
v,u = G t1 Zt1 (t1 ) u + (t1 ) (u v), ...
..., tm Ztm (tm ) u + (tm ) (u v)
v,u = G (1 t1 ) Zt1 (t1 ) u + (t1 ) (u v), ...
..., (1 tm ) Ztm (tm ) u + (tm ) (u v)

82

J.-M. Azas, M. Wschebor

(t1 t)2 t
Zt1 t (t1 ) u + (t1 ) (u v), ...
2
(tm t)2 t
Ztm t (tm ) u + (tm ) (u v) .
...,
2

t
v,u
=G

Proof . We start by showing that the arguments of Theorem 3.1 can be extended to
our present case to establish that Fv is absolutely continuous. This proof already
contains a first approximation to the main ideas leading to the proof of the lemma.
Step 1 Assume - with no loss of generality - that u 0 and write for h > 0:
Fv (u) Fv (u h) = E v .1IAu \Auh E v .1IAuh \Au

(16)

Au \ Auh {(0)(u h) < Z0 (0)u, (0) > 0}


(1)
{(1)(u h) < Z1 (1)u, (1) > 0} Muh,u 1

(17)

Note that:

where:
(1)

Muh,u = {t : t (0, 1), (t) 0, the function Z(.) (.)(u h)


has a local maximum at t with value falling in the interval [0, (t)h]}.
Using the Markov inequality
(1)

(1)

P (Muh,u 1) E Muh,u ,
and the formula for the expectation of the number of local maxima applied to the
process t Zt (t)(u h) imply
|E v .1IAu \Auh |
1I{(0)>0}
+1I{(1)>0}
1

+
0

(0)u
(0)(uh)
(1)u

E (|v |/Z0 = x) pZo (x)dx


E (| v | /Z1 = x) pZ1 (x)dx

(1)(uh)
(t)h

1I{(t)>0} dt

E |v |(Zt (t)(u h)) /V2 = (x, 0)

.pV2 (x, 0)dx,

(18)

where V2 is the random vector


Zt (t)(u h), Zt (t)(u h) .
Now, the usual regression formulae and the form of v imply that
|E v .1IAu \Auh | (const).h
where the constant may depend on u but is locally bounded as a function of u.

Regularity of the distribution of the maximum

83
(1)

An analogous computation replacing Muh,u by


(2)

Muh,u = {t : t (0, 1), (t) 0, the function Z(.) (.)u


has a local maximum at t, Zt (t)u [0, (t)h]}
leads to a similar bound for the second term in (16). It follows that
|Fv (u) Fv (u h)| (const).h
where the constant is locally bounded as a function of u. This shows that Fv is
absolutely continuous.
The proof of the Lemma is in fact a refinement of this type of argument. We
will replace the rough inclusion (17) and its consequence (18) by an equality.
In the two following steps we will assume the additional hypothesis that Z
verifies Hk for every k and (.) is a C function.
Step 2.
Notice that:
Au \ Auh = Au {(0)(u h) < Z0 (0)u, (0) > 0}
(1)
{(1)(u h) < Z1 (1)u, (1) > 0} {Muh,u 1} .

(19)

We use the obvious inequality, valid for any three events F1 , F2 , F3 :


3

1IFj 1I3 Fj 1IF1 F2 + 1IF2 F3 + 1IF3 F1


1

to write the first term in (16) as:


E v .1IAu \Auh = E v .1IAu 1I{(0)(uh)<Z0 (0)u} 1I{(0)>0}
+ E v .1IAu 1I{(1)(uh)<Z1 (1)u} 1I{(1)>0}
(1)

+ E v .1IAu Muh,u + R1 (h)

(20)

where
|R1 (h)| E |v |1I{(0)(uh)<Z0 (0)u,(1)(uh)<Z1 (1)u} 1I{(0)>0,(1)>0}
+ E |v |1I
+ E |v |1I

(1)

1I{(0)>0}

(1)

1I{(1)>0}

(0)(uh)<Z0 (0)u,Muh,u 1
(1)(uh)<Z1 (1)u,Muh,u 1
(1)

+ E |v | Muh,u 1IM (1)

uh,u 1

= T1 (h) + T2 (h) + T3 (h) + T4 (h)

Our first aim is to prove that R1 (h) = o(h) as h 0.


It is clear that T1 (h) = O(h2 ).

84

J.-M. Azas, M. Wschebor

Let us consider T2 (h). Using the integral formula for the expectation of the
number of local maxima:
T2 (h) 1I{(0)>0}

1
0

1I{(t)0} dt

(0)h

(t)h

dz0

dz.

.E |v |(Zt (t)(u h)) /V3 = v3 pV3 (v3 ),


where V3 is the random vector
Z0 (0)(u h), Zt (t)(u h), Zt (t)(u h) ,
and v3 = (z0 , z, 0).
We divide the integral in the right-hand member into two terms, respectively the
integrals on [0, ] and [, 1] in the t-variable, where 0 < < 1. The first integral
can be bounded by

1I{(t)0} dt

(t)h
0

dz E |v |(Zt (t)(u h)) /V2 = (z, 0) pV2 (z, 0).

where the random vector V2 is the same as in (18). Since the conditional expectation as well as the density are bounded for u in a bounded set and 0 < h < 1, this
expression is bounded by (const)h.
As for the second integral, when t is between and 1 the Gaussian vector
Z0 (0)(u h), Zt (t)(u h), Zt (t)(u h)
has a bounded density so that the integral is bounded by C h2 , where C is a constant
depending on .
Since > 0 is arbitrarily small, this proves that T2 (h) = o(h). T3 (h) is similar
to T2 (h).
We now consider T4 (h). Put:
(4)

Z(.) (4) (.)(u h)

Eh =
where

stands

h1/4 |v | h1/4

for the sup-norm in [0, 1]. So,


(1)

(1)

(1)

T4 (h) E |v |1IEh Muh,u (Muh,u 1) + E |v |1IE C Muh,u

(21)

(E C denotes the complement of the event E).


The second term in (21) is bounded as follows:
(1)

E |v |1IE C Muh,u E |v |4 E
h

(1)

Muh,u

1/4

P (EhC )

1/2

The polynomial bound on G, plus the fact that Z has finite moments of all
orders, imply that E |v |4 is uniformly bounded.
(1)
Also, Muh,u D0 (Z(.) (.)(u h), [0, 1]) = D (recall that D0 (g; I ) denotes the number of downcrossings of level 0 by function g). A bound for E D 4

Regularity of the distribution of the maximum

85

can be obtained on applying Lemma 1.2 in Nualart-Wschebor (1991). In fact, the


Gaussian process Z(.) (.)(u h) has uniformly bounded one-dimensional marginal densities and for every positive integer p the maximum over [0, 1] of its p-th
derivative has finite moments of all orders. From that Lemma it follows that E D 4
is bounded independently of h, 0 < h < 1.
Hence,
(1)

E |v |1IE C Muh,u
h

(4)

(const) P ( Z(.) (4) (.)(u h)


1/2

(const) C1 eC2 h

+ hq/4 E |v |q

1/4
) + P (|v |
> h
1/2

> h1/4 )

1/2

where C1 , C2 are positive constants and q any positive number. The bound on the
first term follows from the Landau-Shepp (1971) inequality (see also Fernique,
1974) since even though the process depends on h it is easy to see that the bound
is uniform on h, 0 < h < 1. The bound on the second term is simply the Markov
inequality. Choosing q > 8 we see that the second term in (21) is o(h).
For the first term in (21) one can use the formula for the second factorial moment
(1)
of Muh,u to write it in the form:
1
0

1I{(s)0,(t)0} dsdt

0
E(|v |1IEh (Zs

(s)h
0

(t)h

dz1

dz2

(s)(u h)) (Zt (t)(u h)) /V4 = v4 ).pV4 (v4 ),


(22)

where V4 is the random vector


Zs (s)(u h), Zt (t)(u h), Zs (s)(u h), Zt (t)(u h)
and v4 = (z1 , z2 , 0, 0).
Let s = t and Q be the - unique - polynomial of degree 3 such that
Q(s) = z1 , Q(t) = z2 , Q (s) = 0, Q (t) = 0. Check that
Q(y) = z1 + (z2 z1 )(y s)2 (3t 2y s)(t s)3
Q (t) = 6(z1 z2 )(t s)2
Q (s) = 6(z1 z2 )(t s)2 .
Denote, for each positive h,
(y) := Zy (y)(u h) Q(y).
Under the conditioning V4 = v4 in the integrand of (22), the C function (.)
verifies (s) = (t) = (s) = (t) = 0. So, there exist t1 , t2 (s, t) such that
(t1 ) = (t2 ) = 0 and for y [s, t]:
| (y)| =|

y
t1

( )d |=|

y
t1

t2

(4) ( )d |

(t s)2
2

(4)

86

J.-M. Azas, M. Wschebor


2

Noting that a b a+b


for any pair of real numbers a, b, it follows that the
2
conditional expectation in the integrand of (22) is bounded by:
(4)

E(|v |.1IEh .(t s)4 ( Z(.) (4) (.)(u h)


(t s)4 .h1/2 .h1/4 = (t s)4 .h3/4 .

/V4 = v4 )
(23)

On the other hand, applying Lemma 3.2 we have the inequality


pV4 (z1 , z2 , 0, 0) pZs ,Zt ,Zs ,Zt (0, 0, 0, 0) (const)(t s)4
the constant depending on the process but not on s, t.
Summing up, the expression in (22) is bounded by
(const).h2 .h3/4 = o(h).
(1)

Replacing now in (20) the expectation E v .1IAu Muh,u by the corresponding


integral formula:
E v .1IAu \Auh
= 1I{(0)>0} (0)

u
uh
u

+1I{(1)>0} (1)
+

E v .1IAu /Z0 = (0)x .pZ0 ((0)x)dx


E v .1IAu /Z1 = (1)x .pZ1 ((1)x)dx

uh
(t)h

1I{(t)0} dt
0
0
pV2 (z, 0) + o(h)
u

uh

dzE v .1IAu (Zt (t)(u h)) /V2 = (z, 0)

H1 (x, h)dx + o(h)

(24)

where:
H1 (x, h) = 1I{(0)>0} (0)E v .1IAu /Z0 = (0)x .pZ0 ((0)x)
+ 1I{(1)>0} (1)E v .1IAu /Z1 = (1)x .pZ1 ((1)x)
+

1I{(t)0}
0
E(v .1IAu (Zt

(t)(u h)) /Zt = (t)x, Zt = (t)(u h))


.pZt ,Zt ((t)x, (t)(u h))(t)dt.
(25)

Step 3. Our next aim is to prove that for each u the limit
lim
h0

Fv (u) Fv (u h)
h

exists and admits the representation (15) in the statement of the Lemma. For that
purpose, we will prove the existence of the limit
1
lim E v .1IAu \Auh .
h0 h

(26)

Regularity of the distribution of the maximum

87

This will follow from the existence of the limit


lim

h0,uh<x<u

H1 (x, h).

Consider the first term in expression (25). We apply Lemma 3.1(a) and with the
same notations therein:
Zt = a (t) Z0 + tZt ,

t = a (t) (0) + tt

t [0, 1] .

For u h < x < u replacing in (25) we have:


E v .1IAu /Z0 = (0)x
= E G t1 (Zt1 (t1 )x) + (t1 )(x v), ..., tm (Ztm (tm )x)
+(tm )(x v) 1IB(u,x)
= E v,x .1IB(u,x)

(27)

where v,x is defined in the statement and


B(u, x) = tZt (t)u a (t) (0)x for all t [0, 1] .
For each such that 0 < 1 and a (s) > 0 if 0 s , we define:
B (u, x) = tZt (t)u a (t) (0)x for all t [, 1]
= Zt (t)u +

a (t) (0)(u x)
for all t [, 1] .
t

It is clear that since we consider the case (0) > 0, then


B(u, x) = B0+ (u, x) := lim B (u, x).
0

Introduce also the notations:


M[s,t] = sup Z ( )u : [s, t] ,
(x) = |u x| sup

|a (t) (0)|
: t [, 1] .
t

We prove that as x u,
E v,x .1IB(u,x) E v,u .1IB(u,u)

(28)

We have,
|E v,x .1IB(u,x) E v,u .1IB(u,u) |
E |v,x v,u | + |E v,u (1IB(u,x) 1IB(u,u) ) |.

(29)

88

J.-M. Azas, M. Wschebor

From the definition of v,x it is immediate that the first term tends to 0 as x u.
For the second term it suffices to prove that
P (B(u, x) B(u, u)) 0 as x u.

(30)

Check the inclusion:


B(u, x) B (u, u) (x) M[,1] (x) M[,1] 0, M[0,] > 0
which implies that
P (B(u, x) B(u, u)) P (B(u, x) B (u, u)) + P (B (u, u) B(u, u))
P (|M[,1] | (x)) + 2.P (M[,1] 0, M[0,] > 0).
Let x u for fixed . Since (x) 0, we get:
lim sup P (B(u, x) B(u, u)) P (M[,1] = 0) + 2.P (M[,1] 0, M[0,] > 0).
xu

The first term is equal to zero because of Proposition 2.4. The second term
decreases to zero as 0 since M[,1] 0, M[0,] > 0 decreases to the empty
set.
It is easy to prove that the function
(u, v) E v,u .1IAu (Z

, )

is continuous. The only difficulty comes from the indicator function 1IAu (Z , ) although again the fact that the distribution function of the maximum of the process
Z(.) (.)u has no atoms implies the continuity in u in much the same way as
above.
So, the first term in the right-hand member of (25) has the continuous limit:
1I{(0)>0} (0)E v,u .1IAu (Z

) .pZ0 ((0).u).

With minor changes, we obtain for the second term the limit:
1I{(1)>0} (1)E v,u .1IAu (Z

) .pZ1 ((1).u),

where Z , are as in Lemma 3.1 and v,u as in the statement of Lemma 3.3.
The third term can be treated in a similar way. The only difference is that the regression must be performed on the pair (Zt , Zt ) for each t [0, 1], applying again
Lemma 3.1 (a),(b),(c). The passage to the limit presents no further difficulties, even
if the integrand depends on h.
Finally, note that conditionally on Zt = (t)u, Zt = (t)u one has
Zt (t)u = Ztt t (t)u
and

(Zt (t)u) 1IAu (Z,) = (Zt (t)u)1IAu (Z,) .

Regularity of the distribution of the maximum

89

Adding up the various parts, we get:


1
lim E v .1IAu \Auh = 1I{(0)>0} (0)E v,u .1IAu (Z
h0 h

+1I{(1)>0} (1)E v,u .1IAu (Z


1

) .pZ0 ((0).u)
,

) .pZ1 ((1).u)

t
(t)1I{(t)>0} dtE v,u
(Ztt t (t).u)1IAu (Z t , t )

pZt ,Zt ((t)u, (t)u).


Similar computations that we will not perform here show an analogous result
for
1
lim E v .1IAuh \Au
h0 h
and replacing into (16) we have the result for processes Z with C paths.
Step 4. Suppose now that Z and (.) satisfy the hypotheses of the Lemma and
define:
Z (t) = ( Z)(t) + Y (t)

(t) = ( )(t)

and

where > 0, (t) = 1 ( 1 t), a non-negative C function with compact


+
support, (t)dt = 1 and Y is a Gaussian centered stationary process with C
paths and non-purely atomic spectrum, independent of Z. Proceeding as in Sec.
10.6 of Cramer-Leadbetter (1967), one can see that Y verifies Hk for every k. The
definition of Z implies that Z inherites this property. Thus for each positive ,
Z meets the conditions for the validity of Steps 2 and 3, so that the function
Fv (u) = E v 1IAu (Z

, )

where v = G(Zt1 (t1 )v, ..., Ztm (tm )v) is continuoustly differentiable
and its derivative verifies (15) with the obvious changes, that is:
Fv

(u) = (0)E

v,u

+ (1)E
1

v,u

(t)E

pZt ,(Z

)t

.1IAu

(Z ) ,( )

.1IAu
v,u

.pZ0 (0) .u

(Z ) ,( )

(t) .u,

t
t

.pZ1 (1) .u
t

(t) .u dt.

(t).u 1IAu ((Z

)t ,( )t )

(31)

Let 0. We prove next that (Fv ) (u) converges for fixed (u, v) to a limit
function Fv (u) that is continuous in (u, v). On the other hand, it is easy to see that
for fixed (u, v) Fv (u) Fv (u). Also, from (31) it is clear that for each v, there
exists 0 > 0 such that if (0, 0 ), (Fv ) (u) is bounded by a fixed constant when
u varies in a bounded set because of the hypothesis on the functions G and and
the non-degeneracy of the one and two-dimensional distribution of the process Z.

90

J.-M. Azas, M. Wschebor

So, it follows that Fv (u) = Fv (u) and the same computation implies that Fv (u)
satisfies (15).
Let us show how to proceed with the first term in the right-hand member of
(31). The remaining terms are similar.
Clearly, almost surely, as 0 one has Zt Zt , (Z )t Zt , (Z )t Zt
uniformly for t [0, 1], so that the definition of Z in (11) implies that (Z )t
Zt uniformly for t [0, 1], since the regression coefficient (a ) (t) converges to
a (t) uniformly for t [0, 1] (with the obvious notation).
Similarly, for fixed (u, v):
( )t t , (v,u ) v,u
uniformly for t [0, 1].
Let us prove that
E (v,u ) 1IAu

E v,u 1IAu (Z

(Z ) ,( )

) .

This is implied by

as

P Au
0. Denote, for

> 0, 0:

Cu, = Au

Au Z ,

0.

(32)

(t).u for every t [0, 1]

Eu, = Zt (t)u + for all t [0, 1] .


One has:
P (Cu,

Eu,0 ) P (Cu, \ Eu, ) + P (Eu, \ Cu, ) + P (Eu, \ Eu,0 ).

Let K be a compact subset of the real line and suppose u K. We denote:


D

sup

uK,t[0,1]

(t).u Zt (t).u |>

and
Fu, = sup

t[0,1]

Zt (t)u .

Fix > 0 and choose small enough so that:


P D

< .

Check the following inclusions:


Cu, \ Eu, D

, ,

Eu, \ Cu,

D c, Fu, ,

Eu, \ Eu,0 Fu,

which imply that if is small enough:


P (Cu,

Eu,0 ) 2. + 2.P Fu, .

Regularity of the distribution of the maximum

91

For each u, as 0 one has


P Fu, P

sup

t[0,1]

Zt (t)u = 0 = 0.

where the second equality follows again on applying Proposition 2.4.


This proves that as 0 the first term in the right-hand member of (31) tends
to the limit
(0)E v,u .1IAu (Z , ) .pZ0 ( (0) .u) .
It remains to prove that this is a continuous function of (u, v). It suffices to prove
the continuity of the function
E 1IAu (Z

) = P Au Z ,

as a function of u. For that purpose we use inequality:


| P Au+h Z ,
P

| sup

t[0,1]

P Au Z ,

Zt (t).u || h |

and as h 0 the right-hand member tends to P | supt[0,1] Zt (t).u |= 0


which is equal to zero by Propostion 2.4.
Proof of Theorem 1.1 We proceed by induction on k.
We will give some details for the first two derivatives including some implicit
formulae that will illustrate the procedure for general k.
We introduce the following additional notations. Put Yt := Xt (t)u and define, on the interval [0, 1], the processes X , X , Xt , Y , Y , Y t , and the functions
, , t , as in Lemma 3.1. Note that the regression coefficients corresponding
to the processes X and Y are the same, so that anyone of them may be used to define
the functions , , t . One can easily check that
Ys = Xs (s)u
Ys = Xs (s)u
Yst = Xst t (s)u.

For t1 , ..., tm [0, 1] { , } , m 2, we define by induction the stochastic


t
t
processes X t1 ,...,tm = X t1 ,...,tm1 m , Y t1 ,...,tm = Y t1 ,...,tm1 m and the function
t
t1 ,...,tm = t1 ,...,tm1 m , applying Lemma 3.1 for the computations at each stage.
With the aim of somewhat reducing the size of the formulae we will express
the successive derivatives in terms of the processes Y t1 ,...,tm instead of Xt1 ,...,tm .
The reader must keep in mind that for each m-tuple t1 , ..., tm the results depend on
u through the expectation of the stochastic process Y t1 ,...,tm . Also, for a stochastic
process Z we will use the notation
A(Z) = A0 (Z, ) = {Zt 0 : for all t [0, 1]} .

92

J.-M. Azas, M. Wschebor

First derivative. Suppose that X satisfies H2 . We apply formula (15) in Lemma


3.3 for 1, Z = X and (.) 1 obtaining for the first derivative:
F (u) = E 1IA(Y ) pY0 (0) + E 1IA(Y ) pY1 (0)
1

E Ytt11 1IA

Y t1

pYt

,Yt1 (0, 0)dt1 .

(33)

This expression is exactly the expression in (9) with the indicated notational changes and after taking profit of the fact that the process is Gaussian, via the regression
on the conditionning in each term. Note that according to the definition of the
Y -process:
E 1IA(Y ) = E 1IAu (X , )
E 1IA(Y ) = E 1IAu (X
E Ytt11 1IA

= E Ytt11 1IAu (Xt1 , t1 ) .

Y t1

Second derivative. Suppose that X satisfies H4 . Then, X , X , Xt1 satisfy H3 , H3 ,


H2 respectively. Therefore Lemma 3.3 applied to these processes can be used to
show the existence of F (u) and to compute a similar formula, excepting for the
necessity of justifying differentiation under the integral sign in the third term. We
get the expression:
(1)

(1)

F (u) = E 1IA(Y ) pY0 (0) E 1IA(Y ) pY1 (0)


1

+
0

E Ytt11 1IA

Y t1

(1,0)
(0, 0)dt1
t1 ,Yt1

pY

+pY0 (0) (0)E 1IA(Y


1

(t2 )E Yt2 ,t2 1I

A Y

+pY1 (0) (0)E 1IA(Y


1

) pY0 (0) + (1)E 1IA(Y

(t2 )E Yt2 ,t2 1I


A

,t2

pY

t2 ,(Y

,t2

pY

t2 ,(Y

) pY1 (0)

)t2 (0, 0)dt2

t1 (t1 )E 1IA(Y t1 ) + t1 (0)E Ytt11 , 1IA

(0, 0) + t1 (1)E Ytt1 , 1I


1 ,Yt1
1
A

pYt

) pY1 (0)

)t2 (0, 0)dt2

) pY0 (0) + (1)E 1IA(Y

1
0

(1)
(1)
In this formula pYt , pYt
0
1

t1

(t2 )E

Y t1 ,

Y t1 ,

pY t1 (0)
0

pY t1 (1)
1

Ytt11 ,t2 Ytt21 ,t2 1IA(Y t1 ,t2 )

dt1 ,

pY t1 ,(Y t1 ) (0, 0)dt2


t2

t2

(34)

and pYt ,Yt (0, 0)(1,0) stand respectively for the deriv1
1
ative of pYt0 (.), the derivative of pYt1 (.) and the derivative with respect to the first
variable of (pYt ,Yt (., .)).
1
1
To validate the above formula, note that:

Regularity of the distribution of the maximum

93

The first two lines are obtained by differentiating with respect to u, the densities
pY0 (0) = pX0 (u), pY1 (0) = pX1 (u), pYt ,Yt (0, 0) = pXt ,Xt (u, 0).
1
1
1
1
Lines 3 and 4 come from the application of Lemma 3.3 to differentiate E(1IA(Y ) ).
The lemma is applied with Z = X , = , = 1.
Similarly, lines 5 and 6 contain the derivative of E(1IA(Y ) ).
The remaining corresponds to differentiate the function
E Ytt11 1IA(Y t1 ) = E Xtt11 t1 (t1 )u 1IAu (Xt1 , t1 )
in the integrand of the third term in (33). The first term in line 7 comes from the
simple derivative

E (Xtt11 t1 (t1 )v)1IAu (Xt1 , t1 ) = t1 (t1 )E(1IA(Y t1 ).


v
The other terms are obtained by applying Lemma 3.3 to compute

E (Xtt11 t1 (t1 )v)1IAu (Xt1 , t1 ) ,


u
putting Z = Xt1 , = t1 , = Xtt11 t1 (t1 )v.
Finally, differentiation under the integral sign is valid since because of Lemma
3.1, the derivative of the integrand is a continuous function of (t1 , t2 , u) due
the regularity and non-degeneracy of the Gaussian distributions involved and
Proposition 2.4.
General case. With the above notation, given the mtuple t1 , ..., tm of elements
of [0, 1] { , } we will call the processes Y, Y t1 , Y t1 ,t2 , ..., Y t1 ,...,tm1 the ancestors of Y t1 ,...,tm . In the same way we define the ancestors of the function t1 ,...,tm .
Assume the following induction hypothesis: If X satisfies H2k then F is k
times continuously differentiable and F (k) is the sum of a finite number of terms
belonging to the class Dk which consists of all expressions of the form:
1

..

ds1 ..dsp Q(s1 , .., sp )E 1IA

Y t1 ,..,tm

K1 (s1 , .., sp )K2 (s1 , .., sp ) (35)

where:
1 m k.
t1 , ..., tm [0, 1] { , } , m 1.
s1 , .., sp , 0 p m, are the elements in {t1 , ..., tm } that belong to [0, 1] (that
is, which are neither nor ). When p = 0 no integral sign is present.
Q(s1 , .., sp ) is a polynomial in the variables s1 , .., sp .
is a product of values of Y t1 ,...,tm at some locations belonging to s1 , .., sp .
K1 (s1 , .., sp ) is a product of values of some ancestors of t1 ,...,tm at some
locations belonging to the set s1 , .., sp {0, 1} .
K2 (s1 , .., sp ) is a sum of products of densities and derivatives of densities of
the random variables Z at the point 0, or the pairs ( Z , Z ) at the point (0, 0)
where s1 , .., sp {0, 1} and the process Z is some ancestor of Y t1 ,...,tm .

94

J.-M. Azas, M. Wschebor

Note that K1 does not depend on u but K2 is a function of u.


It is clear that the induction hypothesis is verified for k = 1. Assume that it
is true up to the integer k and that X satisfies H2k+2 . Then F (k) can be written as
a sum of terms of the form (35). Consider a term of this form and note that the
variable u may appear in three locations:
1. In , where differentiation is simple given its product form, the fact that
t1 ,...,tq
= t1 ,...,tq (s), q m, s s1 , ..., sp and the boundedness of
u Ys
moments allowing to differentiate under the integral and expectation signs.
2. In K2 (s1 , .., sp ) which is clearly C as a function of u. Its derivative with
respect to u takes the form of a product of functions of the types K1 (s1 , .., sp )
and K2 (s1 , .., sp ) defined above.
3. In 1IA Y t1 ,..,tm . Lemma 3.3 shows that differentiation produces 3 terms depending upon the processes Y t1 ,...,tm ,tm+1 with tm+1 belonging to [0, 1] { , }.
Each term obtained in this way belongs to Dk+1 .
The proof is achieved by noting that, as in the computation of the second derivative, Lemma 3.1 implies that the derivatives of the integrands are continuous
functions of u that are bounded as functions of (s1 , .., sp , tm+1 , u) if u varies in a
bounded set.
The statement and proof of Theorem 1.1 can not, of course, be used to obtain
explicit expressions for the derivatives of the distribution function F . However, the
implicit formula for F (k) (u) as sum of elements of Dk can be transformed into explicit upper-bounds if one replaces everywhere the indicator functions 1IA(Y t1 ,..,tm ) )
by 1 and the functions t1 ,..,tm (.) by their absolute value.
On the other hand, Theorem 1.1 permits to have the exact asymptotic behaviour
of F (k) (u) as u + in case V ar(Xt ) is constant. Even though the number of
terms in the formula increases rapidly with k, there is exactly one term that is dominant. It turns out that as u +, F (k) (u) is equivalent to the k-th derivative of
the equivalent of F (u). This is Corollary 1.1.
Proof of Corollary 1.1. To prove the result for k = 1 note that under the hypothesis
of the Corollary, one has r(t, t) = 1, r01 (t, t) = 0, r02 (t, t) = r11 (t, t) and an
elementary computation of the regression (13) replacing Z by X, shows that:
bt (s) = r(s, t),
and
t (s) = 2

ct (s) =

r01 (s, t)
r11 (t, t)

1 r(s, t)
(t s)2

since we start with (t) 1.


This shows that for every t [0, 1] one has inf s[0,1] ( t (s)) > 0 because of the
non-degeneracy condition and t (t) = r02 (t, t) = r11 (t, t) > 0. The expression
for F becomes:
(36)
F (u) = (u)L(u),

Regularity of the distribution of the maximum

95

where
L(u) = L1 (u) + L2 (u) + L3 (u),
L1 (u) = P (Au (X , ),
L2 (u) = P (Au (X , ),
1

L3 (u) =
0

E (Xtt t (t)u)1IAu (Xt , t )

dt
.
(2 r11 (t, t))1/2

Since for each t [0, 1] the process Xt is bounded it follows that


a.s. 1IAu (Xt , t ) 1 as u +.
A dominated convergence argument shows now that L3 (u) is equivalent to

u
(2)1/2

1
0

r02 (t, t)
u
dt =
1/2
(r11 (t, t))
(2 )1/2

r11 (t, t)dt.

Since L1 (u), L2 (u) are bounded by 1, (1) follows for k = 1.


For k 2, write
F (k) (u) = (k1) (u)L(u) +

h=k
h=2

k 1 (kh)

(u)L(h1) (u).
k1

(37)

As u +, for each j = 0, 1, ..., k 1, (j ) (u) (1)j uj (u) so that the


first term in (37) is equivalent to the expression in (1). Hence, to prove the Corollary
it suffices to show that the succesive derivatives of the function L are bounded. In
fact, we prove the stronger inequality
|L(j ) (u)| lj (

u
), j = 1, ..., k 1
aj

(38)

for some positive constants lj , aj , j = 1, ..., k 1.


We first consider the function L1 . One has:
(s) =
( ) (s) =

1 r(s, 0)
f or 0 < s 1, (0) = 0,
s

1 + r(s, 0) s.r10 (s, 0)


1
f or 0 < s 1, ( ) (0) = r11 (0, 0).
2
2
s

The derivative L1 (u) becomes


L1 (u) = (1)E[1IAu (X
1

(t)E (Xt

,t

,
,t

)]

pX ( (1)u)
1

(t)u)1IAu (X

,t , ,t )

pX

,(X )t (

(t)u, ( ) (t)u) dt.

Notice that (1) is non-zero so that the first term is bounded by a constant
times a non-degenerate Gaussian density. Even though (0) = 0, the second

96

J.-M. Azas, M. Wschebor

term is also bounded by a constant times a non-degenerate Gaussian density because the joint distribution of the pair (Xt , (X )t ) is non-degenerate and the pair
( (t), ( ) (t)) = (0, 0) for every t [0, 1].
Applying a similar argument to the succesive derivatives we obtain (38) with
L1 instead of L.
The same follows with no changes for
L2 (u) = P (Au (X , ).
For the third term
1

L3 (u) =
0

E (Xtt t (t)u)1IAu (Xt , t )

dt
(2 r11 (t, t))1/2

we proceed similarly, taking into account t (s) = 0 for every s [0, 1]. So (38)
follows and we are done.
Remark. Suppose that X satisfies the hypotheses of the Corollary with k 2.
Then, it is possible to refine the result as follows.
For j = 1, ..., k :
F (j ) (u) = (1)j 1 (j 1)!hj 1 (u)
1

1 + (2)1/2 .u.
0

(r11 (t, t))1/2 dt (u) + j (u)(u) (39)

1 (j )
where hj (u) = (1)
j ! ((u)) (u), is the standard j-th Hermite polynomial
(j = 0, 1, 2, ...) and
| j (u) | Cj exp(u2 )

where C1 , C2 , ... are positive constants and > 0 does not depend on j .
The proof of (39) consists of a slight modification of the proof of the Corollary.
Note first that from the above computation of (s) it follows that 1) if X0 < 0,
then if u is large enough Xs (s).u 0 for all s [0, 1] and 2) if X0 > 0,
then X0 (0).u > 0 so that:
L1 (u) = P (Xs (s).u 0) for all s [0, 1])

1
2

as u +.

On account of (38) this implies that if u 0:


0

1
L1 (u) =
2

+
u

L1 (v)dv D1 exp(1 u2 )

with D1 , 1 positive constants.


L2 (u) is similar. Finally:
1

L3 (u) =
0

E (Xtt t (t)u)

dt
(2 r11 (t, t))1/2

Regularity of the distribution of the maximum


1

97

E (Xtt t (t)u)1I(A

u (X

t , t ) C

dt
.
(2 r11 (t, t))1/2

(40)

The first term in (40) is equal to:


1

(2)1/2 .u.

(r11 (t, t))1/2 dt.

As for the second term in (40) denote # =

inf

s,t[0,1]

t (s) > 0 and let u > 0.

Then:
P

Au (X t , t )

P ( s [0, 1] such that Xst > # .u) D3 exp(3 u2 )

with D3 , 3 are positive constants, the last inequality being a consequence of the
Landau-Shepp-Fernique inequality.
The remainder follows in the same way as the proof of the Corollary.
Acknowledgements. This work has received a support from CONICYT-BID-Uruguay, grant
91/94 and from ECOS program U97E02.

References
1. Adler, R.J.: An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes, IMS, Hayward, Ca (1990)
2. Azas, J-M., Wschebor, M.: Regularite de la loi du maximum de processus gaussiens
reguliers, C.R. Acad. Sci. Paris, t. 328, serieI, 333336 (1999)
3. Belyaev, Yu.: On the number of intersections of a level by a Gaussian Stochastic process,
Theory Prob. Appl., 11, 106113 (1966)
4. Berman, S.M.: Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series (1992)
5. Bulinskaya, E.V.: On the mean number of crossings of a level by a stationary Gaussian
stochastic process, Theory Prob. Appl., 6, 435438 (1961)
6. Cierco, C.: Probl`emes statistiques lies a` la detection et a` la localisation dun g`ene a` effet
quantitatif. PHD dissertation. University of Toulouse.France (1996)
7. Cramer, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes, J. Wiley &
Sons, New-York (1967)
8. Diebolt, J., Posse, C.: On the Density of the Maximum of Smooth Gaussian Processes,
Ann. Probab., 24, 11041129 (1996)
9. Fernique, X.: Regularite des trajectoires des fonctions aleatoires gaussiennes, Ecole
dEte de Probabilites de Saint Flour, Lecture Notes in Mathematics, 480, Springer-Verlag,New-York (1974)
10. Landau, H.J., Shepp, L.A.: On the supremum of a Gaussian process, Sankya Ser. A, 32,
369378 (1971)
11. Leadbetter, M.R., Lindgren, G., Rootzen, H.: Extremes and related properties of random
sequences and processes. Springer-Verlag, New-York (1983)
12. Lifshits, M.A.: Gaussian random functions. Kluwer, The Netherlands (1995)
13. Marcus, M.B.: Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 5271 (1977)

98

J.-M. Azas, M. Wschebor

14. Nualart, D., Vives, J.: Continuite absolue de la loi du maximum dun processus continu,
C. R. Acad. Sci. Paris, 307, 349354 (1988)
15. Nualart, D., Wschebor, M.: Integration par parties dans lespace de Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83109 (1991)
16. Piterbarg, V.I.: Asymptotic Methods in the Theory of Gaussian Processes and Fields,
American Mathematical Society. Providence, Rhode Island (1996)
17. Rice, S.O.: Mathematical Analysis of Random Noise, Bell System Technical J., 23,
282332, 24, 45156 (19441945)
18. Tsirelson, V.S.: The Density of the Maximum of a Gaussian Process, Th. Probab. Appl.,
20, 817856 (1975)
19. Weber, M.: Sur la densite du maximum dun processus gaussien, J. Math. Kyoto Univ.,
25, 515521 (1985)
20. Wschebor, M.: Surfaces aleatoires. Mesure geometrique des ensembles de niveau, Lecture Notes in Mathematics, 1147, Springer-Verlag (1985)
21. Ylvisaker, D.: A Note on the Absence of Tangencies in Gaussian Sample Paths, The
Ann. of Math. Stat., 39, 261262 (1968)

The Distribution of the Maximum of a Gaussian


Process: Rice Method Revisited.
Jean-Marc Azas , azais@cict.fr
Mario Wschebor , wscheb@fcien.edu.uy
December 21, 2000

Abstract
This paper deals with the problem of obtaining methods to compute the
distribution of the maximum of a one-parameter stochastic process on a fixed
interval, mainly in the Gaussian case. The main point is the relationship
between the values of the maximum and crossings of the paths, via the socalled Rices formulae for the factorial moments of crossings.
We prove that for some general classes of Gaussian process the so-called
Rice series is convergent and can be used for to compute the distribution
of the maximum. It turns out that the formulae are adapted to the numerical
computation of this distribution and becomes more efficient than other numerical methods, namely simulation of the paths or standard bounds on the
tails of the distribution.
We have included some relevant numerical examples to illustrate the power
of the method.

Mathematics Subject Classification (1991): 60Gxx, 60E05, 60G15, 65U05.


Key words: Extreme Values, Distribution of the Maximum, Crossings of a Level,
Rices Formulae.
Short Title: Distribution of the Maximum.

Laboratoire de Statistique et Probabilites. UMR-CNRS C55830 Universite Paul Sabatier. 118,


route de Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matematica. Facultad de Ciencias. Universidad de la Rep


ublica. Calle Igua 4225.
11400 Montevideo. Uruguay.

Introduction

Let X = {Xt : t IR} be a stochastic process with real values and continuous paths
defined on a probability space (, , P ) and MT := max{Xt : t [0, T ]}.
The computation of the distribution function of the random variable MT
F (T, u) := P (MT u), u IR
by means of a closed formula based upon natural parameters of the process X is
known only for a very restricted number of stochastic processes (and trivial functions
of them): the Brownian Motion {Wt : t 0}; the Brownian Bridge, Bt := Wt
1
tW1 (0 t 1); Bt 0 Bs ds (Darling, 1983); the Brownian Motion with a linear
t
drift (Shepp, 1979); 0 Ws ds + yt (McKean, 1963, Goldman, 1971 Lachal, 1991); the
stationary Gaussian processes with covariance equal to:
1. r(t) = e|t| (Ornstein-Uhlenbeck process, DeLong, 1981),
2. r(t) = (1 |t|)+ , T a positive integer (Slepian process, Slepian 1961, Shepp,
1971),
3. r(t) even, periodic with with period 2, r(t) = 1|t| for 0 |t| 1, 0 < 2,
(Shepp and Slepian 1976),
4. r(t) = 1 |t|/1 /1 , |t| < 1 /, 0 < 1/2, T = (1 )/
(Cressie 1980),
5. r(t) = cos t.
Given the interest in F (T, u) for a large diversity of theoretical and technical
purposes an extensive literature has been developed of which we give a sample of
references pointing to various directions:
1. Obtaining inequalities for F (T, u) : Slepian (1962); Landau & Shepp (1970);
Marcus & Shepp (1972); Fernique (1974); Borell (1975); Ledoux (1996); Talagrand (1996) and references therein. A general review of a certain number of
classical results is in Adler (1990, 2000).
2. Describing the behaviour of F (T, u) under various asymptotics : Qualls and
Watanabe (1973); Piterbarg (1981, 1996); Leadbetter, Lingren and Rootzen
(1983); Berman (1985a, b, 1992); Talagrand (1988); Berman & Kono (1992) ;
Sun (1993); Wschebor (2000); Azas, Bardet and Wschebor (2000).
2

3. Studying the regularity of the distribution of MT : Ylvisaker (1968); Tsirelson


(1975); Weber (1985); Lifshits (1995); Diebolt and Posse (1996); Azas and
Wschebor (2000) and references therein.
Generally speaking, even though important results are associated with problems
1) 2) and 3) they only give limited answers to the computation of F (T, u) for fixed
T and u. As a consequence, Monte-Carlo methods based on the simulation of the
paths of the continuous parameter process X are widely used, even though they
have well-known difficulties : a) they are poor for theoretical purposes ; b) they are
expensive from the point of view of the number of elementary computations needed
to assure that the error is below a given bound and c) they always depend on the
quality of the random number generator being employed.
The approach in this paper is based upon expressing F (T, u) by means of a
series (The Rice series) whose terms contain the factorial moments of the number
of upcrossings. The underlying ideas have been known since a long time (Rice
(1944-1945), Slepian (1962), Miroshin (1974)).
The main new result in this paper is that we have been able to prove the convergence of that series in a general framework instead of considering only some
particular processes. This provides a method that can be widely applied.
A typical application is Theorem2.2 that states that if a stationary Gaussian
process has a covariance which has a Taylor expansion at zero that is absolutely
convergent at t = 2T , then F (T, u) can be computed by means of the Rice series.
On the other hand, even though Theorems 2.1 and 2.3 below do not refer specifically
to Gaussian processes, in practice, at present we are only able to apply them to the
numerical computation of F (T, u) only in Gaussian cases.
In the section Numerical examples we include a comparison between the complexities of the computations of F (T, u) using the Rice series versus Monte-Carlo
method, in the relevant case of a general class of stationary Gaussian processes. It
shows that the use of Rice series is a priori better. More important is the fact that
the Rice series is self-controlling for the numerical errors. This implies that the a
posteriori number of computations may be much smaller than the one required by
simulation. In fact, in relevant cases for standard bounds for the error, the actual
computation is performed with a few terms in the Rice series.
As examples we give tables for F (T, u) for a number of Gaussian processes.
When the length of the interval T increases, one needs an increasing mumber
of terms in the Rices series not to surpass a given bound for the approximation
error. For small values of T an large values of the level u one can use the so3

called Davies bound(1977), or more accurately, the first term in the Rice series
to obtain approximations for F (T, u). But as T increases, for moderate values of u
the Davies bound is far from the true value and one requires the computation of
the succesive terms. The numerical results are shown in the case of four Gaussian
stationary processes for which no closed formula is known.
An asymptotic approximation of F (T, u) as u + recently obtained by Azas,
Bardet and Wschebor (2000). It extends to any T a previous result by Piterbarg
(1981) for sufficiently small T .
One of the key points in the computation is the numerical approximation of the
factorial moments of upcrossings by means of Rice integral formulae. For that purpose, the main difficulty is the precise description of the behaviour of the integrands
appearing in these formulae near the diagonal, which is again an old subject that is
interesting on its own - see Belayeiv (1966), Cuzick (1975) - and remains widely open.
We have included in the Section Computation of Moments some new results, that
give partial answers and are helpful to improve the numerical methods.
The extension to processes with non-smooth trajectories can be done by smoothing the paths by means of a deterministic device, applying the previous methods
to the regularized process and estimating the error as a function of the smoothing
width. We have not included these type of results here since for the time being they
do not appear to be of practical use.
The Note (Azas & Wschebor 1997) contains a part of the results in the present
paper, without proofs.

Notations
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I) := {t I : f (t) = u}
Nu (f ; I) := (Cu (f ; I))
denote respectively the set of roots of the equation f (t) = u on the interval I and the
number of these roots, with the convention Nu (f ; I) = + if the set Cu is infinite.
Nu (f ; I) is called the number of crossings of f with the level u on the interval
I. In what follows, I will be the interval [0, T ] if it is not stated otherwise.
In the same way, if f is a differentiable function the number of upcrossings of
f is defined by means of
Uu (f ; I) := ({t I : f (t) = u, f (t) > 0}).
4

f p denotes the norm of f in Lp (I, ), 1 p +, the Lebesgue measure. The


joint density of the finite set of random variables X1 , ...Xn at the point (x1 , ...xn )
will be denoted pX1 ,...,Xn (x1 , ...xn ) whenever it exists. (t) := (2)1/2 exp(t2 /2) is
t
the density of the standard normal distribution, (t) := (u)du its distribution
function. |I| is the length of I. x+ = sup {x, 0}.
If A is a matrix, AT denotes its transposed, and if A is a square matrix, det(A)
its determinant. V ar() is the variance matrix of the (finite dimensional) random
vector and Cov(, ) the covariance of and .
For m and k, positive integers, k m, define the factorial kth power of m by
m[k] := m(m 1)...(m k + 1)
For other real values of m and k we put m[k] := 0.
If k is an integer k 1, the diagonal of I k is the set:
Dk (I) := {(t1 , ..., tk ) I k , tj = th for some pair (j, h), j = h}.
f (m) is the m-th derivative of the function f . jh = 0 or 1 according as j = h or
j = h.

The distribution of the maximum and the Rice


series

We introduce the notations


m := E((Uu )[m] 1I{X0 u} ); m := E((Uu )[m] ) (m = 1, 2, ...)
where Uu = Uu (X, [0, T ]). m is the factorial moment of the number of upcrossings
of the process X with the level u on the interval [0, T ], starting below u at t = 0.
The Rice formula to compute m , whenever it holds is:
u

m =

dt1 ...dtm

[0,T ]

dxE (Xt1 )+ ...(Xtm )+ /X0 = x, Xt1 = ... = Xtm = u

pX0 ,Xt1 ,...,Xtm (x, u, ..., u) =

(1)

[0,T ]m

dt1 ...dtm

dx

pX0 ,Xt1 ,...,Xtm ,Xt

[0,+)m

x1 ...xm

,...,Xtm (x, u, ..., u, x1 , ...xm )dx1 ...dxm .

(2)

(References for conditions for this formula to hold true that suffice for our presente purposes and also for proofs can be found, for example, in Marcus (1977) and
in Wschebor (1985).
This section contains two main results. The first is Theorem 2.1 that requires
the process to have C paths and contains a general condition enabling to compute
F (T, u) as the sum of a series. The second is Theorem 2.2 that illustrates the same
situation for Gaussian stationary processes from conditions on the the covariance.
As for Theorem 2.3, it contains upper and lower bounds on F (T, u) for processes
with C k paths verifying some additional conditions.
Theorem 2.1 Assume that a.s. the paths of the stochastic process X are of class
C and that the density pXT /2 is bounded by some constant D.
(i) If there exists a sequence of positive numbers {ck }k=1,2,... such that:
k := P

X (2k1)

ck .T 2k1 +

22k1

Dck
= o 2k (k )
(2k 1)!

(3)

then :

(1)m+1

1 F (T, u) = P (X0 > u) +


m=1

m
m!

(4)

(ii) In formula (4) the error when one replaces the infinite sum by its m0 -th

partial sum is bounded by m


where:
0 +1

m
:= sup 2k+1 k .
km

We will call the series in the right-hand term of (4) the Rice Series.
For the proof we will assume, with no loss of generality that T = 1.
We start with the following lemma on the Cauchy remainder for polynomial
interpolation (Davis 1975, Th. 3.1.1 ).
Lemma 2.1 a) Let I be an interval in the real line, f : I IR a function of
class C k , k a positive integer, t1 , ..., tk , k points in I and let P (t) be the - unique
- interpolation polynomial of degree k 1 such that for i = 1, ..., k: f (ti ) = P (ti ),
taking into account possible multiplicities.

Then, for t I :
1
(t t1 )....(t tk )f (k) ()
k!

f (t) P (t) =
where

min(t1 , ..., tk , t) max(t1 , ..., tk , t).


b) If f is of class C k and has k zeros in I (taking into account possible multiplicities), then:
|f (1/2)|

1
f (k)
k!2k

The next combinatorial lemma plays the central role in what follows. A proof is
given in Lindgren (1972).
Lemma 2.2 Let be a non-negative integer-valued random variable having finite
moments of all orders. Let k, m, M (k 0, m 1, M 1) be integers and denote
M

pk := P ( = k) ; m := E( [m] ) ; SM :=

(1)m+1
m=1

m
m!

Then
(i) For each M :

2M

pk S2M +1

pk

S2M

(5)

k=1

k=1

(ii) The sequence {SM ; M = 1, 2, ...} has a finite limit if and only if m /m! 0
as m , and in that case:

P ( 1) =

(1)m+1

pk =
m=1

k=1

m
.
m!

(6)

Remark. A by-product of Lemma 2.2 that will be used in the sequel is the following:
if in (6) one substitutes the infinite sum by the M partial sum, the absolute value
M +1 /((M + 1)!) of the first neglected term is an upper-bound for the error in the
computation of P ( 1).
7

Lemma 2.3 With the same notations as in Lemma 2.2 we have the equality:

E( [m] ) = m

(k 1)[m1] P ( k) (m = 1, 2, ...).
k=m

Proof: Check the identity


j1

[m]

(k)[m1]

=m
k=m1

for each pair of integers j, m. So,

E(

[m]

)=

[m]

P ( = j) =

j=m

(k 1)[m1] =

P ( = j)m
j=m

k=m

(k 1)[m1] P ( k).

=m

k=m

Lemma 2.4 Suppose that a.s. the paths of the process X belong to C and that
pX1/2 is bounded by the constant D. Then for any sequence {ck , k = 1, 2, ...} of
positive numbers, one has

[m]

E((Uu )

)m

(k 1)[m1] P

X (2k1)

ck +

k=m

22k1

Dck
,
(2k 1)!

(7)

Proof: Because of Lemma 2.3 it is enough to prove that P (Uu k) is bounded


by the expression in brackets in the right-hand member of (7). We have
P (Uu k) P ( X (2k1)

ck ) + P (Uu k, X (2k1)

< ck ).

Because of Rolles theorem:


{Uu k} {Nu (X; I) 2k 1},
Applying Lemma 2.1 to the function X(.) u and replacing in its statement k by
2k 1, we obtain:
{Uu k, X (2k1)

< ck } {|X1/2 u|
8

22k1

ck
}.
(2k 1)!

The remaining is plain.


Proof of Theorem 2.1 :
[m]
We use the notation m := E(Uu ) (m = 1, 2, ...).
Using Lemma 2.4 and the hypothesis we obtain:

1
(k+1)
k [m] m
2
= m 2(m+1)

m!
m! k=m
m!

1
1x

(m)

|x=1/2 = m

Since m m we can apply Lemma 2.2 to the random variable = Uu 1I{X0 u}

0.
and the result follows from m
Remarks
One can replace condition pXT /2 (x) D for all x by pXT /2 (x) D for x in some
neighbourhood of u. In this case, the statement of Theorem 2.1 holds if one adds in

(ii) that the error is bounded by m


for m0 large enough. The proof is similar.
0 +1
Also, one may substitute the one-dimensional density pXT /2 by pXt for some other
t (0, T ), introducing into the bounds the corresponding modifications.
The application of Theorem 2.1 requires an adequate choice of the sequence
{ck , k = 1, 2, ...} that depends on the available description of the process X. The
whole procedure will have some practical interest for the computation of P (M > u)

only if we get appropriate bounds for the quantities m


and the factorial moments m
can be actually computed by means of Rice formulae (or by some other procedure).
The next Theorem shows how this can be done in the case of a general class of
Gaussian stationary processes.
Theorem 2.2 Let X be Gaussian, centered and stationary, with covariance .
Assume that has a Taylor expansion at the origin that is absolutely convergent
at t = 2T. Then the conclusion of Theorem 2.1 holds true so that the Rice series
converges and F (T, u) can be computed by means of (4)
Proof. Again we assume with no loss of generality that T = 1 and that (0) = 1.
Note that the hypothesis implies that the spectral moments k exist and are
finite for every k = 0, 1, 2, ...
We will prove a stronger result, assuming the hypothesis:
H1 : 2k C1 (k!)2 .
9

It is easy to verify that if has a Taylor expansion at zero that is absolutely


convergent at t = 2, then H1 holds true. (In fact, both conditions are only slightly
different, since H1 implies that the Taylor expansion of at zero is absolutely
convergent in {|t| < 2}).
Let us check that the hypotheses of Theorem 2.1 hold true.
First, pX1/2 (x) D = (2)1/2 .
Second, let us show a sequence {ck } that satisfies (3). We have
P ( X (2k1)

(2k1)

ck ) P (|X0

| ck ) + 2P (Uck (X (2k1) , I) 1)
1/2

P (|Z| ck 4k2 ) + 2E(Uck (X (2k1) , I)), (8)


where Z is standard normal.
(2k1)
Note that {Xt
; t IR} is a Gaussian stationary centered process with co(4k2)
variance function
(.). So we can use Rice formula for the expectation of the
number of upcrossings of a stationary centered Gaussian process (see for example
Cramer & Leadbetter, 1967) to compute the second term in the right-hand member
of (8). Using the inequality 1 (x) (1/x)(x) valid for x > 0, one gets:
1/2

P( X

(2k1)

2 4k2
+ (1/)
ck

ck )

4k
4k2

1/2

exp

Choose
ck := (B1 k4k2 )1/2 if

ck := (4k )1/2 if

4k
B1 k
4k2

4k
> B1 k.
4k2

Using hypothesis H1 ), if B1 > 1 :


P ( X (2k1)

B1 k
2
1
+ (B1 k)1/2 e 2 .

ck )

Finally, choosing B1 := 4log(2):


k

2
1/2
(1 + 2(C1 + 1)k)22k

10

(k = 1, 2, ...),

c2k
24k2

(9)

so that (3) is satisfied. As a by product, note that

8
1/2
(1 + 2(C1 + 1)m)2m (m = 1, 2, ...).

(10)

Remarks
a) If one is willing to use Rice formulae to compute the factorial moments m , it
is enough to verify that the distribution of
Xt1 , ..., Xtk , Xt1 , ..., Xtk
is non-degenerate for any choice of k = 1, 2, ... (t1 , ..., tk ) I k \Dk (I). For Gaussian
stationary processes a sufficient condition for non-degeneracy is the spectral measure
not to be purely atomic (see Cramer and Leadbetter (1967) for a proof). The same
kind of argument permits to show that the conclusion remains if the spectral measure
is purely atomic and the set of its atoms has an acumulation point in IR. Sufficient
conditions for the finiteness of m are given also in Nualart & Wschebor (Lemma
1.2, 1991).
b) If instead of requiring the paths of the process X to be of class C , one relaxes
this condition up to a certain order of differentiability, one can still get upper and
lower bounds for P (M > u).
Theorem 2.3 Let X = {Xt : t I} be a real -valued stochastic process. Suppose
that pXt (x) is bounded for t I, x IR and that the paths of X are of class C p+1 .
Then
2K+1

if

(1)m+1

2K + 1 < p/2 : P (M > u) P (X0 > u) +


m=1

m
m!

and
2K

if

(1)m+1

2K < p/2 : P (M > u) P (X0 > u) +

m=1

m
.
m!

Note that all the moments in the above formulae are finite.
The proof is a straightforward application of Lemma 2.2 and Lemma 1.2 in
Nualart & Wschebor (1991).
When the level u is high, the results by Piterbag (1981, 1996), which were until
recently the sharpest known asymptotic bounds for the tail of the distribution of the
11

maximum on a fixed interval of general Gaussian stationary processes with regular


paths (for a refinement, see Azas, Bardet and Wschebor, 2000) can be deduced
from the foregoing arguments. Here, only the first term in the Rice series takes part
in the equivalent of P (M > u) as u +. More precisely, if 4 < , it is not
hard to prove that
u2 (1+)
2
(u) 1 (const)e 2 ,
2

2 (const)e

u2 (1+)
2

for a certain > 0. Lemma 3.2 implies that


u2 (1+)
2
(u) P (M > u) (const)e 2 ,
2

0 1 (u) +

(11)

which is Piterbargs result.

Computation of Moments

An efficient numerical computation of the factorial moments of crossings is associated to a fine description of the behaviour as the k-tuple (t1 , ..., tk ) approaches the
diagonal Dk (I), of the integrands
A+
t1 ,...,tk (u, ..., u) =

[0,+)m

x1 ...xm pXt1 ,...,Xtm ,Xt

,...,Xtm (u, ..., u, x1 , ...xm )dx1 ...dxm .

At1 ,...,tk (u) =

dx

[0,+)m

x1 ...xm pX0 ,Xt1 ,...,Xtm ,Xt

,...,Xtm (x, u, ..., u, x1 , ...xm )dx1 ...dxm .

that appear respectively in Rice formulae for the k th factorial moment of upcrossings or the k th factorial moment of upcrossings with the additional condition that
X0 u (see formula(2).
For example in Azas, Cierco and Croquette (1999) it is proved that if X is
Gaussian, stationary, centered and 8 < , then the integrand A+
s,t (u, u) in the
12

computation of 2 - the second factorial moment of the number of upcrossings satisfies:


3/2

A+
s,t (u, u)

1
(2 6 24 )
1 4
exp

u2
1296 (4 22 )1/2 2 22
2 4 22

(t s)4 ,

(12)

as t s 0.
(12) can be extended to non-stationary Gaussian processes obtaining an equivalence of the form:
A+
s,t (u, u)

J(t)(t s)4

as s, t t

(13)

where J(t) is a continuous non-zero function of t depending on u, that can be expressed in terms of the mean and covariance functions of the process and its derivatives. We give a proof of an equivalence of the form (13) in the next proposition.
One can profit of this equivalence to improve the numerical methods to compute
2 (the second factorial moment of the number of upcrossings restricted to X0
u). Equivalence formulae such as (12) or (13) can be used to avoid numerical
degeneracies near the diagonal D2 (I). Note that even in case X is stationary at the
departure, under conditioning on X0 , the process that must be taken into account
in the actual computation of the factorial moments of upcrossings that appear in
the Rice series(4) will be non-stationary, so that equivalence (13) is the appropriate
tool.
Proposition 3.1 Suppose that X is a Gaussian process with C 5 paths and that for
(2)
(3)
each t I the joint distribution of Xt , Xt , Xt , Xt does not degenerate.Then (13)
holds true.
1
a two-dimensional random vector having as proba2
s
bility distribution the conditional distribution of X
given Xs = Xt = u.
Xt
One has:
Proof. Denote by =

+ +
A+
pXs ,Xt (u, u)
s,t (u, u) = E 1 2

(14)

Put = t s and check the following Taylor expansions around the point s:
E (1 ) = m1 + m2 2 + L1 3

(15)

E (2 ) = m1 + m2 2 + L2 3

(16)

13

V ar () =

a 2 + b 3 + c 4 + 11 5
a 2

b+b
2

a 2

3 + d 4 + 12 5

b+b
2

3 + d 4 + 12 5

a 2 + b 3 + c 4 + 22 5

(17)

where m1 , m2 , m2 , a, b, c, d, a, b, c are continuous functions of s and L1 , L2 , 11 ,


12 , 22 are bounded functions of s and t. (15),(16) and (17) follow directly from
s
on the condition Xs = Xt = u.
the regression formulae of the pair X
Xt
Note that (as in Belyaiev, 1966 or Azas and Wschebor, 2000)
det V ar(Xs , Xt , Xs )T
det V ar(Xs , Xs , Xt Xs (t s)Xs )T
V ar(1 ) =
=
det V ar(Xs , Xt )T
det V ar(Xs , Xt Xs )T
A direct computation gives:
det V ar(Xs , Xt )T 2 det V ar(Xs , Xs )T

(18)

(2)

1 det V ar(Xs , Xs , Xs )T 2
V ar(1 )

4 det V ar(Xs , Xs )T
where denotes equivalence as 0. So,
(2)

1 det V ar(Xs , Xs , Xs )T
a=
4 det V ar(Xs , Xs )T
which is a continuous non-vanishing function for s I. Note that the coefficient of
. This follows either by
3 in the Taylor expansion of Cov(1 , 2 ) is equal to b+b
2
direct computation or noting that det V ar() is a symmetric function of the pair
s, t.
Put
(s, t) = det V ar()
The behaviour of (s, t) as s, t t can be obtained by noting that
(s, t) =

det V ar(Xs , Xt , Xs , Xt )T
det V ar(Xs , Xt )T

and applying Lemma 3.2 in Azas and Wschebor (2000) or Lemma 4.3, p.76 in
Piterbarg (1996) which provide an equivalent for the numerator, so that:
(s, t) (t) 6
14

(19)

with
(2)

(t) =

(3)

1 det V ar(Xet , Xet , Xet , Xet )T


144
det V ar(Xet , Xet )T

The non degeneracy hypothesis implies that (t) is continuous and non zero.
Then:
E

1+ 2+

1
1/2

2 [(s, t)]

xy exp
0

1
F (x, y) dxdy
2(s, t)

(20)

where
F (x, y) = V ar(2 )(x E(1 ))2 + V ar(1 )(y E(2 ))2 2Cov(1 , 2 )(x E(1 ))(y E(2 ))
Substituting the expansions (15), (16), (17) in the integrand of (20) and making
the change of variables x = 2 v, y = 2 w we get, as s, t t:
E 1+ 2+

5
2 (t)

1/2

vw exp
0

1
F (v, w) dvdw
2(t)

(21)

(t) can also be expressed in terms of the functions a, b, c, d, a , b , c :


(t) = ac + ca + 2ad

bb
2

and
2

F (v, w) = a (v m2 + w m2 ) + m21 (c + c + 2d) m1 (b b )(v + w m2 m2 )


The functions a, b, c, d, b, c, m1 , m2 that appear in these formulae are all evaluated
at the point t.
Replacing (21) and (18) into (14) one gets (13).
For k 3, the general behaviour of the functions At1 ,...,tk (u) and A+
t1 ,...,tk (u, ..., u)
when (t1 , ..., tk ) approaches the diagonal is not known. Proposition 3.3 , even though
it contains restrictive conditions (it requires E{Xt } = 0 and u = 0) can be applied
to improve the efficiency in the computation of the k th -factorial moments by means
of a Monte-Carlo method, via the use of important sampling. More precisely, when
15

k
computing the integral of A+
t1 ,...,tk (u) over I , instead of choosing at random the point
(t1 , t2 , ..., tk ) in the cube I k with a uniform distribution, we do it with a probability
law that has a density proportional to the function 1i<jk (tj ti )4 . For its proof
we will use the following auxiliary proposition, that has its own interest and extends
(19) to any k.

Proposition 3.2 Suppose that X = {Xt : t I} is a Gaussian process defined on


the interval I of the real line with C 2k1 paths, k an integer, k 2, and that the
(2k1)
joint distribution of Xt , Xt , ...., Xt
is non-degenerate for each t I. Then,

if t1 , t2 , ...., tk t :
(2k1) T

det V ar(Xt , Xt , ..., Xt


= det V ar(Xt1 , Xt1 , ..., Xtk , Xtk )
[2!.3!....(2k 1)!]2
T

(tj ti )8
1i<jk

(22)
Proof. With no loss of generality, we consider only ktuples (t1 , t2 , ...., tk ) such
that ti = tj if i = j.
Suppose f : I IR is a function of class C 2m , m 1, and t1 , t2 , ...., tm
are pairwise different points in I. We use the following notations for interpolating
polynomials:
Pm (t; f ) is the polynomial of degree 2m 1 such that
Pm (tj ; f ) = f (tj ) and Pm (t; f ) = f (tj ) for j = 1, ..., m.
Qm (t; f ) is the polynomial of degree 2m 2 such that
Qm (tj ; f ) = f (tj ) for j = 1, ..., m ; Qm (t; f ) = f (tj ) for j = 1, ..., m 1.
From Lemma 2.1 we know that
f (t) Pm (t; f ) =

f (t) Qm (t; f ) =

1
(t t1 )2 ....(t tm )2 f (2m) ()
(2m)!

1
(t t1 )2 ....(t tm1 )2 (t tm )f (2m1) ()
(2m 1)!

where
= (t1 , t2 , ...., tm , t), = (t1 , t2 , ...., tm , t)
16

(23)

(24)

and
min(t1 , t2 , ...., tm , t) , max(t1 , t2 , ...., tm , t).
Note that the function
g(t) = f (2m1) ((t1 , t2 , ...., tm , t)) =

(2m 1)! [f (t) Qm (t; f )]


(t t1 )2 ....(t tm1 )2 (t tm )

is differentiable at the point t = tm and differentiating in (24):


f (tm ) Qm (tm ; f ) =

1
(tm t1 )2 ....(tm tm1 )2 f (2m1) ((t1 , t2 , ...., tm , tm ))
(2m 1)!
(25)

Put
m = (t1 , t2 , ...., tm , tm ), m = (t1 , t2 , ...., tm , tm ).
Since Pm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm ))
and Qm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm1 ))
with coefficients depending (in both cases) only on t1 , t2 , ...., tm , t, it follows that:
= det V ar Xt1 , Xt1 , Xt2 P1 (t2 ; X), Xt2 Q2 (t2 , X), ...
T

..., Xtk Pk1 (tk ; X), Xtk Qk (tk ; X) =


1
(2) 1
, ....
= det V ar Xt1 , Xt1 , (t2 t1 )2 X1 , (t2 t1 )2 X(3)
2
2!
3!
1
1
(2k2)
..,
(tk t1 )2 ...(tk tk1 )2 Xk1 ,
(tk t1 )2 ...(tk tk1 )2 X(2k1)
k1
(2k 2)!
(2k 1)!
=

[2!...(2k 1)!]2

(tj ti )8
1i<jk

with
(2)

(2k2)

= det V ar Xt1 , Xt1 , X1 , X(3)


, ..., Xk1 , X(2k1)
2
k1

(2k1) T

det V ar(Xt , Xt , ..., Xt


as t1 , t2 , ...., tk t . This proves (22).
17

Proposition 3.3 Suppose that X is a centered Gaussian process with C 2k1 paths
and that for each pairwise distinct values of the parameter t1 , t2 , ..., tk I the joint
(2k1)
distribution of (Xth , Xth , ...., Xth
, h = 1, 2, ..., k) is non-degenerate. Then, as

t1 , t2 , ..., tk t :

A+
t1 ,...,tk (0, ..., 0) Jk (t )

(tj ti )4
1i<jk

where Jk (t) is a continuous non-zero function of t.


Proof. Introduce the notation
(k)

Dk (t) = det V ar(Xt , Xt , ...., Xt )T


In the same way as in the proof of Proposition 3.2 and with a simpler computation,
it follows that as t1 , t2 , ..., tk t
det V ar(Xt1 , Xt2 , ..., Xtk )T

1
[2!.....(k 1)!]2

(tj ti )2 . Dk1 (t ).

(26)

1i<jk

For pairwise different values t1 , t2 , ..., tk , let Z = (Z1 , ..., Zk )T be a random vector
having the conditional distribution of (Xt1 , ...., Xtk )T given Xt1 = Xt2 = ... = Xtk =
0. The (Gaussian) distribution of Z is centered and we denote its covariance matrix
by . Also put:
1 =

1
ij
det()

i,j=1,...,k

ij being the cofactor of the position (i, j) in the matrix . Then, one can write:
+
+
A+
. pXt1 ,...,Xtk (0, ..., 0)
t1 ,...,tk (0, ..., 0) = E Z1 ...Zk

(27)

and
A+
t1 ,...,tk (0, ..., 0) =

1
k
2

(2) (det())

1
2

x1 ...xk exp
(R+ )k

F (x1 , ..., xk )
2. det()

dx1 ...dxk
(28)

where
k

ij xi xj .

F (x1 , ..., xk ) =
i,j=1

18

Letting t1 , t2 , ..., tk t and using (22) and (26) we get:


det() =

det V ar(Xt1 , Xt1 , ..., Xtk , Xtk )T

det V ar(Xt1 , ..., Xtk )T

1
[k!.....(2k 1)!]2

(tj ti )6 .
1i<jk

D2k1 (t )
.
Dk1 (t )

We consider now the behaviour of the ij (i, j = 1, ..., k). Let us first look at 11 .
Using the same method as above, now applied to the cofactor of the position (1, 1)
in , one has:

11

det V ar(Xt1 , Xt2 , ..., Xtk , Xt2 , ..., Xtk )T


=

det V ar(Xt1 , ..., Xtk )T

1
[2!...(2k2)!]2

ti )8

2i<jk (tj

1
[2!.....(k1)!]2

2hk (t1

2
1i<jk (tj ti )

1
=
[k!...(2k 2)!]2

(tj ti )
2i<jk

th )4 D2k2 (t )

Dk1 (t )
6

(t1 th )
2hk

=
D2k2 (t )
Dk1 (t )

A similar computation holds for ii , i = 2, ..., k.


Consider now 12 . One has:
det E (Xt1 , Xt2 , ..., Xtk , Xt2 , ..., Xtk )T .(Xt1 , Xt2 , ..., Xtk , Xt1 , Xt3 ..., Xtk )
=
det V ar(Xt1 , ..., Xtk )T
det E (Xt2 , Xt2 , ..., Xtk , Xtk , Xt1 )T .(Xt1 , Xt1 , Xt3 , Xt3 , ..., Xtk , Xtk , Xt2 )
=

det V ar(Xt1 , ..., Xtk )T


12

1
[k!...(2k 2)!]2

(tj ti )6
3i<jk

(t1 th )4 (t2 th )4 (t2 t1 )2 .


3hk

A similar computation applies to all the cofactors ij , i = j.


Perform in the integral in (28) the change of variables
i=k

(ti tj )2 . yj

xj =
i=1,i=j

19

j = 1, ..., k

D2k2 (t )
Dk1 (t )

and the integral becomes:


(tj ti )8

y1 ...yk exp
(R+ )k

1i<jk

1
G(y1 , ..., yk )
2. det()

dy1 ...dyk

where
k

h=k

G(y1 , ..., yk ) =

i,j=1

h=k

ij

(th tj )2

(th ti )
h=1,h=i

yi yj .

h=1,h=j

so that, as t1 , t2 , ..., tk t
G(y1 , ..., yk )
D2k2 (t )
[(2k 1)!]2
det()
D2k1 (t )

i=k

yi
i=1

Now, passage to the limit under the integral sign in (28), which is easily justified by
application of the Lebesgue Theorem, leads to
E

Z1+ ...Zk+

1
(2)

k
2

|tj ti |

k!...(2k 1)!

1i<jk

Dk1 (t )
D2k1 (t )

1
2

Ik ( )

where Ik (), > 0 is

y1 ...yk exp

Ik () =
(R+ )k

i=k

yi
i=1

dy1 ...dyk = 1 Ik (1)


k

and
= [(2k 1)!]2

D2k2 (t )
D2k1 (t )

Replacing into (27) one gets the result with


Jk (t) =

2!...(2k 2)!

Ik (1)

[2(2k 1)!]2k1 [D2k1 (t)] 2

This finishes the proof.


20

D2k1 (t)
D2k2 (t)

Numerical examples

4.1

Comparison with Monte-Carlo method

First, let us compare the numerical computation based upon Theorem 2.1 with
the Monte-Carlo method based on the simulation of the paths. We do this for
stationary Gaussian processes that satisfy the hypotheses of Theorem 2.2 and also
the non-degeneracy condition that ensures that one is able to compute the factorial
moments of crossings by means of Rice formulae.
Suppose that we want to compute P (M > u) with an error bounded by , where
> 0 is a given positive number.
To proceed by simulation, we discretize the paths by means of a uniform partition
{tj := j/n, j = 0, 1, ..., n}. Denote
M (n) := sup Xtj .
0jn

Using Taylors formula at the time where the maximum M of X(.) occurs, one
gets :
0 M M (n) X

/(2n

It follows that
0 P (M > u) P (M (n) > u) = P (M > u, M (n) u)
P (u < M u + X

/(2n

)).

If we admit that the distribution of M has a locally bounded density (which is a


well-known fact under the mentioned hypotheses) the above suggests that a number
of n = (const) 1/2 points is required if one wants the mean error P (M > u)
P (M (n) > u) to be bounded by .
On the other hand, to estimate P (M (n) > u) by Monte-Carlo with a mean square
error smaller than , we require the simulation of N = (const) 2 Gaussian n-tuples
(Xt1 , ..., Xtn ) from the distribution determined by the given stationary process. Performing each simulation demands (const)nlog(n) elementary operations (Dietrich
and Newsam, 1997). Summing up, the total mean number of elementary operations
required to get a mean square error bounded by in the estimation of P (M > u)
has the form (const) 5/2 log(1/).
Suppose now that we apply Theorem 2.1 to a Gaussian stationary centered process verifying the hypotheses of Theorem 2.2 and the non-degeneracy condition.
21


The bound for m
in Equation (10) implies that computing a partial sum with
(const)log(1/) terms assures that the tail in the Rice series is bounded by . If
one computes each m by means of a Monte-Carlo method for the multiple integrals
appearing in the Rice formulae, then the number of elementary operations for the
whole procedure will have the form (const) 2 log(1/). Hence, this is better than
simulation as tends to zero.
As usual, for given > 0, the value of the generic constants decides the comparison between both methods.
More important is the fact that the enveloping property of the Rice series implies
that the actual number of terms required by the application of Theorem 2.1 can

. More
be much smaller than the one resulting from the a priori bound on m

precisely, suppose that we have obtained each numerical approximation m


of m
with a precision

|m
m | ,

and that we stop when

m
0 +1
.
(m0 + 1)!

(29)

Then, it follows that


m

m+1

(1)
m=1

m
m+1 m

(1)
(e + 1).
m! m=1
m!

Putting = /(e + 1), we get the desired bound. In other words one can profit of
the successive numerical approximations of m to determine a new m0 which turns
out to be - in certain interesting examples - much smaller than the one deduced

from the a priori bound on m


.

4.2

Comparison with usuals bounds

Next, we will give the results of the evaluation of P (MT > u) using up to three
terms in the Rice series in a certain number of typical cases. We compare these
results with the classical evaluation using what is often called the Davies (1977)
bound. In fact this bound seems to have been widely used since the work of Rice
(1944). It is an upper-bound with no control on the error, given by:
P (M > u) P (X0 > u) + E Uu ([0, T ])
22

(30)

The above mentioned result by Piterbarg (11) shows that in fact, for fixed T and
high level u this bound is sharp. In general, using more than one term of the Rice
series supplies a remarkable improvement in the computation.
We consider several stationary centered Gaussian processes listed in the following
table, where the covariances and the corresponding spectral densities are indicated.
process
X1
X2
X3
X4

covariance
1 (t) = exp(t2 /2)
2 (t) = (ch(t))1
1
3 (t) = 31/2 t sin(31/2 t)

4 (t) = e| 5t| ( 35 |t|3 + 2t2 + 5|t| + 1)

spectral density
f1 (x) = (2)1/2 exp(x2 /2)
1
f2 (x) = 2ch((x)/2)
f3 (x) = 121/2 1I{3<x<3}
4
f4 (x) = 105 (5 + x2 )4

In all cases, 0 = 2 = 1 to be able to compare the various results. Note that 1


and 3 have analytic extensions to the whole plane, so that Theorem 2.2 applies to
the processes X1 and X3 . On the other hand, even though all spectral moments of
the process X2 are finite, Theorem 2.2 applies only for a length less than /4 since
the meromorphic extension of 2 (.) has poles at the points i/2 + ki, k an integer.
With respect to 4 (.) notice that it is obtained as the convolution 5 5 5 5
where 5 (t) := e|t| is the covariance of the Ornstein-Uhlenbeck process, plus a
change of scale to get 0 = 2 = 1. The process X4 has 6 < and 8 = and its
paths are C 3 . So, for the processes X2 and X4 we apply Theorem 2.3 to compute
F (T, u).
Table 1 contains the results for T = 1, 4, 6, 8, 10 and the values u = 2, 1, 0, 1, 2, 3
using three terms of the Rices series. A single value is given when a precision of 102
is met; otherwise the lower-bound and the upper-bound given by two or three terms
of the Rices series respectively, are diplayed. The calculation uses a deterministic
evaluation of the first two moments 1 and 2 using program written by Cierco Croquette and Delmas (2000) and a Monte-Carlo evaluation of 3 . In fact, for simpler
and faster calculation, 3 has been evaluated instead of 3 providing a slightly weaker
bound.
In addition Figures 1 to 4 show the behavior of four bounds : namely, from the
highest to the lowest
The Davies bound (D) defined by formula (30)

23

u
-2

-1

Length of the time interval


1
4
6
8
0.99 1.00 1.00
1.00
0.99 1.00 1.00
1.00
1.00 1.00 1.00
1.00
0.99 1.00 1.00
1.00
0.93 1.00 1.00
0.99-1.00
0.93 0.99 1.00
0.99-1.00
0.93 1.00 1.00
1.00
0.93 1.00 1.00
0.99-1.00
0.65 0.90 0.95
0.95-0.99
0.65 0.89 0.94-0.95 0.93-0.99
0.656 0.919 0.97
0.98-0.99
0.65 0.89 0.94-0.95 0.94-0.99
0.25 0.49 0.61
0.69-0.70
0.25 0.48 0.58
0.66-0.68
0.26 0.51 0.62
0.71
0.25 0.48 0.59
0.67-0.69
0.04 0.11 0.15
0.18
0.04 0.11 0.14
0.18
0.04 0.11 0.15
0.19
0.04 0.11 0.14
0.18
0.00 0.01 0.01
0.02
0.00 0.01 0.01
0.02
0.00 0.01 0.01
0.02
0.00 0.01 0.01
0.02

T
10
1.00
1.00
1.00
1.00
0.98-1.00
0.98-1.00
0.99
0.98-1.00
0.90-1.00
0.87-1.00
0.92-1.00
0.88-1.00
0.74-0.77
0.70-0.76
0.76-0.78
0.72-0.77
0.22
0.21
0.22
0.22
0.02
0.02
0.02
0.02

Table 1: Values of P (M > u) for the different processes. Each cell contains, from
top to bottom, the values corresponding to stationary centered Gaussian processes
with covariances 1 , 2 , 3 and 4 respectively. The calculation uses three terms
of the Rice series for the upper-bound and two terms for the lower-bound. Both
bounds are rounded up to two decimals and when they differ, both displayed.

24

One, three, or two terms of the Rice series (R1, R3, R2 in the sequel) that is
K

P (X0 > u) +

(1)m+1
m=1

m
m!

with K = 1, 3 or 2.
Note that the bound D differs from R1 due to the difference between 1 and
1 . These bounds are evaluated for T = 4, 6, 8, 10, 15 and also for T = 20 and
T = 40 when they fall in the range [0, 1]. Between these values an ordinary spline
interpolation has been performed.
In addition we illustrate the complete detailed calculation in three chosen cases.
They correspond to zero and positive levels u. For u negative, it is easy to check
that the Davies bound is often greater than 1, thus non informative.
For u = 0, T = 6, = 1 , we have P (X0 > u) = 0.5, 1 = 0.955, 1 = 0.602,
2 /2 = .150, 3 /6 = 0.004, so that:
D = 1.455 , R1 = 1.103 , R3 = 0.957 , R2 = 0.953
R2 and R3 give a rather good evaluation of the probability, the Davies bound
gives no information.
For u = 1.5, T = 15, = 2 , we have P (X0 > u) = 0.067, 1 = 0.517,
1 = 0.488, 2 /2 = 0.08, 3 /6 = 0.013, so that:
D = 0.584 , R1 = 0.555 , R3 = 0.488 , R2 = 0.475
In this case the Davies bound is not sharp and a very clear improvement is
provided by the two bounds R2 and R3.
For u = 2, T = 10, = 3 , we have P (X0 > u) = 0.023, 1 = 0.215,
1 = 0.211, 2 /2 = 0.014, 3 /6 = 3104 , so that:
D = 0.238 , R1 = 0.234 , R3 = 0.220. , R2 = 0.220
In this case the Davies bound is rather sharp.
As a conclusion, these numerical results show that it is worth using several terms
of the Rice series. In particular the first three terms are relatively easy to compute
and provide a good evaluation of the distribution of M under a rather broad set of
conditions.
25

Acknowledgements
We thank C. Delmas for computational assistance. This work has received a support
ECOS program U97E02.

References

Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics


for General Gaussian Processes, IMS, Hayward, Ca.
Adler, R.J. (2000) On excursion sets, tube formulae, and maxima of random
fields. Annals of Applied Probability. To appear
Azas, J-M., Cierco, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM: Probability and Statistics, 3, 107-129.
Azas, J-M. and Wschebor M. (1997). Une formule pour calculer la distribution
du maximum dun processus stochastique. C.R. Acad. Sci. Paris, t. 324, serieI,
225-230.
Azas, J-M. and Wschebor M. (1999). Regularite de la loi du maximum de
processus gaussiens reguliers. C.R. Acad. Sci. Paris, t. 328, Ser. I, 333-336.
Azas, J-M and Wschebor, M. (2000). On the Regularity of the Distribution
of the Maximum of one-parameter Gaussian Processes, accepted for publication at
Probability Theory and Related Fields.
Azas, J-M, Bardet, J-M and Wschebor, M. (2000). On the Tails of the Distribution of the Maximum of a Smooth Stationary Gaussian Process, submitted.
Belyaev, Yu. (1966). On the number of intersections of a level by a Gaussian
Stochastic process. Theory Prob. Appl., 11, 106-113.
Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
Berman, S.M. (1985b). The maximum of a Gaussian process with nonconstant
variance. Ann. Inst. H. Poincare Probab. Statist., 21, 383-391.
Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and Brooks, Probability Series.
Berman, S.M. and Kono, N. (1989). The maximum of a gaussian process with
non-constant variance: a sharp bound for the distribution of the tail. Ann. Probab.,
17, 632-650.
Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent.
Math., 30, 207-216.

26

Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic


Processes, J. Wiley & Sons, New-York.
Cierco, C. (1996). Probl`emes statistiques lies `a la detection et `a la localisation
dun g`ene `a effet quantitatif. PHD dissertation. University of Toulouse, France.
Cierco-Ayrolles, C., Croquette, A. and Delmas, C. (2000). Computing the Distribution of the Maximum of Regular Gaussian Processes. Submitted.
Cressie, N (1980). The asymptotic distribution of the scan statistic under uniformity. Ann. Probab, 8, 828-840.
Cuzick, J. (1975). Conditions for finite moments of the number of zero crossings
for Gaussian processes.Ann. Probab, 3, 849-858.
Darling, D. A.(1983). On the supremum of certain Gaussian processes. Ann.
Probab. 11, 803-806.
Davies, R.B. (1987). Hypothesis testing when a nuisance parameter is present
only under the alternative. Biometrika 74, 33-43.
Davis, J. D. (1975). Interpolation and approximation. Dover, New York.
DeLong, D. M. (1981). Crossing probabilities for a square root boundary by a
Bessel Process. Communication in Statistics-Theory and methods. A10, 2197-2213.
Diebolt, J. and Posse, C. (1996). On the Density of the Maximum of Smooth
Gaussian Processes.Ann. Probab., 24, 1104-1129.
Dietrich, C. R. and Newsam G. N. (1997.) Fast and exact simulation of stationary Gaussian processes throught circulant embedding of the covartiance matrix. .
SIAM J. Sci. Comput. 18, 4, 1088-1107.
Fernique, X.(1974). Regularite des trajectoires des fonctions aleatoires gaussiennes.Ecole dEte de Probabilites de Saint Flour. Lecture Notes in Mathematics,
480, Springer-Verlag,New-York.
Goldman, M. (1971). On the first passage of the integrated Wiener process.The
Annals Math. Statist., 42, 6, 2150-2155.
Lachal, A. (1991). Sur le premier instant de passage de lintegrale du mouvement
Brownien. Ann. Inst. H. Poincare Probab. Statist. 27, 385-405.
Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process.
Sankya Ser. A 32, 369-378.
Leadbetter, M.R., Lindgren, G. and Rootzen, H. (1983). Extremes and related
properties of random sequences and processes. Springer-Verlag, New-York.
Ledoux, M. (1996). Isoperimetry and Gaussian Analysis. Ecole dEte de Probabilites de Saint-Flour 1994. Lecture Notes in Math. 1648, 165-264. Springer-Verlag.
New York.
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, SpringerVerlag, New-York.
27

Lifshits, M.A.(1995). Gaussian random functions . Kluwer, The Netherlands.


Lindgren, G. (1972). Wave-length and Amplitude in Gaussian Noise . Adv.
Appl. Prob., 4, 81-108.
McKean, H.P. (1963). A winding problem for a resonant driven by a white noise.
J. Math. Kyoto Univ., 2, 227-235.
Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely
Continuous Sample Paths, Ann. Probab., 5, 52-71.
Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes.
Proc. Sith Berkeley Symp. Math. Statist. Prob., 2, 423-442.
Miroshin, R. N. (1974). Rice series in the theory of random functions. Vestnik
Leningrad Univ. Math., 1, 143-155.
Miroshin, R. N. (1977). Condition for finiteness of moments of the number of
zeros of stationary Gaussian processes. Th. Prob. Appl., 22 , 615-624.
Miroshin, R. N. (1983). The use of Rice series. Th. Prob. Appl. 28, 714-726.
Nualart, D. and Wschebor, M. (1991). Integration par parties dans lespace de
Wiener et approximation du temps local. Prob. Th. Rel. Fields, 90, 83-109.
Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of
Gaussian processes. Th, Proba. Appl., 26, 687-705.
Piterbarg, V; I. (1996). Asymptotic Methods in the Theory of Gaussian Processes
and Fields. American Mathematical Society. Providence, Rhode Island.
Qualls, C and Watanabe, H. (1973). Asymptotic properties of Gaussian process.
Ann. Math. Statistics, 43, 580-596
Rice, S.O. (1944-1945). Mathematical Analysis of Random Noise.Bell System
Tech. J., 23, 282-332; 24, 45-156.
Shepp, L. A. (1971). First passage time for a particular Gaussian process. The
Ann. of Math. Stat., 42, 946-951.
Shepp, L. A. (1979). The joint density of the maximum and its location for a
Wiener process with drift. J. Appl. Prob. 16, 423-427.
Shepp, L. A. and Slepian, D.(1976). First-passage time for a particular stationary
periodic Gaussian process. J. Appl. Prob., 13, 27-38.
Slepian, D. (1961). First passage time for a particular Gaussian process.Ann.
Math. Statist., 32, 610-612.
Slepian, D. (1962). The one-sided barrier problem for Gaussian noise.Bell System
Tech. J. 42, 463-501.
Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields,
Ann. Probab., 21, 34-71.
Talagrand, M. (1988). Small tails for the supremum of a Gaussian process. Ann.
Inst. H. Poincare, Ser. B, 24, 2, 307-315.
28

Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab.


24, 1049-1103.
Tsirelson, V.S. (1975). The Density of the Maximum of a Gaussian Process.
Th. Probab. Appl., 20, 817-856.
Weber, M. (1985). Sur la densite du maximum dun processus gaussien. J. Math.
Kyoto Univ., 25, 515-521.
Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de
niveau.Lecture Notes in Mathematics, 1147, Springer-Verlag.
Wschebor, M. (2000). Sur la loi du sup de certains processus Gaussiens non
bornes. Accepted for publication by C.R. Acad. Sci. Paris.
Ylvisaker, D. (1968). A Note on the Absence of Tangencies in Gaussian Sample
Paths.The Ann. of Math. Stat., 39, 261-262.

29

, u =1

0.9

0.8

Values of the bounds

0.7

0.6

0.5

0.4

0.3

0.2

0.1

8
10
12
Length of the interval

14

16

18

20

Figure 1: For the process with covariance 1 and the level u = 1, representation of
the three upper-bounds D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval

2 u =0
1

0.95

0.9

Values of the bounds

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

10

15

Length of the interval

Figure 2: For the process with covariance 2 and the level u = 0, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
30

u =2

0.9

0.8

0.7

Values of the bounds

0.6

0.5

0.4

0.3

0.2

0.1

10

15

20
25
Length of the interval

30

35

40

Figure 3: For the process with covariance 3 and the level u = 2, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval

4 u =1.5
1

0.9

0.8

Values of the bounds

0.7

0.6

0.5

0.4

0.3

0.2

0.1

8
10
12
Length of the interval

14

16

18

20

Figure 4: For the process with covariance 4 and the level u = 1.5, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
31

Second order approximation of the tail of


the distribution of the maximum of a
Gaussian process.
Mario Wschebor
Centro de Matem
atica. Facultad de Ciencias.
Universidad de la Rep
ublica.
E mail: wschebor@cmat.edu.uy
Fort Collins, June 22-26, 2009

X = {X(t) : t T } real-valued random field.


Most of what we will say refers to Gaussian fields.

MT = sup{X(t) : t T }

F (x) = P (MT x).

The computation of F (x) by means of a closed formula is known


only for a short list of processes,in which an actual formula exists
for the distribution of M = M[0,T ].
1

The methods to find formulas for the distribution of the supremum of these processes, are ad hoc, hence non transposable to
more general random functions, even in the Gaussian context.
Given the interest in the distribution of the random variable MT ,
arising in a diversity of theoretical and technical questions, a large
body of mathematics has been developed beyond these particular formulas.

INEQUALITIES.
We give some fundamental examples for Gaussian processes.
Assume that X is centered Gaussian and there exists a countable
subset D T such that almost surely MT = suptD X(t). [In
particular, this condition holds true if X is separable].
2(t) = E[X 2(t)].
2

Theorem 1 (C. Borell, Sudakov, Tsirelson, 1974-1975) Assume


that P(MT < ) = 1.
Then:
T2 = sup 2(t) < +
tT

and for every u > 0


2

1u
2
2

P(|MT (MT )| > u) 2[1 (u/T )] e

(1)

(Z) denotes a median of the probability distribution of the random variable Z.


[COMMENTS ON THE HYPOTHESIS: 1) RELATIONSHIP WITH
THE GAUSSIAN ZERO OR ONE LAW; 2) IF T IS A COMPACT
SEPARABLE TOPOLOGICAL SPACE AND THE PATHS OF
THE PROCESS ARE CONTINUOUS, ALL THE HYPOTHESES ARE SATISFIED]
3

Theorem 2 (Ibragimov-Sudakov-Tsirelson, 1976) Assume that


the process X satisfies the same hypotheses as in Theorem 1.
Then:
1) E (|MT |) < .
2) For every u > 0,
P(|MT E (MT )| > u) 2 exp

1 u2

.
2
2 T

(2)

A Corollary is the following: under the same hypotheses of Th.


1 or 2, for each > 0 there exists a positive constant C such
that for all u > 0 :
1 u2
P(|MT | > u) C exp
.
2
2 T +

(3)

Grosso modo, this says that the tail of the distribution of the
random variable MT is bounded (except for a multiplicative constant) by the value of the centered normal density having variance larger than, and arbitrarily close to, T2 .
The problem is that C can grow (and tend to infinity) as
decreases to zero. Even for fixed , in general, one can only
have rough bounds for C. This implies serious limitations for
the use of these inequalities in Statistics and in other fields.
5

SOME BRIEF COMMENTS ON PROOFS.

Isoperimetric properties of the Gaussian law

Itos formula and the proof of Theorem 2

Ferniques direct proof of the Corollary.

These inequalities are essential for the development of the mathematical theory. However, in a wide number of applications, the
general situation is that these inequalities are not good enough,
one reason being that they depend on certain constants (the
expectation or the median of MT ) that one is unable to estimate
or for which estimations differ substantially from the true values.
As a consequence, the bounds become exponentially larger than
the true values, as u +.
Since the 1990s several methods have been introduced with the
aim of obtaining more precise results:
Examples: the double sum method (Piterbarg, 1996); the EulerPoincar
e Characteristic approximation (EPC, Taylor, Takemura
and Adler, 2005, Adler and Taylors book, 2007); the tube
method (Sun, 1993), the use of Rice series (Miroshin, 1984,
Azas-MW, 2002), the record method (Rychlik,Mercadier, 2005).
See the book by Azas-MW, Wiley, 2009 for a more detailed account.
7

In general, one would like to write:


1 u2
P{M > u} = A(u) exp
+ B(u)
(4)
2
2
where A(u) is a known function having polynomially bounded
growth as u +, 2 = suptT Var(X(t)) and B(u) is an error bounded by a centered Gaussian density with variance 12,
12 < 2.

We call the first (respectively the second) term in the righthand side of (4) the first (resp second) order approximation of
P{M > u}.

First order approximation has been considered by Taylor, Takemura and Adler (2005) and also Adler and Taylor (2007) by
means of the expectation of the EPC of the excursion set Eu :=
{t S : X(t) > u}. This works for large values of u. The same
authors have considered the second order approximation, that is,
how fast does the difference between P{M > u} and the expected
EPC tend to zero when u +.

As far as I know, the only known result giving a precise description of the second order term for the asymptotics of P(MT > u)
as u + is the following (Piterbarg, 1981 for sufficiently small
T and Azas-Bardet-MW, 2001 for general T ):
Let X be a one-parameter centered Gaussian stationary process,
satisfying certain regularity conditions. Then, as u +:
2
T (u)
P(MT > u)= 1 (u) +
2

3 3(4 2
)9/2 T
4
2
[1 + o(1)] .

u
2
5
9/2
u
4 2
22 (26 2
4)
(resp. ) denotes the standard normal distribution (resp. density). k is the k-th spectral moment.
10

THE DIRECT METHOD. Assumptions and notations

X = {X(t) : t S} denotes a real-valued Gaussian field defined


on the parameter set S. The domain S will have some geometric regularity and the paths of the random function will satisfy
some differentiability conditions.

[A1] (Geometry of S): - S is a compact subset of Rd


- S is the disjoint union of Sd, Sd1..., S0, Sj is an orientable C 3
manifold of dimension j without boundary. Sd0 , is the non empty
face having largest dimension. j denotes the jdimensional geometric measure on Sj .
- Each Sj has an atlas such that the second derivatives of the
11

inverse functions of all charts are bounded by a fixed constant


(so that the maximum curvature is bounded).

Conditions on the random field:


- [A2] : X is in fact defined on an open set containing S and has
C 2 paths
- [A3] : for every t S the distribution of X(t), X (t) does
not degenerate; for every s, t S, s = t, the distribution of
X(s), X(t) does not degenerate.
- [A4] : Almost surely the maximum of X(.) on S is attained at
a single point.
- For t Sj , Xj (t), Xj,N (t) denote respectively the derivative
along Sj and the normal derivative. The tangent space is Tt,j
and its orthogonal complement Nt,j .
[A5] : Almost surely, for every j = 1, . . . , d there is no point t in
Sj such that Xj (t) = 0, det(Xj (t)) = 0.

Theorem 3 Let M = maxtS X(t). Under assumptions A1 to


A5, the distribution of M has the density
E 1IAx X(t) = x pX(t)(x)

pM (x) =
tS0
d

E | det(Xj (t))| 1IAx X(t) = x, Xj (t) = 0 pX(t),X (t)(x, 0)j (dt),


j
j=1 Sj
(5)
where Ax = {M x}.
+

12

REMARKS ON THEOREM 3.

One should be precise about the meaning of the right-hand


side of (5), since the ingredients in the integrand depend on
the parametrization of the manifold Sj . One can show that
locally the integral is independent of the parametrization and
then extend each integral to the whole Sj .

Theorem 3 implies, in particular, the existence of a continuous density of the distribution of M . For one-parameter
processes, this kind of formula can be iterated and used to
prove higher order differentiability of F (x). This is a hard
and very interesting subject.
13

Formula (5) is only implicit (M appears on the right) but can


be used to get bounds for pM (x), hence for the tail 1 F (x),
on integrating once.

The proof of this theorem is based on a variant of Rice Formula, which permits to write as an integral the expectation
of the total mass of weighted roots of a random field.[See
Azas-MW, 2009, chapters 6 and 7].

The theorem can be extended to certain classes of nonGaussian random fields.

A general bound for pM


For t Sj , j d0, Ct,j is the closed convex cone generated by
the set of directions:
t sn
{ Rd : = 1 ; sn S, (n = 1, 2, . . .), sn t,
},
t sn
whenever this set is non-empty and Ct,j = {0} if it is empty.
Ct,j the dual cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
These definitions easily imply Tt,j Ct,j and Ct,j Nt,j . Also
j = d0, Ct,j = Nt,j .
X(.) has an extended outward derivative at the point t in Sj ,
j d0 if Xj,N (t) Ct,j .
14

Theorem 4 Under assumptions A1 to A5 above:

(a) pM (x) p(x) where


p(x) :=
tS0

E 1IX (t)C X(t) = x pX(t)(x)


t,0

d0

+
j=1 Sj

E | det(Xj (t))| 1IX (t)C X(t) = x, Xj (t) = 0


t,j
j,N
pX(t),X (t)(x, 0)j (dt) (6)
j
+

(b) P{M > u}

p(x)dx.

(a) follows from Theorem 3 and the observation that if t Sj ,


one has {M X(t)} {Xj,N (t) Ct,j }. (b) is an obvious consequence of (a).
The actual interest of this Theorem depends on the feasibility
of computing p(x). It turns out that this can be done in some
relevant cases.The results can be compared with the approximation of P{M > u} by means of u+ pE (x)dx given by Adler and
Taylor (2007) and Taylor, Takemura and Adler (2005), where
pE (x) :=
tS0
d0

+
j=1

E 1IX (t)C X(t) = x pX(t)(x)


t,0

(1)j
Sj

E det(Xj (t)) 1IX

j,N (t)Ct,j

X(t) = x, Xj (t) = 0

pX(t),X (t)(x, 0)j (dt). (7)


j

Under certain conditions, u+ pE (x)dx is the expected value of


the Euler-Poincar
e Characteristic of the excursion set Eu. The
advantage of pE (x) over p(x) is that one can have nice expressions for it for a large set of random fields.
Conversely p(x) has the obvious advantage that it is an upperbound of the true density pM (x) for every x-value, so that it
provides an upper-bound for 1 F (u) = u+ pM (x)dx for every
u, whereas the EPC-approximation only works for large u. We
will also see that in relevant cases, presently known second-order
results are more precise for the direct method.

Computing p(x) for stationary isotropic Gaussian fields


We now assume that the process X is centered Gaussian, with
covariance function
E X(s)X(t) =

st 2 ,

(8)

where : R+ R is of class C 4 . Without loss of generality, we


assume that (0) = 1 and put = (0), = (0). Assumption (8) is equivalent to saying that the law of X is invariant
under orthogonal linear transformations and translations of the
underlying parameter space Rd.
We will also assume that the set S has a polyhedral shape. i.e.
each Sj (j = 1, . . . , d) is a union of subsets of affine manifolds of
dimension j in Rd.
15

Theorem 5 Assume that the random field X is centered Gaussian, satisfies conditions A1-A5 of Chapter and has a covariance
having the form (8). Let S have polyhedral shape. Then,
p(x) = (x)

d0

0(t) +
tS0

j=1

| | j/2
H j (x) + Rj (x) gj

(9)

- gj = Sj j (t)j (dt), j (t) is the normalized solid angle of the


cone Ct,j in Nt,j :
j (t) =

dj1(Ct,j S dj1)

d(t) = 1.

dj1

(S dj1)

for j = 0, . . . , d 1,

(10)
(11)

For convex or other usual polyhedra j (t) is constant for t Sj ,


so that gj is equal to this constant multiplied by j (Sj ).

- Hj (resp. H j ) are the usual (resp. probabilistic) Hermite polynomials.


j

- Rj (x) =

2 2 ((j+1)/2 +
Tj (v) exp

| |

v := (2)1/2 (1 2)1/2y x

y2
2

dy

with := | |( )1/2, (12)

j1

Hj (v)
Hk2(v) v2/2
e
j
Ij1(v),
Tj (v) :=
k
2 k!
2 (j 1)!
k=0

2
In(v) = 2ev /2

[ n1
2 ]

(13)

(n 1)!!
Hn12k (v)
(14)
(n 1 2k)!!
k=0

n
2
+ 1I{n even} 2 (n 1)!! 2(1 (x))
2k

REMARKS ON THE THEOREM


1.- The expressions one obtains are complicated for higher order
dimension of the parameter. However, they are explicit and easy
to compute recursively.
2.- The proof requires some ingredients of analytic random matrix theory, namely, the distribution of the eigenvalues of a GOE
matrix.
3.- The principal term is
(x)

tS0

d0

0(t) +
j=1

| | j/2
H j (x) gj ,

(15)

which is the product of a standard normal density times a polynomial with degree d0. Integrating once, we get -in this casethe formula for the expectation of the EPC of the excursion set
given in Adler and Taylor (2007).
16

TWO EXAMPLES OF THEOREMS ABOUT SECOND


ORDER APPROXIMATION.

Theorem 6 Assume that the process X satisfies conditions A1


-A5. With no loss of generality, let maxtS Var(X(t)) = 1. In
addition, assume that set Sv of points t S where the variance
of X(t) attains its maximal value is contained in Sd0 (d0 > 0) the
non-empty face having largest dimension and that no point in Sv
is a boundary point of S\Sd0 . Then, there exist some positive
constants C, such that for every x > 0.
|pE (x) pM (x)| p(x) pM (x) C(x(1 + )).
17

Theorem 7 Assume that X is centered, satisfies hypotheses A1A5, the covariance has the form (8) with (0) = 1/2, (x)
0 f or x 0. Let S be a convex set, and d0 = d 1. Then
2
1
.
lim log p(x) pM (x) = 1 +
x+ x2
12 1

(16)

Remarks
1.- Since S is convex, the added hypothesis that the maximum
dimension d0 such that Sj is not empty is equal to d is not an
actual restriction.
2.- (0) = 1/2 is not an actual restriction, one can always
reduce the problem to this case by means of a scale change. As
for (x) 0 f or x 0 it is always verified by the so-called
Schoenberg covariances, which is exactly the class of functions
such that ( t s 2) is a covariance for any dimension d.
18

SOME REFERENCES
Adler, R.J. (1981). The Geometry of Random Fields. J. Wiley
and Sons, New York.
Adler, R.J. and Taylor, J. (2007). Random fields and geometry
Springer-Verlag.
Azas, J-M, Bardet, J-M and Wschebor, M. (2002). On the
Tails of the Distribution of the Maximum of a Smooth Stationary Gaussian Process. ESAIM: Probability and Statistics, Vol.
6, 177-184.
Azas, J-M. and Wschebor, M. (2008). A general formula for the
distribution of the maximum of a Gaussian field and the approximation of the tail. Stoch. Proc. Appl., 118, (7), 1190-1218.
Azas, J-M. and Wschebor, M. (2009). Level sets and extrema
of random processes and fields. J. Wiley and Sons, New York.
Borell, C. (1975). The Brunn-Minkowski inequality in Gauss
19

space. Invent. Math., 30, 207-216.


Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci.
Paris, S
er. I, 337, 663-666.
Cram
er, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J. Wiley and Sons, New-York.
Fernique, X.(1974). R
egularit
e des trajectoires des fonctions
al
eatoires gaussiennes. Lecture Notes Math., 480, SpringerVerlag,New-York.
Ibragimov A. , Sudakov V. N and Tsirelson B. S.(1976). Norms
of Gaussian sample functions. Lecture Notes Math. 550, 2041.
Springer-Verlag.
Piterbarg, V; I. (1996). Asymptotic Methods in the Theory of
Gaussian Processes and Fields. AMS., Prov., Rhode Island.
Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties
of half spaces for spherically invariant measures (in Russian).
Zap. Nauchn. Sem. LOMI, 45, 75-82.

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields

Non asymptotic bounds for the distribution of


the maximum of Random fields
12 Janvier 2009

Jean-Marc A ZAI S

Institut de Mathematiques,
Universite de Toulouse

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

1 / 21

Examples
The record method
The maxima method

Examples

The record method

The maxima method


Second order

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

2 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

Signal + noise model

Spatial Statistics often uses signal + noise model, for example :


precision agriculture
neuro-sciences
sea-waves modelling

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

3 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

Signal + noise model

Spatial Statistics often uses signal + noise model, for example :


precision agriculture
neuro-sciences
sea-waves modelling

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

3 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

Signal + noise model

Spatial Statistics often uses signal + noise model, for example :


precision agriculture
neuro-sciences
sea-waves modelling

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

3 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

Precision agriculture
Representation of the yield per unit by GPS harvester .

Is there only noise or some region with higher fertility ? ?

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

4 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

Neuroscience
The activity of the brain is recorded under some particular action and
the same question is asked

source : Maureen CLERC

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

5 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

Sea-waves spectrum
Locally in time and frequency the spectrum of waves is registered.
We want to localize transition periods.

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

6 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
Examples

In all these situations a good statistics consists in observing the


maximum of the (absolute value) of the random field for deciding if it
is typically large (Noise) of not (signal).

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

7 / 21

Examples
The record method
The maxima method

Examples

The record method

The maxima method


Second order

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

Non asymptotic bounds for the distribution of the maximum of Random fields
The record method

8 / 21

Non asymptotic bounds for the distribution of the maximum of Random fields

Examples
The record method
The maxima method

The record method

Hypothesis
S is a regular set of R2 (compact, simply connected + piecewise C 1
parametrization of the boundary by arc length) X is such that
X
-the bivariate process Z = (X, t
) has C 1 sample paths and non
2
degenerated Gaussian distribution.
X 2 X
- the distribution of (X(t), X (t)), X(t), t
, t2 do not degenerate
1
2

Then Roughly speaking the event


{M > u}
is almost equal to the events
The level curve at levelu is not empty
The point at the southern extremity of the level curve exists
There exists a point on the level curve :
X
X
= 0; X01 (t) = t
>0
X(t) = u; X10 (t) = t
1
2
X20 (t) =

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

2X
t12

< 0
9 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The record method

The computation of a probability is bounded by the expectation of the


number of roots of the process Z(t) (u, 0) that can be computed by
means of a Rice formula. Reintroducing the boundary to get an exact
statement, we get
L

P{M > u} P{Y(0) > u} +

E(|Y ( )|) Y( ) = u)pY( ) (u)d


0

E(| det(Z (t) 1IX20 (t)<0 1IX01 (t)>0 | X(t) = u, X01 (t) = 0)pX(t),X01 (t) (u, 0)dt,

+
S

Where Y( ) is the process X on the boundary and = 0 corresponds


to the southern extremity. The difficulty lies in the computation of the
expectation of the determinant

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

10 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The record method

The key point is that under the condition {X(t) = u, X01 (t)} = 0, the
quantity
X10 X01
| det(Z (t)| =
X11 X02
is simply equal to |X10 X02 | . Taking into account conditions, we get the
following expression for the second integral
E(|X20 (t) X01 (t)+ X(t) = u, X01 (t) = 0)pX(t),X01 (t) (u, 0)dt.

+
S

Moreover under stationarity or some more general hypotheses, these


two random variables are independent.

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

11 / 21

Examples
The record method
The maxima method

Examples

The record method

The maxima method


Second order

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method

12 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method

Consider a realization with M > u, then necessarily there exist a local


maxima or a border maxima above U
Border maxima : local maxima in relative topology
If the consider sets with polyedral shape union of manifold of
dimension 1 to d

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

13 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method

In fact result are simpler (and stronger) in term of the density pM (x) of
the maximum. Bound for the distribution are obtained by integration.

Theorem
pM (x) pM (x) :=
pM (x) :=
S

1
[p (x) + pEC
M (x)]with
2 M

E | det(X(t))|/X(t) = x, X (t) = 0 pX(t),Xj (t) (x, 0)dt+boundary term

and
d
pEC
M (x) := (1)
S

E det(X(t))/X(t) = x, X (t) = 0 pX(t),Xj (t) (x, 0)dt+boundary

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

14 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method

Quantity pEC
M (x) is easy to compute using the work by Adler and
properties of symmetry of the order 4 tensor of variance of X ( under
the conditional distribution) )

Lemma
E det(X(t))/X(t) = x, X (t) = 0 = det()Hd (x)
where Hd (x) is the dth Hermite polynomial and := Var(X (t))
main advantage of Euler characteristic method lies in this result.

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

15 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method

computation of pm

The key point is the following


If X is stationary and isotropic with covariance ( t 2 ) normalized by
Var(X(t)) = 1 et Var(X (t)) = Id
Then under the condition X(t) = x, X (t) = 0
X(t) =

8G +

2 Id + xId

Where G is a GOE matrix (Gaussian Orthogonal Ensemble), and a


standard normal independent variable. We use recent result on the
the characteristic polynomials of the GOE. Fyodorov(2004)

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

16 / 21

Non asymptotic bounds for the distribution of the maximum of Random fields

Examples
The record method
The maxima method

The maxima method

Theorem
Assume that the random field X is centered, Gaussian, stationary
and isotrpic and is regular Let S have polyhedral shape. Then,

d0

| | j/2
Hj (x) + Rj (x) gj
0 (t) +
p(x) = (x)
(1)

tS0

j=1

gj = Sj j (t)j (dt), j (t) is the normalized solid angle of the cone


of the extended outward directions at t in the normal space with
the convention d (t) = 1.
For convex or other usual polyhedra j (t) is constant for t Sj ,
Hj is the j th(probabilistic) Hermite polynomial.

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

17 / 21

Non asymptotic bounds for the distribution of the maximum of Random fields

Examples
The record method
The maxima method

The maxima method

Theorem (continued)
Rj (x) =

2
| |

j
2

((j+1)/2

Tj (v) exp

v := (2)1/2 (1 2 )1/2 y x
j1

Tj (v) :=
k=0

In (v) = 2e

v2 /2

y2
2

dy

with := | |( )1/2 ,

Hj (v)
Hk2 (v) v2 /2
e
j
Ij1 (v),
k
2 k!
2 (j 1)!

(2)
(3)

[ n1
2 ]

(n 1)!!
Hn12k (v)
(n 1 2k)!!
k=0

n
+ 1I{n even} 2 2 (n 1)!! 2(1 (x))
2k

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

(4)

18 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
Second order

Second order study


Using an exact implicit formula

Theorem
Under conditions above + Var(X(t)) 1 Then
limx+

2
log pM (x) pM (x)
x2

1 + inf
tS

t2

1
+ (t)2t

Var X(s)/X(t), X (t)


(1 r(s, t))2
sS\{t}

t2 := sup

and t is some geometrical characteristic et t = GEV((t))


The right hand side is finite and > 1

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

19 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method

Level Sets and Extrema


of Random Processes
and Fields
References

Jean-Marc Azas and Mario Wschebor

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

20 / 21

Examples
The record method
The maxima method

Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
References

Adler R.J. and Taylor J. E. Random fields and


geometry. Springer.
Mercadier, C. (2006), Numerical Bounds for the
Distribution of the Maximum of Some One- and
Two-Parameter Gaussian Processes, Adv. in Appl.
Probab. 38, pp. 149170.

Jean-Marc A ZAI S ( Institut de Mathematiques,


Universite de Toulouse )

21 / 21

arXiv:0910.0763v1 [math.PR] 5 Oct 2009

Some applications of Rice formulas to waves


Jean-Marc Azas

Jose R. Leon

Mario Wschebor

October 5, 2009

Abstract
We use Rices formulas in order to compute the moments of some
level functionals which are linked to problems in oceanography and optics. For instance, we consider the number of specular points in one or
two dimensions, the number of twinkles, the distribution of normal angle
of level curves and the number or the length of dislocations in random
wavefronts. We compute expectations and in some cases, also second moments of such functionals. Moments of order greater than one are more
involved, but one needs them whenever one wants to perform statistical
inference on some parameters in the model or to test the model itself.
In some cases we are able to use these computations to obtain a Central
Limit Theorem.

AMS Subject Classification: Primary 60G15; Secondary 60G60 78A10 78A97


86A05
Keywords: Rice formula, specular points, dislocations of wavefronts, random
seas.

Introduction

Many problems in applied mathematics require to estimate the number of points,


the length, the volume and so on, of the level sets of a random function W (x),
where x Rd , so that one needs to compute the value of certain functionals of
the probability distribution of the size of the random set
W
CA
(u, ) := {x A : W (x, ) = u},

for some given u.


Let us mention some examples which illustrate this general situation:
Universit
e de

Toulouse, IMT, LSP, F31062 Toulouse Cedex 9, France. Email: azais@cict.fr


de Matem
atica. Facultad de Ciencias. Universidad Central de Venezuela. A.P.
47197, Los Chaguaramos, Caracas 1041-A, Venezuela. Email: jose.leon@ciens.ucv.ve
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igu
a
4225. 11400. Montevideo. Uruguay. wschebor@cmat.edu.uy
Escuela

The number of times that a random process {X(t) : t R} crosses the


level u:
NAX (u) = #{s A : X(s) = u}.
Generally speaking, the probability distribution of the random variable
NAX (u) is unknown, even for the simplest models of the underlying process.
However, there exist some formulas to compute E(NAX ) and also higher
order moments.
A particular case is the number of specular points of a random curve or a
random surface.
Consider first the case of a random curve. We take cartesian coordinates
Oxz in the plane. A light source placed at (0, h1 ) emits a ray that is
reflected at the point (x, W (x)) of the curve and the reflected ray is registered by an observer placed at (0, h2 ).
Using the equality between the angles of incidence and reflexion with respect to the normal vector to the curve - i.e. N (x) = (W (x), 1) - an
elementary computation gives:
W (x) =

2 r1 1 r2
x(r2 r1 )

(1)

x2 + 2i , i=1,2.

where i := hi W (x) and ri :=

The points (x, W (x)) of the curve such that x is a solution of (1) are called
specular points. We denote by SP1 (A) the number of specular points
such that x A, for each Borel subset A of the real line. One of our aims
in this paper is to study the probability distribution of SP1 (A).
The following approximation, which turns out to be very accurate in practice for ocean waves, was introduced long ago by Longuet-Higgins (see [13]
and [14]):
Suppose that h1 and h2 are big with respect to W (x) and x, then ri =
i + x2 /(2i ) + O(h3
i ). Then, (1) can be approximated by
W (x)

x h1 + h2
x 1 + 2

= kx,
2 1 2
2 h1 h2

where
k :=

(2)

1 1
1
+
.
2 h1
h2

Denote Y (x) := W (x) kx and SP2 (A) the number of roots of Y (x)
belonging to the set A, an approximation of SP1 (A) under this asymptotic.
The first part of Section 3 below will be devoted to obtain some results
on the distribution of the random variable SP2 (R).

Consider now the same problem as above, but adding a time variable t,
that is, W becomes a random function parameterized by the pair (x, t).
We denote Wx , Wt , Wxt , ... the partial derivatives of W .
We use the Longuet-Higgins approximation (2), so that the approximate
specular points at time t are (x, W (x, t)) where
Wx (x, t) = kx.
Generally speaking, this equation defines a finite number of points which
move with time. The implicit function theorem, when it can be applied,
shows that the x-coordinate of a specular point moves at speed
dx
Wxt
=
.
dt
Wxx k
The right-hand side diverges whenever Wxx k = 0, in which case a flash
appears and the point is called a twinkle. We are interested in the
(random) number of flashes lying in a set A of space and in an interval
[0, T ] of time. If we put:
Wx (x, t) kx
Wxx (x, t) k

Y(x, t) :=

(3)

then, the number of twinkles is:


T W(A, T ) := {(x, t) A [0, T ] : Y(x, t) = 0}

Let W : Q Rd Rd with d > d be a random field and let us define


the level set
W
CQ
(u) = {x Q : W (x) = u}.

Under certain general conditions this set is a (dd )-dimensional manifold


but in any case, its (d d )-dimensional Hausdorff measure is well defined.
We denote this measure by dd . Our interest will be to compute the
W
(u))] as well
mean of the dd -measure of this level set i.e. E[dd (CQ
as its higher moments. It will be also of interest to compute:
E[

W (u)
CQ

Y (s)ddd (s)].

where Y (s) is some random field defined on the level set. Caba
na [7],
Wschebor [19] (d = 1) Azas and Wschebor [4] and, in a weak form,
Z
ahle [20] have studied these types of formulas. See Theorems 5 and 6.
Another interesting problem is the study of phase singularities, dislocations of random wavefronts. They correspond to lines of darkness, in light

propagation, or threads of silence in sound [6]. In a mathematical framework they can be define as the loci of points where the amplitude of waves
vanishes. If we represent the wave as
W (x, t) = (x, t) + i(x, t), where x Rd
where , are independent homogenous Gaussian random fields the dislocations are the intersection of the two random surfaces (x, t) = 0, (x, t) =
0. We consider a fixed time, for instance t = 0. In the case d = 2 we will
study the expectation of the following random variable
#{x S : (x, 0) = (x, 0) = 0}.
In the case d = 3 one important quantity is the length of the level curve
L{x S : (x, 0) = (x, 0) = 0}.
All these situations are related to integral geometry. For a general treatment
of the basic theory, the classical reference is Federers Geometric Measure Theory [9].
The aims of this paper are: 1) to re-formulate some known results in a
modern language or in the standard form of probability theory; 2) to prove
new results, such as computations in the exact models, variance computations
in cases in which only first moments have been known, thus improving the
statistical methods and 3) in some case, obtain Central Limit Theorems.
The structure of the paper is the following: In Section 2 we review without
proofs some formulas for the moments of the relevant random variables. In
Section 3 we study expectation, variance and asymptotic behavior of specular
points. Section 4 is devoted to the study of the distribution of the normal to the
level curve. Section 5 presents three numerical applications. Finally, in Section
6 we study dislocations of wavefronts following a paper by Berry & Dennis [6].

Some additional notation and hypotheses


d is Lebesgue measure in Rd , d (B) the d -dimensional Hausdorff measure of a
Borel set B and M T the transpose of a matrix M . (const) is a positive constant
whose value may change from one occurrence to another.
If not otherwise stated, all random fields are assumed to be Gaussian and centered.

Rice formulas

We give here a quick account of Rice formulas, which allow to express the
expectation and the higher moments of the size of level sets of random fields
by means of some integral formulas. The simplest case occurs when both the
4

dimension of the domain and the range are equal to 1, for which the first results
date back to Rice [17] (see also Cramer and Leadbetters book [8]). When
the dimension of the domain and the range are equal but bigger than 1, the
formula for the expectation is due to Adler [1] for stationary random fields. For
a general treatment of this subject, the interested reader is referred to the book
[4], Chapters 3 and 6, where one can find proofs and details.
Theorem 1 (Expectation of the number of crossings, d = d = 1) Let W =
{W (t) : t I} , I an interval in the real line, be a Gaussian process having C 1 paths. Assume that Var(W (t)) = 0 for every t I.
Then:
(4)
E NIW (u) = E |W (t)| W (t) = u pW (t) (u)dt.
I

Theorem 2 (Higher moments of the number of crossings, d = d = 1) Let


m 2 be an integer. Assume that W satisfies the hypotheses of Theorem 1 and
moreover, for any choice of pairwise different parameter values t1 , ..., tm I
the joint distribution of the k-random vector (W (t1 ), ..., W (tm )) has a density
(which amounts to saying that its variance matrix is non-singular). Then:
E NIW (u)(NIW (u) 1)...(NIW (u) m + 1)
m

E
Im

j=1

|W (tj )| W (t1 ) = ... = W (tm ) = u pW (t1 ),...,W (tm ) (u, ..., u)dt1 ...dtm .
(5)

Under certain conditions, the formulas in Theorems 1 and 2 can be extended


to non-Gaussian processes.
Theorem 3 (Expectation, d = d > 1) Let W : A Rd Rd be a Gaussian

random field, A an open set of Rd , u a fixed point in Rd . Assume that


the sample paths of W are continuously differentiable
for each t A the distribution of W (t) does not degenerate
P({t A : W (t) = u , det(W (t)) = 0}) = 0
Then for every Borel set B included in A
E NBW (u) =

E[| det W (t)| W (t) = u]pW (t) (u)dt.


B

If B is compact, both sides are finite.


The next proposition provides sufficient conditions (which are mild) for the third
hypothesis in the above theorem to be verified (see again [4], Proposition 6.5).

Proposition 1 Under the same conditions of the above theorem one has
P({t A : W (t) = u , det(W (t)) = 0}) = 0
if
pX(t) (x) C for all x in some neighborhood of u,
at least one of the two following conditions is satisfied
a) the trajectories of W are twice continuously differentiable
b)
() = sup P{| det W (t)| < W (t) = x} 0
xV (u)

as 0 where V (u) is some neighborhood of u.


Theorem 4 (m-th factorial moment d = d > 1) Let m 2 be an integer.
Assume the same hypotheses as in Theorem 3 except for (iii) that is replaced by
(iii) for t1 , ..., tm A distinct values of the parameter, the distribution of
W (t1 ), ..., W (tm )
does not degenerate in (Rd )m .
Then for every Borel set B contained in A, one has
E

NBW (u) NBW (u) 1 ... NBW (u) m + 1


m

E
Bm

j=1

| det W (tj ) | W (t1 ) = ... = W (tm ) = u


pW (t1 ),...,W (tm ) (u, ..., u)dt1 ...dtm , (6)

where both sides may be infinite.


When d > d we have the following formula :
Theorem 5 (Expectation of the geometric measure of the level set. d > d )

Let W : A Rd be a Gaussian random field, A an open subset of Rd , d > d

and u Rd a fixed point. Assume that:


Almost surely the function t

W (t) is of class C 1 .

For each t A, W (t) has a non-degenerate distribution.


P{t A, W (t) = u, W (t) does not have full rank} = 0

Then, for every Borel set B contained in A, one has


E (dd (W, B)) =

det W (t)(W (t))T

1/2

W (t) = u

pW (t) (u)dt.

(7)

If B is compact, both sides in (7) are finite.


The same kind of result holds true for integrals over the level set, as stated
in the next theorem.
Theorem 6 (Expected integral on the level set) Let W be a random field
that verifies the hypotheses of Theorem 5. Assume that for each t A one has
another random field Y t : V Rn , where V is some topological space, verifying
the following conditions:
Y t (v) is a measurable function of (, t, v) and almost surely, (t, v)
Y t (v) is continuous.
For each t A the random process (s, v) W (s), Y t (v) defined on
W V is Gaussian.

Moreover, assume that g : A C(V, Rn ) R is a bounded function, which is


continuous when one puts on C(V, Rn ) the topology of uniform convergence on
compact sets. Then, for each compact subset B of A, one has
g(t, Y t )dd (W, dt)

E
BW 1 (u)

E [det(W (t)(W (t))T )]1/2 g(t, Y t ) Z(t) = u .pZ(t) (u)dt.

(8)

Specular points and twinkles

3.1

Number of roots

Let W (t) : Rd Rd be a zero mean stationary Gaussian field. If W satisfies


the conditions of Theorem 3 one has:
E NAW (u) = |A|E[| det(W (0))|]pW (0) (u).
where |A| denotes the Lebesgue measure of A.
For d = 1, NAW (u) is the number of crossings of the level u and the formula
becomes
W
E N[0,T
] (u) =

where

i =

i d()

u2
2 2
0 ,
e
0

i = 0, 2, 4, . . . ,

being the spectral mesure of W .


Formula (9) is in fact the one S.O. Rice wrote in the 40s see [17].

(9)

3.2

Number of specular points

We consider first the one-dimensional static case with the longuet-Higgins approximation (2) for the number of specular points, that is:
SP2 (I) = #{x I : Y (x) = W (x) kx = 0}

We assume that the Gaussian process {W (x) : x R} has C 2 paths and


Var(W (x)) is constant equal to, say, v 2 > 0. (This condition can always be
obtained by means of an appropriate non-random time change, the unit speed
transformation) . Then Theorem 1 applies and
1 kx
E(|Y (x)|) ( )dx
v
v
I
1 kx
= G(k, (x)) ( )dv, (10)
v
v
I

E(|Y (x)| Y (x) = 0)pY (x) (0)dx =

E(SP2 (I)) =
I

where 2 (x) is the variance of W (x) and G(, ) := E(|Z|), Z with distribution
N (, 2 ).
For the second equality in (10), in which we have erased the condition in the
conditional expectation, take into account that since Var(W (x)) is constant,
for each x the random variables W (x) and W (x) are independent (differentiate under the expectation sign and use the basic properties of the Gaussian
distribution).
An elementary computation gives:
G(, ) = [2(/) 1] + 2(/),
where (.) and (.) are respectively the density and the cumulative distribution
functions of the standard Gaussian distribution.
When the process W (x) is also stationary, v 2 = 2 and 2 (x) is constant
equal to 4 . If we look at the total number of specular points over the whole
line, we get

G(k, 4 )
(11)
E(SP2 (R)) =
k
which is the result given by [14] (part II, formula (2.14) page 846). Note that
this quantity is an increasing function of k4 ) .
Since in the longuet-Higgins approximation k 0, one can write a Taylor
expansion having the form:
E(SP2 (R))

24 1
1 k2
1 k4
1+
+
+ ...
k
2 4
24 24

(12)

Let us turn to the variance of the number of specular points, under some
additional restrictions. First of all, we assume for this computation that the
8

given process {W (x) : x R} is stationary with covariance function


E(W (x)W (y)) = (x y). is assumed to have enough regularity as to perform the computations below, the precise requirements on it being given in the
statement of Theorem 7.
Putting for short S = SP2 (R), we have:
Var(S) = E(S(S 1)) + E(S) [E(S)]2

(13)

The first term can be computed using Theorem 2:


E(S(S 1)) =

R2

E |W (x) k||W (y) k| W (x) = kx, W (y) = ky

.pW (x),W (y) (kx, ky) dxdy

(14)

where
1 k 2 (2 x2 + 22 (x y)xy + 2 y 2 )
,
2
22 2 (x y)
2 22 2 (x y)
(15)
under the additional condition that the density (15) does not degenerate for
x = y.
For the conditional expectation in (14) we perform a Gaussian regression of
W (x) (resp. W (y)) on the pair (W (x), W (y)). Putting z = x y, we obtain:
1

pW (x),W (y) (kx, ky) =

exp

W (x) = y (x) + ay (x)W (x) + by (x)W (y)


(z) (z)
22 2 (z)
2 (z)
,
by (x) = 2
2 2 (z)

ay (x) =

(16)

where y (x) is Gaussian centered, independent of (W (x), W (y)). The regression of W (y) is obtained by permuting x and y.
The conditional expectation in (14) can now be rewritten as an unconditional
expectation:
(z)x + 2 y
22 2 (z)

(z)y + 2 x
22 2 (z)
(17)
Notice that the singularity on the diagonal x = y is removable, since a Taylor
expansion shows that for z 0:
E y (x) k (z) 1 +

(z) 1 +

x (y) k (z) 1 +

(z)x + 2 y
1 4
=
x z + O(z 3 ) .
22 2 (z)
2 2

(18)

One can check that


2 (z) = E (y (x))2 = E (x (y))2 = 4
9

2 2 (z)
22 2 (z)

(19)

and
E y (x)x (y) = (4) (z) +

2 (z) (z)
.
22 2 (z)

(20)

Moreover, if 6 < +, performing a Taylor expansion one can show that as


z 0 one has
1 2 6 24 2
z
(21)
2 (z)
4
2
and it follows that the singularity at the diagonal of the integrand in the righthand side of (14) is also removable.
We will make use of the following auxiliary statement that we state as a
lemma for further reference.The proof requires some calculations, but is elementary and we skip it. The value of H(; 0, 0) below can be found for example
in [8], p. 211-212.
Lemma 1 Let
H(; , ) = E(| + || + |)
where the pair (, ) is centered Gaussian, E( 2 ) = E( 2 ) = 1, E() = .
Then,
H(; , ) = H(; 0, 0) + R2 (; , )
where

H(; 0, 0) =

1 2 +

2
arctan

and

1 2

|R2 (; , )| 3(2 + 2 )

if 2 + 2 1 and 0 1.

In the next theorem we compute the equivalent of the variance of the number
of specular points, under certain hypotheses on the random process and with
the longuet-Higgins asymptotic. This result is new and useful for estimation
purposes since it implies that, as k 0, the coefficient of variation of the
random variable S tends to zero at a known speed. Moreover, it will also appear
in a natural way when normalizing S to obtain a Central Limit Theorem.
Theorem 7 Assume that the centered Gaussian stationary process W = {W (x) :
x R} is dependent, that is, (z) = 0 if |z| > , and that it has C 4 -paths.
Then, as k 0 we have:
Var(S) =

1
+ O(1).
k

where
J
= +
2

24
24
,
3

2
10

(22)

J=

2 (z)H (z); 0, 0)
2(2 + (z))

dz,

(23)

the functions H and 2 (z) have already been defined above, and
(z) =

(z)2 (z)
1
(4)

(z)
+
.
2 (z)
22 2 (z)

Remarks on the statement.


The assumption that the paths of the process are of class C 4 imply that
8 < . This is well-known for Gaussian stationary processes (see for
example [8]).
Notice that since the process is -dependent, it is also -dependent for any
> . It is easy to verify that when computing with such a instead of
one gets the same value for .
One can replace the -dependence by some weaker mixing condition, such
as
(i) (z) (const)(1 + |z|) (0 i 4)
for some > 1, in which case the value of should be replaced by:
=

1
24
+

2 (z)H((z); 0, 0)
2

2 +

(z)

1 4

dz.
2

The proof of this extension can be performed following the same lines as
the one we give below, with some additional computations.
Proof of the Theorem: We use the notations and computations preceding
the statement of the theorem.
Divide the integral on the right-hand side of (14) into two parts, according as
|x y| > or |x y| , i.e.
E(S(S 1)) =

... = I1 + I2 .

... +

(24)

|xy|

|xy|>

In the first term, the dependence of the process implies that one can
factorize the conditional expectation and the density in the integrand. Taking
into account that for each x R, the random variables W (x) and W (x) are
independent, we obtain for I1 :
I1 =
|xy|>

E |W (x) k| E |W (y) k| pW (x) (kx)pW (y) (ky)dxdy.

On the other hand, we know that W (x) (resp. W (x)) is centered normal with
variance 2 (resp. 4 ). Hence:
I1 = G(k,

4 )

2
|xy|>

1 k 2 (x2 + y 2 )
1
exp
dxdy,
22
2
2
11

To compute the integral on the right-hand side, notice that the integral over
the whole x, y plane is equal to 1/k 2 so that it suffices to compute the integral
over the set |x y| . Changing variables, this last one is equal to
+

x+

dx

x
+

1
=
2k 2

1
1 k 2 (x2 + y 2 )
dy
exp
22
2
2
1

e 2 u du

u+ k

e 2 v dv

u k

+ O(1),
=
k 2
where the last term is bounded if k is bounded (in fact, remember that we are
considering an approximation in which k 0). So, we can conclude that:
|xy|>

1
1
1 k 2 (x2 + y 2 )

dxdy = 2
exp
+ O(1)
22
2
2
k
k 2

Replacing in the formula for I1 and performing a Taylor expansion, we get:


I1 =

24 1

+ O(1) .
2
k
k 2

(25)

Let us now turn to I2 .


Using Lemma 1 and the equivalences (18) and (21), whenever |z| = |x y|
, the integrand on the right-hand side of (14) is bounded by
(const) H((z); 0, 0) + k 2 (x2 + y 2 ) .
We divide the integral I2 into two parts:
First, on the set {(x, y) : |x| 2, |x y| } the integral is clearly bounded
by some constant.
Second, we consider the integral on the set {(x, y) : x > 2, |x y| }.
(The symmetric case, replacing x > 2 by x < 2 is similar,that is the reason
for the factor 2 in what follows).
We have (recall that z = x y):
2 (z) H (z); 0, 0 + R2 (z); ,

I2 = O(1) + 2
|xy|,x>2

22 2 (z)

exp

12

1 k 2 (2 x2 + 2 (x y)xy + 2 y 2 )
dxdy
2
22 2 (x y)

which can be rewritten as:

I2 = O(1) + 2

2 (z) H (z); 0, 0 + R2 (z); ,

1
2(2 + (z))
+

exp

1
2(2

(z))

k2 z 2
2
1
1

2 2 (z) 2 + (z) 2

exp k 2

dz

(x z/2)2
dx
2 (z))

In the inner integral we perform the change of variables

2k(x z/2)
=
2 (z)
so that it becomes:
+
1
1 1
1
1
exp 2 d =

+ O(1)
2
k 2 0
2
2 2k

where 0 = 2 2k(2 z/2)/ 2 (z).

(26)

Notice that O(1) in (26) is uniformly bounded, independently of k and z, since


the hypotheses on the process imply that 2 (z) is bounded below by a
positive number, for all z.
We can now replace in the expression for I2 and we obtain
J
I2 = O(1) + .
k 2

(27)

To finish, put together (27) with (25), (24), (13) and (12).
Corollary 1 Under the conditions of Theorem 7, as k 0:
Var(S)
k.
E(S)
The proof follows immediately from the Theorem and the value of the expectation.
The computations made in this section are in close relation with the two
results of Theorem 4 in Kratz and Leon [12]. In this paper the random variable
SP2 (I) is expanded in the Wiener-Hermite Chaos. The aforementioned expansion yields the same formula for the expectation and allows obtaining also a
formula for the variance. However, this expansion is difficult to manipulate in
order to get the result of Theorem 7.
Let us now turn to the Central Limit Theorem.
13

Theorem 8 Assume that the process W satisfies the hypotheses of Theorem 7.


In addition, we will assume that the fourth moment of the number of approximate specular points on an interval having length equal to 1 is bounded uniformly
in k, that is
4
E SP2 ([0, 1])
(const)
(28)
Then, as k 0,

24 1
k

/k

N (0, 1),

where denotes convergence in distribution.


Remark.
One can give conditions for the added hypothesis (28) to hold true, which
require some additional regularity for the process. Even though they are not
nice, they are not costly from the point of view of physical models. For example,
either one of the following conditions imply (28):
The paths x
W (x) are of class C 11 . (Use Theorem 3.6 of [4] with m = 4,
applied to the random process {W (x) : x R}. See also [16]).
The paths x
W (x) are of class C 9 and the support of the spectral
measure has an accumulation point: apply Exercice 3.4 of [4] to get the
non-degeneracy condition, Proposition 5.10 of [4] and Rice formula (Theorem 2) to get that the fourth moment of the number of zeros of W (x)
is bounded.
Proof of the Theorem. Let and be real numbers satisfying the conditions
1/2 < < 1, + > 1, 2 + < 2. It suffices to prove the convergence as k
takes values on a sequence of positive numbers tending to 0. To keep in mind
that the parameter is k, we use the notation
S(k) := S = SP2 (R)
Choose k small enough, so that k > 2 and define the sets of disjoint
intervals, for j = 0, 1, . . . , [k ]:
Ujk = (j 1)[k ] + /2, j[k ] /2 ,
Ijk = j[k ] /2, j[k ] + /2 .

[.] denotes integer part.


Notice that each interval Ujk has length [k ] and that two neighboring
intervals Ujk are separated by an interval of length . So, the -dependence of
the process implies that the random variables SP2 (Ujk ), j = 0, 1, . . . , [k ]
14

are independent. A similar argument applies to SP2 (Ijk ), j = 0, 1, . . . , [k ].


We denote:
SP2 (Ujk ),

T (k) =
|j|[k ]

Denote
Vk = Var(S(k))

1/2

where the equivalence is due to Theorem 7.

k/

We give the proof in two steps, which easily imply the statement. In the
first one, we prove that
Vk [S(k) T (k)]

tends to 0 in the L2 of the underlying probability space.


In the second step we prove that
Vk T (k)
is asymptotically standard normal.

Step 1. We prove first that Vk [S(k) T (k)] tends to 0 in L1 . Since it is


non-negative, it suffices to show that its expectation tends to zero. We have:
S(k) T (k) =

SP2 (Ijk ) + Z1 + Z2
|j|<[k ]

where
Z1 = SP2 , [k ].[k ] + /2 ,
Z2 = SP2 [k ].[k ] /2, +) .
Using the fact that E SP2k (I) (const)
+

Vk E(S(k)T (k)) (const)k 1/2

=0

(kx/ 2 )dx, one can show that

[k ]k

+
2

(kx/

2 )dx .

[k ][k ]

which tends to zero as a consequence of the choice of and .


It suffices to prove that Vk2 Var S(k) T (k) 0 as k 0. Using independence:
Var S(k) T (k) =

Var SP2 (Ijk ) + Var(Z1 ) + Var(Z2 )


|j|<[k ]

|j|<[k ]

E SP2 (Ijk )(SP2 (Ijk ) 1)

+ E(Z1 (Z1 1)) + E(Z2 (Z2 1)) + E S(k) T (k) .


15

(29)

We already know that Vk2 E S(k) T (k) 0. Using the hypotheses of the
theorem, since each Ijk can be covered by a fixed number of intervals of size one,
we know that E SP2 (Ijk )(SP2 (Ijk ) 1) is bounded by a constant which does
not depend on k and j. We can write
Vk2
|j|<[k ]

E SP2 (Ijk )(SP2 (Ijk ) 1) (const)k 1

which tends to zero because of the choice of . The remaining two terms can
be bounded by calculations similar to those of the proof of Theorem 7.
Step 2.
T (k) is a sum of independent but not equi-distributed random
variables. To prove it satisfies a Central Limit Theorem, we use a Lyapunov
condition based of fourth moments. Set:
Mjm := E

SP2 (Ujk ) E SP2 (Ujk )

For the Lyapunov condition it suffices to verify that


4
|j|[k ]

Mj4 0 as k 0,

(30)

where
2 :=

Mj2 .
|j|[k ]

To prove (30), let us partition each interval Ujk into p = [k ] 1 intervals


I1 , ...Ip of equal size . We have
E SP1 + + SPp )4 =

E SPi1 SPi2 SPi3 SPi4 ,

(31)

1i1 ,i2 ,i3 ,i4 p

where SPi stands for SP2 (Ii ) E SP2 (Ii ) Since the size of all the intervals
is equal to and given the finiteness of fourth moments in the hypothesis, it
follows that E SPi1 SPi2 SPi3 SPi4 is bounded.
On the other hand, notice that the number of terms which do not vanish in
the sum of the right-hand side of (31) is O(p2 ). In fact, if one of the indices in
(i1 , i2 , i3 , i4 ) differs more than 1 from all the other, then E SPi1 SPi2 SPi3 SPi4
vanishes. Hence,
E SP2 (Ujk ) E SP2 (Ujk )

(const)k 2

so that |j|[k ] Mj4 = O(k 2 k ). The inequality 2 + < 2 implies Lyapunov condition.

16

3.3

Number of specular points without approximation

We turn now to the computation of the expectation of the number of specular


points SP1 (I) defined by (1). This number of specular points is equal to the
number of zeros of the process
Z(x) := W (x) m1 (x, W (x)) = 0,
where
m1 (x, w) =

x2 (h1 w)(h2 w) + [x2 + (h1 w)2 ][x2 + (h2 w)2 ]


.
x(h1 + h2 2w)

Assume that the process {W (x) : x R} is Gaussian, centered, stationary, with


0 = 1. The process Z(t) is not Gaussian and we must use a generalization of
Theorem 1, namely Theorem 3.2 of [4] to get
b

E SP1 ([a, b]) =

dx
a

E |Z (x)| Z(x) = 0, W (x) = w


m2 (x,w)
w2
1
1
12
2
e
dw.
. e 2
22
2

(32)

For the conditional expectation in (32), notice that


Z (x) = W (x)

m1
m1
(x, W (x))
(x, W (x))W (x),
x
w

so that under the condition,


Z (x) = W (x)K(x, w), where K(x, w) =

m1
m1
(x, w))+
(x, w))m1 (x, w).
x
w

Using that for each x, W (x) and W (x) are independent random variables
and performing a Gaussian regression of W (x) on W (x), we can write (32) in
the form:

E SP1 ([a, b])


b

dx
a

E | 2 w K(x, w)|

1
1
m2 (x, w)
exp (w2 + 1
) dw.
2
2
2 2
(33)

where is centered Gaussian with variance 4 22 . Formula (33) can still be


rewritten as:
E SP1 ([a, b])
=

1
2

4 22
2

dx
a

m2 (x, w)
1
) dw,
G(m, 1) exp (w2 + 1
2
2
17

(34)

where
m = m(x, w) =

2 w + K(x, w)
4 22

Notice that in (34), the integral is convergent as a , b + and


that this formula is well-adapted to numerical approximation.

3.4

Number of twinkles

We give a proof of a result stated in [14] (part III pages 852-853).


We consider Y(x, t) defined by (3) and we limit ourselves to the case in which
W (x, t) is centered and stationary. If Y satisfies the conditions of Theorem 3,
by stationarity we get
E T W(I, T ) = T

E | det Y (x, t)| Y(x, t) = 0 pY(x,t) (0)dx.

(35)

Since Wxx and Wx are independent with respective variances


+

40 =

4 (d, d )

20 =

2 (d, d ),

where is the spectral measure of the stationary random field W (x, t). The
density in (35) satisfies
pY(x,t) (0) = (20 )1/2 kx(20 )1/2 (40 )1/2 k(40 )1/2 .
On the other hand
Y (x, t) =

Wxx (x, t) k
Wxxx (x, t)

Wxt (x, t)
Wxxt (x, t)

Under the condition Y(x, t) = 0, one has


| det(Y (x, t))| = |Wxt (x, t)Wxxx (x, t)|.
Computing the regression it turns out that the conditional distribution of the
pair (Wxt (x, t), Wxxx (x, t)) under the same condition, is the one of two independent centered gaussian random variables, with the following parameters:
31
k and variance 22
40
40
kx and variance 60
expectation
20
expectation

231
, for the first coordinate
40
240
, for the second coordinate
20

(36)
(37)

It follows that:
E | det(Y (x, t))| Y(x, t) = 0 = G

31
k,
40
18

22

40
231
.G
kx,
40
20

60

240
20

Summing up:
1
E T W(R, T ) =
T
1

40
=

40

k
40

setting 6 := 60
III page 853).

3.5

31
k,
40

240
20

22
31
k,
40

231
40

G
R

22

40
kx,
20

231 1
40 k

60

1
240

20
20

6 20 + 40
6

20 40
6 + 240

kx

20
(38)

This result is equivalent to formula (4.7) of [14] (part

Specular points in two dimensions

We consider at fixed time a random surface depending on two space variables x


and y. The source of light is placed at (0, 0, h1 ) and the observer is at (0, 0, h2 ).
The point (x, y) is a specular point if the normal vector n(x, y) = (Wx , Wy , 1)
to the surface at (x, y) satisfies the following two conditions:
the angles with the incident ray I = (x, y, h1 W ) and the reflected
ray R = (x, y, h2 W ) are equal (for short the argument (x, y) has
been removed),
it belongs to the plane generated by I and R.
Setting i = hi W and ri =
case we have:

x2 + y 2 + i , i = 1, 2, as in the one-parameter

x
2 r1 1 r2
,
2
+y
r2 r1
y
2 r1 1 r2
Wy = 2
.
x + y 2 r2 r1

Wx =

x2

(39)

When h1 and h2 are large, the system above can be approximated by


Wx = kx
Wy = ky,

(40)

under the same conditions as in dimension 1.


Next, we compute the expectation of SP2 (Q), the number of approximate
specular points in the sense of (40) that are in a domain Q. In the remaining of
this paragraph we limit our attention to this approximation and to the case in
which {W (x, y) : (x, y) R2 } is a centered Gaussian stationary random field.

19

Let us define:

Wx (x, y) kx
Wy (x, y) ky

Y(x, y) :=

(41)

Under very general conditions, for example on the spectral measure of {W (x, y) :
x, y R} the random field {Y (x, y) : x, y R} satisfies the conditions of Theorem 3, and we can write:
E SP2 (Q) =
Q

E | det Y (x, y)| pY(x,y) (0) dxdy,

(42)

since for fixed (x, y) the random matrix Y (x, y) and the random vector Y (x, y)
are independent, so that the condition in the conditional expectation can be
erased.
The density in the right hand side of (42) has the expression
pY(x,y) (0) = p(Wx ,Wy ) (kx, ky)
k2
02 x2 211 xy + 20 y 2 .
2(20 02 211 )
20 02 211
(43)
To compute the expectation of the absolute value of the determinant in the right
hand side of (42), which does not depend on x, y, we use the method of [6]. Set
2
:= det Y (x, y) = (Wxx k)(Wyy k) Wxy
.
=

1
2

exp

We have
E(||) = E

1 cos(t)
dt .
t2

(44)

Define
2
h(t) := E exp it[(Wxx k)(Wyy k) Wxy
]

Then
E(||) =

+
0

1 Re[h(t)]
dt .
t2

To compute h(t) we define

0 1/2 0
0
A = 1/2 0
0
0 1

and the variance matrix of Wxx , Wyy , Wx,y

40 22 31
:= 22 04 13 .
31 13 22

20

.
(45)

Let1/2 A1/2 = P diag(1 , 2 , 3 )P T where P is orthogonal. Then by a


diagonalization argument
h(t) = eitk

E exp it (1 Z12 k(s11 +s21 )Z1 )+(2 Z22 k(s12 +s22 )Z2 )+(3 Z32 k(s13 +s23 )Z3 )
where (Z1 , Z2 , Z3 ) is standard normal and sij are the entries of 1/2 P T .
One can check that if is a standard normal variable and , are real
constants, > 0:

E ei (+)

i 2

= (12i )1/2 e (12i ) =

2
2
1
+i
+
exp
1 + 4 2
1 + 4 2
(1 + 4 2 )1/4

where

1
arctan(2 ), 0 < < /4.
2
Replacing in (46), we obtain for Re[h(t)] the formula:
=

Re[h(t)] =
j=1

dj (t, k)
1+

42j t2

j (t) + k 2 tj (t)

cos

(47)

j=1

where, for j = 1, 2, 3:
dj (t, k) = exp

k 2 t2 (s1j + s2j )2
,
2
1 + 42j t2

j (t) =

1
arctan(2j t), 0 < j < /4,
2

j (t) =

(s1j + s2j )2 j
1
.
t2
3
1 + 42j t2

Introducing these expressions in (45) and using (43) we obtain a new formula
which has the form of a rather complicated integral. However, it is well adapted
to numerical evaluation.
On the other hand, this formula allows us to compute the equivalent as
k 0 of the expectation of the total number of specular points under the
longuet-Higgins approximation. In fact, a first order expansion of the terms in
the integrand gives a somewhat more accurate result, that we state as a theorem:
Theorem 9
E SP2 (R2 ) =

21

m2
+ O(1)
k2

(48)

(46)

where
+

m2 =

3
j=1 (1

+ 42j t2 )

cos

3
j=1

j (t)

t2

1/2

1 23/2

3
j=1

Aj

1 + Aj
t2

dt

1 B1 B2 B2 B3 B3 B1

dt,
(49)

where
Aj = Aj (t) = 1 + 42j t2

1/2

, Bj = Bj (t) =

(1 Aj )/(1 + Aj ).

Notice that m2 only depends on the eigenvalues 1 , 2 , 3 and is easily


computed numerically.
In Flores and Leon [10] a different approach was followed in search of a formula for the expectation of the number of specular points in the two-dimensional
case, but their result is only suitable for Montecarlo approximation.
We now consider the variance of the total number of specular points in
two dimensions, looking for analogous results to the one-dimensional case (i.e.
Theorem 7 and its Corollary 1), in view of their interest for statistical applications. It turns out that the computations become much more involved. The
statements on variance and speed of convergence to zero of the coefficient of
variation that we give below include only the order of the asymptotic behavior
in the longuet-Higgins approximation, but not the constant. However, we still
consider them to be useful. If one refines the computations one can give rough
bounds on the generic constants in Theorem 10 and Corollary 2 on the basis of
additional hypotheses on the random field.
We assume that the real-valued, centered, Gaussian stationary random field
{W (x) : x R2 } has paths of class C 3 , the distribution of W (0) does not
degenerate (that is Var(W (0)) is invertible). Moreover, let us consider W (0),
expressed in the reference system xOy of R2 as the 2 2 symmetric centered
Gaussian random matrix:
W (0) =

Wxx (0) Wxy (0)


Wxy (0) Wyy (0)

The function
z

(z) = det Var W (0)z ,

defined on z = (z1 , z2 )T R2 , is a non-negative homogeneous polynomial of


degree 4 in the pair z1 , z2 . We will assume the non-degeneracy condition:
min{(z) : z = 1} = > 0.

(50)

Theorem 10 Let us assume that {W (x) : x R2 } satisfies the above condtions


and that it is also -dependent, > 0, that is, E W (x)W (y) = 0 whenever
22

x y > .
Then, for k small enough:
Var SP2 (R2 )

L
,
k2

where L is a positive constant depending upon the law of the random field.
A direct consequence of Theorems 9 and 10 is the following:
Corollary 2 Under the same hypotheses of Theorem 10, for k small enough,
one has:
Var SP2 (R2 )
L1 k
E SP2 (R2 )
where L1 is a new positive constant.
Proof of Theorem 10. For short, let us denote T = SP2 (R2 ). We have:
Var(T ) = E(T (T 1)) + E(T ) [E(T )]2

(51)

We have already computed the equivalents as k 0 of the second and third


term in the right-hand side of (51). Our task in what follows is to consider the
first term.
The proof is performed along the same lines as the one of Theorem 7, but
instead of applying Rice formula for the second factorial moment of the number of crossings of a one-parameter random process, we need Theorem 4 for
dimension d = 2. We write the factorial moment of order m = 2 in the form:
E(T (T 1))
=
R2 R2

E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0 pY(x),Y(y) (0, 0) dxdy

... dxdy +
xy >

... dxdy = J1 + J2 .
xy

For J1 we proceed as in the proof of Theorem 7, using the -dependence and


the evaluations leading to the statement of Theorem 9. We obtain:
J1 =

O(1)
m22
+ 2 .
4
k
k

(52)

Let us show that for small k,


O(1)
.
k2
In view of (51), (48) and (52) this suffices to prove the theorem.
J2 =

23

(53)

We do not perform all detailed computations. The key point consists in


evaluating the behavior of the integrand that appears in J2 near the diagonal
x = y, where the density pY(x),Y(y) (0, 0) degenerates and the conditional expectation tends to zero.
For the density, using the invariance under translations of the law of W (x) :
x R2 , we have:
pY(x),Y(y) (0, 0) = pW (x),W (y) (kx, ky)
= pW (0),W (yx) (kx, ky)
= pW (0),[W (yx)W (0)] (kx, k(y x)).
Perform the Taylor expansion, for small z = y x R2 :
W (z) = W (0) + W (0)z + O( z 2 ).
Using the non-degeneracy assumption (50) and the fact that W (0) and W (0)
are independent, we can show that for x, z R2 , z :
C1
exp C2 k 2 ( x C3 )2
z 2

pY(x),Y(y) (0, 0)

where C1 , C2 , C3 are positive constants.


Let us consider the conditional expectation. For each pair x, y of different
points in R2 , denote by the unit vector (y x)/ y x and n a unit vector orthogonal to . We denote respectively by Y, Y, n Y the first and
second partial derivatives of the random field in the directions given by and n.
Under the condition
Y(x) = 0, Y(y) = 0
we have the following simple bound on the determinant, based upon its definition
and Rolles Theorem applied to the segment [x, y] = {x + (1 )y}:
det Y (x) Y(x)

n Y(x) y x

sup

Y(s)

n Y(x)

(54)

s[x,y]

So,
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0
y x 2E
= z 2E

sup

Y(s)

n Y(x)

n Y(y) W (x) = kx, W (y) = ky

s[x,y]

sup

Y(s)

n Y(0)

s[0,z]

24

n Y(z) W (0) = kx,

W (z) W (0)
= k ,
z

where the last equality is again a consequence of the stationarity of the random
field {W (x) : x R2 }.
At this point, we perform a Gaussian regression on the condition. For the
condition, use again Taylor expansion, the non-degeneracy hypothesis and the
independence of W (0) and W (0). Then, use the finiteness of the moments of
the supremum of bounded Gaussian processes (see for example [4], Ch. 2), take
into account that z to get the inequality:
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0 C4 z

1+k x

(55)

where C4 is a positive constant. Summing up, we have the following bound for
J2 :
J2 C1 C4 2

1+k x

R2
+

= C1 C4 2 2 2

1 + k
0

exp C2 k 2 ( x C3 )2 dx
4

(56)

exp C2 k 2 ( C3 )2 d

Performing the change of variables w = k, (53) follows.

The distribution of the normal to the level


curve

Let us consider a modeling of the sea W (x, y, t) as a function of two space variables and one time variable. Usual models are centered Gaussian stationary
with a particular form of the spectral measure that we discuss briefly below.
We denote the covariance by (x, y, t) = E(W (0, 0, 0)W (x, y, t)).
In practice, one is frequently confronted with the following situation: several
pictures of the sea on time over an interval [0, T ] are stocked and some properties
or magnitudes are observed. If the time T and the number of pictures are large,
and if the process is ergodic in time, the frequency of pictures that satisfy a
certain property will converge to the probability of this property to happen at
a fixed time.
Let us illustrate this with the angle of the normal to the level curve at a
point chosen at random. We consider first the number of crossings of a level
u by the process W (, y, t) for fixed t and y, defined as
W (,y,t)

N[0,M1 ] (u) = #{x : 0 x M1 ; W (x, y, t) = u}.


We are interested in computing the total number of crossings per unit time when
integrating over y [0, M2 ] i.e.
1
T

M2

dt
0

W (,y,t)

N[0,M1 ] (u) dy.


25

(57)

If the ergodicity assumption in time holds true, we can conclude that a.s.:
1
T

M2

W (,y,t))

N[0,M1 ]

dt
0

W (,0,0))

(u) dy M1 E N[0,M1 ]

(u) =

M1 M2

200 21 u2
000 ,
e
000

where
abc =
R3

ax by ct d(x , y , t )

are the spectral moments.


Hence, on the basis of the quantity (57) for large T , one can make inference
about the value of certain parameters of the law of the random field. In this
example these are the spectral moments 200 and 000 .
If two-dimensional level information is available, one can work differently because there exists an interesting relationship with Rice formula for level curves
that we explain in what follows.
We can write (x = (x, y)):
W (x, t) = ||W (x, t)||(cos (x, t), sin (x, t))T .
Instead of using Theorem 1, we can use Theorem 6, to write
M2

E
0

W (,y,0)

N[0,M1 ] (u) dy = E
=

CQ (0,u)

2 (Q)

| cos (x, 0)| d1

200 2u2
000 ,
e
000

(58)

where Q = [0, M1 ] [0, M2 ]. We have a similar formula when we consider sections of the set [0, M1 ] [0, M2 ] in the other direction. In fact (58) can be
generalized to obtain the Palm distribution of the angle .
Set h1 ,2 = 1I[1 , 2 ] , and for 1 < 2 define
F (2 ) F (1 ) : = E 1 ({x Q : W (x, 0) = u ; 1 (x, s) 2 })
=E

h1 ,2 ((x, s))d1 (x)ds


CQ (u,s)

(59)
u2

exp(
)
y W
)((x W )2 + (y W )2 )1/2 ] 200 .
= 2 (Q)E[h1 ,2 (
x W
2000
Denoting = 200 020 110 and assuming 2 (Q) = 1 for ease of notation, we

26

readily obtain
F (2 ) F (1 )

u2

e 2000

=
(2)3/2 ()1/2 000

R2

h1 ,2 () x2 + y 2 e 2 (02 x

211 xy+20 y 2 )

dxdy

u2

e 200

=
(2)3/2 (+ )1/2 000
2 exp(

2
(+ cos2 ( ) + sin2 ( )))dd
2+

where + are the eigenvalues of the covariance matrix of the random vector
(x W (0, 0, 0), y W (0, 0, 0)) and is the angle of the eigenvector associated to
+ . Remarking that the exponent in the integrand can be written as
1/ (1 2 sin2 ( )) with 2 := 1 + / and that
+
0

2 exp

H2
2

2H

it is easy to get that


2

F (2 ) F (1 ) = (const)

1 2 sin2 ( )

1/2

d.

From this relation we get the density g() of the Palm distribution, simply by
dividing by the total mass:
g() =

1 2 sin2 ( )

1 2 sin2 ( )

1/2
1/2

=
d.

1 2 sin2 ( )
4K( 2 )

1/2

(60)

Here K is the complete elliptic integral of the first kind. This density characterizes the distribution of the angle of the normal at a point chosen at random
on the level curve.
In the case of a random field which is isotropic in (x, y), we have 200 = 020
and moreover 110 = 0, so that g turns out to be the uniform density over
the circle (Longuet-Higgins says that over the contour the distribution of the
angle is uniform (cf. [15], pp. 348)).
Let now W = {W (x, t) : t R+ , x = (x, y) R2 } be a stationary zero
mean Gaussian random field modeling the height of the sea waves. It has the
following spectral representation:
ei(1 x+2 y+t)

W (x, y, t) =

27

f (1 , 2 , )dM (1 , 2 , ),

where is the manifold {21 + 22 = 4 } (assuming that the acceleration of


gravity g is equal to 1) and M is a random Gaussian orthogonal measure defined
on (see [13]). This leads to the following representation for the covariance
function
(x, y, t) =
=

ei(1 x+2 y+t) f (1 , 2 , )2 (dV )

ei(

x cos + 2 y sin +t)

G(, )dd,

where, in the second equation, we made the change of variable 1 = 2 cos ,


2 = 2 sin and G(, ) = f ( 2 cos , 2 sin , )2 3 . The function G is
called the directional spectral function. If G does not depend of the random field W is isotropic in x, y.
Let us turn to ergodicity. For a given subset Q of R2 and each t, let us define
At = {W (x, y, t) : > t ; (x, y) Q}
and consider the -algebra of t-invariant events A = At . We assume that
for each pair (x, y), (x, y, t) 0 as t +. It is well-known that under
this condition, the -algebra A is trivial, that is, it only contains events having
probability zero or one (see for example [8], Ch. 7).
This has the following important consequence in our context. Assume further
that the set Q has a smooth boundary and for simplicity, unit Lebesgue measure.
Let us consider
Z(t) =
H x, t d1 (x),
CQ (u,t)

where H x, t = H W (x, t), W (x, t) , where W = (Wx , Wy ) denotes gradient in the space variables and H is some measurable function such that the integral is well-defined. This is exactly our case in (59). The process {Z(t) : t R} is
strictly stationary, and in our case has a finite mean and is Riemann-integrable.
By the Birkhoff-Khintchine ergodic theorem ([8] page 151), a.s. as T +,
1
T

T
0

Z(s)ds EB [Z(0)],

where B is the -algebra of t-invariant events associated to the process Z(t).


Since for each t, Z(t) is At -measurable, it follows that B A, so that EB [Z(0)] =
E[Z(0)]. On the other hand, Rices formula yields (take into account that stationarity of W implies that W (0, 0) and W (0, 0) are independent):
E[Z(0)] = E[H u, W (0, 0) ||W (0, 0)||]pW (0,0) (u).

28

We consider now the CLT. Let us define


Z(t) =

1
t

Z(s) E(Z(0)) ds,

In order to compute second moments, we use Rice formula for integrals over
level sets (cf. Theorem 6), applied to the vector-valued random field
X(x1 , x2 , s1 , s2 ) = (W (x1 , s1 ), W (x2 , s2 ))T .
The level set can be written as:
CQ2 (u, u) = {(x1 , x2 ) QQ : X(x1 , x2 , s1 , s2 ) = (u, u)}
So, we get
Var Z(t) =

2
t

t
0

for 0 s1 t, 0 s2 t.

s
(1 )I(u, s)ds,
t

where
I(u, s) =
Q2

E H(x1 , 0)H(x2 , s) W (x1 , 0)

W (x2 , s)

W (x1 , 0) = u ; W (x2 , s) = u

pW (x1 ,0),W (x2 ,s) (u, u)dx1 dx2 E[H u, W (0, 0) ||W (0, 0)||]pW (0,0) (u)
Assuming that the given random field is time--dependent, that is,
(x, y, t) = 0 (x, y), whenever t > , we readily obtain

t Var Z(t) 2

I(u, s)ds := 2 (u) as t .

Using now a variant of the Hoeffding-Robbins Theorem [11] for sums of dependent random variables, we get the CLT:

tZ(t) N (0, 2 (u)).

Numerical computations

Validity of the approximation for the number of specular


points
In the particular case of stationary processes we have compared the exact expectation given by (32) with the approximation (10).
In full generality the result depends on h1 , h2 , 4 and 2 . After scaling, we
can assume for example that 2 = 1.
The main result is that, when h1 h2 , the approximation (10) is very sharp.
For example with the value (100, 100, 3) for (h1 , h2 , 4 ), the expectation of the
total number of specular points over R is 138.2; using the approximation (11)
29

the result with the exact formula is around 2.102 larger but it is almost hidden
by the precision of the computation of the integral.
If we consider the case (90, 110, 3), the results are respectively 136.81 and
137.7.
In the case (100, 300, 3), the results differ significantly and Figure 1 displays
the densities (32) and (10)

0.7

0.6

0.5

0.4

0.3

0.2

0.1

100

200

300

400

500

Figure 1: Intensity of specular points in the case h1 = 100, h2 = 300, 4 = 3.


In solid line exact formula, in dashed line approximation (10)

Effect of anisotropy on the distribution of the angle of the


normal to the curve
We show the values of the density given by (60) in the case of anisotropic
processes = 0.5 and = /4. Figure 2 displays the densities of the Palm distribution of the angle showing a large departure from the uniform distribution.

Specular points in dimension 2


We use a standard sea model with a Jonswap spectrum and spread function
cos(2). It corresponds to the default parameters of the Jonswap function of
the toolbox WAFO [18]. The variance matrix of the gradient is equal to
104

114 0
0 81

30

0.55
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
2

1.5

0.5

0.5

1.5

Figure 2: Density of the Palm distribution of the angle of the normal to the
level curve in the case = 0.5 and = /4
and the matrix of Section 3.5 is

3 0
11 0
0 3

9
= 104 3
0

The spectrum is presented in Figure 3


Directional Spectrum
Level curves at:
2
4
6
8
10
12

90

0.8

120

60
0.6
0.4

150

30

0.2

180

330

210

300

240
270

Figure 3: Directional Jonswap spectrum as obtained using the default options


of Wafo

31

The integrand in (42) is displayed in Figure 4 as a function of the two space


variables x, y. The value of the asymptotic parameter m2 defining the expansion
on the expectation of the numbers of specular points, see(48), is 2.527103.

Figure 4: Intensity function of the specular points for the Jonswap spectrum

The Matlab programs used for these computations are available at

\protect\vrule width0pt\protect\href{http://www.math.univ-toulouse.fr/\string~azais/prog/pro

Application to dislocations of wavefronts

In this section we follow the article by Berry and Dennis [6]. As these authors, we
are interested in dislocations of wavefronts. These are lines in space or points in
the plane where the phase , of the complex scalar wave (x, t) = (x, t)ei(x,t) ,
is undefined, (x = (x1 , x2 )) is a two dimensional space variable). With respect
to light they are lines of darkness; with respect to sound, threads of silence.

32

It will be convenient to express by means of its real and imaginary parts:


(x, t) = (x, t) + i(x, t).
Thus the dislocations are the intersection of the two surfaces
(x, t) = 0

(x, t) = 0.

We assume an isotropic Gaussian model. This means that we will consider the
wavefront as an isotropic Gaussian field
(x, t) =
R2

exp (i[ k x c|k|t])(

(|k|) 1/2
) dW (k),
|k|

where, k = (k1 , k2 ), |k| =


k12 + k22 , (k) is the isotropic spectral density
and W = (W1 + iW2 ) is a standard complex orthogonal Gaussian measure
on R2 , with unit variance. Here we are interested only in t = 0 and we put
(x) := (x, 0) and (x) := (x, 0).
We have, setting k = |k|
cos( kx )(

(x) =
R2

(k) 1/2
) dW1 (k)
k

R2

(k) 1/2
) dW2 (k)+
k

R2

sin( kx )(

(k) 1/2
) dW2 (k) (61)
k

sin( kx )(

(k) 1/2
) dW1 (k) (62)
k

and
cos( kx )(

(x) =
R2

The covariances are


E [(x)(x )] = E [(x)(x )] = (|x x |) :=

J0 (k|x x |)(k)dk

where J (x) is the Bessel function of the first kind of order .


E [(r1 )(r2 )] = 0.

(63)

Moreover

Three dimensional model


In the case of a three dimensional Gaussian field, we have x = (x1 , x2 , x3 ),
k = (k1 , k2 , k3 ),k = |k| = k12 + k22 + k32 and
(x) =
R3

exp i[ k x ] (

(k) 1/2
) dW (k).
k2

In this case, we write the covariances in the form:

E [(r1 )(r2 )] = 4
0

sin(k|r1 r2 |)
(k)dk.
k|r1 r2 |

(64)

The same formula holds true for the process and also E [(r1 )(r2 )] = 0 for
any r1 , r2 , showing that the two coordinates are independent Gaussian fields .
33

6.1

Mean length of dislocation curves, mean number of


dislocation points

Dimension 2: Let us denote {Z(x) : x R2 } a random field with values in R2 ,


with coordinates (x), (x), which are two independent Gaussian stationary
isotropic random fields with the same distribution. We are interested in the
expectation of the number of dislocation points
d2 := E[#{x S : (x) = (x) = 0}],
where S is a subset of the parameter space having area equal to 1.
Without loss of generality we may assume that Var((x)) = Var((x)) = 1
and for the derivatives we set 2 = Var(i (x)) = Var(i (x)), i = 1, 2. Then,
using stationarity and the Rice formula (Theorem 3) we get
d2 = E[| det(Z (x))|/Z(x) = 0]pZ(x) (0),
The stationarity implies independence between Z(x) and Z (x) so that the conditional expectation above is in fact an ordinary expectation. The entries of
Z (x) are four independent centered Gaussian variables with variance 2 , so
that, up to a factor, | det(Z (x))| is the area of the parallellogram generated by
two independent standard Gaussian variables in R2 . One can easily show that
the distribution of this volume is the product of independent square roots of
a 2 (2) and a 2 (1) distributed random variables. An elementary calculation
gives then: E[| det(Z (x))|] = 2 . Finally, we get
d2 =
This quantity is equal to
formula (4.6).

K2
4

1
2
2

in Berry and Dennis [6] notations, giving their

Dimension 3: In the case, our aim is to compute


d3 = E[L{x S : (x) = (x) = 0}]

where S is a subset of R3 having volume equal to 1 and L is the length of the


curve. Note that d3 is denoted by d [6]. We use the same notations and remarks
except that the form of the Rices formula is (cf. Theorem 5)
d3 =

1
E[(det Z (x)Z (x)T )1/2 ].
2

Again
E[(det(Z (x)Z (x)T )1/2 ] = 2 E(V ),
where V is the surface area of the parallelogram generated by two standard
Gaussian variables in R3 . A similar method to compute the expectation of this
random area gives:
E(V ) = E(

4
2 (3)) E( 2 (2)) =
2
34

=2
2

Leading eventually to
d3 =

2
.

In Berry and Dennis notations [6] this last quantity is denoted by


their formula (4.5).

6.2

k2
3

giving

Variance

In this section we limit ourselves to dimension 2. Let S be again a measurable


subset of R2 having Lebesgue measure equal to 1. The computation of the
variance of the number of dislocations points is performed using Theorem 4 to
express
E NSZ (0) NSZ (0) 1

=
S2

A(s1 , s2 )ds1 ds2 .

We assume that {Z(x) : x R2 } satisfies the hypotheses of Theorem 4 for


m = 2. Then use
Var NSZ (0) = E NSZ (0) NSZ (0) 1

+ d2 d22 .

Taking into account that the law of the random field is invariant under translations and orthogonal transformations of R2 , we have
A(s1 , s2 ) = A (0, 0), (r, 0) = A(r)

whith r = s1 s2 ,

The Rices function A(r)) , has two intuitive interpretations. First it can be
viewed as
A(r) = lim

1
E N B((0, 0), ) N B((r, 0), ) .
2 4

Second it is the density of the Palm distribution (a generalization Horizontal


window conditioning of [8]) of the number of zeroes of Z per unit of surface,
locally around the point (r, 0) given that there is a zero at (0, 0).
A(r)/d22 is called correlation function in [6].
To compute A(r), we put 1 , 2 , 1 2 for the partial derivatives of , with
respect to first and second coordinate.
and
A(r) = E | det Z (0, 0) det Z (r, 0)| Z(0, 0) = Z(r, 0) = 02 pZ(0,0),Z(r,0) (04 )
=E

1 2 2 1 (0, 0) 1 2 2 1 (r, 0)
pZ(0,0),Z(r,0) (04 )

Z(0, 0) = Z(r, 0) = 02
(65)

where 0p denotes the null vector in dimension p.

35

The density is easy to compute


pZ(0,0),Z(r,0) (04 ) =

1
, where (r) =
2
(2) (1 2 (r))

J0 (kr)(k)dk.

We use now the same device as above to compute the conditional expectation
of the modulus of the product of determinants, that is we write:
|w| =

(1 cos(wt)t2 dt.

(66)

and also the same notations as in [6]

C := (r)

E = (r)
H = E/r

F
= (r)

F0 = (0)

The regression formulas imply that the conditional variance matrix of the vector
W = 1 (0), 1 (r, 0), 2 (0), 2 (r, 0), 1 (0), 1 (r, 0), 2 (0), 2 (r, 0) ,
is given by
= Diag A, B, A, B
with
A=

B=

E C
F 1C
2
E2
F0 1C
2

F0
H

H
F0

E
F0 1C
2
E2 C
F 1C
2

Using formula (66) the expectation we have to compute is equal to


1
2

1
1
1
1
2
1 T (t1 , 0) T (t1 , 0) T (0, t2 ) T (0, t2 )
dt2 t2
1 t2
2
2
2
2

1
1
1
1
+ T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) (67)
4
4
4
4

dt1

where
T (t1 , t2 ) = E exp i(w1 t1 + w2 t2 )
with
w1 = 1 (0)2 (0) 1 (0)2 (0) = W1 W7 W3 W5
w2 = 1 (r, 0)2 (r, 0) 1 (r, 0)2 (r, 0) = W2 W8 W4 W6 .

36

T (t1 , t2 ) = E exp(iWT HW) where W has the distribution N (0, ) and

0
0
0
D
0
0
D 0
,
H=
0 D
0
0
D
0
0
0

1 t1 0
.
2 0 t2
A standard diagonalization argument shows that
D=

T (t1 , t2 ) = E exp(iWT HW) = E exp(i

j j2 ) ,
j=1

where the j s are independent with standard normal distribution and the j
are the eigenvalues of 1/2 H1/2 . Using the characteristic function of the 2 (1)
distribution:
8

E exp(iWT HW) =

j=1

(1 2ij )1/2 .

(68)

Clearly
1/2 = Diag A1/2 , B 1/2 , A1/2 , B 1/2
and
1/2 H1/2

0
0
=
0
MT

0
0
0
MT
M
0
0
0

M
0

0
0

with M = A1/2 DB 1/2 .


Let be an eigenvalue of 1/2 H1/2 It is easy to check that 2 is an eigenvalue of MMT . Respectively if 21 and 22 are the eigenvalues of MMT , those
of 1/2 H1/2 are 1 (twice) and 2 (twice).
Note that 21 and 22 are the eigenvalues of MMT = A1/2 DBDA1/2 or
equivalently, of DBDA. Using (68)
E exp(iWT HW) = 1+4(21 +22 )+1621 22

= 1+4tr(DBDA)+16 det(DBDA)

where
DBDA =

1
4

E
t21 F0 (F0 1C
2 ) + t1 t2 H(F
E2
t1 t2 H(F0 1C 2 ) + t22 F0 (F

E2C
1C 2 )
E2C
1C 2 )

E C
t21 F0 (F 1C
2 ) + t1 t2 H(F0
E2 C
t1 t2 H(F 1C 2 ) + t22 F0 (F0

So,
E2
E2C
) + 2t1 t2 H(F
)
2
1C
1 C2
E2C 2
E2 2
)

(F

)
(F0
1 C2
1 C2

4tr(DBDA) = (t21 + t22 )F0 (F0


16 det(DBDA) = t21 t22 F02 H 2

37

(69)
(70)

E2
1C 2 )
E2
1C 2 )

giving
T (t1 , t2 ) = E exp(iWT HW)

E2
E2C
)
+
2t
t
H(F

)
1
2
1 C2
1 C2
E2 2
E2C 2
+ t21 t22 F02 H 2 (F0
)

(F

)
1 C2
1 C2

Performing the change of variable t = A1 t with A1 = F0 (F0


integral (67) becomes
= 1 + (t21 + t22 )F0 (F0

A1
2
1

dt1

(71)

E2
1C 2 )

the

2
dt2 t2
1 t2

1
1
1
1
1
+
+
2
2
2
2
2
2
2
2
1 + t1 1 + t2
2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z 1 + (t1 + t2 ) + 2A2 t1 t2 + t21 t22 Z
=
1

where

A1
2

dt1

1
1

+
1 + t21
1 + t22

1 + (t21 + t22 ) + t21 t22 Z

1 + (t21 + t22 ) + t21 t22 Z

A2 =
Z=

2
dt2 t2
1 t2

2
2
H F (1C )E C
F0 F0 (1C 2 )E 2
F02 H 2
E2 C 2
1 (F 1C
2 ) .(F0
F02

(72)

4A22 t21 t22

E2
2
1C 2 )

In this form, and up to a sign change, this result is equivalent to Formula (4.43)
of [6] (note that A22 = Y in [6]).
In order to compute the integral (72), first we obtain

1
1
1
dt2 = .
2
t2
1 + t22

We split the other term into two integrals, thus we have for the first one
1
2

1
1
1

dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z
1 + t21
=

1
2(1 + t21 )

1
=
2(1 + t21 )

(1 + t21 Z)t22 2A2 t1 t2


1
dt2
t22 1 + t21 2A2 t1 t2 + (1 + t21 Z)t22
t22 2Z1 t1 t2
1
dt2 = I1 ,
t22 t22 2Z1 t1 t2 + Z2

1+t2

A2
where Z2 = 1+Zt12 and Z1 = 1+Zt
2.
1
1
Similarly for the second integral we get

38

1
2

1
1
1

dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) + 2A2 t1 t2 + t1 t2 Z
1 + t21
=

I1 + I2 =

1
2(1 + t21 )

1
2(1 + t21 )

1
=
(1 + t21 )

t22 + 2Z1 t1 t2
1
dt2 = I2
2
2
t2 t2 + 2Z1 t1 t2 + Z2

t22 + 2Z1 t1 t2
t22 2Z1 t1 t2
1
+
dt2
t22 t22 2Z1 t1 t2 + Z2
t22 + 2Z1 t1 t2 + Z2

t22 + (Z2 4Z12 t21 )


dt2
t42 + 2(Z2 2Z12 t21 )t22 + Z22

(Z2 2Z12 t21 )


1
=
.
2
(1 + t1 ) Z2 (Z2 Z12 t21 )

In the third line we have used the formula provided by the method of residues.
In fact, if the polynomial X 2 SX + P with P > 0 has not root in [0, ), then

t4

t2
dt =
St2 + P

( P ).
P (S + 2 P )

In our case = (Z2 4Z12 t21 ), S = 2(Z2 2Z12 t21 ) and P = Z22 .
Therefore we get

A(r) =

A1
3
4 (1 C 2 )

(Z2 2Z12 t21 )


1
1
dt1 .
1

t21
(1 + t21 ) Z2 (Z2 Z12 t21 )

Acknowledgement
This work has received financial support from European Marie Curie Network
SEAMOCS.

References
[1] R. J. Adler, The Geometry of Random Fields, Wiley,(1981).
[2] R. J. Adler and J. Taylor, Random Fields and Geometry. Springer, (2007).
[3] J-M. Azas, J. Leon and J. Ortega, Geometrical Characteristic of Gaussian
sea Waves. Journal of Applied Probability , 42,1-19. (2005).
[4] J-M. Azas, and M. Wschebor, Level set and extrema of random processes
and fields, Wiley (2009).
[5] J-M. Azas, and M. Wschebor, On the Distribution of the Maximum of a
Gaussian Field with d Parameters, Annals of Applied Probability, 15 (1A),
254-278, (2005).
39

[6] M.V. Berry, and M.R. Dennis, Phase singularities in isotropic random waves,
Proc. R. Soc. Lond, A, 456, 2059-2079 (2000).
[7] E. Caba
na, Esperanzas de Integrales sobre Conjuntos de Nivel aleatorios. Actas del 2 Congreso Latinoamericano de Probabilidad y Estadistica
Matem
atica, Editor: Sociedad Bernoulli secci
on de Latinoamerica, Spanish
, Caracas, 65-82 (1985).
[8] H. Cramer and M.R. Leadbetter, Stationary and Related Stochastic Processes, Wiley (1967).
[9] H. Federer, Geometric Measure, Springer (1969).
[10] E. Flores and J.R. Leon, Random seas, Levels sets and applications,
Preprint (2009).
[11] W. Hoeffding and H. Robbins, The Central Limit Theorem for dependent
random variables, Duke Math. J. 15 , 773-780,(1948).
[12] M. Kratz and J. R. Leon, Level curves crossings and applications for Gaussian models, Extremes, DOI 10.1007/s10687-009-0090-x (2009).
[13] P. Kree and C. Soize, Mecanique Aletoire, Dunod (1983).
[14] M. S. Longuet-Higgins, Reflection and refraction at a random surface. I, II,
III, Journal of the Optical Society of America, vol. 50, No.9, 838-856 (1960).
[15] M. S. Longuet-Higgins, The statistical geometry of random surfaces. Proc.
Symp. Appl. Math., Vol. XIII, AMS Providence R.I., 105-143 (1962).
[16] Nualart, D. and Wschebor, M., Integration par parties dans lespace de
Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83-109
(1991).
[17] S.O. Rice,(1944-1945). Mathematical Analysis of Random Noise, Bell System Tech. J., 23, 282-332; 24, 45-156 (1944-1945).
[18] WAFO-group . WAFO - A Matlab Toolbox for Analysis of Random Waves
and Loads - A Tutorial. Math. Stat., Center for Math. Sci., Lund Univ., Lund,
Sweden. ISBN XXXX, URL http://www.maths.lth.se/matstat/wafo.(2000)
[19] M. Wschebor, Surfaces Aleatoires. Lecture Notes Math. 1147, Springer,
(1985).
[20] U. Z
ahle, A general Rice formula, Palm measures, and horizontal-window
conditioning for random fields, Stoc. Process and their applications, 17, 265283 (1984).

40






 



 
















A general expression for the distribution of the maximum of a


Gaussian field and the approximation of the tail

arXiv:math/0607041v2 [math.PR] 8 Jan 2007

Jean-Marc Azas , azais@cict.fr

Mario Wschebor , wschebor@cmat.edu.uy

February 2, 2008

AMS subject classification: Primary 60G70 Secondary 60G15


Short Title: Distribution of the Maximum.
Key words and phrases: Gaussian fields, Rice Formula, Euler-Poincare Characteristic, Distribution of the Maximum, Density of the Maximum, Random Matrices.
Abstract
We study the probability distribution F (u) of the maximum of smooth Gaussian fields
defined on compact subsets of Rd having some geometric regularity.
Our main result is a general expression for the density of F . Even though this is an
implicit formula, one can deduce from it explicit bounds for the density, hence for the
distribution, as well as improved expansions for 1 F (u) for large values of u.
The main tool is the Rice formula for the moments of the number of roots of a random
system of equations over the reals.
This method enables also to study second order properties of the expected Euler Characteristic approximation using only elementary arguments and to extend these kind of results
to some interesting classes of Gaussian fields. We obtain more precise results for the direct method to compute the distribution of the maximum, using spectral theory of GOE
random matrices.

Introduction and notations

Let X = {X(t) : t S} be a real-valued random field defined on some parameter set S and
M := suptS X(t) its supremum.
The study of the probability distribution of the random variable M , i.e. the function
FM (u) := P{M u} is a classical problem in probability theory. When the process is Gaussian,
general inequalities allow to give bounds on 1 FM (u) = P{M > u} as well as asymptotic
results for u +. A partial account of this well established theory, since the founding paper
by Landau and Shepp [20] should contain - among a long list of contributors - the works of
Marcus and Shepp [24], Sudakov and Tsirelson [30], Borell [13] [14], Fernique [17], Ledoux and
Talagrand [22], Berman [11] [12], Adler[2], Talagrand [32] and Ledoux[21].
During the last fifteen years, several methods have been introduced with the aim of obtaining more precise results than those arising from the classical theory, at least under certain
restrictions on the process X , which are interesting from the point of view of the mathematical
theory as well as in many significant applications. These restrictions include the requirement

This work was supported by ECOS program U03E01.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118, route de
Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.

the domain S to have certain finite-dimensional geometrical structure and the paths of the
random field to have a certain regularity.
Some examples of these contributions are the double sum method by Piterbarg [28]; the
Euler-Poincare Characteristic (EPC) approximation, Taylor, Takemura and Adler [34], Adler
and Taylor [3]; the tube method, Sun [31] and the well- known Rice method, revisited by Azas
and Delmas [5], Azas and Wschebor [6]. See also Rychlik [29] for numerical computations.
The results in the present paper are based upon Theorem 3 which is an extension of Theorem
3.1 in Azas and Wschebor [8] allowing to express the density pM of FM by means of a general
formula. Even though this is an exact formula, it is only implicit as an expression for the
density, since the relevant random variable M appears in the right-hand side. However, it can
be usefully employed for various purposes.
First, one can use Theorem 3 to obtain bounds for pM (u) and thus for P{M > u} for
every u by means of replacing some indicator function in (4) by the condition that the normal
derivative is extended outward (see below for the precise meaning). This will be called the
direct method. Of course, this may be interesting whenever the expression one obtains can
be handled, which is the actual situation when the random field has a law which is stationary
and isotropic. Our method relies on the application of some known results on the spectrum of
random matrices.
Second, one can use Theorem 3 to study the asymptotics of P{M > u} as u +. More
precisely, one wants to write, whenever it is possible
P{M > u} = A(u) exp

1 u2
2 2

+ B(u)

(1)

where A(u) is a known function having polynomially bounded growth as u +, 2 =


suptS Var(X(t)) and B(u) is an error bounded by a centered Gaussian density with variance
12 , 12 < 2 . We will call the first (respectively the second) term in the right-hand side of (1)
the first (resp second) order approximation of P{M > u}.
First order approximation has been considered in [3] [34] by means of the expectation of the
EPC of the excursion set Eu := {t S : X(t) > u}. This works for large values of u. The same
authors have considered the second order approximation, that is, how fast does the difference
between P{M > u} and the expected EPC tend to zero when u +.
We will address the same question both for the direct method and the EPC approximation method. Our results on the second order approximation only speak about the size of the
variance of the Gaussian bound. More precise results are only known to the authors in the
special case where S is a compact interval of the real line, the Gaussian process X is stationary
and satisfies a certain number of additional requirements (see Piterbarg [28] and Azas et al. [4]).
Theorem 5 is our first result in this direction. It gives a rough bound for the error B(u) as
u +, in the case the maximum variance is attained at some strict subset of the face in S
having the largest dimension. We are not aware of the existence of other known results under
similar conditions.
In Theorem 6 we consider processes with constant variance. This is close to Theorem 4.3
in [34]. Notice that Theorem 6 has some interest only in case suptS t < , that is, when
one can assure that 12 < 2 in (1). This is the reason for the introduction of the additional
hypothesis (S) < on the geometry of S, (see below (64) for the definition of (S)), which
is verified in some relevant situations (see the discussion before the statement of Theorem 6).
In Theorem 7, S is convex and the process stationary and isotropic. We compute the exact
asymptotic rate for the second order approximation as u + corresponding to the direct
2

method.
In all cases, the second order approximation for the direct method provides an upper bound
for the one arising from the EPC method.
Our proofs use almost no differential geometry, except for some elementary notions in Euclidean space. Let us remark also that we have separated the conditions on the law of the
process from the conditions on the geometry of the parameter set.
Third, Theorem 3 and related results in this paper, in fact refer to the density pM of
the maximum. On integration, they imply immediately a certain number of properties of the
probability distribution FM , such as the behaviour of the tail as u +.
Theorem 3 implies that FM has a density and we have an implicit expression for it. The
proof of this fact here appears to be simpler than previous ones (see Azas and Wschebor [8])
even in the case the process has 1-dimensional parameter (Azas and Wschebor [7]). Let us
remark that Theorem 3 holds true for non-Gaussian processes under appropriate conditions
allowing to apply Rice formula.
Our method can be exploited to study higher order differentiability of FM (as it has been
done in [7] for one-parameter processes) but we will not pursue this subject here.
This paper is organized as follows:
Section 2 includes an extension of Rice Formula which gives an integral expression for the
expectation of the weighted number of roots of a random system of d equations with d real
unknowns. A complete proof of this formula in a form which is adapted to our needs in this
paper, can be found in [9]. There is an extensive literature on Rice formula in various contexts
(see for example Belayiev [10] , Cramer-Leadbetter [15], Marcus [23], Adler [1], Wschebor [35].
In Section 3, we obtain the exact expression for the distribution of the maximum as a consequence of the Rice-like formula of the previous section. This immediately implies the existence
of the density and gives the implicit formula for it. The proof avoids unnecessary technicalities
that we have used in previous work, even in cases that are much simpler than the ones considered here.
In Section 4, we compute (Theorem 4) the first order approximation in the direct method
for stationary isotropic processes defined on a polyhedron, from which a new upper bound for
P{M > u} for all real u follows.
In Section 5, we consider second order approximation, both for the direct method and the
EPC approximation method. This is the content of Theorems 5, 6 and 7.
Section 6 contains some examples.

Assumptions and notations


X = {X(t) : t S} denotes a real-valued Gaussian field defined on the parameter set S. We
assume that S satisfies the hypothesis A1
A1 :
S is a compact subset of Rd

S is the disjoint union of Sd , Sd1 ..., S0 , where Sj is an orientable C 3 manifold of dimension


j without boundary. The Sj s will be called faces. Let Sd0 , d0 d be the non empty face
having largest dimension.
We will assume that each Sj has an atlas such that the second derivatives of the inverse
functions of all charts (viewed as diffeomorphisms from an open set in Rj to Sj ) are
bounded by a fixed constant. For t Sj , we denote Lt the maximum curvature of Sj at
the point t. It follows that Lt is bounded for t S.
Notice that the decomposition S = Sd ... S0 is not unique.
Concerning the random field we make the following assumptions A2-A5
A2 : X is in fact defined on an open set containing S and has C 2 paths
A3 : for every t S the distribution of X(t), X (t) does not degenerate; for every s, t S,
s = t, the distribution of X(s), X(t) does not degenerate.
A4 : Almost surely the maximum of X(t) on S is attained at a single point.
(t) denote respectively the derivative along S and the normal derivaFor t Sj , Xj (t) Xj,N
j
d
tive. Both quantities are viewed as vectors in R , and the density of their distribution will be
expressed respectively with respect to an orthonormal basis of the tangent space Tt,j of Sj at
the point t, or its orthogonal complement Nt,j . Xj (t) will denote the second derivative of X
along Sj , at the point t Sj and will be viewed as a matrix expressed in an orthogonal basis
of Tt,j . Similar notations will be used for any function defined on Sj .

A5 : Almost surely, for every j = 0, 1, . . . , d there is no point t in Sj such that Xj (t) = 0,


det(Xj (t)) = 0
Other notations and conventions will be as follows :
j is the geometric measure on Sj .
m(t) := E(X(t)), r(s, t) = Cov(X(s), X(t)) denote respectively the expectation and covariance of the process X ; r0,1 (s, t), r0,2 (s, t) are the first and the second derivatives of r
with respect to t. Analogous notations will be used for other derivatives without further
reference.
If is a random variable taking values in some Euclidean space, p (x) will denote the
density of its probability distribution with respect to the Lebesgue measure, whenever it
exists.
(x) = (2)1/2 exp(x2 /2) is the standard Gaussian density ; (x) :=

x
(y)dy.

Assume that the random vectors , have a joint Gaussian distribution, where has
values in some finite dimensional Euclidean space. When it is well defined,
E(f ()/ = x)
is the version of the conditional expectation obtained using Gaussian regression.
Eu := {t S : X(t) > u} is the excursion set above u of the function X(.) and Au :=
{M u} is the event that the maximum is not larger than u.
, , , denote respectively inner product and norm in a finite-dimensional real Euclidean
space; d is the Lebesgue measure on Rd ; S d1 is the unit sphere ; Ac is the complement
of the set A. If M is a real square matrix, M 0 denotes that it is positive definite.
4

If g : D C is a function and u C, we denote

Nug (D) := {t D : g(t) = u}

which may be finite or infinite.

Some remarks on the hypotheses


One can give simple sufficient additional conditions on the process X so that A4 and A5 hold
true.
If we assume that for each pair j, k = 0, . . . , d and each pair of distinct points s, t, s Sj , t
Sk , the distribution of the triplet
X(t) X(s), Xj (s), Xk (t))
does not degenerate in R Rj Rk , then A4 holds true.

This is well-known and follows easily from the next lemma (called Bulinskaya s lemma)
that we state without proof, for completeness.
Lemma 1 Let Z(t) be a stochastic process defined on some neighborhood of a set T embedded
in some Euclidean space. Assume that the Hausdorff dimension of T is smaller or equal than
the integer m and that the values of Z lie in Rm+k for some positive integer k . Suppose, in
addition, that Z has C 1 paths and that the density pZ(t) (v) is bounded for t T and v in some
neighborhood of u Rm+k . Then, a. s. there is no point t T such that Z(t) = u.
With respect to A5, one has the following sufficient conditions: Assume A1, A2, A3 and as
additional hypotheses one of the following two:
t

X(t) is of class C 3

sup
tS,x V (0)

P | det X (t) | < /X (t) = x 0,

as 0,

where V (0) is some neighborhood of zero.


Then A5 holds true. This follows from Proposition 2.1 of [8] and [16].

Rice formula for the number of weighted roots of random


fields

In this section we review Rice formula for the expectation of the number of roots of a random
system of equations. For proofs, see for example [8], or [9], where a simpler one is given.
Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and
u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t) 0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =

E | det(Z (t))|/Z(t) = u pZ(t) (u)dt.

If B is compact, then both sides in (2) are finite.


5

(2)

Theorem 2 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that

for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:
a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)
ous.

Y t (w) is continu-

Z(s), Y t (w) defined on U W is Gaussian.

b) For each t U the random process (s, w)

Moreover, assume that g : U C(W, Rd ) R is a bounded function, which is continuous when

one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =

E
tI,Z(t)=u

E | det(Z (t)|g(t, Y t )/Z(t) = u).pZ(t) (u)dt.

(3)

Remarks:
1. We have already mentioned in the previous section sufficient conditions implying hypothesis (iv) in Theorem 1.
2. With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .

The implicit formula for the density of the maximum

Theorem 3 Under assumptions A1 to A5, the distribution of M has the density


E 1IAx /X(t) = x pX(t) (x)

pM (x) =
tS0
d

+
j=1

Sj

E | det(Xj (t))| 1IAx /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt),

(4)

Remark: One can replace | det(Xj (t))| in the conditional expectation by (1)j det(Xj (t)),
since under the conditioning and whenever M x holds true, Xj (t) is negative semi-definite.
Proof of Theorem 3
Let Nj (u), j = 0, . . . , d be the number of global maxima of X(.) on S that belong to Sj and are
larger than u. From the hypotheses it follows that a.s.
j=0,...,d Nj (u) is equal to 0 or 1, so
that
P{M > u} =
P{Nj (u) = 1} =
E(Nj (u)).
(5)
j=0,...,d

j=0,...,d

The proof will be finished as soon as we show that each term in (5) is the integral over (u, +)
of the corresponding term in (4).
This is self-evident for j = 0. Let us consider the term j = d. We apply the weighted Rice
formula of Section 2 as follows :
Z is the random field X defined on Sd .
For each t Sd , put W = S and Y t : S R2 defined as:
Y t (w) := X(w) X(t), X(t) .
Notice that the second coordinate in the definition of Y t does not depend on w.
6

In the place of the function g, we take for each n = 1, 2, . . . the function gn defined as
follows:
gn (t, f1 , f2 ) = gn (f1 , f2 ) = 1 Fn (sup f1 (w)) . 1 Fn (u f2 (w)) ,
wS

where w is any point in W and for n a positive integer and x 0, we define :


Fn (x) := F(nx) ;

with F(x) = 0 if 0 x 1/2 , F(x) = 1 if x 1 ,

(6)

and F monotone non-decreasing and continuous.


It is easy to check that all the requirements in Theorem 2 are satisfied, so that, for the value 0
instead of u in formula (3) we get:
gn (Y t ) =

E
tSd ,X (t)=0

Sd

E | det(X (t)|gn (Y t )/X (t) = 0).pX (t) (0)d (dt).

(7)

Notice that the formula holds true for each compact subset of Sd in the place of Sd , hence for
Sd itself by monotone convergence.
Let now n in (7). Clearly gn (Y t ) 1IX(s)X(t)0,sS . 1IX(t)u . The passage to the limit
does not present any difficulty since 0 gn (Y t ) 1 and the sum in the left-hand side is bounded

by the random variable N0X (Sd ), which is in L1 because of Rice Formula. We get
E(Nd (u)) =
Sd

E | det(X (t)| 1IX(s)X(t)0,sS 1IX(t)u /X (t) = 0).pX (t) (0)d (dt)

Conditioning on the value of X(t), we obtain the desired formula for j = d.


The proof for 1 j d 1 is essentially the same, but one must take care of the parameterization of the manifold Sj . One can first establish locally the formula on a chart of Sj , using
local coordinates.
It can be proved as in [8], Proposition 2.2 (the only modification is due to the term 1IAx )
that the quantity written in some chart as
E det(Y (s)) 1IAx /Y (s) = x, Y (s) = 0 pY (s),Yj (s) (x, 0)ds,
where the process Y (s) is the process X written in some chart of Sj ,
(Y (s) = X(1 (s))), defines a j-form. By a j-form we mean a mesure on Sj that does not
depend on the parameterization and which has a density with respect to the Lebesgue measure
ds in every chart. It can be proved also that the integral of this j-form on Sj gives the
expectation of Nj (u).
To get formula (2) it suffices to consider locally around a precise point t Sj the chart
given by the projection on the tangent space at t. In this case we obtain that at t
ds is in fact j (dt)
Y (s) is isometric to Xj (t)
where s = (t).
The first consequence of Theorem 3 is the next corollary. For the statement, we need to
introduce some further notations.
For t in Sj , j d0 we define Ct,j as the closed convex cone generated by the set of directions:
{ Rd :

= 1 ; sn S, (n = 1, 2, . . .) such that sn t,
7

t sn
as n +},
t sn

whenever this set is non-empty and Ct,j = {0} if it is empty. We will denote by Ct,j the dual
cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
Notice that these definitions easily imply that Tt,j Ct,j and Ct,j Nt,j . Remark also that for
j = d0 , Ct,j = Nt,j .
We will say that the function X(.) has an extended outward derivative at the point t in
(t) C .
Sj , j d0 if Xj,N
t,j
Corollary 1 Under assumptions A1 to A5, one has :
(a) pM (x) p(x) where
E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)+

p(x) :=
tS0
d0
Sj

j=1

E | det(Xj (t))| 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (8)

(b) P{M > u}

p(x)dx.
u

Proof
(a) follows from Theorem 3 and the observation that if t Sj , one has
(t) C }. (b) is an obvious consequence of (a).
{M X(t)} {Xj,N
t,j
The actual interest of this Corollary depends on the feasibility of computing p(x). It turns
out that it can be done in some relevant cases, as we will see in the remaining of this section.
+
Our result can be compared with the approximation of P{M > u} by means of u pE (x)dx
given by [3], [34] where
pE (x) :=

E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)


tS0

d0

(1)j

+
j=1

Sj

E det(Xj (t)) 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (9)

Under certain conditions , u pE (x)dx is the expected value of the EPC of the excursion set
Eu (see [3]). The advantage of pE (x) over p(x) is that one can have nice expressions for it in
quite general situations. Conversely p(x) has the obvious advantage that it is an upper-bound
of the true density pM (x) and hence provides upon integrating once, an upper-bound for the
tail probability, for every u value. It is not known whether a similar inequality holds true for
pE (x).
On the other hand, under additional conditions, both provide good first order approximations
for pM (x) as x as we will see in the next section. In the special case in which the process
X is centered and has a law that is invariant under isometries and translations, we describe
below a procedure to compute p(x).

Computing p(x) for stationary isotropic Gaussian fields

For one-parameter centered Gaussian process having constant variance and satisfying certain
regularity conditions, a general bound for pM (x) has been computed in [8], pp.75-77. In the
two parameter case, Mercadier [26] has shown a bound for P{M > u}, obtained by means of a
method especially suited to dimension 2. When the parameter is one or two-dimensional, these
bounds are sharper than the ones below which, on the other hand, apply to any dimension but
to a more restricted context. We will assume now that the process X is centered Gaussian,
with a covariance function that can be written as
E X(s).X(t) = s t

(10)

where : R+ R is of class C 4 . Without loss of generality, we assume that (0) = 1.


Assumption (10) is equivalent to saying that the law of X is invariant under isometries (i.e.
linear transformations that preserve the scalar product) and translations of the underlying
parameter space Rd .
We will also assume that the set S is a polyhedron. More precisely we assume that each
Sj (j = 1, . . . , d) is a union of subsets of affine manifolds of dimension j in Rd .
The next lemma contains some auxiliary computations which are elementary and left to the
reader. We use the abridged notation : := (0), := (0)
Lemma 2 Under the conditions above, for each t U , i, i , k, k , j = 1, . . . , d:
1. E

X
ti (t).X(t)

2. E

X
X
ti (t). tk (t)

3. E

2X
ti tk (t).X(t)

4. E

2X
2X
ti tk (t). ti tk (t)

= 0,
= 2 ik and < 0,
= 2 ik , E

2X
X
ti tk (t). tj (t)

=0

= 24 ii .kk + i k .ik + ik i k ,

5. 2 0
6. If t Sj , the conditional distribution of Xj (t) given X(t) = x, Xj (t) = 0 is the same as
the unconditional distribution of the random matrix
Z + 2 xIj ,
where Z = (Zik : i, k = 1, . . . , j) is a symmetric j j matrix with centered Gaussian
entries, independent of the pair X(t), X (t) such that, for i k, i k one has :
E(Zik Zi k ) = 4 2 ii + ( 2 ) ik i k + 4 ii .kk (1 ik ) .
Let us introduce some additional notations:
Hn (x), n = 0, 1, . . . are the standard Hermite polynomials, i.e.
Hn (x) := ex

n x2

For the properties of the Hermite polynomials we refer to Mehta [25].


H n (x), n = 0, 1, . . . are the modified Hermite polynomials, defined as:
H n (x) := ex

2 /2

n x2 /2

We will use the following result:


Lemma 3 Let

Jn (x) :=

ey

2 /2

Hn ()dy, n = 0, 1, 2, . . .

(11)

where stands for the linear form = ay + bx where a, b are some real parameters that satisfy
a2 + b2 = 1/2. Then

Jn (x) := (2b)n 2 H n (x).


Proof :
It is clear that Jn is a polynomial having degree n. Differentiating in (11) under the integral
sign, we get:
Jn (x) = b

ey

2 /2

Hn ()dy = 2nb

Also:

Jn (0) =

ey

2 /2

Hn1 ()dy = 2n b Jn1 (x)

(12)

ey

2 /2

Hn (ay)dy,

so that Jn (0) = 0 if n is odd.


If n is even, n 2, using the standard recurrence relations for Hermite polynomials, we have:
+

Jn (0) =

ey

2 /2

= 2a2

ey

2ayHn1 (ay) 2(n 1)Hn2 (ay) dy


2 /2

(ay)dy 2(n 1)Jn2 (0)


Hn1

= 4b2 (n 1)Jn2 (0).

Equality (13) plus J0 (x) = 2 for all x R, imply that:

(2p)!
2.
J2p (0) = (1)p (2b)2p (2p 1)!! 2 = (2b2 )p
p!

(13)

(14)

Now we can go back to (12) and integrate successively for n = 1, 2, . . . on the interval [0, x]
using the initial value given by (14) when n = 2p and Jn (0) = 0 when n is odd, obtaining :

Jn (x) = (2b)n 2Qn (x),


where the sequence of polynomials Qn , n = 0, 1, 2, . . . verifies the conditions:
Q0 (x) = 1

(15)

Qn (x)

= nQn (x)

(16)

Qn (0) = 0 if n is odd

(17)

Qn (0) = (1)n/2 (n 1)!! if n is even.

(18)

It is now easy to show that in fact Qn (x) = H n (x) , n = 0, 1, 2, . . . using for example that:
x
H n (x) = 2n/2 Hn .
2

The integrals

In (v) =

2 /2

et

Hn (t)dt,

will appear in our computations. They are computed in the next Lemma, which can be proved
easily, using the standard properties of Hermite polynomials.
10

Lemma 4 (a)
[ n1
]
2
v2 /2

(n 1)!!
Hn12k (v)
(n 1 2k)!!

(n 1)!! 2 (x)

2k

In (v) = 2e

k=0
n

+ 1I{n even} 2 2
(b)

In () = 1I{n even} 2 2 (n 1)!!

(19)
(20)
(21)

Theorem 4 Assume that the process X is centered Gaussian, satisfies conditions A1-A5 with
a covariance having the form (10) and verifying the regularity conditions of the beginning of this
section. Moreover, let S be a polyhedron. Then, p(x) can be expressed by means of the following
formula:

d0

| | j/2
p(x) = (x)
H j (x) + Rj (x) gj
0 (t) +
,
(22)

j=1

tS0

where

gj is a geometric parameter of the face Sj defined by


j (t)j (dt),

gj =

(23)

Sj

where j (t) is the normalized solid angle of the cone Ct,j in Nt,j , that is:
j (t) =

dj1 (Ct,j S dj1 )


for j = 0, . . . , d 1,
dj1 (S dj1 )

d (t) = 1.

(24)
(25)

Notice that for convex or other usual polyhedra j (t) is constant for t Sj , so that gj is
equal to this constant multiplied by the j-dimensional geometric measure of Sj .
For j = 1, . . . d,
Rj (x) =

2
| |

j
2

((j + 1)/2

y2
dy
2

(26)

with := | |( )1/2

(27)

Tj (v) exp

where
v := (2)1/2 (1 2 )1/2 y x
and

j1

Tj (v) :=
k=0

Hk2 (v) v2 /2
Hj (v)
Ij1 (v).
e
j
2k k!
2 (j 1)!

(28)

where In is given in the previous Lemma.


For the proof of the theorem, we need some ingredients from random matrices theory.
Following Mehta [25], denote by qn () the density of eigenvalues of n n GOE matrices at the
point , that is, qn ()d is the probability of Gn having an eigenvalue in the interval (, + d).
The random nn real random matrix Gn is said to have the GOE distribution, if it is symmetric,
2 ) = 1/2 if i < k
with centered Gaussian entries gik , i, k = 1, . . . , n satisfying E(gii2 ) = 1, E(gik
11

and the random variables: {gik , 1 i k n} are independent.


It is well known that:
n1

2 /2

qn () = e

2 /2

c2k Hk2 ()
k=0
+

+ 1/2 (n/2)1/2 cn1 cn Hn1 ()

ey

+ 1I{n odd

2 /2

Hn (y)dy 2

ey

2 /2

Hn (y)dy

Hn1 ()
,
+ y 2 /2
e
H
(y)dy
n1

(29)

where ck := (2k k! )1/2 , k = 0, 1, . . ., (see Mehta [25], ch. 7.)


In the proof of the theorem we will use the following remark due to Fyodorov [18] that we
state as a Lemma
Lemma 5 Let Gn be a GOE n n matrix. Then, for R one has:
E | det(Gn In )| = 23/2 (n + 3)/2 exp( 2 /2)

qn+1 ()
,
n+1

(30)

Proof:
Denote by 1 , . . . , n the eigenvalues of Gn . It is well-known (Mehta [25], Kendall et al. [19])
that the joint density fn of the n-tuple of random variables (1 , . . . , n ) is given by the formula
n

n
2
i=1 i

fn (1 , . . . , n ) = cn exp

1i<kn

|k i | , with cn := (2)n/2 ((3/2))n

(1+i/2)
i=1

Then,
n

E | det(Gn In )| = E

i=1

|i |

=
Rn i=1

|i |cn exp(

= e

2 /2

cn
cn+1

Rn

n
2
i=1 i

)
1i<kn

|k i | d1 , . . . , dn

fn+1 (1 , . . . , n , )d1 , . . . , dn = e

2 /2

cn qn+1 ()
.
cn+1 n + 1

The remainder is plain.


Proof of Theorem 4:
We use the definition (8) given in Corollary 1 and the moment computations of Lemma 2 which
imply that:
pX(t) (x) = (x)

(31)
j/2

pX(t),Xj (t) (x, 0) = (x)(2)

j/2

(2 )

X (t) is independent of X(t)

Xj,N
(t)

is independent of

(33)

(Xj (t), X(t), Xj (t)).

Since the distribution of X (t) is centered Gaussian with variance 2 Id , it follows that :
E( 1IX (t)Cbt,0 /X(t) = x) = 0 (t)
12

(32)

if t S0 ,

(34)

and if t Sj , j 1:
E(| det(Xj (t))| 1IX

j,N (t)Ct,j

/X(t) = x, Xj (t) = 0)

= j (t) E(| det(Xj (t))|/X(t) = x, Xj (t) = 0)


= j (t) E(| det(Z + 2 xIj )|). (35)
In the formula above, j (t) is the normalized solid angle defined in the statement of the theorem
and the random j j real matrix Z has the distribution of Lemma 2 .
A standard moment computations shows that Z has the same distribution as the random matrix:
2 Ij ,

8 Gj + 2

where Gj is a j j GOE random matrix, is standard normal in R and independent of Gj .


So, for j 1 one has
+

E | det(Z + 2 xIj )| = (8 )j/2

E | det(Gj Ij )| (y)dy,

where is given by (27).


For the conditional expectation in (8) use this last expression in (35) and (5). For the density
in (8) use (32). Then Lemma 3 gives (22).

Remarks on the theorem


The principal term is
(x)

d0

0 (t) +
j=1

tS0

| |

j/2

H j (x) gj

(36)

which is the product of a standard Gaussian density times a polynomial with degree d0 .
Integrating once, we get -in our special case- the formula for the expectation of the EPC
of the excursion set as given by [3]
The complementary term given by
d0

Rj (x)gj ,

(x)

(37)

j=1

can be computed by means of a formula, as it follows from the statement of the theorem
above. These formulae will be in general quite unpleasant due to the complicated form of
Tj (v). However, for low dimensions they are simple. For example:

2 (v) v(1 (v)) ,

T2 (v) = 2 2(v),

T1 (v) =

T3 (v) =

3(2v 2 + 1)(v) (2v 2 3)v(1 (v)) .


2

13

(38)
(39)
(40)

Second order asymptotics for pM (x) as x + will be mainly considered in the next
section. However, we state already that the complementary term (37) is equivalent, as
x +, to
12

(x) gd0 Kd0 x2d0 4 e

2
3 2

x2

(41)

where the constant Kj , j = 1, 2, ... is given by:


Kj = 23j2

j+1

2
j/4
j/2
3 2
(2) (j 1)!

2j4

(42)

We are not going to go through this calculation, which is elementary but requires some
work. An outline of it is the following. Replace the Hermite polynomials in the expression
for Tj (v) given by (28) by the well-known expansion:
[j/2]

(1)i

Hj (v) = j!
i=0

(2v)j2i
i!(j 2i)!

(43)

and Ij1 (v) by means of the formula in Lemma 4.


Evaluating the term of highest degree in the polynomial part, this allows to prove that,
as v +, Tj (v) is equivalent to

v2
2j1
v 2j4 e 2 .
(j 1)!

(44)

Using now the definition of Rj (x) and changing variables in the integral in (26), one gets
for Rj (x) the equivalent:
12

Kj x2j4 e

2
3 2

x2

(45)

In particular, the equivalent of (37) is given by the highest order non-vanishing term in
the sum.
Consider now the case in which S is the sphere S d1 and the process satisfies the same
conditions as in the theorem. Even though the theorem can not be applied directly,
it is possible to deal with this example to compute p(x), only performing some minor
changes. In this case, only the term that corresponds to j = d 1 in (8) does not vanish,
= 1 for each t S d1 and one can use invariance
Ct,d1 = Nt,d1 , so that 1IX
b
(t)C
d1,N

t,d1

under rotations to obtain:


p(x) = (x)

d1 S d1
E | det(Z + 2 xId1 ) + (2| |)1/2 Id1 |
(2)(d1)/2

(46)

where Z is a (d 1) (d 1) centered Gaussian matrix with the covariance structure


of Lemma 2 and is a standard Gaussian real random variable, independent of Z. (46)
follows from the fact that the normal derivative at each point is centered Gaussian with

14

variance 2| | and independent of the tangential derivative. So, we apply the previous
computation, replacing x by x + (2| |)1/2 and obtain the expression:
p(x) = (x)
+

2 d/2
(d/2)
| | (d1)/2
H d1 (x + (2| |)1/2 y) + Rd1 (x + (2| |)1/2 y) (y)dy.

(47)

Asymptotics as x +

In this section we will consider the errors in the direct and the EPC methods for large values
of the argument x. Theses errors are:
p(x) pM (x) =

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x pX(t) (x)


tS0

d0

+
j=1

Sj

E | det(Xj (t)| 1IX

j,N (t)Ct,j

pE (x) pM (x) =

. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (48)

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x pX(t) (x)


tS0

d0

(1)j

+
j=1

Sj

E det(Xj (t) 1IX

j,N (t)Ct,j

. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt).


(49)

It is clear that for every real x,


|pE (x) pM (x)| p(x) pM (x)
so that the upper bounds for p(x) pM (x) will automatically be upper bounds for
|pE (x) pM (x)|. Moreover, as far as the authors know, no better bounds for |pE (x) pM (x)|
than for p(x) pM (x) are known. It is an open question to determine if there exist situations
in which pE (x) is better asymptotically than p(x).
Our next theorem gives sufficient conditions allowing to ensure that the error
p(x) pM (x)
is bounded by a Gaussian density having strictly smaller variance than the maximum variance
of the given process X , which means that the error is super- exponentially smaller than pM (x)
itself, as x +. In this theorem, we assume that the maximum of the variance is not attained
in S\Sd0 . This excludes constant variance or some other stationary-like condition that will be
addressed in Theorem 6. As far as the authors know, the result of Theorem 5 is new even for
one-parameter processes defined on a compact interval.
For parameter dimension d0 > 1, the only result of this type for non-constant variance
processes of which the authors are aware is Theorem 3.3 of [34].
Theorem 5 Assume that the process X satisfies conditions A1 -A5. With no loss of generality,
we assume that maxtS Var(X(t)) = 1. In addition, we will assume that the set Sv of points
t S where the variance of X(t) attains its maximal value is contained in Sd0 (d0 > 0) the
non-empty face having largest dimension and that no point in Sv is a boundary point of S\Sd0 .
Then, there exist some positive constants C, such that for every x > 0.
|pE (x) pM (x)| p(x) pM (x) C(x(1 + )),
where (.) is the standard normal density.
15

(50)

Proof :
Let W be an open neighborhood of the compact subset Sv of S such that dist(W, (S\Sd0 )) > 0
where dist denote the Euclidean distance in Rd . For t Sj W c , the density
pX(t),Xj (t) (x, 0)
can be written as the product of the density of Xj (t) at the point 0, times the conditional density
of X(t) at the point x given that Xj (t) = 0, which is Gaussian with some bounded expectation
and a conditional variance which is smaller than the unconditional variance, hence, bounded by
some constant smaller than 1. Since the conditional expectations in (48) are uniformly bounded
by some constant, due to standard bounds on the moments of the Gaussian law, one can deduce
that:
p(x) pM (x) =

W Sd0

E | det(Xd0 (t))| 1IX

d0 ,N (t)Ct,d0

.pX(t),Xd

. 1IM >x /X(t) = x, Xd 0 (t) = 0

(t) (x, 0)d0 (dt)

+ O(((1 + 1 )x)), (51)

as x +, for some 1 > 0. Our following task is to choose W such that one can assure
that the first term in the right hand-member of (51) has the same form as the second, with a
possibly different constant 1 .
To do this , for s S and t Sd0 , let us write the Gaussian regression formula of X(s) on the
pair (X(t), Xd 0 (t)):
X(s) = at (s)X(t) + bt (s), Xd 0 (t) +

ts
2

X t (s).

(52)

where the regression coefficients at (s), bt (s) are respectively real-valued and Rd0 -valued.
From now onwards, we will only be interested in those t W . In this case, since W does not
contain boundary points of S\Sd0 , it follows that
Ct,d0 = Nt,d0 and 1IX

d0 ,N (t)Ct,d0

= 1.

Moreover, whenever s S is close enough to t, necessarily, s Sd0 and one can show that
the Gaussian process {X t (s) : t W Sd0 , s S} is bounded, in spite of the fact that its
trajectories are not continuous at s = t. For each t, {X t (s) : s S} is a helix process, see [8]
for a proof of boundedness.
On the other hand, conditionally on X(t) = x, Xd 0 (t) = 0 the event {M > x} can be written as
{X t (s) > t (s) x, for some s S}
where
t (s) =

2(1 at (s))
.
ts 2

(53)

Our next goal is to prove that if one can choose W in such a way that
inf{ t (s) : t W Sd0 , s S, s = t} > 0,

(54)

then we are done. In fact, apply the Cauchy-Schwarz inequality to the conditional expectation
in (51). Under the conditioning, the elements of Xd0 (t) are the sum of affine functions of x
with bounded coefficients plus centered Gaussian variables with bounded variances, hence, the
absolute value of the conditional expectation is bounded by an expression of the form
Q(t, x)

1/2

X t (s)
>x
t
sS\{t} (s)
sup
16

1/2

(55)

where Q(t, x) is a polynomial in x of degree 2d0 with bounded coefficients. For each t W Sd0 ,
the second factor in (55) is bounded by
P sup

X t (s)
: t W Sd0 , s S, s = t > x
t (s)

1/2

Now, we apply to the bounded separable Gaussian process


X t (s)
: t W Sd0 , s S, s = t
t (s)
the classical Landau-Shepp-Fernique inequality [20], [17] which gives the bound
P sup

X t (s)
: t W Sd0 , s S, s = t > x C2 exp(2 x2 ),
t (s)

for some positive constants C2 , 2 and any x > 0. Also, the same argument above for the density
pX(t),Xd (t) (x, 0) shows that it is bounded by a constant times the standard Gaussian density.
0
To finish, it suffices to replace these bounds in the first term at the right-hand side of (51).
It remains to choose W for (54) to hold true. Consider the auxiliary process
Y (s) :=

X(s)
r(s, s)

, s S.

(56)

Clearly, Var(Y (s)) = 1 for all s S. We set


r Y (s, s ) := Cov(Y (s), Y (s )) , s, s S.
Let us assume that t Sv . Since the function s
Var(X(s)) attains its maximum value at

s = t, it follows that X(t), Xd0 (t) are independent, on differentiation under the expectation
sign. This implies that in the regression formula (52) the coefficients are easily computed and
at (s) = r(s, t) which is strictly smaller than 1 if s = t, because of the non-degeneracy condition.
Then
2(1 r(s, t))
2(1 r Y (s, t))
t (s) =

.
(57)
ts 2
ts 2

Since r Y (s, s) = 1 for every s S, the Taylor expansion of r Y (s, t) as a function of s, around
s = t takes the form:
Y
r Y (s, t) = 1 + s t, r20,d
(t, t)(s t) + o( s t 2 ),
0

(58)

where the notation is self-explanatory.


Also, using that Var(Y (s)) = 1 for s S, we easily obtain:
Y
(t, t) = Var(Yd0 (t)) = Var(Xd 0 (t))
r20,d
0,

(59)

where the last equality follows by differentiation in (56) and putting s = t. (59) implies that
Y
(t, t) is uniformly positive definite on t Sv , meaning that its minimum eigenvalue has
r20,d
0,
a strictly positive lower bound. This, on account of (57) and (58), already shows that
inf{ t (s) : t Sv , s S, s = t} > 0,

(60)

The foregoing argument also shows that


inf{ (at )d0 (t) : t Sv , S d0 1 , s = t} > 0,
17

(61)

since whenever t Sv , one has at (s) = r(s, t) so that


(at )d0 (t) = r20,d0 , (t, t).
To end up, assume there is no neighborhood W of Sv satisfying (54). In that case using a
compactness argument, one can find two convergent sequences {sn } S , {tn } Sd0 , sn s0 ,
tn t0 Sv such that
tn (sn ) 0.
may be .
t0 = s0 is not possible, since it would imply
=2

(1 at0 (s0 ))
= t0 (s0 ),
t0 s 0 2

which is strictly positive.


If t0 = s0 , on differentiating in (52) with respect to s along Sd0 we get:
Xd 0 (s) = (at )d0 (s)X(t) + (bt )d0 (s), Xd 0 (t) +

d0 t s
s
2

X t (s),

where (at )d0 (s) is a column vector of size d0 and (bt )d0 (s) is a d0 d0 matrix. Then, one must
have at (t) = 1, (at )d0 (t) = 0 . Thus
tn (sn ) = uTn (at0 )d0 (t0 )un + o(1),
where un := (sn tn )/ sn tn . Since t0 Sv we may apply (61) and the limit of tn (sn )
cannot be non-positive.
A straightforward application of Theorem 5 is the following
Corollary 2 Under the hypotheses of Theorem 5, there exists positive constants C, such that,
for every u > 0 :
+

pE (x)dx P(M > u)

+
u

p(x)dx P(M > u) CP( > u),

where is a centered Gaussian variable with variance 1


The precise order of approximation of p(x) pM (x) or pE (x) pM (x) as x + remains
2 respectively which
in general an open problem, even if one only asks for the constants d2 , E
govern the second order asymptotic approximation and which are defined by means of

and

1
:= lim 2x2 log p(x) pM (x)
x+
d2

(62)

1
lim 2x2 log pE (x) pM (x)
2 := x+
E

(63)

whenever these limits exist. In general, we are unable to compute the limits (62) or (63) or
even to prove that they actually exist or differ. Our more general results (as well as in [3], [34])
only contain lower-bounds for the liminf as x +. This is already interesting since it gives
some upper-bounds for the speed of approximation for pM (x) either by p(x) or pE (x). On the
other hand, in Theorem 7 below, we are able to prove the existence of the limit and compute
d2 for a relevant class of Gaussian processes.
18

For the next theorem we need an additional condition on the parameter set S. For S
verifying A1 we define
(S) = sup

sup

0jd0 tSj

sup
sS,s=t

dist (t s), Ct,j


st 2

(64)

where dist is the Euclidean distance in Rd .


One can show that (S) < in each one of the following classes of parameter sets S:
- S is convex, in which case (S) = 0.
- S is a C 3 manifold, with or without boundary.
- S verifies the following condition: For every t S there exists an open neighborhood V of
t in Rd and a C 3 diffeomorphism : V B(0, r) (where B(0, r) denotes the open ball in Rd
centered at 0 and having radius r, r > 0) such that
(V S) = C B(0, r), where C is a convex cone.
However, (S) < can fail in general. A simple example showing what is going on is the
following: take an orthonormal basis of R2 and put
S = {(, 0) : 0 1} {( cos , sin ) : 0 1}
where 0 < < , that is, S is the boundary of an angle of size . One easily checks that
(S) = +. Moreover it is known [3] that in this case the EPC approximation does not verify
a super- exponential inequality. More generally, sets S having whiskers have (S) = +.
Theorem 6 Let X be a stochastic process on S satisfying A1 -A5. Suppose in addition that
Var(X(t)) = 1 for all t S and that (S) < +.
Then
1
(65)
lim inf 2x2 log p(x) pM (x) 1 + inf 2
x+
tS + (t)2
t
t
with

Var X(s)/X(t), X (t)


(1 r(s, t))2
sS\{t}

t2 := sup
and
t := sup

dist 1
t r01 (s, t), Ct,j
1 r(s, t)

sS\{t}

(66)

where
t := Var(X (t))
(t) is the maximum eigenvalue of t
in (66), j is such that t Sj ,(j = 0, 1, . . . , d0 ).
The quantity in the right hand side of (65) is strictly bigger than 1.
Remark. In formula (65) it may happen that the denominator in the right-hand side is
identically zero, in which case we put + for the infimum. This is the case of the one-parameter
process X(t) = cos t + sin t where , are Gaussian standard independent random variables,
and S is an interval having length strictly smaller than .

19

Proof of Theorem 6
Let us first prove that suptS t < .
For each t S, let us write the Taylor expansions
r01 (s, t) = r01 (t, t) + r11 (t, t)(s t) + O( s t 2 )
= t (s t) + O( s t 2 )

where O is uniform on s, t S, and


1 r(s, t) = (s t)T t (s t) + O( s t 2 ) L2 s t 2 ,
where L2 is some positive constant. It follows that for s S, t Sj , s = t, one has:
dist 1
t r01 (s, t), Ct,j

L3

1 r(s, t)

dist (t s), Ct,j


st 2

+ L4 ,

(67)

where L3 and L4 are positive constants. So,


dist 1
t r01 (s, t), Ct,j
which implies suptS t < .

L3 (S) + L4 .

1 r(s, t)

With the same notations as in the proof of Theorem 5, using (4) and (8), one has:

p(x) pM (x) = (x)

E 1IX (t)Cbt,0 . 1IM >x /X(t) = x


tS0

d0

+
j=1

Sj

E | det(Xj (t))| 1IX

j,N (t)Ct,j .

1IM >x /X(t) = x, Xj (t) = 0

(2)j/2 (det(Var(Xj (t))))1/2 j (dt) . (68)


Proceeding in a similar way to that of the proof of Theorem 5, an application of the Holder
inequality to the conditional expectation in each term in the right-hand side of (68) shows that
the desired result will follow as soon as we prove that:

lim inf 2x2 log P {Xj,N


Ct,j } {M > x}/X(t) = x, Xj (t) = 0
x+

t2

1
,
+ (t)2t

for each j = 0, 1, . . . , d0 , where the liminf has some uniformity in t.


Let us write the Gaussian regression of X(s) on the pair (X(t), X (t))
X(s) = at (s)X(t) + bt (s), X (t) + Rt (s).
Since X(t) and X (t) are independent, one easily computes :
at (s) = r(s, t)
bt (s) = 1
t r01 (s, t).
Hence, conditionally on X(t) = x, Xj (t) = 0, the events

T
{M > x} and {Rt (s) > (1 r(s, t))x r01
(s, t)1
t Xj,N (t) for some s S}

20

(69)

coincide.
(t)|X (t) = 0) the regression of X (t) on X (t) = 0. So, the probability in
Denote by (Xj,N
j
j
j,N
(69) can written as

Cbt,j

P{ t (s) > x

T (s, t)1 x
r01

for some s S}pXj,N


(t)|Xj (t)=0 (x )dx
1 r(s, t)

(70)

where
t (s) :=

Rt (s)
1 r(s, t)

dx is the Lebesgue measure on Nt,j . Remember that Ct,j Nt,j .

If 1
t r01 (s, t) Ct,j one has

T
r01
(s, t)1
t x 0

for every x Ct,j , because of the definition of Ct,j .


/ Ct,j , since Ct,j is a closed convex cone, we can write
If 1
t r01 (s, t)

1
t r01 (s, t) = z + z

with z Ct,j , z z and z = dist(1


t r01 (s, t), Ct,j ).

So, if x Ct,j :
T (s, t)1 x
r01
z T x + z T x
t
=
t x
1 r(s, t)
1 r(s, t)

using that z T x 0 and the Cauchy-Schwarz inequality. It follows that in any case, if x Ct,j
the expression in (70) is bounded by
Cbt,j

P t (s) > x t x for some s S pXj,N


(t)|Xj (t)=0 (x )dx .

(71)

To obtain a bound for the probability in the integrand of (71) we will use the classical
inequality for the tail of the distribution of the supremum of a Gaussian process with bounded
paths.
The Gaussian process (s, t))
t (s), defined on (S S)\{s = t} has continuous paths. As
the pair (s, t) approches the diagonal of S S, t (s) may not have a limit but, almost surely,
it is bounded (see [8] for a proof). (For fixed t, t (.) is a helix process with a singularity at
s = t, a class of processes that we have already met above).
We set
mt (s) := E( t (s)) (s = t)
m := sups,tS,s=t |mt (s)|
:= E | sups,tS,s=t t (s) mt (s) | .
The almost sure boundedness of the paths of t (s) implies that m < and < . Applying the
Borell-Sudakov-Tsirelson type inequality (see for example Adler [2] and references therein) to
the centered process s
t (s)mt (s) defined on S\{t} , we get whenever xt x m > 0:
P{ t (s) > x t x for some s S}

P{ t (s) mt (s) > x t x m for some s S}


2 exp
21

(x t x m )2
.
2t2

The Gaussian density in the integrand of (71) is bounded by


(2j (t))

jd
2

x mj,N (t)

exp

2j (t)

(t)|X (t))
where j (t) and j (t) are respectively the minimum and maximum eigenvalue of Var(Xj,N
j

and mj,N (t) is the conditional expectation E(Xj,N (t)|Xj (t) = 0). Notice that j (t), j (t), mj,N (t)
are bounded, j (t) is bounded below by a positive constant and j (t) (t).

Replacing into (71) we have the bound :

P {Xj,N
Ct,j } {M > x}/X(t) = x, Xj (t) = 0

(2j (t))

jd
2

Cbt,j {xt x m>0}

exp

x mj,N (t) 2
(x t x m )2
dx
+
2t2
2(t)
xm

+ P Xj,N
(t)|Xj (t) = 0
, (72)
t

where it is understood that the second term in the right-hand side vanishes if t = 0.
Let us consider the first term in the right-hand side of (72). We have:
x mj,N (t)
(x t x m )2
+
2t2
2(t)

2
(x t x m )2 ( x mj,N (t) )
+
2t2
2(t)
(x m t mj,N (t) )2
2
,
= A(t) x + B(t)(x m ) + C(t) +
2t2 + 2(t)2t

where the last inequality is obtained after some algebra, A(t), B(t), C(t) are bounded functions
and A(t) is bounded below by some positive constant.
So the first term in the right-hand side of (72) is bounded by :
2.(2j )

jd
2

exp

(x m t mj,N (t))2
2t2 + 2(t)2t

Rdj

exp A(t) x + B(t)(x m ) + C(t)


L|x|dj1 exp

dx

(x m t mj,N (t) )2
2t2 + 2(t)2t

(73)

where L is some constant. The last inequality follows easily using polar coordinates.
Consider now the second term in the right-hand side of (72). Using the form of the conditional

density pXj,N
(t)/Xj (t)=0 (x ), it follows that it is bounded by
P

(Xj,N
(t)/Xj (t)

= 0)

mj,N (t)

x m t mj,N (t)
t

L1 |x|dj2 exp

(x m t mj,N (t) )2
2(t)2t

where L1 is some constant. Putting together (73) and (74) with (72), we obtain (69).
The following two corollaries are straightforward consequences of Theorem 6:
22

(74)

Corollary 3 Under the hypotheses of Theorem 6 one has


lim inf 2x2 log |pE (x) pM (x)| 1 + inf

tS 2
t

x+

1
.
+ (t)2t

Corollary 4 Let X a stochastic process on S satisfying A1 -A5. Suppose in addition that


E(X(t)) = 0, E(X 2 (t)) = 1, Var(X (t) = Id for all t S.
Then
+
1
pE (x)dx 1 + inf 2
.
lim inf 2u2 log P(M > u)
u+
tS t + 2
u
t
and

d0

(1)j (2)j/2 gj H j (x) (x).

p (x) =
j=0

where gj is given by (23) and H j (x) has been defined in Section 4.


The proof follows directly from Theorem 6 the definition of pE (x) and the results in [1].

Examples

1) A simple application of Theorem 5 is the following. Let X be a one parameter real-valued


centered Gaussian process with regular paths, defined on the interval [0, T ] and satisfying an
adequate non-degeneracy condition. Assume that the variance v(t) has a unique maximum, say
1 at the interior point t0 , and k = min{j : v (2j) (t0 ) = 0} < . Notice that v (2k) (t0 ) < 0. Then,
one can obtain the equivalent of pM (x) as x which is given by:
pM (x)

1 v (t0 )/2
1/k
kCk

E || 2k 1 x11/k (x),

(75)

1
v (2k) (t0 ) + 14 [v (t0 )]2 1Ik=2 . The
where is a standard normal random variable and Ck = (2k)!
proof is a direct application of the Laplace method. The result is new for the density of the
maximum, but if we integrate the density from u to +, the corresponding bound for P{M > u}
is known under weaker hypotheses (Piterbarg [28]).

2) Let the process X be centered and satisfy A1-A5. Assume that the the law of the process
is isotropic and stationary, so that the covariance has the form (10) and verifies the regularity
condition of Section 4. We add the simple normalization = (0) = 1/2. One can easily
check that
1 2 ( s t 2 ) 42 ( s t 2 ) s t 2
(76)
t2 = sup
[1 ( s t 2 )]2
sS\{t}
Furthermore if
(x) 0 for x 0

(77)

one can show that the sup in (76) is attained as s t 0 and is independent of t. Its value
is
t2 = 12 1.
The proof is elementary (see [4] or [34]).
Let S be a convex set. For t Sj , s S:
dist r01 (s, t), Ct,j = dist 2 ( s t 2 )(t s), Ct,j .
23

(78)

The convexity of S implies that (t s) Ct,j . Since Ct,j is a convex cone and 2 ( s t 2 ) 0,
one can conclude that r01 (s, t) Ct,j so that the distance in (78) is equal to zero. Hence,
t = 0 for every t S
and an application of Theorem 6 gives the inequality
lim inf
x+

1
2
.
log p(x) pM (x) 1 +
2

x
12 1

(79)

A direct consequence is that the same inequality holds true when replacing p(x) pM (x) by
|pE (x) pM (x)| in (79), thus obtainig the main explicit example in Adler and Taylor [3], or in
Taylor et al. [34].
Next, we improve (79). In fact, under the same hypotheses, we prove that the liminf is an
ordinary limit and the sign is an equality sign. We state this as
Theorem 7 Assume that X is centered, satisfies hypotheses A1-A5, the covariance has the
form (10) with (0) = 1/2, (x) 0 f or x 0. Let S be a convex set, and d0 = d 1.
Then
1
2
.
(80)
lim log p(x) pM (x) = 1 +
x+ x2
12 1
Remark Notice that since S is convex, the added hypothesis that the maximum dimension d0
such that Sj is not empty is equal to d is not an actual restriction.
Proof of Theorem 7
In view of (79), it suffices to prove that
lim sup
x+

1
2
log p(x) pM (x) 1 +
.
2

x
12 1

(81)

Using (4) and the definition of p(x) given by (8), one has the inequality
p(x) pM (x) (2)d/2 (x)

Sd

E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0)d (dt),

(82)

where our lower bound only contains the term corresponding to the largest dimension and we
have already replaced the density pX(t),X (t) (x, 0) by its explicit expression using the law of the
process. Under the condition {X(t) = x, X (t) = 0} if v0T X (t)v0 > 0 for some v0 S d1 , a
Taylor expansion implies that M > x. It follows that
E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0

E | det(X (t))| 1I

sup v T X (t)v > 0

/X(t) = x, X (t) = 0 . (83)

vS d1

We now apply Lemma 2 which describes the conditional distribution of X (t) given X(t) =
x, X (t) = 0 . Using the notations of this lemma, we may write the right-hand side of (83) as :
E | det(Z xId)| 1I

sup v T Zv > x

vS d1

which is obviously bounded below by


E | det(Z xId)| 1IZ11 >x

=
x

E | det(Z xId)|/Z11 = y (2)1/2 1 exp


24

y2
dy, (84)
2 2

where 2 := Var(Z11 ) = 12 1. The conditional distribution of Z given Z11 = y is easily


deduced from Lemma 2. It can be represented by the random d d real symmetric matrix

y
Z12
... ...
Z1d

2 + y Z23 . . .
Z2d

Z :=
,
..

.
d + y

where the random variables {2 , . . . , d , Zik , 1 i < k d} are independent centered Gaussian
with
Var(Zik ) = 4 (1 i < k d) ; Var(i ) =

4 1
16 (8 1)
(i = 2, . . . , d) ; =

12 1
12 1

Observe that 0 < < 1.


Choose now 0 such that (1+0 ) < 1. The expansion of det(Z xId) shows that if x(1+0 )
y x(1 + 0 ) + 1 and x is large enough, then
E | det(Z xId)| L 0 (1 (1 + 0 ))d1 xd ,

where L is some positive constant. This implies that

1
2

exp(
x

L
y2
)E | det(ZxId)| dy
2
2
2

x(1+0 )+1

exp(
x(1+0 )

y2
)0 (1(1+0 ))d1 xd dy
2 2

for x large enough. On account of (82),(83),(84), we conclude that for x large enough,
p(x) pM (x) L1 xd exp

x2 (x(1 + 0 ) + 1)2
+
.
2
2 2

for some new positive constant L1 . Since 0 can be chosen arbitrarily small, this implies (81).
3) Consider the same processes of Example 2, but now defined on the non-convex set {a
t b}, 0 < a < b. The same calculations as above show that t = 0 if a < t b and
t = max

2a (2a2 (1 cos ))(1 cos)


2 (z 2 )z
,
sup
,
2
1 (2a2 (1 cos ))
z[2a,a+b] 1 (z ) [0,]
sup

for t = a.
4) Let us keep the same hypotheses as in Example 2 but without assuming that the covariance is decreasing as in (77). The variance is still given by (76) but t is not necessarily equal
to zero. More precisely, relation (78) shows that
t sup 2
sS\{t}

( s t 2 )+ s t
1 ( s t 2 )

The normalization: = 1/2 implies that the process X is identity speed, that is
Var(X (t)) = Id so that (t) = 1. An application of Theorem 6 gives
lim inf
x+

where

2
log p(x) pM (x) 1 + 1/Z .
x2

(85)
2

4 (z 2 )+ z
1 2 (z 2 ) 42 (z 2 )z 2
+
max
,
[1 (z 2 )]2
z(0,] [1 (z 2 )]2
z(0,]

Z := sup
and is the diameter of S.
5) Suppose that

25

the process X is stationary with covariance (t) := Cov(X(s), X(s + t)) that satisfies
(s1 , . . . , sd ) = i=1,...,d i (si ) where 1 , ..., d are d covariance functions on R which are
monotone, positive on [0, +) and of class C 4 ,
S is a rectangle

S=

[ai , bi ] , ai < bi .
i=1,...,d

Then, adding an appropriate non-degeneracy condition, conditions A2-A5 are fulfilled and Theorem 6 applies
It is easy to see that

1 (s1 t1 )2 (s2 t2 ) . . . d (sd td )

..
r0,1 (s, t) =

.
1 (s1 t1 ) . . . d1 (sd1 td1 ).d (sd td )

belongs to Ct,j for every s S. As a consequence t = 0 for all t S. On the other hand,
standard regressions formulae show that
2
2
2
2
2
1 21 . . . 2d 2
Var X(s)/X(t), X (t)
1 2 . . . d 1 . . . d1 d
=
,
(1 r(s, t))2
(1 1 . . . d )2

where i stands for i (si ti ). Computation and maximisation of t2 should be performed


numerically in each particular case.

References
[1] Adler, R.J. (1981). The Geometry of Random Fields. Wiley, New York.
[2] Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes. IMS, Hayward, Ca.
[3] Adler, R.J. and Taylor J. E.(2005). Random fields and geometry. Book to appear.
[4] Azas J-M., Bardet J-M. and Wschebor M. (2002). On the Tails of the distribution of the
maximum of a smooth stationary Gaussian Process. Esaim: P. and S., 6,177-184.
[5] Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution of the
maximum of a Gaussian random fields. Extremes (2002)5(2), 181-212.
[6] Azas, J-M and Wschebor, M. (2002). The Distribution of the Maximum of a Gaussian
Process: Rice Method Revisited, In and out of equilibrium: probability with a physical
flavour, Progress in Probability, 321-348, Birkha
user.
[7] Azas J-M. and Wschebor M (2001). On the regularity of the distribution of the Maximum
of one parameter Gaussian processes Probab. Theory Relat. Fields, 119, 70-98.
[8] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[9] Azas J-M. and Wschebor, M. (2006). A self contained proof of the Rice formula for random
fields. Preprint available at http://www.lsp.ups-tlse.fr/Azais/publi/completeproof.pdf.
[10] Belyaev, Y. (1966). On the number of intersections of a level by a Gaussian Stochastic
process. Theory Prob. Appl., 11, 106-113.
26

[11] Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a
Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
[12] Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series.
[13] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30,
207-216.
[14] Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci. Paris, Ser. I, 337, 663-666.
[15] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J.
Wiley & Sons, New-York.
[16] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[17] Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
dEte de Probabilites de Saint Flour (1974). Lecture Notes in Mathematics, 480, SpringerVerlag, New-York.
[18] Fyodorov, Y. (2006). Complexity of Random Energy Landscapes, Glass Transition and
Absolute Value of Spectral Determinant of Random Matrices Physical Review Letters v. 92
(2004), 240601 (4pages); Erratum: ibid. v.93 (2004),149901(1page)
[19] Kendall, M.G., Stuart,A. and Ord, J.K. (1987). The Advanced Theory of Statistics, Vol. 3.
[20] Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process. Sankya
Ser. A 32, 369-378.
[21] Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Math. Soc.,
Providence, RI.
[22] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, Springer-Verlag,
New-York.
[23] Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 52-71.
[24] Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes. Proc.
Sixth Berkeley Symp. Math. Statist. Prob., 2, 423-442.
[25] Mehta,M.L. (2004). Random matrices, 3d-ed. Academic Press.
[26] Mercadier, C. (2006). Numerical bounds for the distribution of the maximum of one- and
two-dimensional processes, to appear in Advances in Applied Probability, 38, (1).
[27] Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th, Proba. Appl., 26, 687-705.
[28] Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes and
Fields. American Mathematical Society. Providence. Rhode Island.
[29] Rychlik, I. (1990). New bounds for the first passage, wave-length and amplitude densities.
Stochastic Processes and their Applications, 34, 313-339.
[30] Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties of half spaces for spherically
invariant measures (in Russian). Zap. Nauchn. Sem. LOMI, 45, 75-82.
27

[31] Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields, Ann. Probab.,
21, 34-71.
[32] Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab., 24,
1049-1103.
[33] Taylor, J.E. and Adler, R. J. (2003). Euler characteristics for Gaussian fields on manifolds.
Ann. Probab., 31, 533-563.
[34] Taylor J.E., Takemura A. and Adler R.J. (2005). Validity of the expected Euler Characteristic heuristic. Ann. Probab., 33, 4, 1362-1396.
[35] Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de niveau.
Lecture Notes in Mathematics, 1147, Springer-Verlag.

28

On the Distribution of the Maximum of a Gaussian


Field with d Parameters.
Jean-Marc Azas , azais@cict.fr
Mario Wschebor , wscheb@fcien.edu.uy
November 10, 2003

AMS subject classification: 60G15, 60G70.


Short Title: Distribution of the Maximum.
Key words and phrases: Gaussian fields, Rice Formula, Regularity of the Distribution of the Maximum.
Abstract
Let I be a compact d-dimensional manifold, X : I R a Gaussian
process with regular paths and FI (u) , u R the probability distribution
function of suptI X(t).
We prove that under certain regularity and non-degeneracy conditions, FI
is a C 1 -function and FI is absolutely continuous, and that FI FI satisfy
certain implicit equations that permit to give bounds for their values and to
compute their asymptotic behaviour as u +. This is a partial extension
of previous results by the authors in the case d = 1.
Our methods use strongly the so-called Rice formulae for the moments of
the number of roots of an equation of the form Z(t) = x, where Z : I Rd
is a random field and x a fixed point in Rd . We also give proofs for this kind
of formulae, which have their own interest beyond the present application.

This work was supported by ECOS program U97E02.


Laboratoire de Statistique et Probabilites. UMR-CNRS C5583 Universite Paul Sabatier. 118,
route de Narbonne. 31062 Toulouse Cedex 4. France.

Centro de Matematica. Facultad de Ciencias. Universidad de la Rep


ublica. Calle Igua 4225.
11400 Montevideo. Uruguay.

Introduction and notations.

Let I be a d-dimensional compact manifold and X : I R a Gaussian process with


regular paths defined on some probability space (, A, P). Define MI = sup X(t)
tI

and FI (u) = P{MI u}, u R the probability distribution function of the random
variable MI . Our aim is to study the regularity of the function FI when d > 1.
There exist a certain number of general results on this subject, starting from
the papers by Ylvisaker (1968) and Tsirelson (1975) (see also Weber (1985), Lifshits
(1995), Diebolt and Posse (1996) and references therein). The main purpose of this
paper is to extend to d > 1 some of the results about the regularity of the function
u
FI (u) in Azas & Wschebor (2001), which concern the case d = 1.
Our main tool here is Rice Formula for the moments of the number of roots
NuZ (I) of the equation Z(t) = u on the set I, where {Z(t) : t I} is an Rd -valued
Gaussian field, I is a subset of Rd and u a given point in Rd . For d > 1, even
though it has been used in various contexts, as far as the authors know, a full proof
of Rice Formula for the moments of NuZ (I) seems to have only been published by R.
Adler (1981) for the first moment of the number of critical points of a real-valued
stationary Gaussian process with a d-dimensional parameter, and extended by Azas
and Delmas (2002) to the case of processes with constant variance. Caba
na (1985)
contains related formulae for random fields; see also the PHD thesis of Konakov
cited by Piterbarg (1996b). In the next section we give a more general result which
has an interest that goes beyond the application of the present paper. At the same
time the proof appears to be simpler than previous ones. We have also included
the proof of the formula for higher moments, which in fact follows easily from the
first moment. Both extend with no difficulties to certain classes of non-Gaussian
processes.
It should be pointed out that the validity of Rice Formula for Lebesgue-almost
every u Rd is easy to prove (Brillinger, 1972) but this is insufficient for a certain
number of standard applications. For example, assume X : I
R is a real-valued
random process and one is willing to compute the moments of the number of critical
points of X. Then, we must take for Z the random field Z(t) = X (t) and the
formula one needs is for the precise value u = 0 so that a formula for almost every
u does not solve the problem.
We have added Rice Formula for processes defined on smooth manifolds. Even
though Rice Formula is local, this is convenient for various applications. We will
need a formula of this sort to state and prove the implicit formulae for the derivatives
of the distribution of the maximum (see Section 3).
2

The results on the differentiation of FI are partial extensions of Azas & Wschebor (2001). They concern only the first two derivatives and remain quite far away
from what is known for d = 1. The main result in that paper states that if X is
a real-valued Gaussian process defined on a certain compact interval I of the real
line, has C 2k paths (k integer, k 1) and satisfies a non-degeneracy condition, then
the distribution of MI is of class C k .
For Gaussian fields defined on a d-dimensional regular manifold (d > 1) and
possessing regular paths we obtain some improvements with respect to classical
and general results due to Tsirelson (1975) for Gaussian sequences. An example is
Corollary 6.1, that provides an asymptotic formula for FI (u) as u + which is
explicit in terms of the covariance of the process and can be compared with Theorem
4 in Tsirelson (1975) where an implicit expression depending on the function F itself
is given.
We use the following notations:
If Z is a smooth function U
Rd , U a subset of Rd , its successive derivatives
are denoted Z , Z ,...Z (k) and considered respectively as linear, bilinear, ..., klinear
forms on Rd . For example, X (3) (t){v1 , v2 , v3 } is the value of the third derivative at
point t applied to the triplet (v1 , v2 , v3 ). The same notation is used for a derivative
on a C manifold.
I and I are respectively the interior, the boundary and the closure of the set
I,
I. If is a random vector with values in Rd , whenever they exist, we denote by
p (x) the value of the density of at the point x, by E() its expectation and by
Var() its variance-covariance matrix. is Lebesgue measure.
If u, v are points in Rd , u, v denotes their usual scalar product and u the
Euclidean norm of u.
For M a d d real matrix, we denote
M = sup M x
x =1

We put 21 = min , ...2d for the eigenvalues of M M T , 0 1 ... d . Then,


M = d and if M is non-singular, M 1 = 11 = min1(M ) .
Also for symmetric M , M 0 (respectively M 0) denotes that M is positive
definite (resp. negative definite).
m
m!
is the usual combinatorial number, i.e. m
= n!(mn)!
if m, n are nonn
n
m
negative integers, m n and n = 0 otherwise.
Ac denotes the complement of the set A. For real x, x+ = sup(x, 0), x =
sup(x, 0)
3

Rice formulae

Our main results in this section are the following:


Theorem 2.1 Let Z : I
Rd , I a compact subset of Rd , be a random field and
u Rd .
Assume that:
A0: Z is Gaussian,
A1: t
Z(t) is a.s. of class C 1 ,
A2: for each t I, Z(t) has a non degenerate distribution (i.e. Var Z(t)
0),
Z(t) = u, det Z (t) = 0} = 0
A3: P{t I,
A4: (I) = 0.
Then

E NuZ (I) =

E (| det(Z (t))|/Z(t) = u) pZ(t) (u)dt,

(1)

and both members are finite.


Theorem 2.2 Let k, k 2 be an integer. Assume the same hypotheses as in
Theorem (2.1) excepting for A2 that is replaced by
A2 : for t1 , ..., tk I pairwise different values of the parameter, the distribution
of
Z(t1 ), ..., Z(tk )
does not degenerate in (Rd )k . Then
E

NuZ (I) NuZ (I) 1 ... NuZ (I) k + 1


k

E
Ik

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u


j=1

pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk , (2)


where both members may be infinite.
Remark.
Note that Theorem 2.1 (resp 2.2) remains valid, excepting for the finiteness of the
expectation in Theorem (2.1), if I is open and hypotheses A0,A1,A2 (resp A2) and
4

A3 are verified. This follows immediately from the above statements. A standard extension argument shows that (1) holds true if one replaces I by any Borel subset of I
Sufficient conditions for hypotheses A3 to hold are given by the next proposition.
Proposition 2.1 Let Z : I
Rd , I a compact subset of Rd be a random field with
1
d
paths of class C and u R . Assume that
pZ(t) (x) C for all t I and x in some neighbourhood of u.
at least one of the two following hypotheses is satisfied:
a) a.s. t

Z(t) is of class C 2

b)
() =

sup

P | det(Z (t))| < /Z(t) = x 0

tI,xV (u)

as 0, where V (u) is some neighbourhood of u.


Then A3 holds true.
Proof. If condition a) holds true, the result is Lemma 5 in Cucker and Wschebor
(2003).
To prove it under condition b), assume with no loss of generality that I = [0, 1]d
and that u = 0. Put GI = t I, Z(t) = 0, det Z (t) = 0 Choose > 0, > 0;
there exists a positive number M such that
P(EM ) = P sup Z (t) > M .
tI

Denote by det the modulus of continuity of | det(X (.))| and choose m large enough
so that

d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id
such a cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
Fm,
P GCi1 ...id EM

P(GI ) P(EM ) + P(Fm, ) +


1i1 ...id m

(3)

When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|

M
d j = 1, ..., d
m

where Zj denotes the j-th coordinate of Z, and:


det Z (ti1 ...id )

< .

So, if m is
chosen sufficiently large so that V (0) contains the ball centred at 0 with

M d
radius m , one has:
P(GI ) 2 + md (

2M d
d) C()
m

Since and are arbitrarily small, the result follows.


Lemma 2.1 With the notations of Theorem (2.1), suppose that A1 and A4 hold
true and that
pZ(t) (x) C for all t I and x in some neighbourhood of u
Then P NuZ (I) = 0 = 0
Proof: We use the notation of Proposition 2.1, with the same definition of EM
excepting that we do not suppose that I = [0, 1]d .
Since I has zero measure, for each positive integer m, it can be covered by h(m)
cubes C1 , ..., Ch(m) with centres t1 , ...th(m) and side lengths s1 , ...sh(m) smaller than
1/m, such that
h(m)

(si )d 0 as m +.
i=1

So,
h(m)

NuZ (I)

= 0 P(EM ) +

c
NuZ (Ci ) = 0 EM

i=1
h(m)

+
i=1

d
P |Zj (ti ) uj | M si
j = 1, ..., d + C
2

This gives the result.


6

h(m)

( dM si )d

i=1

Lemma 2.2 Let Z : I Rd , I a compact subset of Rd , be a C 1 function and u a


point in Rd . Assume that
a) inf tZ 1 ({u}) min Z (t) > 0
b) Z () < /d
where Z is the continuity modulus of Z , defined as the maximum of the continuity moduli of its entries and a positive number.
Then, if t1 , t2 are two distinct roots of the equation Z(t) = u such that the
segment [t1 , t2 ] is contained in I, the Euclidean distance between t1 and t2 is greater
than .
1

Recall that min Z (t) is the inverse of Z (t)


.
t1 t2
Proof: Set = t1 t2 , v = t1 t2 . Using the mean value theorem, for
i = 1, ..., d, there exists i [t1 , t2 ] such that
Z (i )v

=0

Thus
| Z (t1 )v i | = | Z (t1 )v i Z (i )v i |
d

|Z (t1 )ik Z (i )ik ||vk | Z (


)

k=1

|vk | Z (
) d

k=1

In conclusion
min Z (t1 ) Z (t1 )v Z (
)d,
that implies > .
Proof of Theorem 2.1: Consider a continuous non-decreasing function F such
that
F (x) = 0
F (x) = 1

for x 1/2
for x 1.

Let and be positive real numbers. Define the random function


, (u) = F

1
inf min Z (s) + Z(s) u
2 sI

1F

d
Z ()

(4)

and the set I = {t I : t s , s


/ I}. If , (u) > 0 and NuZ (I )
does not vanish, conditions a) and b) in Lemma 2.2 are satisfied. Hence, in each
7

ball with diameter 2 centred at a point in I there is at most one root of the
equation Z(t) = u, and a compactness argument shows that NuZ (I ) is bounded by
a constant C(, I), depending only on and on the set I.
Take now any real-valued non-random continuous function f : Rd R with
compact support. Because of the coarea formula (Federer, 1969, Th 3.2.3), since
a.s. Z is Lipschitz and , (u).f (u) is integrable:

Rd

f (u)NuZ (I ), (u)du =

| det(Z (t))|f Z(t) , Z(t) dt.


I

Taking expectations in both sides,

Rd

f (u)E NuZ (I ), (u) du =


f (u)du
Rd

E (| det(Z (t))|, (u)/Z(t) = u) pZ(t) (u)dt.


I

It follows that the two functions


(i) : E NuZ (I ), (u)
(ii) :

E (| det(Z (t))|, (u)/Z(t) = u) pZ(t) (u)dt,


I

coincide Lebesgue-almost everywhere as functions of u.


Let us prove that both functions are continuous, hence they are equal for every
u Rd .
Fix u = u0 and let us show that the function in (i) is continuous at u = u0 .
Consider the random variable inside the expectation sign in (i). Almost surely, there
is no point t in Z 1 ({u0 }) such that det(Z (t)) =0. By the local inversion theorem,
Z(.) is invertible in some neighbourhood of each point belonging to Z 1 ({u0 }) and
the distance from Z(t) to u0 is bounded below by a positive number for t I
outside of the union of these neighbourhoods. This implies that, a.s., as a function of
u, NuZ (I ) is constant in some (random) neighbourhood of u0 . On the other hand, it
is clear from its definition that the function u
, (u) is continuous and bounded.
We may now apply dominated convergence as u u0 , since NuZ (I ), (u) is
bounded by a constant that does not depend on u.
For the continuity of (ii), it is enough to prove that, for each t I the conditional
expectation in the integrand is a continuous function of u. Note that the random
8

variable | det(Z (t))|, (u) is a functional defined on {(Z(s), Z (s)) : s I}. Perform a Gaussian regression of (Z(s), Z (s)) : s I with respect to the random
variable Z(t), that is, write
Z(s) = Y t (s) + t (s)Z(t)
Zj (s) = Yjt (s) + jt (s)Z(t), j = 1, ..., d
where Zj (s) (j = 1, ..., d) denote the columns of Z (s), Y t (s) and Yjt (s) are Gaussian
vectors, independent of Z(t) for each s I, and the regression matrices t (s), jt (s)
(j = 1, ..., d) are continuous functions of s, t (take into account A2). Replacing in
the conditional expectation we are now able to get rid of the conditioning, and using
the fact that the moments of the supremum of an a.s. bounded Gaussian process
are finite, the continuity in u follows by dominated convergence.
So, now we fix u Rd and make 0, 0 in that order, both in (i) and (ii).
For (i) one can use Beppo Levis Theorem. Note that almost surely
= NuZ (I),
NuZ (I ) NuZ (I)
where the last equality follows from Lemma 2.1. On the other hand, the same
Lemma 2.1 plus A3 imply together that,almost surely:
inf min Z (s) + Z(s) u
sI

>0

so that the first factor in the right-hand member of (4) increases to 1 as decreases
to zero. Hence by Beppo Levis Theorem:
lim lim E NuZ (I ), (u) = E NuZ (I) .
0 0

For (ii), one can proceed in a similar way after de-conditioning obtaining (1). To
finish the proof, remark that standard Gaussian calculations show the finiteness of
the right-hand member of (1).
Proof of Theorem 2.2: For each > 0, define the domain
Dk, (I) = {(t1 , ..., tk ) I k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (I)

Z(t1 , ..., tk ) = Z(t1 ), ..., Z(tk ) .


9

It is clear that Z satisfies the hypotheses of Theorem 2.1 for every value (u, ..., u)
(Rd )k . So,
Z
E N(u,...,u)
Dk, (I)

=
Dk, (I)

| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (5)
j=1

To finish, let 0, note that NuZ (I) NuZ (I) 1 ... NuZ (I) k + 1 is the monotone
limit of
Z
N(u,...,u)
Dk, (I) ,
and that the diagonal Dk (I) = (t1 , ..., tk ) I k , ti = tj for some pair i, j, i = j has
zero Lebesgue measure in (Rd )k .
Remark Even thought we will not use this in the present paper, we point out
that it is easy to adapt the proofs of Theorems 2.1 and 2.2 to certain classes of
non-Gaussian processes.
For example, the statement of Theorem 2.1 remains valid if one replaces hypotheses A0 and A2 respectively by the following B0 and B2:
B0 : Z(t) = H(Y (t)) for t I where
Y : I Rn is a Gaussian process with C 1 paths such that for each t I, Y (t) has
a non-degenerate distribution and H : Rn Rd is a C 1 function.
B2 : for each t I, Z(t) has a density pZ(t) which is continuous as a function of
(t, u).
Note that B0 and B2 together imply that n d. The only change to be introduced in the proof of the theorem is in the continuity of (ii) where the regression is
performed on Y (t) instead of Z(t)
Similarly, the statement of Theorem 2.2 remains valid if we replace A0 by B0 and
add the requirement the joint density of Z(t1 ), ..., Z(tk ) to be a continuous function
of t1 , ..., tk , u for pairwise different t1 , ..., tk
Now consider a process X from I to R and define
X
(I) = {t I, X(.) has a local maximum at the point t, X(t) > u}
Mu,1
X
(I) = {t I, X (t) = 0, X(t) > u}
Mu,2

The problem of writing Rice Formulae for the factorial moments of these random
variables can be considered as a particular case of the previous one and the proofs are
10

the same, mutatis mutandis. For further use, we state as a theorem, Rice Formula
for the expectation. For short we do not state the equivalent of Theorem (2.2) that
holds true similarly.
Theorem 2.3 Let X : I
R , I a compact subset of Rd , be a random field. Let
X
u R, define Mu,i
(I), i = 1, 2 as above. For each d d real symmetric matrix M ,
1
we put (M ) := | det(M )|1IM 0 , 2 (M ) := | det(M )|.
Assume:
A0: X is Gaussian,
A1: a.s. t
X(t) is of class C 2 ,
A2: for each t I, X(t), X (t) has a non degenerate distribution in R1 Rd ,
A3: either
a.s. t
X(t) is of class C 3
or
() =

sup

P | det X (t) | < /X (t) = x 0

tI,x V (0)

as 0, where V (0) denotes some neighbourhood of 0,


A4: I has zero Lebesgue measure.
Then, for i = 1, 2 :
X
E Mu,i
(I) =

E i X (t) /X(t) = x, X (t) = 0 pX(t),X (t) (x, 0)dt

dx
u

and both members are finite.

2.1

Processes defined on a smooth manifold.

Let U be a differentiable manifold (by differentiable we mean infinitely differentiable)


of dimension d. We suppose that U is orientable in the sense that there exists a
non-vanishing differentiable d-form on U . This is equivalent to assuming that
there exists an atlas (Ui , i ); i I such that for any pair of intersecting charts
(Ui , i ), (Uj , j ), the Jacobian of the map i 1
j is positive.
We consider a Gaussian stochastic process with real values and C 2 paths X =
{X(t) : t U } defined on the manifold U . In this subsection, our aim is to write
Rice Formulae for this kind of processes under various geometric settings for U .
More precisely we will consider three cases: first, when U is a manifold without any
additional structure on it; second, when U has a Riemannian metric; third, when it
11

is embedded in an Euclidean space. We will make use of these formulae in Section


3 but they have an interest in themselves. (See Taylor and Adler (2002) for other
details or similar results).
We will assume that in every chart X(t) and DX(t) have a non-degenerate joint
distribution and that hypothesis A3 is verified. For S a Borel subset of U , the
X
(S), the number of local
following quantities are well defined and measurable : Mu,1
X
maxima and Mu,2 (S), the number of critical points.
2.1.1

Abstract manifold

Proposition 2.2 For k = 1, 2 the quantity which is expressed in every chart with
coordinates s1 , ..., sd as
+

dxE k (Y (s))/Y (s) = x, Y (s) = 0 pY (s),Y

(s) (x, o)

di=1 dsi ,

(6)

where Y (s) is the process X written in the chart : Y = X 1 , defines a d-form


k on U and for every Borel set S U

X
k = E Mu,k
(S) .

Proof: Note that a d-form is a measure on U whose image in each chart is


absolutely continuous with respect to Lebesgue measure di=1 dsi ,. To prove that (6)
defines an d-form, it is sufficient to prove that its density with respect to di=1 dsi ,
satisfies locally the change-of-variable formula. Let (U1 , 1 ), (U2 , 2 ) two intersecting
charts and set
1
1
U3 := U1 U2 ; Y1 := X 1
1 ; Y2 := X 2 ; H := 2 1 .

Denote by s1i and s2i , i = i, ..., d the coordinates in each chart. We have
Y1
=
s1i
2 Y1
=
s1i s1j

i ,j

Y2 Hi
s2i s1i

2 Y2 Hi Hj
+
s2i s2j s1i s1j

Thus at every point


Y1 (s1 ) = H (s1 )
12

Y2 (s2 ),

Y2 2 Hi
.
s2i s1i s1j

pY1 (s1 ),Y1 (s1 ) (x, 0) = pY2 (s2 ),Y2 (s2 ) (x, 0)| det(H (s1 )|1
and at a singular point
Y1 (s1 ) = H (s1 )

Y2 (s2 )H (s1 ).

On the other hand, by the change of variable formula


di=1 ds1i = | det(H (s1 )|1 di=1 ds2i .
Replacing in the integrand in (6), one checks the desired result.
For the second part again it suffices to prove it locally for an open subset S
included in a unique chart. Let (S, ) a chart and let again Y (s) be the process
written in this chart, it suffices to check that
X
(S) =
E Mu,k
+

d(s)
(S)

dx E k (Y (s))/Y (s) = x, Y (s) = 0 pY (s),Y

(s) (x, 0).

(7)

X
Y
Since Mu,k
(S) is equal to Mu,k
{(S)} we see that the result is a direct consequence of Theorem (2.3)

2.1.2

Riemannian manifold

The form in (6) is intrinsic (in the sense that it does not depend on the parametrization) but the terms inside the integrand are not. It is possible to give a complete
intrinsic expression in the case when U is a equipped with a Riemannian metric.
When such a Riemannian metric is not given, it is always possible to use the metric
g induced by the process itself (see Taylor and Adler, 2002) by setting
gs (Y, Z) = E

Y (X) Z(X)

for Y, Z belonging the tangent space T (s) at s U . Y (X), (resp. Z(X)) denotes
the action of the tangent vector Y (resp. Z) on the function X. This metric leads
to very simple expressions for centred variance-1 Gaussian processes.
The main point is that at a singular point of X the second order derivative D2 X
is intrinsic since it defines locally the Taylor expansion. Given the Riemannian
metric gs the second differential can be represented by an endomorphism that will
be denoted 2 X(s).
D2 X(s){Y, Z} = Y (Z(X) = Z(Y (X) = gs (2 X(s)Y, Z).
13

(8)

In fact, at a singular point the definition given by formula (8) coincide with the
definition of the Hessian read in and orthonormal basis. This endomorphism is
intrinsic and of course its determinant. So in a chart
det 2 X(s) = det(D2 X(s)) det(gs )1 ,

(9)

and 2 X(s) is negative definite if and only if D2 X(s) is. Hence


k 2 X(s) = k D2 X(s) det(gs )1 ; (k = 1, 2)
We turn now to the density in (6). The gradient at some location s is defined
as the unique vector X(s) T (s) such that gs (X(s), Y ) = DX(s){Y }. In a
chart the vector of coordinates of the gradient in the basis xi , i = 1, d is given
1
by gs DX(s) where DX(s) is now the vector of coordinates of the derivative in
the basis dxi , i = 1, d. The joint density at (x, 0) of X(s), X(s) is intrinsic only
if read in an orthonormal basis of the tangent space. In that case the vector of
coordinates is given by
X(s) = gs

1/2

X(s) = gs

1/2

DX

By the change of variable formula :


pX(s),X(s) (x, 0) = pX(s),DX(s) (x, 0) det(gs )
Remembering that the Riemannian volume V ol satisfies
V ol =

det(gs ) di=1 ds2i

we can rewrite expression (6) as


+

dx E k (2 X(s)/X(s) = x, X(s) = 0 pX(s),X(s) (x, 0) V ol

(10)

where we have omitted the tilde above X(s) for simplicity. This is the Riemannian
intrinsic expression.
2.1.3

Embedded manifold

In most practical applications, U is naturally embedded in an Euclidean space Rm .


Examples of such situations are given by U being a sphere or the boundary of a
domain in Rm . In such a case we look for an expression for (10) as a function of
14

the natural derivative on Rm . The manifold is equipped with the metric induced
by the Euclidean metric in Rm . Considering the form (10), clearly the Riemannian
volume is just the geometric measure on U .
Following Milnor (1965), we assume that the process Xt is defined on an open
neighbourhood of U so that the ordinary derivatives X (s) and X (s) are well defined
for s U . Denoting the projector onto the tangent and normal spaces by PT (s) and
PN (s) , we have.
X(s) = PT (s) (X (s)).
Wee now define the second fundamental form II of U embedded in Rm than can be
defined in our simple case as the bilinear application ( see Kobayashi Nomizu 199?
T 2, chap. 7 for details).
Y, Z T (s)

PN (s) (X Y ).

where X Y is the Levi-Civita connection on Rn . The next formula is well known,


or easy to check at a singular point, and gives the expression of the Hessian on U .
Y, Z T (s)

X (s){Y, Z}+ < II{Y, Z}, X (s) >,

(11)

The determinant of the bilinear form given by (11), expressed in an orthonormal


basis, gives the value of det 2 X(s) . As a conclusion we get the expression of
every terms involved in (10).
Examples:
Codimension 1: with a given orientation we get
2 X = XT + II.XN
where XT is the tangent projection of the second derivative and XN the normal
component of the gradient.
Sphere: When U is a sphere of radius r > 0 in Rd+1 oriented towards the inside
2 X = XT + r(Id)d XN

(12)

Curve: When the manifold is a curve parametrized by arc length


E Muk (U ) =

dx
u

dt
0

E k XT (t) + C(t)XN (t)/X(t) = x, XT (t) = 0 pX(t),XT (t) (x, 0), (13)


Where C(t) is the curvature at location t and XN (t) is the derivative taken is
the direction of the normal to the curve at point t.
15

Remark: One can consider a number of variants of Rice formulae, in which we


may be interested in computing the moments of the number of roots of the equation
Z(t) = u under some additional conditions. This has been the case in the statement
of Theorem 2.3 in which we have given formulae for the first moment of the number
of zeroes of X in which X is bigger than u (i=2) and also the real-valued process
X has a local maximum (i=1).
We just consider below two additional examples of variants that we state here
for further reference. We limit the statements to random fields defined on subsets
of Rd . Similar statements hold true when the parameter set is a general smooth
manifold. Proofs are essentially the same as the previous ones.
Variant 1: Assume that Z1 , Z2 are Rd -valued random fields defined on compact
subsets I1 , I2 of Rd and suppose that (Zi , Ii ) (i = 1, 2) satisfy the hypotheses of
Theorem 2.1 and that for every s I1 and t I2 , the distribution of Z1 (s), Z2 (t)
does not degenerate. Then, for each pair u1 , u2 Rd :
E NuZ11 (I1 )NuZ22 (I2 )
=
I1 I2

dt1 dt2 E (| det(Z1 (t1 ))|| det(Z2 (t2 ))|/Z1 (t1 ) = u1 , Z2 (t2 ) = u2 ) pZ1 (t1 ),Z2 (t2 ) (u1 , u2 ),
(14)

Variant 2: Let Z, I be as in Theorem 2.1 and a real-valued bounded random


variable which is measurable with respect to the -algebra generated by the process
Z. Assume that for each t I, there exists a continuous Gaussian process {Y t (s) :
s I}, for each s, t I a non-random function t (s) : Rd Rd and a Borelmeasurable function g : C R where C is space of real-valued continuous functions
on I equipped with the supremum norm, such that:
1. = g Y t (.) + t (.)Z(t)
2. Y t (.) and Z(t) are independent
3. for each u0 R, almost surely the function
u

g Y t (.) + t (.)u

is continuous at u = u0
Then the formula :
E NuZ (I) =

E (| det(Z (t))|/Z(t) = u) pZ(t) (u)dt,


I

16

holds true.
We will be particularly interested in the function = 1IMI <v for some v R. We
will see that later on that it satisfies the above conditions under certain hypotheses
on the process Z.

First Derivative, First Form.

Our main goals in this and the next section are to prove existence and regularity of
the derivatives of the function u
FI (u) and, at the same time, that they satisfy
some implicit formulae that can be used to provide bounds on them. In the following
we assume that I is a d-dimensional C manifold embedded in RN , N d. and

are respectively the geometric measures on I and I. Unless explicit statement of


the contrary, the topology on I will be he relative topology.
In this section we prove formula (17) for FI (u). -that we call first form- which
is valid for -almost every u, under strong regularity conditions on the paths of the
process X. In fact, the hypothesis that X is Gaussian is only used in Rice formula
itself and in Lemma 3.1 which gives a bound for the joint density
pX(s),X(t),X (s),X (t) .
In both places, one can substitute Gaussianity by appropriate conditions that permit
to obtain similar results.
More generally, it is easy to see that inequality (15) below is valid under quite
general non Gaussian conditions and implies that we have the following upper bound
for the density of the distribution of the random variable MI .
FI (u)

E 1 (X (t))/X(t) = u, X (t) = 0 pX(t),X (t) (u, 0)(dt)+


I

E 1 (X (t))/X(t) = u, X (t) = 0 pX(t),X (t) (u, 0)


(dt), (15)

where the function 1 has been defined in the statement of Theorem 2.3 and X
denotes the restriction of X to the boundary I.
Even for d = 1 (one parameter processes) and X Gaussian and stationary, inequality (15) provides reasonably good upper bounds for FI (u) (see Diebolt and
Posse (1996), Azas and Wschebor (2001). We will see an example for d = 2 at the
end of this section.
17

In the next section, we are able to prove that FI (u) is a C 1 function and that
formula (17) can be essentially simplified by getting rid of the conditional expectation, thus obtaining the second form for the derivative. This is done under weaker
regularity conditions but the assumption that X is Gaussian becomes essential.
In case the dimension d of the parameter is equal to 1, this is the starting point
to continue the differentiation procedure and under hypotheses H2k one is able to
(k)
prove that FI is a C k function and to obtain implicit formulae for FI (see Azas &
Wschebor, 2001)
When d > 1, a certain number of difficulties arise and it is not clear that the
process can continue beyond k = 2. With the purpose of establishing such formula
for FI , we introduce in Section 4 the helix-processes which appear in a natural
way in these formulae and have paths possessing singularities of a certain form that
will be described precisely in that section.
Definition 3.1 Let X : I R be real-valued stochastic process defined on a
subset of Rd . We will say that X satisfies condition (Hk ), k a positive integer, if
the following three conditions hold true:
X is Gaussian;
a.s. the paths of X are of class C k ;
for any choice of pairwise different values of the parameter t1 , ...., tn the joint
distribution of the random variables
X(t1 ), ..., X(tn ), X (t1 ), ..., X (tn ), ....., X (k) (t1 ), ..., X (k) (tn )

(16)

has maximum rank. Note that the number of distinct real-valued Gaussian
variables belonging to this set (16), on account of exchangeability of the order
of differentiation, is equal to
n 1+

d
d+1
k+d1
+
+ .... +
d1
d1
d1

The next proposition shows that there exist processes that satisfy (Hk ).
Proposition 3.1 Let X = X(t) : t Rd be a centred stationary Gaussian process having continuous spectral density f X . Assume that f X (x) > 0 for every x Rd
and that for any > 0 f X (x) C x holds true for some constant C and all
x Rd .
Then, X satisfies (Hk ) for every k = 1, 2, ...
18

Proof: The proof is an adaptation of the proof of a related result for d = 1


(Cramer & Leadbetter (1967), p. 203).
It is well-known that the hypothesis implies that the paths are C k for every
k = 1, 2, ... As for the non-degeneracy condition, let t1 , ..., tn be pairwise different
values of the parameter. Denote by k1 ,k2 ...,kd X the partial derivative of X k1 times
with respect to the first coordinate, k2 times with respect to the second, ...., kd times
with respect to the d-th coordinate. We want to prove that, for any k = 1, 2, ... the
centred Gaussian joint distribution of the random variables
k1 ,k2 ...,kd X(th )
where the d-tuple (k1 , ..., kd ) varies on the set of non-negative integers such that
k1 + ...kd d and th varies in the set {t1 , ...tn }, is non-degenerate. For this purpose,
it suffices to show that if we put
n

Z=
h=1

k1 ,k2 ...,kd ,h k1 ,k2 ...,kd X(th )

where k denotes summation over all the d-tuples of non-negative integers k1 , k2 ..., kd
such that k1 +k2 +..+kd k and k1 ,k2 ...,kd ,h are complex numbers, then E |Z|2 = 0
implies k1 ,k2 ...,kd ,h = 0 for any choice of the indices k1 , k2 ..., kd , h in the sum. Using
the spectral representation, and denoting x = (x1 , ..., xd ),
n

E |Z|

k1 ,k2 ...,kd ,h k1 ,k2 ...,kd ,h


h,h =1

(ix1 )k1 ...(ixd )kd (ix1 )k1 ...(ixd )kd .


Rd

. exp [i x, th th ] f X (x)dx
where the inner sum is over all 2dtuples of non-negative integers k1 , k2 ..., kd , k1 , k2 ..., kd
such that k1 + k2 + .. + kd k, k1 + k2 + .. + kd k. Hence,
2

E |Z|

k1

=
Rd h=1

kd

k1 ,k2 ...,kd ,h (ix1 ) ...(ixd ) exp [i x, th ]

f X (x)dx

The hypothesis on f X implies that if E |Z|2 = 0, then


n

h=1

k1 ,k2 ...,kd ,h (ix1 )k1 ...(ixd )kd exp [i x, th ] = 0 for all x Rd .

The result follows from the fact that the set of functions xk11 ...xkdd exp [i x, th ] where
k1 , k2 ..., kd , h vary as above, is linearly independent.

19

Theorem 3.1 (First derivative, first form) Let X : I R be a Gaussian


process, I a C compact d-dimensional manifold .
Assume that X verifies Hk for every k = 1, 2, ...
Then, the function u
FI (u) is absolutely continuous and its Radon-Nikodym
derivative is given for almost every u by:
FI (u) = (1)d
(1)d1
I

E (det(X (t)) 1IMI u /X(t) = u, X (t) = 0) pX(t),X (t) (u, 0)(dt)+

E det(X (t)) 1IMI u /X(t) = u, X (t) = 0 pX(t),X (t) (u, 0)


(dt). (17)

Proof : For u < v and S (respectively S) a subset of I (resp. I), let us denote
Mu,v (S) =

{t S : u < X(t) v, X (t) = 0, X (t) 0}

Mu,v (S) =

t S : u < X(t) v, X (t) = 0, X (t) 0

Step 1. Let h > 0 and consider the increment


FI (u) FI (u h) = P {MI u}

1 Muh,u (I) 1
Muh,u (I)

Let us prove that


1, Muh,u (I) 1 = o(h) as h 0.
P Muh,u (I)

(18)

In fact, for > 0 :


1, Muh,u (I) 1
P Muh,u (I)
E Muh,u (I )Muh,u (I) + E (Muh,u (I \ I )) (19)
The first term in the right-hand member of (19) can be computed by means of
a Rice-type Formula, and it can be expressed as:
u

(dt)(dt)
I I

dxdx
uh

E 1 (X (t)) 1 (X (t))/X(t) = x, X(t) = x, X (t) = 0, X (t) = 0


pX(t),X(t),X (t),X (t) (x, x, 0, 0),
20

where the function 1 has been defined in Theorem 2.3.


Since in this integral t t , the integrand is bounded and the integral is
O(h2 ).
For the second term in (19) we apply Rice formula again. Taking into account
that the boundary of I is smooth and compact, we get:
E (Muh,u (I \ I )}
u

E 1 (X (t))/X(t) = x, X (t) = 0 pX(t),X (t) (x, 0) dx

(dt)
I\I

uh

(const) h (I \ I ) (const) h.,


where the constant does not depend on h and . Since > 0 can be chosen arbitrarily
small, (18) follows and we may write:
FI (u) FI (u h)
1 + P MI u, Muh,u (I) 1 + o(h)
= P MI u, Muh,u (I)
as h 0.
Note that the foregoing argument also implies that FI is absolutely continuous
with respect to Lebesgue measure and that the density is bounded above by the
right-hand member of (17). In fact:
1 + P Muh,u (I) 1
FI (u) FI (u h) P Muh,u (I)
+ E Muh,u (I)
E Muh,u (I)
and it is enough to apply Rice Formula to each one of the expectations on the
right-hand side.
The delicate part of the proof consists in showing that we have equality in (17).
Step 2. For g : I R we put
g

= sup |g(t)|
tI

and if k is a non-negative integer,


g

,k

sup
k1 +k2 +..+kd k

21

k1 ,k2 ...,kd g

For fixed > 0 (to be chosen later on) and h > 0,we denote by Eh the event:
Eh =

,4

Because of the Landau-Shepp-Fernique inequality (see Landau-Shepp, 1970 or Fernique, 1975) there exist positive constants C1 , C2 such that
P(EhC ) C1 exp C2 h2 = o(h) as h 0
so that to have (17) it suffices to show that, as h 0 :
E
E

1I
Muh,u (I)
1IMI u 1IEh = o(h)

Muh,u (I)1

(20)

Muh,u (I) 1IMuh,u (I)1 1IMI u 1IEh = o(h)

(21)

We prove (20). (21) can be proved in a similar way.


We have:
Put Muh,u = Muh,u (I).
E

Muh,u 1IMuh,u 1 1IMI u 1IEh E (Muh,u (Muh,u 1) 1IEh )


u

(s) (t)
II

dx1 dx2
uh

E 1 (X (s)) 1 (X (t)) 1IEh /X(s) = x1 , X(t) = x2 , X (s) = 0, X (t) = 0


.pX(s),X(t),X (s),X (t) (x1, x2 , 0, 0), (22)
on applying Rice formula for the second factorial moment.
Our goal is to prove that the integrand in the right member of (22), that is:
u

dx1 dx2

As,t =
uh

E |det(X (s) det(X (t)| 1IX

(s)0,X (t)0

1IEh /X(s) = x1 , X(t) = x2 , X (s) = 0, X (t) = 0


.pX(s),X(t),X (s),X (t) (x1, x2 , 0, 0), (23)

is o(h) as h 0 uniformly on s, t. Note that when s, t vary in a domain of the form


D := {t, s I : t s > } for some > 0, then the Gaussian distribution in (23)
is non-degenerate and As,t is bounded by (const)h2 , the constant depending on the
minimum of the determinant:
det Var (X(s), X(t), X (s), X (t) ,
22

for s, t D .
So it is enough to prove that As,t = o(h) for t s small, and we may assume
that s and t are in the same chart (U, ). Writing the process in this chart we may
assume that I is a ball or a half ball in Rd . Let s, t two such points, define the
process Y = Y s,t by
Y ( ) = X s + (t s)

; [0, 1].

Under the conditioning one has:


Y (0) = x1 ,

Y (1) = x2 ,

Y (0) = Y (1) = 0

Y (0) = X (s)[(t s), (t s)] ; Y (1) = X (t)[(t s), (t s)].


Consider the interpolation polynomial Q of degree 3 such that
Q(0) = x1 ,

Q(1) = x2 ,

Q (0) = Q (1) = 0

Check that
Q(y) = x1 + (x2 x1 ) y 2 (3 2y), Q (0) = Q (1) = 6(x2 x1 )
Denote
Z( ) = Y ( ) Q( ) 0 1.
Under the conditioning, one has:
Z(0) = Z(1) = Z (0) = Z (1) = 0
and if also the event Eh occurs, an elementary calculation shows that for 0 1 :
|Z (4) ( )|
|Y (4) ( )|
= sup
(const) t s 4 h .
2!
2!
[0,1]
[0,1]

|Z ( )| sup

(24)

On the other hand, check that if A is a positive semi-definite symmetric d d


real matrix and v1 is a vector of Euclidean norm equal to 1, then the inequality
det(A) Av1 , v1

det(B)

holds true, where B is the (d 1) (d 1) matrix


B = (( Avj , vk ))j,k=2,...,d
23

(25)

and {v1 , v2 , ..., vd } an orthonormal basis of Rd containing v1 .


Assume X(s) is negative definite, and that the event Eh occurs. We can apply
(25) to the matrix A = X(s) and the unit vector
ts
.
ts

v1 =

Note that in that case, the elements of matrix B are of the form X(s)vj , vk
hence bounded by (const)h . So,

det [X (s)] X (s)v1 , v1 Cd h(d1) = Cd [Y (0)]

ts

2 (d1)

the constant Cd depending only on the dimension d.


Similarly, if X(t) is negative definite, and the event Eh occurs, then:
det [X (t)] Cd [Y (1)]

ts

2 (d1)

Hence, if C is the condition {X(s) = x1 , X(t) = x2 , X (s) = 0, X (t) = 0}:


E |det(X (s)) det(X (t))| 1IX
Cd2 h2(d1) t s

Cd2

Cd2

2(d1)

2(d1)

(s)0,X (t)0

1IEh /C

E [Y (0)] [Y (1)]

ts

Y (0) + Y (1)
2

ts

Z (0) + Z (1)
2

(const) Cd2 h2d t s

1IEh /C
1IEh /C
1IEh /C

We now turn to the density in (22) using the following Lemma which is similar
to Lemma 4.3., p. 76, in Piterbarg (1996).
Lemma 3.1 For all s, t I:
ts

d+3

pX(s),X(t),X (s),X (t) (0, 0, 0, 0) D

(26)

where D is a constant.
Proof. Assume that (26) does not hold, i.e., that there exist two convergent
sequences {sn }, {tn } in I , sn s , tn t such that
tn sn

d+3

pX(sn ),X(tn ),X (sn ),X (tn ) (0, 0, 0, 0) +


24

(27)

If s = t , (27) can not hold, since the non degeneracy condition assures that this
sequence has the finite limit t s d+3 pX(s ),X(t ),X (s ),X (t ) (0, 0, 0, 0). So, s = t .
Since one can assume with no loss of generality that I is a ball or a half ball, the
n
segment [sn , tn ] is contained in I. Denote the unit vector e1,n = ttnn s
,complete
sn
d
it to an orthonormal basis {e1,n , e2,n , ..., ed,n } of R and take a subsequence of the
integers {nk } so that ej,nk ej as k + for j = 1, ..., d. In what follows, without
loss of generality, we assume that {nk } is the sequence of all positive integers. For
each Rd we denote 1,n , ..., d,n the coordinates of in the basis {e1,n , ..., ed,n }.
Note that tn sn has coordinates (t1,n s1,n , 0, ..., 0) = ( tn sn , 0, ..., 0).
Also, we denote 1 , ..., d the coordinates of in the basis {e1 , ..., ed }
The following computation is similar to the proof of Lemma 3.2. in Azas &
Wschebor (2001). We have:
n = det Var (X(sn ), X(tn ), X (sn ), X (tn ))
X
X
X
X
= det Var X(sn ), X(tn ),
(sn ),
(tn ), ...,
(sn ),
(tn )
1,n
1,n
d,n
d,n
X
X
X
= det Var X(sn ),
(sn ), Y1,n , Z1,n ,
(sn ), Z2,n , ...,
(sn ), Zd,n
1,n
2,n
d,n
where

X
(sn )(t1,n s1,n )
1,n
X
X
2
Z1,n =
(tn )
(sn )
Y1,n
1,n
1,n
t1,n s1,n
X
X
X
X
(tn )
(sn ), ....., Zd,n =
(tn )
(sn )
Z2,n =
2,n
2,n
d,n
d,n
Using now Taylor expansions and taking into account the integrability of the supremum of bounded Gaussian process, we have:
Y1,n = X(tn ) X(sn )

Y1,n =
Z1,n

(t1,n s1,n )2 2 X
(sn ) + 1,n (t1,n s1,n )3
2
2
1,n

(t1,n s1,n )2 3 X
=
(sn ) + n (t1,n s1,n )3
3
6
1,n

2X
(sn ) + 2,n (t1,n s1,n )2 , ......,
2,n 1,n
2X
= (t1,n s1,n )
(sn ) + d,n (t1,n s1,n )2
d,n 1,n

Z2,n = (t1,n s1,n )


Zd,n

25

where the random variables 1,n , 2,n , ..., d,n , n are uniformly bounded in L2 of the
underlying probability space.
Substituting into n it follows that:
144 (t1,n s1,n )[8+2(d1)] n
X
2X
3X
2X
X
X

det Var X(s ), (s ), ...,


(s
),
(s )
(s
),
(s
),
(s
),
...,
1
2
2 1
d
d 1
(1 )3
and this limit is bounded below by a positive constant, independent of s , because
of the non-degeneracy assumption. Since t1,n s1,n = tn sn , this contradicts
(27) and finishes the proof of the Lemma.
Returning to the proof of Theorem 3.1.
To bound the expression in (22) we use Lemma 3.1 and the bound on the conditional expectation, thus obtaining
E (Muh,u (Muh,u 1)1IEh )
(const)Cd2 h2d D

ts

d+1

ds dt

II

dx1 dx2
uh

(const) h22d
since the function (s, t)
t s d+1 is Lebesgue-integrable in I I. The last
constant depends only on the dimension d and the set I, Taking small enough
(20) follows.
An example: Let {X(s, t)} be a real-valued two-parameter Gaussian, centred
stationary isotropic process with covariance . Assume that its spectral measure
is absolutely continuous with density
(ds, dt) = f ()dsdt,
So that

= (s2 + t2 ) 2 .

f ()d = 1.
0

Assume further that Jk = 0 k f ()d < , for 1 k 5. Our aim is to give an


explicit upper bound for the density of the probability distribution of MI where I
is the unit disc i. e.
I = {(s, t) : s2 + t2 1}
26

Using (15) which is a consequence of Theorem 3.1 and the invariance of the law of
the process, we have
FI (u) E 1 (X (0, 0))/X(0, 0) = u, X (0, 0) = (0, 0) pX(0,0),X (0,0) (u, (0, 0))
(1, 0))/X(1, 0) = u, X
(1, 0) = 0 p
+ 2E 1 (X
(1,0) (u, 0) = I1 + I2 . (28)
X(1,0),X
We denote by X, X , X the value of the different processes at some point (s, t);
by Xss , Xst , Xtt the entries of the matrix X and by and the standard normal
density and distribution.
One can easily check that:
X is independent of X and X , and has variance J3 Id
Xst is independent of X, X Xss and Xtt , and has variance 4 J5
Conditionally on X = u, the random variables Xss and Xtt have
expectation: J3
J (J3 )2
variance: 3
4 5
covariance: 4 J5 (J3 )2 .
Using an elementary computation we get that the expectation of the negative part
of a Gaussian variable with expectation and variance 2 is equal to

( ) (
).

We obtain
I2 =

2
(u)
J3

3
J5 (J3 )2
4

with

1
2

(bu) + J3 u(bu) ,

J3

b=

3
J
4 5

(J3 )2

1
2

As for I1 we remark that, conditionally on X = u, Xss + Xtt and Xss Xtt are
independent, so that a direct computation gives:
I1 =

1
(u)E 1 2J3 u
8J3

J5 2
(2 + 32 )
4

1I{ < 2J u} 1I
1
3
{ 1 2J3 u
27

, (29)
J5 2
(2 + 32 ) > 0}
4

Where 1 , 2 , 3 are standard independent normal random variables and 2 = 2J5


4 2 J32 . Finally we get

2
I1 =
(u)
(2 +a2 c2 x2 )(acx)+[2a2 (acx)](acx) x(x)dx,
8J3
0
with a = 2J3 u, c =

J5
.
4

First derivative, second form

We choose, once for all along this section a finite atlas A for I. Then, to every t I
it is possible to associate a fixed chart that will be denoted (Ut , t ). When t I,
t (Ut ) can be chosen to be a half ball with t (t) belonging to the hyperplane limiting
this half ball. For t I, let Vt an open neighbourhood of t whose closure is included
in Ut and t a C function such that
t 1
t 0

on
on

Vt
Utc

(30)
(31)

For every t I and s I we define the normalization n(t, s) in the following


way:
for s Vt , we set in the chart (Ut , t )
n1 (t, s) =

1
s t 2.
2

(32)

By in the chart we mean that s t , is in fact t (t) t (s) .


for general s we set
n(t, s) = t (s)n1 (t, s) + 1 t (s)
Note that in the flat case (d=N) the simpler definition n(t, s) =
works.

1
2

st

For every t I and s I, we set instead of formula (32)


n1 (t, s) = |(s t)N | +

1
s t 2.
2

where (st)N is the normal component of (st) with respect to the hyperplane
delimiting the half ball t (Ut ) . The rest of the definition is the same.
28

Definition 4.1 We will say that f is an helix-function - or an h-function - on I


with pole t I satisfying hypothesis Ht,k , k integer k > 1 if
f is a bounded C k function on I\{t} .
f (s) := n(t, s)f (s) can be prolonged as function of class C k on I.
Definition 4.2 In the same way X is called an h-process with pole t I satisfying
hypothesis Ht,k , k integer k > 1 if
Z is a Gaussian process with C k paths on I\{t} .
Z(s) := n(t, s)Z(s) can be prolonged as a process of class C k on I,
for t I;
with Z(t) = 0 Z (t) = 0. If s1 , ..., sm are pairwise different points of I\{t}
then the distribution of
Z (2) (t), ..., Z (k) (t), Z(s1 ), ..., Z (k) (s1 ), ..., Z (k) (sm )
does not degenerate.
for t I; Z(s) := n(t, s)Z(s) can be prolonged as a process of class C k on I
(t) = 0 and if s1 , ..., sm are pairwise different points of I\{t}
with Z(t) = 0 Z
then the distribution of
Z N (t), Z (2) (t), ..., Z (k) (t), Z(s1 ), ..., Z (k) (s1 ), ..., Z (k) (sm )
does not degenerate. Z N (t) is the derivative normal to the boundary of I at t.
We use the terms h-function and h-process since the function and the paths
of the process need not to extend to a continuous function at the point t. However,
the definition implies the existence of radial limits at t. So the process may take the
form of a helix around t.
Lemma 4.1 Let X be a process satisfying Hk , k 2, and f be a C k function I R
set for s I, s = t
(A) For t I,
X(s) = ats X(t)+ < bts , X (t) > +n(t, s)X t (s),
where ats and bts are the regression coefficients.
In the same way, set
f (s) = ats f (t)+ < bts , f (t) > +n(t, s)f t (s),
29

using the regression coefficients associated to X.


(B) For t I, s T, s = t set
(t) > +n(t, s)X t (s)
X(s) = a
ts X(t)+ < bts , X
and

f (s) = a
ts f (t)+ < bts , f (t) > +n(t, s)f t (s),

Then s
X t (s) and s
pole t satisfying Ht,k .

f t (s) are respectively a h-process and a h-function with

the other one being similar. In fact,


Proof: We give the proof in the case t I,
t
the quantity denoted by X (s) is just X(s) ats X(t) < bts , X (t) >. On L2 (, P ),
let be the projector on the orthogonal complement to the subspace generated by
X(t), X (t). Using a Taylor expansion
X(s) = X(t)+ < (s t), X (t) > + t s

X (1 )t + s v, v (1 )d,
0

With v =

st
st

. This implies that


t

X (s) = t s

X (1 )t + s v, v (1 )d ,

(33)

which gives the result due to the non degeneracy condition.


We state now an extension of Ylvisakers Theorem (1968) on the regularity of
the distribution of the maximum of a Gaussian process which we will use in the
proof of Theorem 4.2 and might have some interests in itself.
Theorem 4.1 Let Z : T R a Gaussian separable process on some parameter
set T and denote by M Z = suptT Z(t) which is a random variable taking values in
R {+}. Assume that there exists 0 > 0, m > such that
m(t) = E(Zt ) m
2 (t) = Var(Zt ) 02
for every t T . Then the distribution of the random variable M Z is the sum of an
atom at + and a-possibly defective-probability measure on R which has a locally
bounded density.
30

Proof: Suppose first that X : T R a Gaussian separable process satisfying


Var(Xt ) = 1 ; E(Xt ) 0,
for every t T . A close look at Ylvisakers proof (1968) shows that the distribution
of the supremum M X has a density pM X that satisfies
pM X (u) (u) =

exp(u2 /2)
for every u R

2 /2)dv
exp(v
u

(34)

Let now Z satisfy the hypotheses of the theorem. For given a, b R, a < b,
choose A R+ so that |a| < A and consider the process:
X(t) =

Z(t) a |m | + A
.
+
(t)
0

Clearly for every t T :


E X(t) =

m(t) a |m | + A
|m | + |a| |m | + A
+

+
0,
(t)
0
0
0

and
Var X(t) = 1.
So that (34) holds for the process X.
On the other hand:
|m | + A
|m | + A b a
{a < M Z b} {
< MX
+
}.
0
0
0
And it follows that
P a < MZ b

|m |+A ba
+
0
0
|m |+A
0

(u)du =
a

v a + |m | + A
1

dv.
0
0

which shows the statement.


Set now (t) 1. The key point is that, due to regression formulae, under the
condition X(t) = u, X (t) = 0 the event
Au (X, ) := X(s) u, s I
coincides with the event
Au (X t , t ) := X t (s) t (s)u, s I\{t} ,
where X t and t are the h-process and the h-function defined in Lemma 4.1.
31

Theorem 4.2 (First derivative, second form) Let X : I R be a Gaussian


process, I a C compact manifold contained in Rd .
Assume that X has paths of class C 2 and for s = t the triplet (X(s), X(t), X (t))
in R R Rd has a non-degenerate distribution.
Then, the result of Theorem 3.1 is valid, the derivative FI (u) given by relation
(17) can be written as
d

FI (u) = 1
+ 1

d1
I

E det X t (t) t (t)u 1IAu (X t , t ) pX(t),X (t) (u, 0)(dt)


t (t) t (t)u 1IAu (X t , t ) p
E det X
(dt), (35)
X(t),X (t) (u, 0)

and this expression is continuous as a function of u.


t

(t) should be understood in the sense that we first define X t and


The notation X
then calculate its second derivative along I.
Proof: As a first step, assume that the process X satisfies the hypotheses of
theorem 3.1, which are stronger that those in the present theorem.
We prove that the first term in (17) can be rewritten as the first term in (35).
One can proceed in a similar way with the second term, mutatis mutandis. For that
purpose, use the remark just before the statement of Theorem 4.2 and the fact that
under the condition
X(t) = u, X (t) = 0
, X (t) is equal to
X t (t) t (t)u.
Replacing in the conditional expectation in (17) and on account of the Gaussianity of the process, we get rid of the conditioning and obtain the first term in
(35).
We now study the continuity of u
FI (u). The variable u appears at three
locations
in the density pX(t),X (t) (u, 0) which is clearly continuous
in
E det X t (t) t (t)u 1IAu (X t , t )
where it occurs twice: in the first factor and in the indicator function.

32

Due to the integrability of the supremum of bounded Gaussian processes, it is


easy to prove that this expression is continuous as a function of the first u.
As for the u in the indicator function, set
v := det X t (t) t (t)v

(36)

and, for h > 0, consider the quantity


E v 1IAu (X t , t ) E v 1IAuh (X t , t )
which is equal to
E v 1IAu (X t , t )\Auh (X t , t ) E v 1IAuh (X t , t )\Au (X t , t )

(37)

Apply Schwarzs inequality to the first term in (37).


E v 1IAu (X t , t )\Auh (X t , t ) E(v2 )P{Au (X t , t )\Auh (X t , t )}

1/2

The event Au (X t , t )\Auh (X t , t ) can be described as


s I\{t} : X t (s) t (s)u 0 ; s0 I\{t} : X t (s0 ) t (s0 )(u h) > 0
This implies that t (s0 ) > 0 and that
t

sup X t (s) t (s)u 0.


sI\{t}

Now, observe that our improved version of Ylvisakers theorem (Theorem 4.1),applies
to the process s
X t (s) t (s)u defined on I\{t}. This implies that the first term
in (37) tends to zero as h 0. An analogous argument applies to the second term.
Finally, the continuity of FI (u) follows from the fact that one can pass to the limit
under the integral sign in (35).
To finish the proof we still have to show that the added hypotheses are in fact
unnecessary for the validity of the conclusion. Suppose now that the process X
satisfies only the hypotheses of the theorem and define
X (t) = Z (t) + Y (t)

(38)

where for each > 0, Z is a real-valued Gaussian process defined on I, measurable with respect to the -algebra generated by {X(t) : t I}, possessing C
33

paths and such that almost surely Z (t), Z (t), Z (t) converge uniformly on I to
X(t), X (t), X (t) respectively as 0. One standard form to construct such an approximation process Z is to use a C partition of the unity on I and to approximate
locally the composition of a chart with the function X by means of a convolution
with a C kernel.
In (38), Y denotes the restriction to I of a Gaussian centred stationary process
satisfying the hypotheses of proposition 3.1, defined on RN , and independent of
X. Clearly X satisfies condition (Hk ) for every k, since it has C paths and
the independence of both terms in (38) ensures that X inherits from Y the nondegeneracy condition in Definition 3.1. So, if
MI = max X (t) and FI (u) = P{MI u}
tI

one has
FI (u) = 1
+ 1

d1

E det X

(t)

E det X

(t)

(t)u 1IAu (X

(t)u 1IAu (X

t , ,t )

t , t )

pX

pX

(t),X (t) (u, 0)(dt)

(dt),
(t) (u, 0)
(t),X

(39)

We want to pass to the limit as 0 in (39). We prove that the right-hand member
is bounded if is small enough and converges to a continuous function of u as 0.
Since MI MI , this implies that the limit is continuous and coincides with FI (u)
by a standard argument on convergence of densities. We consider only the first term
in (39), the second is similar.
The convergence of X and its first and second derivative, together with the
non-degeneracy hypothesis imply that uniformly on t I, as 0 :
pX

(t),X (t) (u, 0)

pX(t),X (t) (u, 0).

The same kind of argument can be used for


det X

(t)

(t)u ,

on account of the form of the regression coefficients and the definitions of X t and
t . The only difficulty is to prove that, for fixed u:
P{C C} 0 as

0,

where
C = Au (X t , t )
34

(40)

C = Au (X t , t )
We prove that
a. s. 1IC 1IC as

0,

(41)

which implies (40).


First of all, note that the event
L=

sup

X t (s) t (s)u = 0

sI\{t}

has zero probability, as already mentioned.


Second, from the definition of X t (s) and the hypothesis, it follows that , as 0,
X ,t (s), ,t (s) converge to X t (s), t (s) uniformly on I\{t}. Now, if
/ C, there
exists s = s() I\{t} such that
X t (
s) t (
s)u > 0
and for

> 0 small enough, one has


X t (
s) t (
s)u > 0,

which implies that


/C.
On the other hand, let C\L. This implies that
sup

X t (s) t (s)u < 0.

sI\{t}

From the above mentioned uniform convergence, it follows that if


enough, then
sup X t (s) t (s)u < 0,

> 0 is small

sI\{t}

hence C . (41) follows.


So, we have proved that the limit as 0 of the first term in (39) is equal to the
first term in (35). Since,if > 0 is small enough the integrand is bounded for t I
and u in a compact interval of the real line.
It remains only to prove that the first term in (35) is a continuous function of u.
For this purpose, it suffices to show that the function
u

P{Au (X t , t )}.

35

is continuous. This is a consequence of the inequality


P{Au+h (X t , t )}P{Au (X t , t )} P

sup

X t (s) t (s)u

|h| sup | t (s)|

sI\{t}

sI\{t}

and of Theorem 4.1, applied once again to the process s


on I\{t}.

X t (s) t (s)u defined

Second derivative

Theorem 5.1 Suppose now that I is a d-dimensional smooth manifold without


boundary and that the process X satisfies hypothesis H4 then the distribution of
MI admits a density which is absolutely continuous and its derivative satisfies :

FI (u) = 1

(1,0)

E det X t (t) t (t)u 1IAu pX(t),X (t) (u, 0)dt


n

t (t)

E
I

i,j

i,j=1

ds t (s)

dt
I

Ci,j (u) 1IAu pX(t),X (t) (u, 0)dt

I
t

E det X (s) (s)u det X (t) (t)u 1IAu /X t (s) = t (s)u, X t (s) = t (s)u
pX t (s),X t (s) t (s)u, t (s)u pX(t),X (t) u, 0 +
S d1

(dw)E det X tT (w) tT (w)x v 1IAu /X tN (w) = tN (w)x, X tN T (w) = tN T (w)x


pX tN (w),X tN T (w) tN (w)x, tN T (w)x

(42)

(1,0)

Where Au stands for Au (X t , t ) and pX(t),X (t) , Ci,j (u), X tT , X tN , X tN T , tT , tN and


tN T , are defined in the proof.
Proof: We have to check that the expression given in Theorem 4.2 that now
takes the form
FI (u) = 1

d
I

E det X t (t) t (t)u 1IAu (X t , t ) pX(t),X (t) (u, 0)(dt)

(43)

is differentiable with respect to u. A sufficient condition is the integrand itself to be


differentiable with a derivative integrable in t, u , t I, u in a compact interval.
36

The derivative of the integrand in (43) is the sum of the three derivatives corresponding to the three locations where the variable u appears, namely :
in the density pX(t),X (t) (u, 0) which is clearly differentiable with bounded
derivative :
(1,0)
pX(t),X (t) (u, 0).
This gives the first term in (42).
In the derivative with respect to the first occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
The derivative of which is
d

t (t)

i,j

Ci,j (u) 1IAu (X t , t ) ,

i,j=1

where Ci,j (u) is the cofactor of location (i, j) in the matrix X t (t) t (t)u.
This quantity is uniformly bounded when u varies in a compact interval, which
follows easily from an expression of the type (33). This gives the second term
in (42).
in the derivative with respect to the second occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
To evaluate this derivative define v as in (36) and set for sufficiently small:
I := I\B(t, ) ; Au = Au (X t , t ) := X t (s) t (s)u, s I ,
B(t, ) being the ball with center t and radius in the chart (t , Ut ). By dominated
convergence
E v 1IAu+h (X t , t ) 1IAu (X t , t )

= lim E v ( 1IAu+h (X t , t ) 1IAu (X t , t )


0

On I , X t is a process satisfying H4. In the same manner as in Lemma 3.3 of


Azas and Wschebor (2001), we can generalize the proof of Theorem 3.1 to the case

37

of a non constant function t and a non constant random variable v to obtain


E v ( 1IAu+h 1IAu )
u+h

dx
I

(ds)(1)d t (s)E det Y t (s) v 1IAx /X t (s) = t (s)x, X t (s) = t (s)x


pX t (s),X t (s) t (s)x, t (s)x

u+h

dx
u

S(t,)

t (s) = t (s)x

(ds)(1)d1 t (s)E det Y t (s) v 1IAx /X t (s) = t (s)x, X


pX t (s),X t (s) t (s)x, t (s)x = I1 + I2 , (44)

where S(t, ) is the sphere with centre t and radius , Y t (s) = X t (s) t (s)x ,
t (s) t (s)x.
Y t (s) = X
Let us prove that the first integral converges as 0. The only problem is the
behaviour around t. So it is sufficient to prove the convergence locally around t in
the chart (t , Ut ) with s in Vt which implies that n(s, t) = 12 t s 2 . Without loss
of generality we may assume that the representation of t in this chart is the point 0
in Rd . To study the behaviour of the integrand as s 0, we choose an orthonormal
basis with ss as first vector and set s = (, 0, ...0)T . At s = 0 the process X t and
its derivative have the following expansions (for short, derivatives are indicated by
sub-indices).
1 2
3
4
X 11 + X 111 + X 1111 + o(4 )
2
6
24
2
3
X t

(s) = X t1 (s) = X 11 + X 111 + X 1111 + o(3 )


s1
2
6
2 t
2
X

(s) = X t11 (s) = X 11 + X 111 + X 1111 + o(2 )


2
s1
2
t
X
(s) = X tj (s) = X 1j + O(2 ) (j = 1),
sj
X t (s) =

where X ij = X tij (0), X 111 =

3X t
(0)
3 s1

and X 1111 =

4X t
(0)
4 s1

Since

X t (s) = m(s).X t
with
m(s) := 1/n(s, 0) =

s21

2
4
m(s)
(, 0...0) = 3
;
2
+ ... + sd
s1

38

(45)
(46)
(47)
(48)

m(s)
2 m(s)
12
(, 0...0) = 0 (i = 1) ;
(, 0...0) = 4
2
si
s1

4
2 m(s)
2 m(s)
(,
0...0)
=
(i
=
1)
;
(, 0...0) = 0 (i = j).
s2i
4
si sj
Using derivation rules, we get

X t (s) = X 11 + X 111 + O(2 )


3
X t
1
(s) = X1t (s) =
X + O()
s1
3 111
2X t
1
t
(s) =
(s) = X11
+ O()
X
2
s1
6 1111
X t
2
(s) = Xjt (s) =
X + O(1) (j = 1)
sj
1j
2X t
2
(s) = 2 X ij + O(1 ) (j = 1) (i = 1) (i = j)
si sj

(49)
(50)
(51)
(52)
(53)

from that we deduce that


pX t (s),X t (s) t (s)x, t (s)x (const)(d1)
Since if Ax occurs and X t (s) = t (s)x, then a.s. the matrix X t (s) t (s)x is
definite negative, and using relation (25), we have
d

det X t (s) t (s)x

|Xiit (s) iit (s)x|

i=1

where the notation iit (s) has an obvious meaning. The condition
C(s) = X t (s) = t (s)x ; X t (s) = t (s)x
converges as s 0 to the condition
X 11 = 11 x ; X 111 = 111 x ; X ti (0) = ti (0)x (i = 2, d)
which is non singular (again the notations 11 , 111 , ti are obvious). Consider
a Gaussian variable which is measurable with respect to the process and which is
39

bounded in probability. then its distribution conditional to C(s) remains bounded


in probability. Since for i = 1,
Xiit (s) iit (s)x =

1
2
2
1
X
+
X

x = Op (2 ),
ii
11
2
2
2 ii
2 11

this variable has the same order of magnitude under C(s).


On the other hand, under C(s)
X11 (s) = Op (1) ; 11 (s) = O(1)
Finally we get:
E

det X t (s) t (s)x v 1IA


= O(2(d1)
x

Since ( t (s) is bounded we see that the integrand is O(1d ) which ensures
convergence of I1 as 0. One easily check that the bound for the integrand is
uniform in t.
We consider now the limit of I2 as 0. It is enough to prove that for each
x R the expression
u+h

d1

(1)

(ds) t (s)

dx
u

S(t,)

t
t
t (s) = t (s)x p t t
E det X (s) v 1IAx /X t (s) = t (s)x, X
X (s),X (s) (s)x, (s)x .
(54)

converges boundedly as 0. Make in (54), the change of variable s = t + w,


w S d1 , and it becomes
u+h

(1)d1
u

2(d1)

(dw) t (t + w)

dx
S d1

t (t+w) v 1IA /X t (t+w) = t (t+w)x, X


t (t+w) = t (t+w)x
det X
x
p t
t (t + w)x, t (t + w)x , (55)
t
X (t+w), X (t+w)

where we have used that


PX,Y (x, u) = d1 PX,Y (x, u)
if (X, Y ) is a random vector in R Rd1 and > 0.
Now consider the following decomposition in block of the matrix X t (t) written
in a basis with first vector equal to w S d1
40

X tN (w) is the second derivative of X t in the direction w : wT X t w .


X tT (w) is the (d 1) (d 1) matrix that consists of the second derivatives
of X t in the direction that are orthogonal to w.
X tN T (w) is the (d 1) vector that consist of the cross second derivative X t ,
one in the direction w one in the d 1 direction orthogonal to w.
We have that the expression of X t (t) in the new basis is
X tN (w)
X tN T (w)

X tN T (w)
X tT (w)

We make the same decomposition with obvious notations for t (t).


Relations (49) to (52) imply that as 0
2(d1) pX t (t+w),X t (t+w) t (t+w)x, t (t+w)x pX tN (w),X tN T (w) tN (w)x, tN T (w)x
(56)
and (53) implies that
2 t
2 t
(t + w) tT (w).
X (t + w) X tT (w) ,
2
2
Noting that 1IAx 1IAx we get that as 0:
2(d1)
t (t+w) v 1IA /X t (t+w) = t (t+w)x, X
t (t+w) = t (t+w)x
E det X
x
2(d1)
t
E det X tT (w) tT (w)x v 1IAx /X N
(w) = tN (w)x, X tN T (w) = tN T (w)x .
Remarking that the integrand is uniformly bounded, we are ready to pass to the
limit and get the result.

Asymptotic expansion of F (u) for large u

Corollary 6.1 Suppose that the process X satisfies the conditions of Theorem 4.2
and that in addition E(Xt ) = 0 and Var(Xt ) = 1.
Then, as u + F (u) is equivalent to
ud
u2 /2
e
(2)(d+1)/2

det((t))
I

where (t) is the variance-covariance matrix of X (t).


41

1/2

dt,

(57)

Proof: Set r(s, t) := E X(s), X(t)), and for i, j = 1, d,


ri; (s, t) :=

r(s, t);
si

rij; (s, t) :=

2
r(s, t);
si sj

ri;j (s, t) :=

2
r(s, t).
si tj

For every t, i and j


ri; (t, t) = 0,

ij (t) = ri;j (t, t) = rij; (t, t).

Thus X(t) and X (t) are independent. Regression formulae imply that
ats = r(s, t),

t (s) =

1 r(t, s)
.
n(s, t)

This implies that t (t) = (t) and that the possible limit values of t (s) as s t
are in the set {v T (t)v : v S d1 }. Due to the non-degeneracy condition these
quantities are minorized by a positive constant. On the other hand for s = t
t (s) > 0. This shows that for every t I one has inf sI t (s) > 0. Since for every
t I the process X t is bounded it follows that
a.s. 1IAu (X t , t ) 1 as u +.
Also
det X t (t) t (t)u

(1)d det (t) ud .

A dominated convergence argument shows that the first term in (35) is equivalent
to
2 /2

ud det(t )(2)1/2 eu

(2)d/2 det(t )

1/2

dt =

ud
2
eu /2
(d+1)/2
(2)

det(t )
I
2 /2

The same kind of argument shows that the second term is O ud1 eu
completes the proof.

1/2

which

References

Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics


for General Gaussian Processes, IMS, Hayward, Ca.
42

dt.

Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution
of the maximum of a Gaussian random fields. To appear in Extremes.
Azas, J-M. and Wschebor M. (1999). Regularite de la loi du maximum de
processus gaussiens reguliers. C.R. Acad. Sci. Paris, t. 328, serieI, 333-336.
Azas, J-M. and Wschebor M. (2001). On the regularity of the distribution of
the maximum of one-parameter Gaussian processes. Probab. Theory Relat. Fields,
119, 70-98.
Brillinger D. R., (1972). On the number of solutions of systems of random
equations. The Annals of Math. Statistics, 43, 534540.
Caba
na, E. M., (1985). Esperanzas de integrales sobre conjuntos de de nivel
aleatorios (spanish). Actas del segundo Congreso latinoamericano de probabilidades
Y estadistica mathematica. Caracas, 65-81
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic
Processes, J. Wiley & Sons, New-York.
Cucker, F.; Wschebor, (2003) M. On the Expected Condition Number of Linear
Programming Problems, to appear in Numerische Mathematik.
Diebolt, J. and Posse, C. (1996). On the Density of the Maximum of Smooth
Gaussian Processes.Ann. Probab., 24, 1104-1129.
Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York
Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes.Ecole dEte de Probabilites de Saint Flour. Lecture Notes in Mathematics,
480, Springer-Verlag,New-York.
Kobayashi Nomizu 199? Foundation of differential geometry. J. Wiley & Sons,
New-York.
Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process.
Sankya Ser. A 32, 369-378.
Lifshits, M.A.(1995). Gaussian random functions . Kluwer, The Netherlands.
Milnor, J. W.(1965). Topology from the differentiable viewpoint. The Univerity
Press of Virginia , Charlottesville.
Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes
and Fields. American Mathematical Society. Providence, Rhode Island.
Piterbarg V. I., (1996b). Rices Method for Large Excursions of Gaussian Random Fields. Technical Report No. 478, University of North Carolina. Translation
of Rices method for Gaussian random fields.
Taylor J.E., and Adler R. (2002) Euler characteristics for Gaussian fields on
manifolds. Preprint
Tsirelson, V.S. (1975). The Density of the Maximum of a Gaussian Process. Th.
Probab. Appl., 20, 847-856.
43

Weber, M. (1985). Sur la densite du maximum dun processus gaussien. J. Math.


Kyoto Univ., 25, 515-521.
Ylvisaker, D. (1968). A Note on the Absence of Tangencies in Gaussian Sample
Paths.The Ann. of Math. Stat., 39, 261-262.

44

You might also like