You are on page 1of 27

Adaptive Cross Approximation of

Multivariate Functions


Mario Bebendorf



no. 453



















Diese Arbeit ist mit Untersttzung des von der Deutschen Forschungs-
gemeinschaft getragenen Sonderforschungsbereichs 611 an der Universitt
Bonn entstanden und als Manuskript vervielfltigt worden.
Bonn, August 2009
Adaptive Cross Approximation of Multivariate
Functions
M. Bebendorf

August 11, 2009


In this article we present and analyze a new scheme for the approximation of multivari-
ate functions (d = 3, 4) by sums of products of univariate functions. The method is based
on the Adaptive Cross Approximation (ACA) initially designed for the approximation of
bivariate functions. To demonstrate the linear complexity of the schemes we apply it to
large-scale multidimensional arrays generated by the evaluation of functions.
AMS Subject Classication: 41A80, 41A63, 15A69.
Keywords: data compression, dimensionality reduction, adaptive cross approximation.
1 Introduction
Representations of functions of several variables by sums of functions of fever variables have been
investigated in many publications; see [22] and the references therein. The best L
2
-approximation of
functions of two variables is shown in [23, 25] to be given by the truncated Hilbert-Schmidt decom-
position. This result was extended by Pospelov [21] to the approximation of functions in d variables
by sums of products of functions of one variable. For some classes of fuctions the approximation by
specic function systems such as exponential functions might be advantageous; see [8, 9], [7]. One
of the best known decompositions in statistics is the analysis of variance (ANOVA) decomposition
[15]. Related to this eld of research are sparse grid approximations; see [28, 10].
An important application of this kind of approximation is the approximation of multidimensional
arrays generated by the evaluation of functions. In this case a d-dimensional tensor is approximated
by the tensor product of a small number of vectors, which signicantly improves the computational
complexity. Multidimensional arrays of data appear in many dierent applications, e.g. statistics,
chemometrics, and nance. While for d = 2 a result due to Eckart and Young [12] states that the
optimal rank-k approximation can be computed via the singular value decomposition (SVD), the
generalization of the SVD for tensors of order more than two is not clear; see the counterexample in
[18]. In the Tucker model [27] a third order tensor is approximated by
r
1

i=1
r
2

j=1
r
3

k=1
g
ijk
x
i
y
j
z
k
with the so-called core array g and the Tucker vectors x
i
, y
j
, and z
k
. The PARAFAC model [11] uses
diagonal core arrays. A method for third order tensors is proposed in [16]. Multi-level decompositions

Institut f ur Numerische Simulation, Rheinische Friedrich-Wilhelms-Universitat Bonn, Wegelerstrasse 6, 53115 Bonn,


Germany, Tel. +49 228 733144, bebendorf@ins.uni-bonn.de.
1
are presented in [17]. The most popular method is based on alternating least squares minimization
[19, 26]. In [29] an incremental method, i.e. the approximation is successively constructed from the
respective remainder, is proposed. While our technique also is based on successive rank-1 approxi-
mations, in [29] the optimal rank-1 approximation is computed via a generalized Rayleigh quotient
iteration. Since all previous methods require the whole matrix for constructing the respective ap-
proximation, they may be computationally still too expensive. We present a method having linear
complexity using a small portion of the original matrix entries.
In [3] the adaptive cross approximation (ACA) was introduced. ACA approximates bivariate
functions by sums of products of univariate functions. A characteristic property of ACA is that
the approximation is constructed from restrictions of to lower dimensional domains of denition,
i.e.
(x, y)
k

i,j=1

ij
(x
i
, y)(x, y
j
)
with points x
i
, y
j
and coecients
ij
which constitute the inverse of the matrix (x
i
, y
j
), i, j =
1, . . . , k. The advantages of the fact that the restrictions (x
i
, y) and (x, y
j
), i, j = 1, . . . , k, are
used, are manifold. First of all it can be seen that this kind of approximation allows to guarantee
quasi-optimal accuracy, i.e. the quality of the approximation will (up to constants) be at least as good
as the approximation in any other system of functions of the same cardinality. Furthermore, matrix
versions of ACA are able to construct approximations without computing all the matrix entries in
advance, only the entries corresponding to the restrictions have to be evaluated. Furthermore, the
method is adaptive because it is able to nd the required rank k in the course of the approximation.
In the present article the adaptive cross approximation will be extended to functions of three and
four variables. The latter two classes of functions will be treated by algorithms which together with
the bivariate ACA can be investigated in the general setting of what we call incremental approxima-
tion. These will be introduced and investigated in Sect. 2. The convergence analysis of the bivariate
ACA can be obtained as a special case. Furthermore, in Sect. 3 we will show convergence also for
singular functions. A principle dierence appears if ACA is extended to more than two variables, be-
cause the dimension of the restricted domains of denition of the approximations are still more than
one-dimensional. Hence, further approximation of these restrictions by an ACA of lower dimension
is required. As a consequence, the inuence of perturbation on the convergence has to be analyzed.
In the trivariate case treated in Sect. 4 one additional approximation using bivariate ACA per step
is sucient. The approximation of functions of four variables is treated in Sect. 5 and requires two
additional bivariate ACA approximations per step. To demonstrate the linear complexity of the
presented techniques, our theoretical ndings are accompanied by the application of the presented
schemes to large-scale multidimensional arrays. A method that is similar to the kind of matrix ap-
proximation treated in this article (at least for the trivariate case) was presented in [20]. Although
it is proved in [20] that low-rank approximations exist, the convergence of the actual scheme has not
been analyzed.
As the need for techniques required to analyze the inuence of perturbations on the convergence
already appears in the cases of three and four variables, the results of this article are expected to
be useful also for problems of more than four dimensions, because algorithms can be constructed by
recursive bisection of the set of variables; see also [14]. In this sense this article lays ground to the
adaptive cross approximation of high-dimensional functions.
2
2 Incremental Approximation
The approximation schemes considered in this article will be of the following form. Given a set X
and a function f : X C, dene r
0
[f] := f and r
k
[f], k = 1, 2, . . . , as
r
k
[f] = r
k1
[f]
r
k1
[f](x
k
)

k
(x
k
)

k
. (1)
Here, x
k
is chosen such that
k
(x
k
) = 0. The choice of the functions
k
: X C denes the respective
approximation scheme. Since the evaluation of functions at given points is central for our methods,
we assume that f and
k
are continuous on X. In the following lemmas properties of r
k
[f] will be
investigated.
Lemma 1. For r
k
[f] the non-recursive representation
r
k
[f] = f
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_

1
.
.
.

k
_

_ (2)
holds, where
U
k
:=
_

1
(x
1
) . . . . . .
1
(x
k
)
0
2
(x
2
)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . 0
k
(x
k
)
_

_
.
Proof. The assertion is obviously true for k = 1. Assume that it is valid for k 1. Then we obtain
that
r
k
= r
k1

r
k1
(x
k
)

k
(x
k
)

k
= f
_

_
f(x
1
)
.
.
.
f(x
k1
)
_

_
T
U
1
k1
_

1
.
.
.

k1
_

_
_
_
_
_
f(x
k
)
_

_
f(x
1
)
.
.
.
f(x
k1
)
_

_
T
U
1
k1
_

1
(x
k
)
.
.
.

k1
(x
k
)
_

_
_
_
_
_

k
(x
k
)
,
which ends the proof, because
_
A b

_
1
=
_
A
1
A
1
b/
1/
_
, (3)
where A is a non-singular matrix and 0 = C.
The previous lemma shows that the constructed approximation
s
k
[f] := f r
k
[f]
is in the linear hull of the functions
1
, . . . ,
k
. The following lemma shows an equivalent expression
for the coecients. In addition to the vector in (4) the quantities

i,k
:= sup
xX
k

=i
|
(k)

(x)|
3
dened on the components of
(k)
: X C
k
,

(k)
:= U
1
k
_

1
.
.
.

k
_

_,
will play an important role in the stability analysis.
Lemma 2. It holds that
U
T
k
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_ =
_

_
r
0
[f](x
1
)

1
(x
1
)
.
.
.
r
k1
[f](x
k
)

k
(x
k
)
_

_
. (4)
Hence,
s
k
[f] =
k

i=1
r
i1
[f](x
i
)

i

i
(x
i
)
.
Proof. Formula (4) is obviously true for k = 1. Assume that it is valid for k 1, then using (3)
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
=
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
_
U
1
k1
v
k1
1

k
(x
k
)
_
,
where
v
k1
= U
1
k1
_

1
(x
k
)
.
.
.

k1
(x
k
)
_

_/
k
(x
k
).
From (2) together with the assumption, we obtain
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
=
_

_
r
0
(x
1
)

1
(x
1
)
, . . . ,
r
k2
(x
k1
)

k1
(x
k1
)
,
_

_
f(x
1
)
.
.
.
f(x
k1
)
_

_
T
v
k1
+
f(x
k
)

k
(x
k
)
_

_ =
_
r
0
(x
1
)

1
(x
1
)
, . . . ,
r
k1
(x
k
)

k
(x
k
)
_
.
In the following lemma we will investigate under which conditions on
k
the function s
k
[f] in-
terpolates f. Notice that r
k
[f](x
k
) = 0. However, r
k
[f](x
j
) does not vanish for j < k in general.
The desired interpolation property will be characterized by the coincidence of the upper triangular
matrix U
k
with the k k matrix
M
k
:=
_

1
(x
1
) . . .
1
(x
k
)
.
.
.
.
.
.

k
(x
1
) . . .
k
(x
k
)
_

_
or equivalently by
i
(x
j
) = 0 for i > j.
Lemma 3. For 1 j < k it holds that
r
k
[f](x
j
) =
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_

_
0
.
.
.
0

j+1
(x
j
)
.
.
.

k
(x
j
)
_

_
. (5)
4
Hence, s
k
[f](x
j
) = f(x
j
), 1 j k, if M
k
= U
k
. In the other case we have that s
k
[f](x
j
) = (

f
k
)
j
,
1 j k, where

f
k
:= (U
1
k
M
k
)
T
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_.
Proof. Since r
j
(x
j
) = 0, we have that
f(x
j
) =
_

_
f(x
1
)
.
.
.
f(x
j
)
_

_
T
U
1
j
_

1
(x
j
)
.
.
.

j
(x
j
)
_

_.
Formula (5) follows from
r
k
(x
j
) = f(x
j
)
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_

1
(x
j
)
.
.
.

k
(x
j
)
_

_ (6)
and the upper triangular structure of U
k
which contains U
j
in the leading j j subblock.
The second part
_

_
s
k
(x
1
)
.
.
.
s
k
(x
k
)
_

_
T
=
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
M
k
follows from s
k
= f r
k
and (6).
Remark 1. Let M
k
be non-singular. We denote by M
(i)
k
(x) C
kk
the matrix which arises from
replacing the i-th column of M
k
by the vector v
k
:= [
1
(x), . . . ,
k
(x)]
T
. The functions
L
i
[f](x) := (M
1
k
v
k
)
i
=
det M
(i)
k
(x)
det M
k
span{
1
, . . . ,
k
}
are Lagrange functions for the points x
1
, . . . , x
k
, i.e. L
i
[f](x
j
) =
ij
, i, j = 1, . . . , k. As a consequence
of Lemma 3, the approximation
s
k
[f] =
k

i=1
(

f
k
)
i
L
i
[f]
is the uniquely dened Lagrangian interpolating function.
The following lemmas will help estimating the remainder r
k
of the approximation. We rst observe
the following property.
Lemma 4. Let M
k
= U
k
and let functions

1
, . . . ,

k
satisfy span{

1
, . . . ,

k
} = span{
1
, . . . ,
k
}.
Then

M C
kk
dened by

M
ij
=

i
(x
j
), i, j = 1, . . . , k, is non-singular and
r
k
[f] = f
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T

M
1
k
_

1
.
.
.

k
_

_.
5
Proof. Let C C
kk
be an invertible matrix such that
_

1
.
.
.

k
_

_ = C
_

1
.
.
.

k
_

_.
Then it follows from M
k
= C

M
k
that

M
k
is invertible and
M
1
k
_

1
.
.
.

k
_

_ =

M
1
k
_

1
.
.
.

k
_

_.
Lemma 1 gives the assertion.
In the following lemma the remainder r
k
[f] is expressed as the error
E
k
[f](x) := f(x)
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T

k
(x)
of a linear approximation in an arbitrary system {
1
, . . . ,
k
} C(X) of functions. Here, we set

k
:= [
1
, . . . ,
k
]
T
.
Lemma 5. It holds that
r
k
[f] = E
k
[f]
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
(M
k
U
k
)
k

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_

_
E
k
[
1
]
.
.
.
E
k
[
k
]
_

_.
In particular, if M
k
= U
k
, then
r
k
[f] = E
k
[f]
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T

M
1
k
_

_
E
k
[

1
]
.
.
.
E
k
[

k
]
_

_,
where

M
k
and

1
, . . . ,

k
are as in Lemma 4.
Proof. The assertion follows from
r
k
= f
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_

1
.
.
.

k
_

_ = f
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
M
k

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_
_
_
_

1
.
.
.

k
_

_ M
k

k
_
_
_
= f
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
(M
k
U
k
)
k

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_

_
E
k
[
1
]
.
.
.
E
k
[
k
]
_

_
.
The second part of the assertion follows from Lemma 4.
6
Remark 2. The previous lemma relates the remainder r
k
[f] to the error E
k
[f] in the system
k
.
Assume that
k
are Lagrange functions, i.e.
i
(x
j
) =
ij
, i, j = 1, . . . , k. If r
k
[f] is to be estimated
by the best approximation error in
k
, then one can use the estimate
E
k
[f]

(1 +I
k
) inf

k
f

, (7)
where I
k
: C(X) C(X) dened as I
k
f =

k
i=1
f(x
i
)
i
denotes the interpolation operator in
k
and I
k
:= sup{I
k
f

, f C(X), f

= 1}. Estimate (7) is a consequence of


f I
k
f

+I
k
(f )

(1 +I
k

)f

for all
k
.
2.1 Perturbation analysis
The next question we are going to investigate is how approximations

k
to the functions
k
inuence
the approximation error r
k
[f], i.e., we will compare the remainder r
k
[f] with r
k
[f] dened by r
0
[f] = f
and
r
k
[f] = r
k1
[f]
r
k1
[f](x
k
)

k
(x
k
)

k
, k = 1, 2, . . . . (8)
In (8) x
k
is chosen such that

k
(x
k
) = 0. Dene

U
k
,

(k)
, and
i,k
by replacing
i
in the respective
denition with

i
. Notice that we use the same points x
k
from the construction of r
k
[f] also for the
construction of r
k
[f]. Therefore, we have to make sure that U
k
is invertible.
Lemma 6. Let
i
:=

such that
i
< |

i
(x
i
)|, i = 1, . . . , k. Then
i
(x
i
) = 0 and
r
k
[f] r
k
[f]

i=1
|r
i1
[f](x
i
)|
|
i
(x
i
)|
(
i,k
+ 1)
i
.
Proof. From |
i
(x
i
)| |

i
(x
i
)|
i
> 0 it follows that U
k
is invertible. Setting
E
k
= U
k


U
k
and
(k)
=
_

1
.
.
.

k
_

_
,
due to Lemma 1 and Lemma 2 we have that
r
k
r
k
=
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
_
_
_

U
1
k
_

1
.
.
.

k
_

_ U
1
k
_

1
.
.
.

k
_

_
_
_
_ =
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_
_
_(

U
k
+ E
k
)

U
1
k
_

1
.
.
.

k
_

_
_

1
.
.
.

k
_

_
_
_
_
=
_

_
f(x
1
)
.
.
.
f(x
k
)
_

_
T
U
1
k
_
_
_
(k)
+ E
k

U
1
k
_

1
.
.
.

k
_

_
_
_
_ =
_

_
r
0
(x
1
)

1
(x
1
)
.
.
.
r
k1
(x
k
)

k
(x
k
)
_

_
T
_

(k)
+ E
k

(k)
_
.
The assertion follows from
(k)
i


i
, (E
k
)
ij


i
, and (E
k
)
ij
= 0 for i > j.
In addition to (8) one may also consider the following scheme, in which r
k1
[f](x
k
) is replaced by
some value a
k
C:

r
k
[f] =

r
k1
[f]
a
k

k
(x
k
)

k
. (9)
7
Then

r
k
[f] will usually neither vanish in the points x
j
, 1 j < k, nor a representation (2) will hold.
However, the following lemma can be proved. We will make use of the recursive relation

(k)
i
=
(k1)
i


k

k
(x
k
)

(k1)
i
(x
k
), i = 1, . . . , k 1, (10a)

(k)
k
=

k

k
(x
k
)
, (10b)
for the components of
(k)
, which follows from (3).
Lemma 7. Let
j
:=

r
j1
[f](x
j
) a
j
such that |
j
| , 1 j k. Then

r
k
[f] = r
k
[f] +
_

1
.
.
.

k
_

_
T

U
1
k
_

1
.
.
.

k
_

_.
Hence,
r
k
[f]

r
k
[f]

sup
xX
k

j=1
|

(k)
j
(x)| |
j
|
1,k
.
Proof. The assertion is proved by induction. It is obviously true for k = 1. Assume that it is valid
for k 1. Then

r
k
=

r
k1

r
k1
(x
k
)

k
(x
k
)

k
+

k
(x
k
)

k
= r
k1
+
_

1
.
.
.

k1
_

_
T

(k1)

k
(x
k
)
_
_
_
_
r
k1
(x
k
) +
_

1
.
.
.

k1
_

_
T

(k1)
(x
k
)
_
_
_
_
+

k
(x
k
)

k
= r
k
+
_

1
.
.
.

k1
_

_
T

(k1)

k
(x
k
)
_

1
.
.
.

k1
_

_
T

(k1)
(x
k
) +

k
(x
k
)

k
.
Equation (10) nishes the proof.
2.2 Estimating the amplication factors
As it can be seen from Lemma 6 and Lemma 7, for the perturbation analysis it is crucial to estimate
the size of the amplication factors
(k)
. The following lemma is an obvious consequence of (10).
Let v
j
: X C, j = 1, . . . , k, be given functions and set

(k)
:= U
1
k
_

_
v
1
.
.
.
v
k
_

_.
Lemma 8. If there is 1 such that v
i

|
i
(x
i
)| for i = 1, . . . , k, then

(k)
i


_
_
1 +
k1

j=i

(j)
i

_
_
.
Furthermore,
(k)
i

(1 + )
ki
provided that
i

|
i
(x
i
)| for some R.
8
Proof. It readily follows from (10) that
(k)
i

(1+)
ki
. Similar to (10) we obtain the following
recursion formula for the components of

(k)
:

(k)
i
=

(k1)
i

v
k

k
(x
k
)

(k1)
i
(x
k
), i = 1, . . . , k 1,

(k)
k
=
v
k

k
(x
k
)
,
which shows

(i)
i

and the recursive relation

(j+1)
i

(j)
i

+
(j)
i

, j = k 1, . . . , i.
Hence, we obtain that

(k)
i

+
k1

j=i

(j)
i

.
In particular the previous lemma shows that
(k)
i

2
ki
provided that x
i
maximizes |
i
|. In
this case we have

i,k

k

=i

(k)

=i
2
k
= 2
ki+1
1.
If
i
, i = 1, . . . , k, are smooth, i.e. if we may assume that
|E
ki
[
i+
](x)| |
i+
(x
i+
)|, = 0, . . . , k i, (11)
with some > 0 and if
i
(x
j
) = 0, i > j, then signicantly better bounds for
(k)
i

can be derived.
Due to (10) each
(k)
i
, 1 i k, can be regarded as a remainder function of an approximation of
type (1) with f

:=
(i)
i
and

j
:=
(i+j)
i+j
, because it follows from (10) that

(i)
i
(x) =

i
(x)

i
(x
i
)
,

(i+j)
i
(x) =
(i+j1)
i
(x)

i+j
(x)

i+j
(x
i+j
)

(i+j1)
i
(x
i+j
), j = 1, . . . , k i.
Since span{

1
, . . . ,

ki
} = span{
i+1
, . . . ,
k
}, Lemma 5 shows that

(k)
i
(x) = E
ki
[
(i)
i
](x)
_

(i)
i
(x
i+1
)
.
.
.

(i)
i
(x
k
)
_

_
T
(M

ki
)
1
_

_
E
ki
[
i+1
](x)
.
.
.
E
ki
[
k
](x)
_

_,
where (M

ki
)

=
i+
(x
i+
), , = 1, . . . , k i. Hence, from (11) and Lemma 8 we obtain that

(k)
i

+
ki

=1
|
(i)
i
(x
i+
)|
_
_
1 +
ki1

j=
2
j
_
_
2
ki
, (12)
because M

ki
= U

ki
and |
(i)
i
(x
j
)| 1, j = i + 1, . . . , k, provided that x
i
maximizes |
i
|.
We will, however, see in the numerical examples that typically
(k)
i

is signicantly smaller than


predicted by our worst-case estimates.
9
3 Adaptive Cross Approximation
The adaptive cross approximation (ACA) was introduced for Nystrom matrices [3] and extended to
collocation matrices [6]. A version with rened pivoting strategy and a generalization of the method
to Galerkin matrices was presented in [5]. The following recursion is in the core of this method.
Given : X Y C, let R
0
(x, y) = (x, y) and
R
k
(x, y) = R
k1
(x, y)
R
k1
(x, y
k
)R
k1
(x
k
, y)
R
k1
(x
k
, y
k
)
, k = 1, 2, . . . . (13)
The points x
k
and y
k
are chosen such that R
k1
(x
k
, y
k
) = 0. The previous recursion corresponds to
the choice
f :=
x
,
i
:= R
i1
(x
i
, )
in (1) if x X is treated as a parameter. Then R
k
(x, y) = r
k
[
x
](y) holds, where
x
is dened
by
x
(y) = (x, y) for all x X, and
i
(y
j
) = 0 for i > j can be seen from inductively applying
Lemma 3. Since
span{
1
, . . . ,
k
} = span{(x
1
, ), . . . , (x
k
, )},
we see that

i
:= (x
i
, ) is a possible choice in Lemma 4. Hence, the degenerate approximation
S
k
:= R
k
of has the representation
S
k
(x, y) =
_

_
(x, y
1
)
.
.
.
(x, y
k
)
_

_
T

M
1
k
_

_
(x
1
, y)
.
.
.
(x
k
, y)
_

_ =
k

i,j=1
(

M
1
k
)
ij
(x, y
i
)(x
j
, y)
with (

M
k
)
ij
= (x
i
, y
j
), i, j = 1, . . . , k.
In Lemma 5 we showed how the remainder R
k
of the approximation can be estimated by relating
the approximation to linear approximation in any system {
1
, . . . ,
k
}. In particular we obtain from
the second part of Lemma 5 that
R
k
(x, y) = E
k
[
x
](y)
_

_
E
k
[
x
1
](y)
.
.
.
E
k
[
x
k
](y)
_

_
T

(k)
(x),
where

(k)
(x) :=

M
T
k
_

_
(x, y
1
)
.
.
.
(x, y
k
)
_

_ C
k
can be regarded as an amplication factor with respect to x. Therefore, we obtain
|R
k
(x, y)| (
1,k
+ 1) max
z{x,x
1
,...,x
k
}
|E
k
[
z
](y)|. (14)
Similar to Remark 1 we denote by

M
(i)
k
(x) the matrix which arises from

M
k
by replacing the i-th
row with the vector [(x, y
1
), . . . , (x, y
k
)]. Then due to Cramers rule we have that

(k)
i
(x) =
det

M
(i)
k
(x)
det

M
k
.
Hence, if the pivoting points x
i
, i = 1, . . . , k, are chosen such that
| det

M
(i)
k
(x)| | det

M
k
| for all x X and i = 1, . . . , k,
10
then
(k)
i

= 1, and we obtain
|R
k
(x, y)| (k + 1) max
z{x,x
1
,...,x
k
}
|E
k
[
z
](y)|.
In this case of so-called matrices of maximum volume we also refer to the error estimates in [24]
which are based on the technique of exact anhilators; see [2, 1]. In practice it is, however, dicult
to nd matrices of maximum volume. In Lemma 8 we observed that

(k)
i

2
ki
, i = 1, . . . , k,
under the realistic condition
|R
k1
(x, y
k
)| |R
k1
(x
k
, y
k
)| for all x X. (15)
In this case (14) becomes
|R
k
(x, y)| 2
k
max
z{x,x
1
,...,x
k
}
|E
k
[
z
](y)|.
If is suciently smooth with respect to y and if the system
k
is appropriately chosen, then it can
be expected that for all x X
E
k
[
x
]


k
(16)
with some 0 < < 1, which results in an approximation error of the order (2)
k
. Hence, ACA
convergences if < 1/2. Notice that the choice of y
k
does not improve our estimates on the
amplication factors
(k)
. However, it is important for obtaining a reasonable decay of the error
E
k
[
x
]; for details see [5].
Up to now we have exploited that is smooth only with respect to the second variable y. If
is smooth also with respect to x, then the arguments from the end of Sect. 2.2 can be applied to
improve the error estimate. Condition (11) is satised, because according to assumption (16) for
= 0, . . . , k i we have that
|E
ki
[R
i+1
(, y
i+
)](x)| |R
i+1
(x
i+
, y
i+
)|,
ki
.
Hence, if is smooth also with respect to x, then according to (12) we obtain
(k)
i

2
ki
and
|R
k
(x, y)|
k
k

i=1
2
ki

ki
2
k

2k
k

i=1

2
i

2
k

2k
1

2
,
which converges for < 1/

2.
Although the estimate has improved, the dependence of
1,k
on k is still exponential. The actual
growth with respect to k seems to be signicantly slower; see the following numerical examples.
3.1 Matrix approximation
The previous estimates can be applied when approximating function generated matrices
a
ij
= (p
i
, q
j
), i = 1, . . . , m, j = 1, . . . , n,
with p
i
X and q
j
Y . In this case (13) becomes the following matrix iteration. Starting from
R
0
:= A, nd a nonzero pivot (i
k
, j
k
) in R
k
and subtract a scaled outer product of the i
k
-th row and
the j
k
-th column:
R
k+1
:= R
k
[(R
k
)
i
k
j
k
]
1
(R
k
)
1:m,j
k
(R
k
)
i
k
,1:n
,
where we use the notations v
k
:= (R
k1
)
i
k
,1:n
and u
k
:= (R
k1
)
1:m,j
k
for the i
k
-th row and the j
k
-th
column of R
k1
, respectively. We use (15) to select i
k
. The choice of j
k
is detailed in [5].
11
Since we are able to control the remainder R
k
of the approximation by our estimates, it is sucient
to construct S
k
= A R
k
, which requires the computation of only
u
k
= (R
k1
)
1:m,j
k
= a
1:m,j
k

k1

=1
(R
1
)
i

j
k
(R
1
)
i

(R
1
)
1:m,j

= a
1:m,j
k

k1

=1
(v

)
j
k
(u

)
i

(17)
and
v
k
= (R
k1
)
i
k
,1:n
= a
i
k
,1:n

k1

=1
(R
1
)
i
k
j

(R
1
)
i

(R
1
)
i

,1:n
= a
i
k
,1:n

k1

=1
(u

)
i
k
(u

)
i

. (18)
In particular this means that only k(m + n) of the original entries of A have to be computed. The
number of operations required for constructing S
k
=

k
=1
u

v
T

is of the order k
2
(m+n), while the
storage required for the approximation S
k
is of the order k(m + n). For further details see [4].
In the following example we consider the smooth function
(x, y) :=
_
1 + x
2
+ y
2
_
1/2
, x, y R,
and the points p
i
= q
i
=
1
n
(i
1
2
), i = 1, . . . , n = m. Table 1 shows the rank k required to satisfy
A S
k

F
A
F
(19)
and the CPU time in seconds on a single core of an Intel Core2 X5482 processor at 3.2 GHz. The
= 10
5
= 10
6
= 10
7
ACA SVD ACA SVD ACA SVD
n k time [s] k time [s] k time [s] k time [s] k time [s] k time [s]
1 250 4 0.00 3 4.2 4 0.00 4 4.2 5 0.00 4 4.2
2 500 4 0.00 3 43.1 4 0.00 4 43.1 5 0.00 4 43.1
5 000 4 0.00 3 381.7 4 0.00 4 381.7 5 0.00 4 381.7
10 000 4 0.00 4 0.00 5 0.00
20 000 4 0.00 4 0.01 5 0.01
40 000 4 0.01 4 0.01 5 0.01
80 000 4 0.02 4 0.03 5 0.03
160 000 4 0.06 4 0.06 6 0.09
320 000 4 0.13 4 0.13 6 0.21
640 000 4 0.29 5 0.37 6 0.48
1 280 000 4 0.61 5 0.81 6 1.01
Table 1: Comparison of ACA and SVD.
approximation via SVD gives the best approximation but requires O(n
3
) complexity. Notice that the
CPU time for both methods includes the computation of the required matrix entries. For problem
sizes larger than 5000 the SVD could not be computed within 30 minutes. ACA shows a linear
complexity, and the approximation rank is insignicantly larger than the optimal one, which does
not seem to depend on the problem size. Note that in order to guarantee linear complexity we
replaced (19) with
u
k+1

2
v
k+1

F
S
k+1

F
, (20)
because A
F
S
k+1

F
and S
k+1
S
k

F
A S
k

F
.
Table 2 shows the expression

1,k
= max
j=1,...,n
k

i=1
|
(k)
i
(q
j
)|.
The amplication factors do not seem to grow exponentially with k.
12
n\k 1 2 3 4 5 6
320 000 1.00 1.04 1.25 2.63 4.70 3.61
640 000 1.00 1.04 1.26 2.61 3.76 3.70
1 280 000 1.00 1.05 1.27 2.54 3.01 3.08
Table 2: Amplication factors
1,k
for the case = 10
7
.
3.2 Application to singular functions
In previous publications the adaptive cross approximation was applied to functions on domains
X Y which are well-separated from singular points of . The following lemma shows the rate
of convergence in the case that X Y approaches singular points. As an important prototype we
consider
(x, y) := (|x|
q
+|y|
q
)
1/q
(21)
with arbitrary q N.
Theorem 1. Let
1
,
2
> 0 and X {x R
d
: x
2
>
1
}, Y {y R
d
: y
2
>
2
}. Then for R
k
applied to from (21) there is a constant c
k
> 0 such that
|R
k
(x, y)| c
k
8 2
1/q
(
q
1
+
q
2
)
1/q
e

k/q
, x X, y Y.
Proof. In [9] it is proved that for the approximation of the function f(t) := t
1/q
by exponential
sums
s
k
(t) :=
k

i=1

i
e

i
t
,
i
,
i
R,
it holds that
f s
k

[,)
= min

i
,
i
f s
k

[,)

8 2
1/q

1/q
e

k/q
.
Without loss of generality we may assume that the coecients
i
are pairwise distinct.
Setting
i
(y) := e

i
|y|
q
, for x X it holds that s
k
(|x|
q
+| |
q
)
k
. Hence, for the approximation
of on X Y we obtain that
sup
xX
inf

k
(x, )
,Y
sup
xX
f(|x|
q
+| |
q
) s
k
(|x|
q
+| |
q
)
,Y
f s
k

[
q
1
+
q
2
,)

8 2
1/q
(
q
1
+
q
2
)
1/q
e

k/q
.
The assertion follows from (14) and Remark 2 with c
k
:= (1 +I
k
)(
1,k
+ 1).
As a consequence, the rank k required to guarantee an error of order > 0 depends logarithmically
on both and the maximum := max{
1
,
2
} of the distances to the singularity provided that
c
k
e

k/q
:

1
8 2
1/q
e

k/q
< k >
4q

2
[| log | +| log | + (3 + 1/q) log 2]
2
. (22)
We will now construct matrix approximations for matrices A R
nn
generated by (21) for q = 2, 5
and = 10
5
. Table 3 shows that in contrast to Table 1 the rank increases with the problem size
due to the singularity of . As predicted in (22) the dependence is logarithmic. The column labeled
factor shows the compression ratio 2k/m, i.e. the ratio of the number of units of memory required
13
q = 2 q = 5
ACA SVD ACA SVD
n k factor time [s] k time [s] k factor time [s] k time [s]
1 250 18 3 10
2
0.00 16 4.2 31 5 10
2
0.00 29 4.2
2 500 19 2 10
2
0.00 18 43.1 35 3 10
2
0.01 32 43.1
5 000 21 8 10
3
0.00 19 381.7 38 2 10
2
0.03 35 381.7
10 000 22 4 10
3
0.02 41 8 10
3
0.06
20 000 24 2 10
3
0.06 44 4 10
3
0.25
40 000 25 1 10
3
0.16 46 2 10
3
0.59
80 000 27 7 10
4
0.41 49 1 10
3
1.36
160 000 28 4 10
4
0.96 52 7 10
4
3.21
320 000 28 2 10
4
2.41 55 3 10
4
8.93
640 000 30 9 10
5
6.57 58 2 10
4
23.85
1 280 000 31 5 10
5
14.12 61 1 10
4
52.91
Table 3: Comparison of ACA and SVD for = 10
5
.
for the approximation and for the original matrix. Additionally, it is visible that q increases the
approximation rank. However, the dierence of the optimal rank and the one computed by ACA is
still small.
Table 4 shows the corresponding amplication factors
1,k
.
n\k 1 2 3 4 5 6 7 8 9 10 15 20 25 30 31
320 000 1.0 1.0 1.2 2.6 2.5 1.9 2.5 3.3 2.7 2.8 3.0 3.3 8.3
640 000 1.0 1.0 1.2 2.6 2.5 1.9 2.5 3.3 2.7 2.8 3.0 3.3 4.8 14.4
1 280 000 1.0 1.0 1.2 2.6 2.5 1.9 2.5 3.3 2.7 2.8 3.0 3.3 4.2 11.0 10.1
Table 4: Amplication factors
1,k
for the case q = 2, = 10
5
.
4 Adaptive cross approximation of trivariate functions
In this section functions : X Y Z C in three variables will be considered. An obvious
generalization of the bivariate method to such functions is the following recursion
R
k
(x, y, z) = R
k1
(x, y, z)
R
k1
(x, y
k
, z
k
)
R
k1
(x
k
, y
k
, z
k
)
R
k1
(x
k
, y, z)
for k = 1, 2, . . . and R
0
(x, y, z) = (x, y, z). The previous recursion still contains a function in two
variables, which can be approximated using ACA. Instead of R
k
we will therefore use the following
recursion

R
k
(x, y, z) =

R
k1
(x, y, z)

R
k1
(x, y
k
, z
k
)

R
k1
(x
k
, y
k
, z
k
)
A
yz
[

R
k1
|
x
k
](y, z) (23)
for k = 1, 2, . . . and

R
0
(x, y, z) = (x, y, z). Here, A
yz
[f] denotes the approximation of the bivariate
function f(y, z) as presented in Sect. 3. The number of ACA steps for the construction of A
yz
[f]
will be denoted by k

. The points x
k
, y
k
, and z
k
are chosen such that

R
k1
(x
k
, y
k
, z
k
) = 0.
Before we analyze the decay of |

R
k
| with k in the setting of incremental approximations, we show
how the approximation

S
k
:=

R
k
can be represented in terms of and

R

, = 1, . . . , k 1.
14
Lemma 9. The function

S
k
is of the form

S
k
(x, y, z) =
k

=1

R
1
(x, y

, z

R
1
(x

, y

, z

)
k

,=1

()


R
1
(x

, y, z
()

)

R
1
(x

, y
()

, z) (24a)
=
k

=1
(x, y

, z

)
k

i,j=1
k

,=1

(ij)

(x
i
, y, z
(i)

)(x
j
, y
(j)

, z). (24b)
with points y
(j)

, z
(i)

and suitable coecients


()

,
(ij)

.
Proof. We have seen in Sect. 3 that
A
yz
[f](y, z) =
k

,=1

f(y, z

)f(y

, z)
with coecients

and points y

, z

depending on f. Hence, it is easy to see that

S
1
(x, y, z) = (x, y
1
, z
1
)
k

,=1

(x
1
, y
1
, z
1
)
(x
1
, y, z
(1)

)(x
1
, y
(1)

, z).
Assume that the assertion is valid for k 1. Since

S
k1
(x
k
, y
(k)

, z) =
k

=1
k1

j=1

()

j
(x
j
, y
(j)

, z) and

S
k1
(x
k
, y, z
(k)

) =
k

=1
k1

i=1

()

i
(x
i
, y, z
(i)

),
where

()

j
=
k1

i,=1
(x
k
, y

, z

)
k

=1

(ij)

(x
i
, y
(k)

, z
(i)

)
and

()

i
=
k1

j,=1
(x
k
, y

, z

)
k

=1

(ij)

(x
j
, y
(j)

, z
(k)

),
it follows that

R
k1
(x
k
, y
(k)

, z) = (x
k
, y
(k)

, z)

S
k1
(x
k
, y
(k)

, z) =
k

=1
k

j=1

()

j
(x
j
, y
(j)

, z)
and similarly

R
k1
(x
k
, y, z
(k)

) =
k

=1
k

i=1

()

i
(x
i
, y, z
(i)

).
Hence
A
yz
[

R
k1
|
x
k
](y, z) =
k

,=1

(k)


R
k1
(x
k
, y, z
(k)

)

R
k1
(x
k
, y
(k)

, z)
=
k

i,j=1
k

=1

(ij)

(x
i
, y, z
(i)

) (x
j
, y
(j)

, z),
15
where

(ij)

:=

,=1

(k)

()

i

()

j
. Together with

R
k1
(x, y
k
, z
k
)

R
k1
(x
k
, y
k
, z
k
)
=
k

=1

(k)

(x, y

, z

)
we obtain the assertion.
Whereas in the bivariate case only the amplication factors
1,k
entered the error estimates, the
perturbation introduced by A
yz
[

R
k1
|
x
k
] is also amplied by the expression
c
(k)
piv
:= max
yY, zZ
|A
yz
[

R
k1
|
x
k
](y, z)|
|

R
k1
(x
k
, y
k
, z
k
)|
as we shall see in the following theorem. Notice that the factor c
(k)
piv
can be evaluated easily in each
step of the iteration to check its size.
Theorem 2. Let > 0 be suciently small, and for j = 1, . . . , k assume that
sup
yY, zZ
|

R
j1
(x
j
, y, z) A
yz
[

R
j1
|
x
j
](y, z)| .
Then for x X, y Y , and z Z
|

R
k
(x, y, z)| (
1,k
+ 1) max
{x,x
1
,...,x
k
}
E
k
[

]
,Y Z
+ c
k
,
where c
k
:=
1,k
+ 2

k
j=1

1,j1

k
i=j
(c
(i)
piv
+ 1)(
i,k
+ 1).
Proof. Notice that |R
k
(x, y, z)| was estimated in the last section if (y, z) is treated as a single variable;
see (14). Hence,
|R
k
(x, y, z)| (
1,k
+ 1) max
{x,x
1
,...,x
k
}
E
k
[

]
,Y Z
.
Furthermore, for xed y, z we have that R
k
= r
k
[
y,z
] is of type (1) if we choose
k
:= r
k1
[
y
k
,z
k
],
while the recursion for

R
k
is of type (9), i.e.

R
k
(x, y, z) =

r
k
[
y,z
](x) for the choice

k
:=

r
k1
[
y
k
,z
k
] =

R
k1
(, y
k
, z
k
), a
k
:= A
yz
[

R
k1
|
x
k
](y, z).
We obtain from Lemma 7 that

r
k
[
y,z
] r
k
[
y,z
]

j=1

(k)
j

r
j1
[
y,z
](x
j
) a
j
|
1,k
,
because
|

r
j1
[
y,z
](x
j
) a
j
| = |

R
j1
(x
j
, y, z) A
yz
[

R
j1
|
x
j
](y, z)| .
Let F
k
:= sup
y,z
r
k1
[
y,z
] r
k1
[
y,z
]

. Then it follows that

= r
k1
[
y
k
,z
k
]

r
k1
[
y
k
,z
k
]

r
k1
[
y
k
,z
k
] r
k1
[
y
k
,z
k
]

+ r
k1
[
y
k
,z
k
]

r
k1
[
y
k
,z
k
]

F
k
+
1,k1
.
For suciently small we may assume that
F
k
+ (
1,k1
+ 1)
1
2
|

k
(x
k
)|. (25)
16
Then Lemma 6 proves the estimate
F
k+1

k

i=1

i
(F
i
+
1,i1
) (26)
with

i
:=
sup
y,z
|r
i1
[
y,z
](x
i
)|
|
i
(x
i
)|
(
i,k
+ 1).
From |r
i1
[
y,z
](x
i
) A
yz
[

R
i1
|
x
i
](y, z)| F
i
+ (
1,i1
+ 1) we obtain that
sup
y,z
|r
i1
[
y,z
](x
i
)|
|
i
(x
i
)|

sup
y,z
|A
yz
[

R
i1
|
x
i
](y, z)| + F
i
+ (
1,i1
+ 1)
|

i
(x
i
)| F
i

1,i1

2c
(i)
piv
+ 1
due to (25).
Dene F

1
= 0 and F

k+1
=

k
i=1

i
(F

i
+
1,i1
). We see that
F

k+1
= F

k
+
k
(F

k
+
1,k1
) = (
k
+ 1)F

k
+
k

1,k1

and thus
F

k
=
k1

j=1

1,j1

j
k1

i=j+1
(
i
+ 1).
From F
1
= 0 and (26) we see that
F
k
F

k
=
k1

j=1

1,j1

j
k1

i=j+1
(
i
+ 1)
k1

j=1

1,j1
k1

i=j
(
i
+ 1).
It follows that

r
k

r
k
r
k

+ F
k+1
+r
k

r
k

+
1,k
+ 2
k

j=1

1,j1
k

i=j
(c
(i)
piv
+ 1)(
i,k
+ 1).
4.1 Matrix approximation
We apply the approximation (23) to the matrix A R
nnn
with entries a
i
1
i
2
i
3
= (p
i
1
, p
i
2
, p
i
3
) and
p
i
=
1
n
(i
1
2
), i = 1, . . . , n, generated by evaluating the smooth function
(x, y, z) = (1 + x
2
+ y
2
+ z
2
)
1/2
.
From (24a) we obtain the representation
(S
k
)
i
1
i
2
i
3
=
k

=1
(w

)
i
1
k

=1
(u

)
i
2
(v

)
i
3
with appropriate vectors u

, v

, and w

, = 1, . . . , k

, = 1, . . . , k. Here, k

denotes the rank of


the -th two-dimensinal approximation. Hence,
S
k

2
F
=
n

i
1
,i
2
,i
3
=1
(S
k
)
2
i
1
i
2
i
3
=
k

=1
(w

, w

,
17
where

:=
k

=1
k

=1
(u

, u

)(v

, v

),
can be exploited to evaluate S
k

F
with linear complexity. Furthermore, we have that S
k+1
S
k

2
F
=
w
k+1

2
2

k+1,k+1
, and condition (20) becomes
w
r+1

r+1,r+1
S
k+1

F
in the case of third order tensors. Notice that the computation of R
k
can be avoided by tracing back
the error as in (17) and (18).
The pivoting indices (i
(k)
1
, i
(k)
2
, i
(k)
3
) can be obtained in many ways. The aim of this choice is to
reduce the amplication factors
1,k
and c
(k)
piv
. In the following numerical examples we have chosen
i
()
1
as the index of the maximum entry of the vector A
1:n,i
()
2
,i
()
3
, where (i
(+1)
2
, i
(+1)
3
) is the index
of the maximum entry in modulus of the rank-k

matrix U

V
H

, U

C
nk

and V

C
nk

. The
maximum can be found with complexity O(k
2

n) via the following procedure. Let u


j
and v
j
denote
the columns of U

and V

, respectively. In [13] it is pointed out that the n


2
n
2
matrix
C :=
k

j=1
diag(u
j
) diag(v
j
)
has the eigenvalues (U

V
H

)
ij
for the eigenvectors e
i
e
j
, i, j = 1, . . . , n. Hence, the maximum entry
can be computed, for instance, by vector iteration. Here, the problem arises that
C(x y) =
k

j=1
(diag(u
j
)x) (diag(v
j
)y),
i.e. the rank increases from step to step. In order to avoid this, C(x y) C
nn
is truncated to
rank-1 via the singular value decomposition. The latter can be computed with O(k
2

n) operations;
see [4].
Hence, the complexity of the matrix approximation algorithm is O(n

k
=1
k
2

), while the storage


requirement is O(nk
S
), where k
S
:=

k
=1
k

. Table 5 shows the number of steps k, the Kronecker


rank k
S
, and the CPU time required for constructing the approximation. The column labeled factor
contains the compression ratio (2k
S
+ 1)/n
2
.
= 10
3
= 10
4
= 10
5
n k k
S
factor time [s] k k
S
factor time [s] k k
S
factor time [s]
160 000 3 9 7 10
10
0.6 3 11 9 10
10
0.7 4 18 1 10
09
1.5
320 000 3 9 2 10
10
1.5 3 11 2 10
10
1.8 4 19 4 10
10
4.0
640 000 3 9 5 10
11
3.8 3 10 5 10
11
4.5 4 20 1 10
10
10.6
1 280 000 3 9 1 10
11
6.9 3 10 1 10
11
9.0 4 20 3 10
11
21.3
Table 5: ACA for third order tensors.
Table 6 shows the results obtained for the functions
(x, y, z) = (x
q
+ y
q
+ z
q
)
1/q
with q = 1, 2, which are singular for x = y = z = 0. The ranks increase compared with Table 5, but
the complexity is still linear with respect to n.
18
q = 1 q = 2
n k k
S
factor time [s] k k
S
factor time [s]
160 000 21 308 2 10
08
127.1 32 706 6 10
08
597.7
320 000 17 214 4 10
09
190.8 30 660 1 10
08
1416.3
640 000 26 405 2 10
09
1426.2 31 667 3 10
09
3653.2
1 280 000 24 421 5 10
10
3102.5 28 574 7 10
10
5756.9
Table 6: ACA for third order tensors and = 10
3
.
5 Adaptive cross approximation of functions of four variables
The construction of approximations to functions
: W X Y Z C
of four variables w, x, y, z can be done by applying ACA to the pairs (w, x) and (y, z):
R
k
(w, x, y, z) = R
k1
(w, x, y, z)
R
k1
(w, x, y
k
, z
k
) R
k1
(w
k
, x
k
, y, z)
R
k1
(w
k
, x
k
, y
k
, z
k
)
.
Since this leads to bivariate functions, we approximate them using ACA again. Hence, in this section
we will investigate the recursion

R
k
(w, x, y, z) =

R
k1
(w, x, y, z)
A
wx
[

R
k1
|
y
k
,z
k
](w, x) A
yz
[

R
k1
|
w
k
,x
k
](y, z)
A
wx
[

R
k1
|
y
k
,z
k
](w
k
, x
k
)
(27)
for k = 1, 2, . . . and

R
0
= . The choice of (w
k
, x
k
, y
k
, z
k
) guarantees A
wx
[

R
k1
|
y
k
,z
k
](w
k
, x
k
) = 0.
Here, A
wx
[f] and A
yz
[g] denote ACA approximations of the bivariate functions f and g with rank
k

.
Lemma 10. The approximating function

S
k
:=

R
k
is of the form

S
k
(w, x, y, z) =
k

=1
u

(w, x)v

(y, z) =
k

||=1
k

|i|=1

i
f
i
(w, x, y, z), (28)
where , i N
4
, || := max
j=1,...,4

j
,
u

(w, x) :=
k

i,j=1

()
ij

R
1
(w, x
()
i
, y

, z

)

R
1
(w
()
j
, x, y

, z

),
v

(y, z) :=
k

i,j=1

()
ij

R
1
(w

, x

, y, z
()
i
)

R
1
(w

, x

, y
()
j
, z),
and
f
i
(w, x, y, z) := (w, x
(
1
)
i
1
, y

1
, z

1
)(w
(
2
)
i
2
, x, y

2
, z

2
)(w

3
, x

3
, y, z
(
3
)
i
3
)(w

4
, x

4
, y
(
4
)
i
4
, z)
with points x
(
1
)
i
1
, y
(
2
)
i
2
, z
(
3
)
i
3
, y
(
4
)
i
4
and coecients
i
,
()
ij
, and
()
ij
.
19
Proof. We already know that
A
wx
[f](w, x) =
k

i,j=1

ij
f(w, x

i
)f(w

j
, x)
with suitable coecients
ij
and points x

i
, w

j
depending on f. Hence, it is easy to see that

S
1
is of
the desired form. Assuming that the assertion is true for k 1, we obtain from

S
k1
(w
k
, x
k
, y
(k)
j
, z) =
k1

4
=1
k

i
4
=1

j
4
i
4
(w

4
, x

4
, y
(
4
)
i
4
, z),
where

j
4
i
4
:=
k1

1
,
2
,
3
=1
k

i
1
,i
2
,i
3
=1

i
(w
k
, x
(
1
)
i
1
, y

1
, z

1
)(w
(
2
)
i
2
, x
k
, y

2
, z

2
)(w

3
, x

3
, y
(k)
j
, z
(
3
)
i
3
),
that

R
k1
(w
k
, x
k
, y
(k)
j
, z) = (w
k
, x
k
, y
(k)
j
, z)

S
k1
(w
k
, x
k
, y
(k)
j
, z) =
k

4
=1
k

i
4
=1

j
4
i
4
(w

4
, x

4
, y
(
4
)
i
4
, z)
and similarly

R
k1
(w
k
, x
k
, y, z
(k)
i
) =
k

3
=1
k

i
3
=1

i
3
i
3
(w

3
, x

3
, y, z
(
3
)
i
3
).
Hence
A
yz
[

R
k1
|
w
k
,x
k
](y, z) =
k

i,j=1

(k)
ij

R
k1
(w
k
, x
k
, y, z
(k)
i
)

R
k1
(w
k
, x
k
, y
(k)
j
, z)
=
k

3
,
4
=1
k

i
3
,i
4
=1

4
i
3
i
4
(w

3
, x

3
, y, z
(
3
)
i
3
)(w

4
, x

4
, y
(
4
)
i
4
, z),
where

4
i
3
i
4
:=

i,j=1

(k)
ij

i
3
i
3

j
4
i
4
. Similarly
A
wx
[

R
k1
|
y
k
,z
k
](w, x) =
k

1
,
2
=1
k

i
1
,i
2
=1

2
i
1
i
2
(w, x
(
1
)
i
1
, y

1
, z

1
)(w
(
2
)
i
2
, x, y

2
, z

2
),
where

2
i
1
i
2
:=

i,j=1

(k)
ij

i
1
i
1

j
2
i
2
, from which the assertion follows.
For four variables we obtain a similar result as Theorem 2 in the trivariate case. Here, in addition
to the amplication factor
1,k
the expression
c
(k)
piv
:= max
yY, zZ
|A
yz
[

R
k1
|
w
k
,x
k
](y, z)|
|A
wx
[

R
k1
|
y
k
,z
k
](w
k
, x
k
)|
will enter the estimates.
20
Theorem 3. Let > 0 be suciently small, and for j = 1, . . . , k let
sup
yY, zZ
|

R
j1
(w
j
, x
j
, y, z) A
yz
[

R
j1
|
w
j
,x
j
](y, z)| , (29a)
sup
wW, xX
|

R
j1
(w, x, y
j
, z
j
) A
wx
[

R
j1
|
y
j
,z
j
](w, x)| . (29b)
Then for w W, x X, y Y , and z Z
|

R
k
(w, x, y, z)| (1 +
1,k
) max
(,){(w,x), (w
i
,x
i
), i=1,...,k}
E
k
[
,
]
,Y Z
+ c
k
,
where
c
k
:=
1,k
+ 2
k

j=1
(
1,j1
+ 1)
k

i=j
(c
(i)
piv
+ 1)(
i,k
+ 1).
Proof. For xed parameters y, z the recursion for

R
k
is of type (9), i.e.

R
k
(w, x, y, z) =

r
k
[
y,z
](w, x),
if we choose

k
(w, x) := A
wx
[

r
k1
[
y
k
,z
k
]](w, x) = A
wx
[

R
k1
|
y
k
,z
k
](w, x), a
k
:= A
yz
[

R
k1
|
w
k
,x
k
](y, z).
Let r
k
be dened as in (1) with
k
(w, x) = r
k1
[
y
k
,z
k
](w, x). From Lemma 7 we obtain

r
k
[
y,z
] r
k
[
y,z
]
,WX

k

j=1

(k)
j

r
j1
[
y,z
](w
j
, x
j
) a
j
|
1,k
,
because
|

r
j1
[
y,z
](w
j
, x
j
) a
j
| = |

R
j1
(w
j
, x
j
, y, z) A
yz
[

R
j1
|
w
j
,x
j
](y, z)| .
Let F
k
:= sup
y,z
r
k1
[
y,z
] r
k1
[
y,z
]
,WX
. Then from assumption (29b) we have that

,WX
= r
k1
[
y
k
,z
k
] A
wx
[

r
k1
[
y
k
,z
k
]]
,WX
r
k1
[
y
k
,z
k
] r
k1
[
y
k
,z
k
]

+ r
k1
[
y
k
,z
k
]

r
k1
[
y
k
,z
k
]

+
F
k
+
k
,
where
k
:= (
1,k1
+ 1). For small enough we may assume that
F
k
+
k

1
2
|

k
(w
k
, x
k
)|. (30)
Then Lemma 6 proves the estimate
F
k+1

k

i=1

i
(F
i
+
i
) (31)
with

i
:=
sup
y,z
|r
i1
[
y,z
](w
i
, x
i
)|
|
i
(w
i
, x
i
)|
(
i,k
+ 1).
From |r
i1
[
y,z
](w
i
, x
i
) A
yz
[

R
i1
|
w
i
,x
i
](y, z)| F
i
+
i
we obtain that
sup
y,z
|r
i1
[
y,z
](w
i
, x
i
)|
|
i
(w
i
, x
i
)|

sup
y,z
|A
yz
[

R
i1
|
w
i
,x
i
](y, z)| + F
i
+
i
|

i
(w
i
, x
i
)| F
i

i
2c
(i)
piv
+ 1
21
due to (30). Similar to the proof of Theorem 2 from F
1
= 0 and (31) we see that
F
k

k1

j=1

j
k1

i=j+1
(
i
+ 1)
k1

j=1

j
k1

i=j
(
i
+ 1).
It follows that

r
k

,WX

r
k
r
k

,WX
+ F
k+1
+r
k

,WX
r
k

,WX
+
1,k
+ 2
k

j=1

j
k

i=j
(c
(i)
piv
+ 1)(
i,k
+ 1).
Notice that that r
k

was estimated in Sect. 3.


5.1 Matrix approximation
We apply the algorithm (27) to a matrix A R
nnnn
with entries a
i
1
i
2
i
3
i
4
= (p
i
1
, p
i
2
, p
i
3
, p
i
4
)
and p
i
=
1
n
(i
1
2
), i = 1, . . . , n, generated by evaluating the smooth function
(w, x, y, z) = (1 + w + x + y + z)
1
.
The stopping criterion (see (20))
S
k+1
S
k

F
S
k+1

F
can be evaluated with linear complexity, because from (28) we obtain the matrix representation
(S
k
)
i
1
i
2
i
3
i
4
=
k

=1
_
k

=1
(u

)
i
1
(v

)
i
2
_
_
_
k

=1
(u

)
i
3
(v

)
i
4
_
_
with suitable vectors u

, v

, u

, and v

, which shows that


S
k

2
F
=
k

=1

,
where

:=
k

=1
k

=1
(u

, u

)(v

, v

) and

:=
k

=1
k

=1
(u

, u

)(v

, v

).
Furthermore, S
k+1
S
k

2
F
=
k+1,k+1

k+1,k+1
. Also in the case of four dimensional arrays the
computation of R
k
can be avoided by tracing back the error as in (17) and (18).
The pivots (i
()
1
, i
()
2
) are chosen as the indices of the entry of maximum modulus in the rank-k

matrix U

V
T

, while (i
(+1)
3
, i
(+1)
4
) corresponds to the maximum entry in the rank-k

matrix U

(V

)
T
,
where U

, V

, U

, and V

consist of the columns u

, v

, u

, and v

, respectively. Both maxima


can be found with linear complexity using the technique from Sect. 4.1.
Hence, the complexity of the matrix approximation algorithm is O(n

k
=1
k
2

+ (k

)
2
) and the
storage required by the approximation is O(n(k
S
+ k

S
)), where k
S
:=

k
=1
k

and k

S
:=

k
=1
k

.
Table 7 shows the number of steps k, the ranks k
S
and k

S
, and the CPU time required for
constructing the approximation. The column labeled factor contains the compression ratio 2(k
S
+
k

S
)/n
3
.
22
= 10
3
= 10
4
= 10
5
n k k
S
k

S
factor time [s] k k
S
k

S
time [s] k k
S
k

S
time [s]
160 000 4 19 13 2 10
14
2.5 5 27 18 4.0 7 53 35 10.6
320 000 4 19 13 2 10
15
6.2 5 27 19 10.5 7 59 35 31.9
640 000 4 19 13 2 10
16
15.8 5 26 19 26.1 7 57 34 69.8
1 280 000 4 19 13 3 10
17
31.5 5 28 19 55.6 7 55 35 142.3
Table 7: ACA for fourth order tensors.
References
[1] M.-B. A. Babaev. Best approximation by bilinear forms. Mat. Zametki, 46(2):2133, 158, 1989.
[2] M.-B. A. Babaev. Exact annihilators and their applications in approximation theory. Trans.
Acad. Sci. Azerb. Ser. Phys.-Tech. Math. Sci., 20(1, Math. Mech.):1724, 233, 2000.
[3] M. Bebendorf. Approximation of boundary element matrices. Numer. Math., 86(4):565589,
2000.
[4] M. Bebendorf. Hierarchical Matrices: A Means to Eciently Solve Elliptic Boundary Value
Problems, volume 63 of Lecture Notes in Computational Science and Engineering (LNCSE).
Springer, 2008. ISBN 978-3-540-77146-3.
[5] M. Bebendorf and R. Grzhibovskis. Accelerating Galerkin BEM for Linear Elasticity using
Adaptive Cross Approximation. Mathematical Methods in the Applied Sciences, 29:17211747,
2006.
[6] M. Bebendorf and S. Rjasanow. Adaptive low-rank approximation of collocation matrices.
Computing, 70(1):124, 2003.
[7] Gregory Beylkin and Lucas Monzon. On approximation of functions by exponential sums. Appl.
Comput. Harmon. Anal., 19(1):1748, 2005.
[8] D. Braess and W. Hackbusch. Approximation of 1/x by exponential sums in [1, ). IMA J.
Numer. Anal., 25(4):685697, 2005.
[9] D. Braess and W. Hackbusch. On the ecient compuation of high-dimensional integrals and
the approximation by exponential sums. Technical Report 3, Max-Planck-Institute MiS, 2009.
[10] Hans-Joachim Bungartz and Michael Griebel. Sparse grids. Acta Numerica, 13:147269, 2004.
[11] J. Douglas Carroll and Jih-Jie Chang. Analysis of individual dierences in multidimensional scal-
ing via an n-way generalization of Eckart-Young decomposition. Psychometrika, 35(3):283
319, 1970.
[12] G. Eckart and G. Young. The approximation of one matrix by another of lower rank. Psycho-
metrica, 1:211218, 1936.
[13] Mike Espig. Eziente Bestapproximation mittels Summen von Elementartensoren in hohen
Dimensionen. PhD thesis, University of Leipzig, 2007.
[14] W. Hackbusch and S. K uhn. A new scheme for the tensor representation. Technical Report 2,
Max-Planck-Institute MiS, 2009.
23
[15] Wassily Hoeding. A class of statistics with asymptotically normal distribution. Ann. Math.
Statistics, 19:293325, 1948.
[16] Ilghiz Ibraghimov. Application of the three-way decomposition for matrix compression. Numer.
Linear Algebra Appl., 9(6-7):551565, 2002. Preconditioned robust iterative solution methods,
PRISM 01 (Nijmegen).
[17] B. N. Khoromskij. Structured rank-(r
1
, . . . , r
d
) decomposition of function-related tensors in R
d
.
Comput. Methods Appl. Math., 6(2):194220 (electronic), 2006.
[18] Tamara G. Kolda. A counterexample to the possibility of an extension of the Eckart-Young low-
rank approximation theorem for the orthogonal rank tensor decomposition. SIAM J. Matrix
Anal. Appl., 24(3):762767 (electronic), 2003.
[19] Pieter M. Kroonenberg and Jan de Leeuw. Principal component analysis of three-mode data by
means of alternating least squares algorithms. Psychometrika, 45(1):6997, 1980.
[20] I. V. Oseledets, D. V. Savostianov, and E. E. Tyrtyshnikov. Tucker dimensionality reduction of
three-dimensional arrays in linear time. SIAM J. Matrix Anal. Appl., 30(3):939956, 2008.
[21] V. V. Pospelov. Approximation of functions of several variables by products of functions of a
single variable. Akad. Nauk SSSR Inst. Prikl. Mat. Preprint, (32):75, 1978.
[22] Themistocles M. Rassias and Jaromr

Simsa. Finite sums decompositions in mathematical anal-
ysis. Pure and Applied Mathematics (New York). John Wiley & Sons Ltd., Chichester, 1995.
A Wiley-Interscience Publication.
[23] Erhard Schmidt. Zur Theorie der linearen und nichtlinearen Integralgleichungen. Math. Ann.,
63(4):433476, 1907.
[24] J. Schneider. Error estimates for two-dimensional Cross Approximation. Technical Report 5,
Max-Planck-Institute MiS, 2009.
[25] Jaromr

Simsa. The best L
2
-approximation by nite sums of functions with separable variables.
Aequationes Math., 43(2-3):248263, 1992.
[26] Jos M. F. ten Berge, Jan de Leeuw, and Pieter M. Kroonenberg. Some additional results
on principal components analysis of three-mode data by means of alternating least squares
algorithms. Psychometrika, 52(2):183191, 1987.
[27] Ledyard R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika,
31:279311, 1966.
[28] Ch. Zenger. Sparse grids. In W. Hackbusch, editor, Parallel Algorithms for Partial Dierential
Equations, volume 31 of Notes on Numerical Fluid Mechanics, pages 241251. Vieweg, 1991.
[29] Tong Zhang and Gene H. Golub. Rank-one approximation to high order tensors. SIAM J.
Matrix Anal. Appl., 23(2):534550 (electronic), 2001.
24
Bestellungen nimmt entgegen:

Sonderforschungsbereich 611
der Universitt Bonn
Poppelsdorfer Allee 82
D - 53115 Bonn

Telefon: 0228/73 4882
Telefax: 0228/73 7864
E-Mail: astrid.link@ins.uni-bonn.de http://www.sfb611.iam.uni-bonn.de/



Verzeichnis der erschienenen Preprints ab No. 430


430. Frehse, Jens; Mlek, Josef; Ruika, Michael: Large Data Existence Result for Unsteady
Flows of Inhomogeneous Heat-Conducting Incompressible Fluids

431. Croce, Roberto; Griebel, Michael; Schweitzer, Marc Alexander: Numerical Simulation of
Bubble and Droplet Deformation by a Level Set Approach with Surface Tension in
Three Dimensions

432. Frehse, Jens; Lbach, Dominique: Regularity Results for Three Dimensional Isotropic and
Kinematic Hardening Including Boundary Differentiability

433. Arguin, Louis-Pierre; Kistler, Nicola: Small Perturbations of a Spin Glass System

434. Bolthausen, Erwin; Kistler, Nicola: On a Nonhierarchical Version of the Generalized
Random Energy Model. II. Ultrametricity

435. Blum, Heribert; Frehse, Jens: Boundary Differentiability for the Solution to Hencky's Law
of Elastic Plastic Plane Stress

436. Albeverio, Sergio; Ayupov, Shavkat A.; Kudaybergenov, Karim K.; Nurjanov, Berdach O.:
Local Derivations on Algebras of Measurable Operators

437. Bartels, Sren; Dolzmann, Georg; Nochetto, Ricardo H.: A Finite Element Scheme for the
Evolution of Orientational Order in Fluid Membranes

438. Bartels, Sren: Numerical Analysis of a Finite Element Scheme for the Approximation of
Harmonic Maps into Surfaces

439. Bartels, Sren; Mller, Rdiger: Error Controlled Local Resolution of Evolving Interfaces
for Generalized Cahn-Hilliard Equations

440. Bock, Martin; Tyagi, Amit Kumar; Kreft, Jan-Ulrich; Alt, Wolfgang: Generalized Voronoi
Tessellation as a Model of Two-dimensional Cell Tissue Dynamics

441. Frehse, Jens; Specovius-Neugebauer, Maria: Existence of Hlder Continuous Young
Measure Solutions to Coercive Non-Monotone Parabolic Systems in Two Space
Dimensions

442. Kurzke, Matthias; Spirn, Daniel: Quantitative Equipartition of the Ginzburg-Landau Energy
with Applications
443. Bulek, Miroslav; Frehse, Jens; Mlek, Josef: On Boundary Regularity for the Stress in
Problems of Linearized Elasto-Plasticity

444. Otto, Felix; Ramos, Fabio: Universal Bounds for the Littlewood-Paley First-Order
Moments of the 3D Navier-Stokes Equations

445. Frehse, Jens; Specovius-Neugebauer, Maria: Existence of Regular Solutions to a Class
of Parabolic Systems in Two Space Dimensions with Critical Growth Behaviour

446. Bartels, Sren; Mller, Rdiger: Optimal and Robust A Posteriori Error Estimates in
L

(L
2
) for the Approximation of Allen-Cahn Equations Past Singularities

447. Bartels, Sren; Mller, Rdiger; Ortner, Christoph: Robust A Priori and A Posteriori Error
Analysis for the Approximation of Allen-Cahn and Ginzburg-Landau Equations
Past Topological Changes

448. Gloria, Antoine; Otto, Felix: An Optimal Variance Estimate in Stochastic Homogenization
of Discrete Elliptic Equations

449. Kurzke, Matthias; Melcher, Christof; Moser, Roger; Spirn, Daniel: Ginzburg-Landau
Vortices Driven by the Landau-Lifshitz-Gilbert Equation

450. Kurzke, Matthias; Spirn, Daniel: Gamma-Stability and Vortex Motion in Type II
Superconductors

451. Conti, Sergio; Dolzmann, Georg; Mller, Stefan: The DivCurl Lemma for Sequences
whose Divergence and Curl are Compact in W
1,1


452. Barret, Florent; Bovier, Anton; Mlard, Sylvie: Uniform Estimates for Metastable
Transition Times in a Coupled Bistable System

453. Bebendorf, Mario: Adaptive Cross Approximation of Multivariate Functions

You might also like