You are on page 1of 23

Chapter 1

Fast Fourier transforms for


nonequispaced data: A tutorial

Daniel Potts
Gabriele Steidl
Manfred Tasche

ABSTRACT In this section, we consider approximative methods for the


fast computation of multivariate discrete Fourier transforms for nonequispaced data (NDFT) in the time domain and in the frequency domain.
In particular, we are interested in the approximation error as function
of the arithmetic complexity of the algorithm. We discuss the robustness
of NDFT{algorithms with respect to roundo errors and apply NDFT{
algorithms for the fast computation of Bessel transforms.

1.1 Introduction
Let d := [? 12 ; 21 )d and IN := fk 2 Zd : ? N2  k < N2 g, where the
inequalities hold componentwise. For xk 2 d , vj 2 N d , and fk 2 C , we
are interested in the fast and robust computation of the discrete Fourier
transform for nonequispaced data (NDFT)

f (vj ) =

k2IN

fk e?2ixk vj

(j 2 IM ) :

(1.1)

For arbitrary nodes, the direct evaluation of (1.1) takes O(N d M d) arithmetical operations, too much for practical purposes. For equispaced nodes
xk := Nk (k 2 IN ) and vj := j (j 2 IN ), the values f (vj ) can be computed
by the well-known fast Fourier transform (FFT) with only O(N d log N )
arithmetical operations. However, the FFT requires sampling on an equally
spaced grid, which represents a signi cant limitation for many applications.
For example, most geographical data are sampled at individual observation
points or by fast moving measuring devises along lines. Using the ACT
method (a daptive weight, c onjugate gradient acceleration, T oeplitz matrices) [12] for the approximation of functions from scattered data [22], one
has to solve a system of linear equations with a block Toeplitz matrix as
system matrix. The entries of this Toeplitz matrix are of the form (1.1)
and should be computed by ecient NDFT algorithms. Further applica-

This is page 1
Printer: Opaque this

Daniel Potts, Gabriele Steidl, Manfred Tasche

tions of the NDFT range from frequency analysis of astronomical data [19]
to modelling and imaging [4, 23].
Often either the nodes in time or in frequency domain lie on an equispaced grid, i.e., we have to compute

f (vj ) =
or

h(k) :=

k2IN
X

j 2IN

fk e?2ikvj =N

(j 2 IM )

(1.2)

fj e?2ikvj =N

(k 2 IM ):

(1.3)

Several methods were introduced for the fast evaluation of (1.2) and (1.3).
It may be useful to compare the di erent approaches by using the following
matrix-vector notation: For the univariate setting d = 1 and M = N , the
equations (1.2) can be written as
f^ = AN f
(1.4)
with



N=2?1
2?1
?2ikvj =N N=2?1 :
f^ := (f (vj ))N=
j =?N=2 ; f := (fk )k=?N=2 ; AN := e
j;k=?N=2

Note that (1.3) is simply the \transposed" version of (1.2), i.e.


h^ = ATN f

(1.5)

2?1
with h^ := (h(k))N=
k=?N=2 .
The di erent NDFT algorithms can be divided into three groups:
I. Based on local series expansions of the complex exponentials like Taylor
series [2] or series of prolate spheroidal functions [18] algorithms were developed which are more ecient than the direct evaluation of the NDFT. In
matrix-vector notation, the matrix AN was approximated by the Hadamard
(componentwise) product  of the classical Fourier matrix
N=2?1

FN := e?2ikj=N j;k=?N=2

and a low rank matrix E , i.e.

AN  FN  E :
The rank of E determines the approximation error. Multiplication of the
right-hand side with a vector means to perform rank(E ) times an FFT of
length N .

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

II. A quite di erent idea was introduced in [9] for the univariate case.
Note that the idea can be generalized for \line settings". Using Lagrange
interpolation, the matrix AN can be exactly splitted as
AN = D1 CD2 FN ;
(1.6)
where Di (i = 1; 2) are diagonal matrices and where

2?1
C := (1='((vj =N ) ? (k=n))N=
j;k=?N=2
with a monotone decreasing function '. A. Dutt and V. Rokhlin [9] originally have used '(x) := tan(x) ? i. For real input data, a more ecient
modi ed version with '(x) := 1=x was suggested in [6]. See Figure 1.1. The
multiplication with C can be approximately realized by the fast multipole
method [13].
In contrast to the other NDFT methods, the splitting (1.6) works also
for the inverse matrix, i.e. A?N1 = FN D3 C T D4 , such that for suitably distributed nodes vj , the same algorithm can be used for the inverse transform.
For clustered nodes, the inverse approach does not work, since the entries
of the diagonal matrices di er extremely. See [6].

III. The most ecient NDFT algorithms for the computation of (1.2)
and (1.3) we have noticed the rst time in [8, 4]. See also [5, 20]. In [25],
we proposed a uni ed approach to the ecient computations of (1.2) and
(1.3), which includes [8, 4]. In matrix-vector notation our approach to (1.2)
can be written as
AN  BFn D (n := N ; > 1) ;
(1.7)
with a modi ed diagonal matrix D and with a sparse matrix B with nonzero
entries ((vj =N ) ? (l=n)), where approximates '. The approaches in
[8, 4] di er only by the choice of the window function '. Now we have
learned that this algorithm with several window functions ' is a remake of
a gridding algorithm, which was known in image processing context years
ago [23, 15, 19]. The gridding method is simply the transposed version of
(1.7) for the ecient computation of (1.3).
However, the dependence of speed and accuracy of the algorithm on the
choice of the oversampling factor and the window width of ' was rstly
investigated in [8, 4]. In [25] the error estimates from [8] were improved,
which leads to citeria for the choice of the parameters of the algorithm.
In this paper, we repeat the uni ed approach (1.7) for the fast computation of (1.2) from [25] and show the relation to the gridding algorithm
for the fast computation of (1.3). We will see that our approach is also
closely connected with interpolation methods by \translates of one function" known from approximation theory. We include estimates of the approximation error.

Daniel Potts, Gabriele Steidl, Manfred Tasche


4

3.5

[sec]

2.5

[8]
[9]
[6]
original

1.5

0.5

0
0

500

1000

1500

2000

2500

3000

3500

4000

4500

FIGURE 1.1. Time (in seconds) for straightforward computation of


NDFT, NDFT{algorithm in [8], FMM{based NDFT{algorithms in [9]
and [6] without consideration of initialization (precomputation) steps.
Section 1.3 contains an algorithm for the ecient computation of the
general NDFT (1.1).
In Section 1.4, we demonstrate another advantage of Algorithm 1.1,
namely its robustness with respect to roundo errors, a feature which is
well-known from the classical FFT [21, 26].
Finally, we apply the NDFT in a fast Bessel transform algorithm.

1.2 NDFT for nonequispaced data either in time


or frequency domain
Let us begin with the computation of (1.2). Only for notational reasons,
we restrict our attention to the case M = N . We have to evaluate the
1-periodic trigonometric polynomial

f (v) :=

k2IN

fk e?2ikv

(1.8)

at the nodes wj := vj =N 2 d (j 2 IN ). We introduce the oversampling


factor > 1 and set n := N .
Let ' be a 1-periodic function with uniformly convergent Fourier series.
We approximate f by
X
s1 (v) := gl '(v ? nl ):
(1.9)
l2In

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

Switching to the frequency domain, we obtain

s1 (v) =
=

k2In

k2Zd

g^k ck (') e?2ikv


X

g^k ck (') e?2ikv +

with

g^k :=
Z

(1.10)
X

r2Zdnf0g k2In
X

l2In

g^k ck+nr (') e?2i(k+nr)v

gl e2ikl=n ;

(1.11)

ck (') := '(v)e2ikv dv (k 2 Zd):


d

If the Fourier coecients ck (') become suciently small for k 2 ZdnIn and
if ck (') 6= 0 for k 2 IN , then we suggest by comparing (1.8) with (1.10) to
set

g^k := 0fk =ck (') kk 22 IINn;I :
(1.12)
n N

Now the values gl can be obtained from (1.11) by the reduced inverse dvariate FFT of size N . If ' is also well-localized in time domain such that
it can be approximated by a 1-periodic function with supp \ d 
2m d (2m  n), then
n

f (wj )  s1 (wj )  s(wj ) =

gl (wj ? nl )
l2In;m (wj )
X

(1.13)

with In;m (wj ) := fl 2 IN : nwj ? m  l  nwj + mg. For xed wj 2 d ,


the above sum contains at most (2m + 2)d nonzero summands.
In summary, we obtain the following algorithm for the fast computation
of (1.2) with O(( N )d log( N )) arithmetical operations:

Algorithm 1.1. (Fast computation of NDFT (1.2))

Input: N 2 N , > 1, n := N , wj 2 d, fk 2 C (j; k 2 IN ).


0: Precompute ck (') (k 2 IN ), (wj ? nl ) (j 2 IN ; l 2 In;m (wj )).
1: Form g^k := fk =ck (') (k 2 IN ):
2: Compute by d{variate FFT

gl := n?d
3: Set

s(wj ) :=

k2IN

g^k e?2ikl=n (l 2 In ):

l2In;m (wj )

gl (wj ? nl ) (j 2 IN ) :

Daniel Potts, Gabriele Steidl, Manfred Tasche

Output: s(wj ) approximate value of f (wj ) (j 2 IN ).

In the univariate setting d = 1 , Algorithm 1.1 reads in matrix-vector


notation as
AN f  BFn Df ;
where B denotes the sparse matrix

B :=
and where

N=2?1 ;n=2?1
l
(wj ? n )
j =?N=2;l=?n=2

2?1
D := O j diag(1=(n ck (')))N=
k=?N=2 j O
with (N ,(n ? N )=2)-zero matrices O.

(1.14)
T

(1.15)

If only few nodes vj di er from the equispaced nodes j , then the approximating function s1 of f can be alternatively constructed by requiring
exact interpolation at the nodes j=n, i.e. f (j=n) = s1 (j=n) (j 2 In ). As
consequence we have only to replace ck (') in step 1 of Algorithm 1.1 by
the discrete Fourier sum of ('(l=n))l2In . The approximation of a function
f by an interpolating function s1 which is a linear combinations of translates of one given function ' on a grid was considered in various papers in
approximation theory. See for example [16] and also [4].
Let us turn to the gridding algorithm for the fast computation of (1.3). It
seems to be useful to introduce the gridding idea by avoiding the convenient
notation with delta distributions. In short: For

g(x) :=
we obtain that

ck (g) =

j 2IN

j 2IN

fj '(x + wj );

fj e?2ikwj ck (') = h(k) ck (') (k 2 Zd)

such that h(k) in (1.3) can be computed, if ck (g) is known. Now we determine
Z X
ck (g) =
fj '(x + wj ) e2ikx dx
d j 2IN

approximately by the trapezoidal rule


X X
fj '(wj ? nl ) e?2ikl=n ;
ck (g)  n1d
l2In j 2IN

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

which introduces an aliasing error. Furthermore, we replace ' by its truncated version , which introduces a truncation error. Obviously, this results
in a \transposed" Algorithm 1.1, which reads for d = 1 in matrix{vector
notation
ATN  DT Fn B T :
For l 2 In we introduce the index set Jm;n (l) := fj 2 IN : l ? m  nwj 
l + mg.

Algorithm 1.2. (Fast computation of NDFT (1.3))

Input: N 2 N , > 1, n := N , wj 2 d, fk 2 C (j; k 2 IN ).


0: Precompute ck (') (k 2 IN ), (wj ? nl ) (l 2 In ; j 2 Jn;m (l)).
1: Set
X
fj (wj ? nl ) (l 2 In ) :
g~l :=
j 2Jn;m (l)

2: Compute by d{variate FFT

c~k (g) := n?d

l2In

g~l e?2ikl=n (k 2 IN ):

3: Form ~h(k) := c~k (g)=ck (') (k 2 IN ):


Output: h~ (k) approximate value of h(k) (k 2 IN ).

Both algorithms introduce the same approximation errors. By (1.13), the


error splits as follows:
E (wj ) := jf (wj ) ? s(wj )j  Ea (wj ) + Et (wj )
with Ea (wj ) := jf (wj ) ? s1 (wj )j and Et (wj ) := js1 (wj ) ? s(wj )j. Note
that Ea (wj ) and Et (wj ) are the aliasing error and the truncation error
introduced by Algorithm 1.2, respectively. Let
X
jfk j:
E1 := jmax
E
(
w
)
;
jj
f
jj
:=
j
1
2I
N

k2IN

Then we obtain immediately by (1.8), (1.10) { (1.12) that




X ck+nr (')
X


jfk j
Ea (wj ) 
c (')
k
d
k2IN
r2Z nf0g


X ck+nr (')


 jjf jj1 kmax
2IN r2Zdnf0g ck (')
and by (1.9), (1.11) { (1.13) that

Et (wj )  jjf jj1 n?d (max


jc (')j?1 )
k2I k
N

l2In

(1.16)

j'(wj ? nl ) ? (wj ? nl )j:

Daniel Potts, Gabriele Steidl, Manfred Tasche

To ll these error estimates with life, we consider special functions ' in the
case d = 1. In [25], we have proved

Theorem 1.1. Let (1:2) be computed by Algorithm 1:1 with the dilated, periodized Gaussian bell
'(v) := (b)?1=2

r2Z

e?(n(v+r)) =b
2

and with the truncated version of ',

(v) := (b)?1=2

r2Z

e?(n(v+r)) =b [?m;m] (n(v + r)) :


2

Here [?m;m] denotes the characteristic function of [?m; m]. Let > 1 and
1 ?( k )2 b
and
1  b  (2 2 m
?1) . Then ck (') = n e n


Ea(wj )  jjf jj1 e?b (1? ) 1 + (2 ? 1)b2 + e?2b


Et(wj )  jjf jj1 p2 (1 + 2bm ) e?b (( bm ) ?( ) ) ;
b
E1  4 jjf jj1 e?b (1? ) :
2

1
2

1 + (2 + 1)b2



The approximation error decreases with increasing b. Therefore


(1.17)
b = (2 2 m
? 1)
provides a good choice for b as function of and m. For the above parameter
b, pthe approximation error and the truncation error are balanced. Since
2= b contributes also to the decay of Et (wj ), we expect that the optimal
parameter b lies slightly above the value in (1.17). A. Dutt and V. Rokhlin
[8] does not give a criteria for the choice of the parameter b. They determine
b by trial computations. Moreover, our error estimate is sharper than the
estimate
E1  (b + 9) jjf jj1 e?b (1? )=4
in [8].
For oversampling factor = 2, the new paper [19] of J. Pelt contains
extensive numerical computations to determine the optimal parameter b as
function of m.
More recently, A. J. W. Duijndam and M. A. Schoneville [7] have evaluated the RMS-error of Algorithm 1.2 with the Gaussian pulse '. They
found the same optimal parameter b as in (1.17), which con rms our result.
Estimates for the multivariate setting were given in [11, 7].
2

1
2

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

Figure 1.2 illustrates our theoretical results by numerical tests. The tests
were implemented in MATLAB. The C{programs for Algorithm 1.1 were
taken from [6] and were included by cmex. The programs were tested on a
Sun SPARCstation 20 in double precision.
Let N = 2t . As example we consider the computation of

f (wj ) =

t ?1
2X

k=0

e?2ikwj (j 2 I2t )


(1.18)


with uniformly distributed random nodes wj 2 ? 21 ; 12 . The exact vector


t?
f^ := (f (wj ))2j=?2?t1? is given by
1

?2iwj (2 ?1) 2iwj


:
f (wj ) := e 1 ? e2iw?j e
By f~ we denote the result of Algorithm 1.1 with '; as in Theorem 1.1,
m := 15; := 2 and b as in (1.17). Let
ENDFT (t) := log10 (kf~ ? f^k2)=kf^k2 ) :

Figure 1.2 (left) shows the error ENDFT (10) for 10 numerical tests and for
various m = 6; : : : ; 20. Figure 1.2 (right) presents the computation time (in
seconds) for the cascade summation of (1.2) and for Algorithm 1.1. Note
that for t = 13 the fast Algorithm 1.1 requires for less than one second
while the direct method based on cascade summation takes more than 3
hours.
5

10

10

10

8
2

10

9
1

10

10

11

10

12

10

13
2

10

14
3

10

10

15

20

10

12

14

16

FIGURE 1.2. Left: (m; E (10)) for m = 6; : : : ; 20. Right: Computation


time in seconds for t = 2; : : : ; 15.
NDFT

Next, we consider centered cardinal B-splines as window functions, which


are in contrast to the Gaussian kernel of compact support such that we have

10

Daniel Potts, Gabriele Steidl, Manfred Tasche

no truncation error. Algorithm 1.2 with centered cardinal B-splines and


with a modi ed third step was originally introduced in [4]. The di erence in
the third step was motivated by the tight frequency localization of Lemarie{
Battle scaling functions which arise in wavelet theory. See also [25]. With
M2m we denote the centered cardinal B-spline of order 2m.

Theorem 1.2. Let (1:2) be computed by Algorithm 1:1 with the dilated, periodized, centered cardinal B-spline
'(v) :=

r2Z

M2m (n(v + r)) (m  1)

of order 2m and let = '. Then


8
> 1
k = 0;
<
ck (') = n1  sin(k=n) 2m

otherwise
n
k=n
and for > 1 and 0    4m=3,
 ?
E1  2m4m? 1 N2
(2 ? 1)?2m jf j;1 :
>
:

Here jf j;1 denotes the Sobolev-like seminorm jf j;1 :=

N=P
2?1

k=?N=2

jfk j jkj .

Proof: 1. For 0 < u < 1 and m 2 N, it holds that

1 
u 2m  2  u 2m + 2 X
u 2m
u+r
u?1
u?r
r=2
r2Z=f0g
Z
2m
1  u 2m

 2 u ?u 1 + 2
u ? x dx
1

2m
 2m4m? 1 u ?u 1 :
2. Since crn (') = 0 (r 6= 0) and
X 

2m 
2m 
2m
k=n
sin(
k=n
)
sin(
k=n
)
=
n ck+rn (') = k=n + r
k=n
k=n + r

2m
k=n
= n ck (') k=n
;
+r


we obtain by (1.16) that

E1  2m4m? 1

N=
2?1
X
k=?N=2
(k6=0)

j m? :
jfk j jkj n? (k=njk=n
? sgn(k)) m
2

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

11

The functions h+ (u) := u2m? =(u ? 1)2m and h? (u) := u2m? =(u + 1)2m are
monotonically increasing for u 2 [0; 21 ] and   4m=3. Thus, since jkj  N=2, we
obtain the assertion.

By Theorem 1.2, we have for  = 0 that


E1  2m4m? 1 (2 ? 1)?2m jjf jj1 :
If the values fk are Fourier coecients of a smooth function such that jf j;1
does not increase with N ! 1, then the estimates with larger values of 
may be useful. For example, we obtain for =  = m = 2 by Theorem 1.2
that
 5
E1  32 N ?2 jf j2;1 :
Multivariate estimates were given in [11, 4].
In various papers, other window functions than the Gaussian pulse or
centered cardinal B{splines were considered:
- prolate spheroidal functions and Kaiser-Bessel functions [15],
- Gaussian pulse tapered with a Hanning window [7],
- Gaussian kernels combined with sinc kernels [19],
- special optimized windows [15, 7].
Numerical results demonstrate that the approximation error (as function
of the speed of the computation) can be further improved by using these
functions.

1.3 NDFT for nonequispaced data in time and


frequency domain
The following algorithm for the fast computation of the general NDFT (1.1)
(with M = N ) turns out to be a combination of the gridding approach to
(1.3) and of our approach to (1.2). Moreover, in order to apply again an
aliasing formula, a periodization procedure becomes necessary. In order to
form a-periodic functions, a parameter a comes into the play.
By '1 2 L2 (Rd ) we denote a suciently smooth function with Fourier
transform
Z
'^1 (v) := '1 (x)e?2ivx dx 6= 0
Rd

12

Daniel Potts, Gabriele Steidl, Manfred Tasche

for all v 2 N d . Then we obtain for

G(x) :=
that

G^ (v) =

and consequently

k2IN

fk '1 (x ? xk )

fk e?2ixk v '^1 (v)

k2IN

^
f (vj ) = '^G((vvj )) (j 2 IN ):
1 j
Thus, given '^1 (vj ), it remains to compute G^ (vj ).
Let n1 := 1 N ( 1 > 1), m1 2 N (2m1  n1 ). We introduce the parameter
a := 1 + 2nm and rewrite G^ (v) as
1
1

G^ (v) =
=

k2IN
X

k2IN

fk
fk

RZd

'1 (x ? xk ) e?2ixv dx
X

ad r2Zd

'1 (x + ra ? xk ) e?2i(x+ra)v dx: (1.19)

Discretization of the integral by the rectangular rule leads to


X
X X
j
fk n?1 d
G^ (v)  S1 (v) =
'1 ( nj + ra ? xk ) e?2i( n +ra)v :
1
j 2Ian r2Zd
k2IN
(1.20)
Now we approximate the function '1 by a function 1 with supp 1 
2m d . Then the third sum in (1.20) contains only one nonzero summand
n
for r = 0. Changing the order of the summation, we obtain
1

1
1

fk 1 ( nt ? xk ) e?2itv=n :
1
1
t2Ian k2IN
After the computation of the inner sum for all t 2 Ian , we arrive at computation problem (1.2), which can be solved in a fast way by Algorithm 1.1,
where the corresponding window functions and parameters are indicated
by the index 2. We summarize:
S1 (v)  S2 (v) := n?d

Algorithm 1.3. (Fast computation of NDFT (1.1))

Input: N 2 N , 1 > 1, 2 > 1, n1 := 1 N , a := 1 + 2nm11 , n2 := 1 2 aN ,


xk 2 d , vj 2 N d , fk 2 C (j; k 2 IN ).
0: Precompute '^1 (vj ) (j 2 IN ), 1 ( nt1 ? xk ) (k 2 IN ; t 2 Ian1 ;m1 (xk )),
ct ('2 ) (t 2 Ian1 ), 2 ( nvj1 ? nl2 ) (j 2 IN ; l 2 In2 ;m2 ( nvj1 )).

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial


1: Compute

13

fk 1 ( nt ? xk ) (t 2 Ian ) :
1
k2IN
2: Form g^t := F (t)=ct ('2 ) (t 2 Ian ):
3: Compute by d{variate FFT
F (t) :=

gl := n?2 d
4: Form

t2Ian1

g^t e?2itl=n

(l 2 In2 ):

gl 2 ( nvj ? nl ) (j 2 IN ) :
1
2
(v =n )

s(vj ) := n?1 d

l2In2 ;m2 j 1
5: Form S (vj ) := s(vj )='^1 (vj ) (j 2 IN ).
Output: S (vj ) approximate value of f (vj ) (j 2 IN ).

The Algorithm 1.3 requires O(( 1 2 aN )d log( 1 2 aN )) arithmetical operations. The approximation error is given by
E (vj )  E1 (vj ) + E2 (vj ) + E3 (vj )
with E1 (vj ) := jf (vj ) ? 'S^ ((vvjj )) j; E2 (vj ) := j S (v'^j )?(vSj )(vj ) j; and E3 (vj ) :=
j S ('v^j )(?vjs)(vj ) j : The error E3 (vj ) is the product of the error of Algorithm 1.1
and j'^1 (vj )j?1 . The cut-o error E2 (vj ) behaves like the truncation error
in Algorithm 1.1. The error E1 (vj ) arising from the discretization of the
integral (1.19) can be estimated by an aliasing argument [11].
The following Table 1.1 (see also [10, 11]) contains the relative approximation error
max jf (vj ) ? S (vj )j= jmax
jf (vj ) :
j 2I
2I
1

introduced by Algorithm 1.3 for tensor products of Gaussian bells '1 and
'2 , for N := 128, m = m1 = m2 , a := NN?m , 1 := a2 , 2 := 2 and for
randomly distributed nodes xj and vk =N in 2 . By the choice of 1 ; 2
and a, the main e ort of the algorithm consists in the bivariate FFT of
size 4N = 1024. The third column of Table 1.1 contains the error of Algorithm 1.3, if we simply set a = 1 and 1 = 2 = 2. This change of
parameters in uences only the rst step of the algorithm. (A similar error
occurs, if we consider 1 as 1-periodic function.) Table 1.1 demonstrates
the signi cance of the parameter a.

1.4 Roundo errors


Beyond the approximation error, the numerical computation of Algorithm
1.1 involves roundo errors. In this section, we will see that similar to the

14

Daniel Potts, Gabriele Steidl, Manfred Tasche

m
5
7
9
11
13
15

a = NN?m
5.96608e-06
5.44728e-08
1.07677e-09
3.31061e-11
1.26030e-12
2.16694e-13

a = 1
0.0180850
0.0318376
0.0541445
0.0906439
0.1507300
0.2500920

TABLE 1.1. Approximation error of Algorithm 1:3 for N := 128,


a := NN?m , 1 := a2 , 2 := 2 and for a := 1, 1 = 2 := 2, respectively.
classical FFT [21, 14], our algorithm is robust with respect to roundo
errors. In the following, we use the standard model of real oating point
arithmetic (see [14], p. 44): For arbitrary ;  2 R and any operation  2
f+; ?; ; =g the exact value    and the computed value (  ) are
related by
(  ) = (  ) (1 + ) (jj  u) ;
where u denotes the unit roundo (or machine precision). In the case of
single precision (24 bits for the mantissa (with 1 sign bit), 8 bits for the
exponent), we have
u = 2?24  5:96  10?8
and for double precision (53 bits for the mantissa (with 1 sign bit), 11 bits
for the exponent)
u = 2?53  1:11  10?16:
Since complex arithmetic is implemented using real arithmetic, the complex oating point arithmetic is a consequence of the corresponding real
arithmetic (see [14], pp. 78 { 80): For arbitrary ;  2 C , we have
( + ) = ( + ) (1 + ) (jj  u) ;
(1.21)
p
(1.22)
( ) =   (1 + ) (jj  12 ?22uu ) :
In particular, if  2 R [ iR and  2 C , then
() =  (1 + ) (jj  u) :
(1.23)
To simplify the following error analysis, we assume that all entries of AN ,
B , Fn and D were precomputed exactly. Moreover, we restrict our attention to real input vectors f 2 RN . If we form f^ := AN f by conventional
multiplication and cascade summation (see [14], p. 70), then we obtain by
(1.21) and (1.23) the componentwise estimate
2 N e + 1) u
j (f^)j ? f^j j  1 (?dlog
(dlog2 N e + 1)u jjf jj1

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

15

and by taking the Euclidean norm


?

k (f^) ? f^k2  u N (dlog2 N e + 1) + O(u2 ) kf k2 :
In particular, we have for f^ := FN f that
?

k (FN f ) ? FN f k2  u N (dlog2 N e + 1) + O(u2 ) kf k2 :


If we compute f^ = FN f (f 2 RN , N power of 2) by the radix{2 Cooley{
Tukey FFT, then, following the lines of the proof in [26] and using (1.21)
p

{ (1.22), the roundo error estimate can be improved by the factor N ,


more precisely



p p
k (FN f ) ? FN f k2  u (4 + 2) N log2 N + O(u2 ) kf k2 : (1.24)

The following theorem states that the roundo error introduced by Algorithm 1.1 can be estimated as the FFT error in (1.24) up to a constant
factor, which depends on m and .

Theorem 1.3. Let m; N 2 N and let n := N ( > 1) be a power of 2


with 2m  n. Let h be a nonnegative real{valued even function, which decreases
monotonically for x  0, and let
'(x) :=
(x) :=

r2Z

h(n(x + r)) ;

([?m;m]h)(n(x + r)) :

r2Z

Suppose that ' has a uniform convergent Fourier expansion with monotone decreasing absolute values of Fourier coecients
c (') = 1 h^ ( 2k ) (k 2 Z) :
k

Let the nodes


wj := vNj 2 [?21 ; 12 ), wj  1 (j 2 IN ) be distributed such that each
 m
\window" ? n + nl ; mn + nl (l 2 In ) contains at most = nodes. If (1.2) is
computed by Algorithm 1.1 with the above functions '; , i.e.,
f~ := B Fn D f (f 2 RN ) ;

where D 2 C n;N and B 2 RN;n are determined by (1.14) { (1.15), then the
roundo error of Algorithm 1.1 can be estimated by


p p
k (f^)?f~k2  p u(4 + 2) N (log2 N + log2 + 2m +p1 ) + O(u2 ) kf k2
4+ 2
with
2
2
1=2
:= (h (0)^+ jjhjjL2 ) :
jh(= )j

16

Daniel Potts, Gabriele Steidl, Manfred Tasche

Let us call an algorithm for the computation of the NDFT (1.2) robust, if
for all f 2 RN there exists a positive constant kN with kN u  1 such that
jj (f~) ? f~jj2  (kN u + O(u2 )) jjf jj2 :
Then, by Theorem 1.3, Algorithm 1.1 is robust.

Proof: 1. First, we estimate the spectral norms of D and B ([14], p. 120). By


assumption and by (1.15), we see immediately that

kDk = kmax
jn c (')j? = (n jcN= (')j)? = jh^ (= )j? :
(1.25)
2IN k
Since is even and monotonically decreasing for x  0, it is easy to check the
1

integral estimate

1 X 2 (w ? l )  1
j n
n l2In
n
Then it follows by de nition of that
X

(0) +

Z 1=2

?1=2

(x) dx :

Z m=n
h2 (nx) dx
(wj ? nl )  h2 (0) + n

?m=n

l2In

 h (0) + jjhjjL :
2

(1.26)

By de nition (1.14) of the sparse matrix

k
2?1;n=2?1
B = (bj;k )N=
j =?N=2;k=?n=2 ; bj;k := (wj ? n ) ;
we get for the j {th component (By)j of By (y = (yk )kn==2??n=1 2 2 C n ) that
2m
X

j(By)j j 
2

r=1
2m
X

r=1

jbj;kr j jykr j
!

bj;kr
2

!2

2m
X

r=1

(bj;kr > 0; kr 2 f?n=2; : : : ; n=2 ? 1g)


!

ykr :
2

By (1.26), we have
2m
X

r=1

b2j;kr 

k2In

(wj ? nk )2  h2 (0) + jjhjj2L2

such that

j(By)j j  (h (0) + jjhjjL )


2

2m
X

r=1

yk2r :

(1.27)

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

17

By assumption, each \window" [? mn + nl ; mn + nl ) (l 2 In ) contains at most =


nodes wj ; wj  1 (j 2 IN ). Therefore each column of B contains at most =
nonzero entries such that by (1.27)

kByk =
2
2

N=
2?1
X

j(By)j j  (h (0) + jjhjjL ) kyk


2

j =?N=2

2
2

and consequently
q

2
2
1=2
~
(h (0) + jjhjjL ) =: :
2. Next, it is easy to check that by (1.23) and (1.25)
k (Df ) ? Df k2  u jh^ (= )j?1 kf k2 :

kB k 
2

(1.28)

(1.29)

From (1.25) it follows that

k (Df )k  k (Df ) ? Df k + kDf k  jh^ (= )j? (u + 1) kf k :


2

(1.30)

3. Set y^ := (Fn ( (Df )) and y := Fn Df . Then we can estimate

ky^ ? yk  ky^ ? Fn ( (Df ))k + kFn ( (Df )) ? Fn Df k


2

such that by (1.24), (1.29) and (1.30)

ky^ ? yk 
2

p p

u(4 + 2) n log2 n + O(u2 ) k (Df )k2


p
+ n k (Df ) ? Df k2
?
p

 jh^ (= )j?1 u(4 + 2)pn log2 n + pnu + O(u2 ) kf k2 : (1.31)

4. Finally, we consider the error between (f~) := (B y^) and f~ := By. By (1.28)
and (1.31), we obtain
k (f~) ? f~k2  k (B y^) ? B y^k2 + kB (^y ? y)k2
 k? (B y^)p? B y^k2 + ~ jh^ (= )j?1  
p
p
u(4 + 2) n log2 n + nu + O(u2 ) kf k2 : (1.32)
By (1.21), (1.23) and (1.14), it follows from [14], p. 76 that
j (B y^) ? B y^j  1 ?2mu
2mu B jy^j
with jy^j := (jy^k j)kn==2??n=1 2 and consequently by (1.28) that

k (B y^) ? B y^k  1 ?2mu


2mu kB k ky^k
?
 2m ~ u + O(u ) ky^k :
2

18

Daniel Potts, Gabriele Steidl, Manfred Tasche

By (1.30) and (1.25), we obtain

ky^k  k?y^ ? yk + kyk = kFn Df


k + O(u) kf k

p
?

n jh^ (= )j + O(u) kf k :
2

Thus

k (B y^) ? B y^k  2m ~ pn jh^ (= )j? u + O(u ) kf k :


?

(1.33)

Together with (1.32) this yields the assertion.

Note that  2m for \uniformly distributed" nodes wj .


Finally, we con rm our theoretical results by numerical experiments. We
use the same setting as in Section 1.2. Further, let f~C 2 C 2t denote the
vector, which was evaluated by cascade summation of the right{hand side
of (1.18), and let

EC (t) := logt (kf~C ? f^k2 )=kf^k2 ) :


Figure 1.3 (left) shows the error EC (t) for 10 numerical tests with various random nodes wj as function of the transform length N = 2t . For
comparison, Figure 1.3 (right) presents the corresponding error ENDFT (t)
introduced by NDFT{Algorithm 1.1.
10

10

11

11

12

12

13

13

14

14

15

15

16

16

17

10

12

14

16

17

10

FIGURE 1.3. Left: (t; E (t)) for t = 1; : : : ; 13. Right: (t; E

t = 1; : : : ; 15.

12

NDFT

14

16

(t)) for

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

19

1.5 Fast Bessel transform


In this section, we apply the NDFT for a fast Bessel transform. We are
interested in a fast algorithm for the computation of

f^(n) :=
where

Jn (x) :=

Z1

f (x) Jn (x) dx (n 2 N 0 ) ;

1
X
k=0

(1.34)

2)n+2k (x 2 R)
(?1)k k(x=
! (n + k)!

is the Bessel function (of rst kind) of order n. Let f be a real{valued,


compactly supported function with supp f  [0; a] (0 < a < 1). Using
the formula (see [1], p. 361)
1
X

e?iq cos t = 2

k=0

1
0 (?1)k J2k (q) cos(2kt) + 2i X(?1)k J2k+1 (q) cos(2k +1)t
k=0

with

1
X

1
0 ak := 1 + X ak ;

2 k=1
we obtain for x 2 [?1; 1] and q 2 R that
k=0

e?iqx = 2

1
X
k=0

0 (?1)k J2k (q) T2k (x)

? 2i

1
X
k=0

(?1)k J2k+1 (q) T2k+1 (x) :

Now multiplication with f (q) and integration with respect to q yields

h(x) :=

Za

= 2

f (q) e?iqx dq

1
X
k=0

0 (?1)k f^(2k) T2k (x)

(1.35)

? 2i

1
X
k=0

(?1)k f^(2k + 1) T2k+1 (x) :

Note that
(Re h)(x) = (Re h)(?x) ; (Im h)(x) = ? (Im h)(?x) :
Finally, orthogonality of the Chebyshev polynomials Tk (x)
Z1

8
<

 if k = l = 0;
w(x) Tk (x) Tl (x) dx = : =2 if k = l 6= 0;
0 if k =
6 l
?1

20

Daniel Potts, Gabriele Steidl, Manfred Tasche

with w(x) := (1 ? x2 )?1=2 implies


k
f^(2k) = (?1)

Z1

?1

k+1
f^(2k + 1) = (?1)

w(x) (Re h)(x) T2k (x) dx;


Z1

?1

w(x) (Im h)(x) T2k+1 (x) dx:

(1.36)
(1.37)

We approximate the integrals (1.35) { (1.37) by the following quadrature


formulas: First, we compute (1.35) by the Simpson rule on the Chebyshev

grid fxl := cos (2l2+1)
N ; l = 0; : : : ; N=2 ? 1g, i.e.,
N
X
?ijaxl =N (l = 0; : : : ; N=2 ? 1) : (1.38)
!j f ( aj
h(xl )  hl := 3aN
N )e
j =0

with

8
<

1 j = 0; N ;
wj := : 4 j = 2k ? 1 (k = 0; : : : ; N=2) ;
2 j = 2k
(k = 0; : : : ; N=2 ? 1) :

We realize the NDFT (1.38) by Algorithm 1.1 in O(N log N ) arithmetical


operations. Again, we choose '; as in Theorem 1.1 with m := 15; := 2
and b in (1.17). Next, we compute (1.36) and (1.37) by the trapezoidal rule,
which results for k = 0; : : : ; N=2 ? 1 in

f^(2k)  f~2k := (?N1)


f^(2k + 1)  f~2k+1 :=

?1
k NX

(Re hl ) cos 2k(22lN+ 1) ;

l=0
?1
k
(?1) +1 NX

l=0

(1.39)

l + 1) :
(Im hl ) cos (2k + 1)(2
2N

(1.40)

For the fast computation of (1.39) and (1.40) in O(N log N ) arithmetical
operations we use the fast cosine transform of type II (DCT{II) proposed
in [24, 3]. In summary, we obtain the following algorithm for the fast computation of the Bessel transform in O(N log N ) arithmetical operations:

Algorithm 1.4. (Fast Computation of Bessel transform (1.34))

Input: N := 2t , f (ja=N ) 2 R (j = 0; : : : ; N ).
1: Compute hl by (1:38) with Algorithm 1:1.
2: Set Re (hl+N=2?1 ) := Re (hN=2?l ), Im (hl+N=2?1 ) := ?Im (hN=2?l ) (l = 1; : : : ; N=2).
3: Compute (1:39) { (1:40) by fast DCT{II.

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial


0

21

5
3

6
4

7
5

8
6

9
7

10

10

11

12

13

14

15

11

16

10

11

12

13

FIGURE 1.4. Left: (t; E (t)) for t = 8; : : : ; 15 and f in (1.41). Right:


(t; E (t)) for t = 7; : : : ; 12 and f in (1.43).
Output: f~n approximate value of f^(n) (n = 0; : : : ; N ? 1).

We test Algorithm 1.4 by the following examples:


1. Consider
f (x) := e?x [0;a](x)
with a := 2000 and  = 0:01. By [17], p. 772 we have
1

(1.41)

e?x Jn (x) dx = (1 + 2 )?1=2 ( 1 + 2 + )?n ;

such that we can approximate f^(k) by


p
f^(k)  f^k := (1 + 2 )?1=2 ( 1 + 2 + )?k :
By f~, we denote the vector computed by Algorithm 1.4. Figure 1.4 (left)
shows the relative error
E (t) := log10 (kf~ ? f^k2 )=kf^k2 :
(1.42)
2. Let
f (x) := [0;a](x)
(1.43)
with a := 100. Then we have by [1], p. 480 that
Z

Jn (x) dx = 2

1
X
l=0

J2l+n+1 (a) :

We choose M such that J2k+n+1 (100) < 10?16 and set


M

X
f^(k)  f^k = 2 J2l+k+1 (100) :

l=0

Figure 1.4 (right) shows the corresponding error (1.42).

22

Daniel Potts, Gabriele Steidl, Manfred Tasche

References
[1] M. Abramowitz and J. A. Stegun, Handbook of Mathematical Functions, Dover Publ., New York, 1972.
[2] C. Anderson and M. D. Dahleh, Rapid computation of the discrete
Fourier transform, SIAM J. Sci. Comput. 17, 913 { 919, 1996.
[3] G. Baszenski and M. Tasche, Fast polynomial multiplication and convolutions related to the discrete cosine transform, Linear Algebra Appl.
252, 1 { 25, 1997.
[4] G. Beylkin, On the fast Fourier transform of functions with singularities, Appl. Comput. Harmon. Anal. 2, 363 { 381, 1995.
[5] A. Brandt, Multilevel computations of integral transforms and particle
interactions with oscillatory kernels, Comput. Phys. Comm. 65, 24 {
38, 1991.
[6] J. O. Droese, Verfahren zur schnellen Fourier{Transformation mit
nichtaquidistanten Knoten, Master Thesis, TH Darmstadt, 1996.
[7] A. J. W. Duijndam and M. A. Schonewille, Nonuniform fast Fourier
transform, Preprint, Univ. Delft, 1998.
[8] A. Dutt and V. Rokhlin, Fast Fourier transforms for nonequispaced
data, SIAM J. Sci. Statist. Comput. 14, 1368 { 1393, 1993.
[9] A. Dutt and V. Rokhlin, Fast Fourier transforms for nonequispaced
data II, Appl. Comput. Harmon. Anal. 2, 85 { 100, 1995.
[10] B. Elbel, Mehrdimensionale Fouriertransformation fur nicht aquidistante Daten, Master Thesis, TU Darmstadt, 1998.
[11] B. Elbel and G. Steidl, Fast Fourier transform for nonequispaced data,
In: Approximation Theory IX, C. K. Chui and L. L. Schumaker (eds.),
Vanderbilt University Press, Nashville, 1998.
[12] H. G. Feichtinger, K. H. Grochenig and T. Strohmer, Ecient numerical methods for nonuniform sampling theory, Numer. Math. 69, 423
{ 440, 1995.
[13] L. Greengard and V. Rokhlin, A fast algorithm for particel simulation,
J. Comput. Phys. 73, 325 { 348, 1987.
[14] N. J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM,
Philadelphia, 1996.
[15] J. I. Jackson, Selection of a convolution function for Fourier inbersion
using gridding, IEEE Trans. Medical Imaging 10, 473-478, 1991.

Chapter 1. Fast Fourier transforms for nonequispaced data: A tutorial

23

[16] K. Jetter and J. Stockler, Algorithms for cardinal interpolation using


box splines and radial basis functions, Numer. Math. 60, 97 { 114,
1991.
[17] M. A. Lawrentjew und B. W. Schabat, Methoden der komplexen Funktionentheorie, Deutscher Verlag der Wissenschaften, Berlin, 1967.
[18] G. Newsam, Private communication.
[19] J. Pelt, Fast computation of trigonometric sums with application to
frequency analysis of astronomical data, Preprint, 1997.
[20] W. H. Press and G. B. Rybicki, Fast algorithms for spectral analysis
of unevenly sampled data, Astrophys. J. 338, 227 { 280, 1989.
[21] G. U. Ramos, Roundo error analysis of the fast Fourier transform,
Math. Comp. 25, 757 { 768, 1971.
[22] M. Rauth and T. Strohmer, Smooth approximation of potential elds
from noisy scattered data, Preprint, Univ. Vienna, 1997.
[23] Sramek, R. A. and F. R. Schwab, Imaging, in Astronomical Society of
the Paci c Conference, Vol. 6, R. Perley, F. R. Schwab and A. Bridle
(eds.), 1988, 117 { 138.
[24] G. Steidl, Fast radix{p discrete cosine transform, Appl. Algebra Engrg.
Comm. Comput. 3, 39 { 46, 1992.
[25] G. Steidl, A note on fast Fourier transforms for nonequispaced grids,
Adv. Comp. Math. (in print).
[26] P. Y. Yalamov, Improvements of some bounds on the stability of fast
Helmholtz solvers, Preprint, Univ. Rousse, 1998.

You might also like