Professional Documents
Culture Documents
University of Penn
November 9, 2007
Abstract
The following is an article from the University of Penn’s website, that introduces the Z-transform to
students. The original document can be viewed at http://www.ling.upenn.edu/courses/ling525/
z.html. The following, is an attempt to convert it to LATEX, interpreting the ASCII-art formulae into
a more human-readable form.
The content is the same – with some changes in notation – namely the usage of j rather than i
as the complex “imaginary” constant (since i is used for current in Electrical Engineering), and the
usage of the asterisk (*) rather than the hash (#) to indicate convolution – consistent with Wikipedia.
This document was typeset by Stuart Longland, and is provided as-is. The author of this document
is unknown.
The LATEX source used to generate this document is available at http://dev.gentoo.org/~redhatter/
misc/z-transform.7z1 .
1
7-zip archive: see http://www.7-zip.org
1
1 Introducing the Z-transform
After reading this section, you may want to look at Chapter 1, “Signal Processing Basics,” in the User’s
Guide for the Matlab Signal Processing Toolbox. There are a few things that we have not covered, and
will not cover – for instance, the state space and the partial fraction expansion ways of representing
linear systems. Some other topics, such as lattice form representations, will be covered later. However,
you should now be able to follow much of the exposition, and benefit from seeing the range of available
signal processing functions.
Note that the User’s Guide is a rather large (about 6 MB) .pdf file, so you will probably prefer to read
it via a fast network connection. Although the whole Guide is 720 pages long, the cited chapter is only
about 45 pages, so it is plausible to print it out.
1.1 Background
In this segment, we will be dealing with the properties of sequences made up of integer powers of some
complex number:
2
If the number z is a negative real number, we will get a sequence that alternates between positive
and negative values. Depending on whether z is less than, equal to, or greater than -1, these values will
increase exponentially, remain constant, or decrease exponentially in magnitude with increasing n.
z = -.97; x = z.^n;
plot(n,x,’go’); title(’z = -.97’);
3
If z is a complex number that happens to lie on the unit circle (in the complex plane, with the x-axis
representing the real part of z and the y-axis representing the imaginary part), then z n will be a sinusoid
sampled at intervals of 6 z radians.
This can easily be shown by considering that in this case z = ejω for some real number ω, which is
equivalent to cos(ω) + j sin(ω), so that z n will be ejnω , or cos(nω) + j sin(nω).
If z is a complex number (with a non-zero imaginary part) whose magnitude is slightly less than
or greater than 1, then z n will be a sinusoidal spiral like those we have just seen, whose magnitude is
exponentially decreasing or increasing depending on whether z has a magnitude less than or greater than
1.
4
Now suppose a linear time-invariant (LTI) system with impulse response h(n) has as its input one of
these exponential sequences
x[n] = z n
for some (arbitrarily chosen) complex number z.
The output will be the convolution sum h ∗ x, or
X
y[n] = h[k]x[n − k]
k
h[k]z n−k
X
=
k
n
h[k]z −k
X
= z
k
Thus the input was z n , and the output is z n multiplied by a constant that depends on the value of z
and on the impulse response h. If we write that constant as H(z) = k h[k]z −k then we can rewrite the
P
y[n] = H(z) ∗ z n
We can see another way to say this if we express the LTI system with impulse response h in the more
general form of the “convolution matrix” Mh – that is, a matrix with a set of shifted time-reversed copies
of the impulse response h in its rows, as discussed in an earlier lecture. Now since
y[n] = Mh z n
= H(z)z n
we can see that the complex exponential z n is an eigenvector of Mh , with H(z) as the eigenvalue, since
Mh times z n equals the constant H(z) times z n .
Because of the superposition property of linear systems, this “eigenrelationship” makes it convenient
to express a signal as a linear combination of complex exponentials. If
ak zkn
X
x[n] =
k
then the system output for an LTI system with impulse response h will be
ak H(zk )zkn
X
y[n] =
k
That is, the output is also a linear combination of the same set of complex exponential sequences, with
each coefficient being the product of the input coefficient ak and the system’s eigenvalue H(zk ) for the
eigenfunction zkn .
The expression for the system eigenvalues in terms of z
h[n]z −n
X
H(z) =
n
This equation is closely related to that for the DFT. Recall that the DFT is
2π
x[n]e−j N kn
X
X(k) =
n
5
If we rewrite the exponential on the right hand side slightly as
2π
−n
ejk N
each value of k can be seen as just picking a different complex number
jk2π
z=e N
∞
x[n]e−jωn
X
DTFT: X(ω) =
n=−∞
∞
x[n](ejω )−n
X
=
n=−∞
x[n]z −n
X
X(z) =
n
becomes
n
or
The right-hand side of this equation is just the DTFT of the sequence x[n] multiplied by a real
exponential r−n . Thus
6
1.2 History of the Z-transform
The z transform is unusual, in being named after a letter of the alphabet rather than a famous math-
ematician. The Fourier transform is named after Baron Jean Baptiste Joseph Fourier (1768-1830); the
Walsh-Hadamard transform is named after J.L. Walsh (?) and Jacques Salomon Hadamard (1865-1963);
we haven’t discussed the Laplace and Hilbert transforms yet, but we will (at least briefly), and they are
named after Pierre-Simon de Laplace (1749-1827) and David Hilbert (1862-1943) respectively.
Laplace transforms have long been used in solving (continuous-time) linear constant-coefficient differ-
ential equations.
According to p 420 of Contemporary Linear Systems (Strum and Kirk 1994),
A method for solving linear, constant-coefficient difference equations by Laplace transforms was
introduced to graduate engineering students by Gardner and Barnes in the early 1940s. They
applied their procedure, which was based on jump functions, to ladder networks, transmission
lines, and applications involving Bessel functions. This approach is quite complicated and in a
separate attempt to simplify matters, a transform of a sampled signal or sequence was defined
in 1947 by W. Hurewicz as
∞
f (kT )z −k
X
Z [f (kT )] =
k=0
This will converge (the sum will be finite) just in case |z| > |a|.
Since for any geometric series
∞
c
crn =
X
n=0 1−r
the Z-transform sum above will be
1 z
X(z) = = ROC:|z| > |a|
1 − az −1 z−a
This equation for the Z-transform of x[n] = an u[n]
z
X(z) =
z−a
is a “rational function”, that is, a ratio of polynomials. We can characterize it by its zeros (the roots
of the numerator) and its poles (the roots of the denominator).
In this case there is one zero (z = 0) and one pole (z = a).
We also need to know the “region of convergence” (ROC) for the Z-transform (here |z| > |a|). For the
next section of exposition, we will neglect the ROC. This is not a good idea in general.
7
1.4 Properties of the Z-transform
It follows directly from the definition
x[n]z −n
X
X(z) =
n
x[n] = cx1[n]
then
cx1[n]z −n
X
X(z) =
n
x1[n]z −n
X
= c
n
= cX1(z)
and if
(x1[n] + x2[n])z −n
X
X(z) =
n
x1[n]z −n + x2[n]z −n
X X
=
n n
= X1(z) + X2(z)
Thus if
x[n] = an u[n] + bn u[n]
then
z z
X(z) = +
(z − a) (z − b)
z(z − b) + z(z − a)
=
(z − a)(z − b)
z(2z − (b + a))
=
(z − a)(z − b)
In addition to linearity, Z-transforms have a number of other properties that make them a useful tool
in analyzing LTI systems:
X(z) = X1(z)X2(z)
8
1.4.2 The shift property of Z-transforms
If
x[n] = x1[n − k]
then
X(z) = z −k X1(z)
x[n] = x1[−n]
then
X(z) = X1(1/z)
Y (z) = H(z)X(z)
and therefore
H(z) = Y (z)/X(z)
We can also express the same LTI system as a linear constant-coefficient difference equation, if the
system is causal:
Z {a0 y[n]} + Z {a1 y[n − 1]} + · · · + Z {aM y[n − M ]} = Z {b0 x[n]} + Z {b1 x[n − 1]} + · · · + Z {bN x[n − N ]}
Likewise we can pull the b and a constants outside each transformed term:
9
Now the shift property of the Z-transform lets us replace (for all k) Z {x[n − k]} with z −k Z {x[n]} and
Z {y[n − k]} with z −k Z {y[n]}:
Now all the Z {· · ·} expressions are either Z {x[n]} Z {[y]} and so we can replace them all with X(z)
or Y (z):
b0 + b1 z −1 + · · · + bN z −N
Y (z)/X(z) =
a0 + a1 z −1 + · · · + aM z −M
By convention, a0 is 1. We can multiply through by 1/b0 , replacing the b coefficients with c coefficients
such that cn = bn /b0 :
1 1 + c1 z −1 + · · · + cN z −N
" #
Y (z)/X(z) =
b0 1 + a1 z −1 + · · · + aM z −M
If a0 is already 1, then the c coefficients are just the same as the b coefficients. Thus the Z-transform of
the impulse response of such a system — ANY system described by a linear constant-coefficient difference
equation — is a ratio of polynomials in z −1 , where the coefficients in the numerator come from the x
(input) coefficients in the difference equation, and the coefficients in the denominator come from the y
(output) coefficients in the difference equation.
1 1 + c1 z −1 + · · · + cN z −N
" #
Y (z)/X(z) =
b0 1 + a1 z −1 + · · · + aM z −M
The fundamental theorem of algebra tells us that we can express the numerator and denominator
polynomials uniquely as a product of first order terms in z −1 = 1/z, with (complex) coefficients q1 , ..., qN
and p1 , ..., pM :
" #
1 (1 − q1 /z)(1 − q2 /z) · · · (1 − qN /z)
H(z) =
b0 (1 − p1 /z)(1 − p2 /z) · · · (1 − pM /z)
In this equation, each of the qn and pn is a complex number. If z equals one of the qn , the numerator
becomes zero, and so does the whole function H(z). These values are called “zeros” of the system. As z
approaches one of the pn , the denominator approaches zero, and the value of the function H(z) approaches
infinity. These values are called “poles” of the system; graphically, you can imagine putting the complex
plane on the ground, so that a pole sticks up into the sky at the point where z = pn .
Starting with our original linear constant-coefficient difference equation
we have given a mechanical procedure for deriving an expression for the (factored form of the) Z-
transform of the system impulse resonse h[n]:
10
" #
1 (1 − q1 /z)(1 − q2 /z) · · · (1 − qN /z)
H(z) =
b0 (1 − p1 /z)(1 − p2 /z) · · · (1 − pM /z)
If all of the a coefficients are zero, then the output is just a moving weighted average of the input, and
the denominator of our expression for H(z) will be 1. This is a filter with only zeros and no poles, an
“all-zero” filter.
Conversely, if all of the b coefficients are 0 (except of course for b0 which must be 1, or the system
would ignore the input!), then the current output sample is basically predicted as a weighted combination
of the earlier output samples, and the numerator of our expression for H(z) will be 1. This is a filter with
only poles and no zeros, an “all-pole” filter.
If the difference equation has both a and b coefficients, then the filter has both poles and zeros.
Some important properties of poles and zeros follow from this algebraic background. For example,
if we are dealing with real-valued signals – as we normally are – than the coefficients of the LCCDE
must obviously also be real. It follows that the roots of the numerator and denominator polynomials in
the Z-transform will either be real, or will come in complex-conjugate pairs. Thus the zeros and poles
will likewise either be real, or will come in complex-conjugate pairs, since they are just the roots of the
numerator and denominator polynomials.
1.8 From the poles and zeros of the Z-transform to the spectrum
Since we know that the Z-transform reduces to the DTFT for z = ejω , and we know how to calculate the
Z-transform of any causal LTI (i.e. the Z-transform of its impulse response) from the coefficients of the
difference equation, we can write down an expression for its spectrum (i.e. the ratio of the spectrum of
the output to the spectrum of the input) by simply setting z = ejω in the Z-transform.
This expression will be a periodic, continuous function of the variable ω, which is frequency in radians.
The period is obviously 2π. We can get N evenly-spaced samples of one period of the spectrum (which is
n
also what the DFT gives us) by setting (in Matlab-ish) ω to Nn 2π for n = 0 to N − 1, i.e. z = ej N 2π for
n = 0 to N − 1.
11
This is quite easy to calculate. It is also especially easy to think about graphically, if we represent the
system in pole-zero form:
12
Now we just evaluate the factored form of the Z-transform of a system with these poles (and the single,
necessary, degenerate zero). This is the product of the differences between the z-values (here the points
on the unit circle) and the roots r and r0 . The sampled spectrum is just the inverse of this product of
distances:
Of course, an alternative method of calculation – generally simpler in practice – is simply to take the
DFT of the impulse response of the filter.
13
1.9 Relationship between pole amplitude and damping
1.10 Stability of recursive filters
[TK]
In matrix notation,
p = Sa
where p is an M × 1 matrix (column vector) of predicted values, S is an M × N matrix whose ith
row contains speech samples s(i) to s(i + N − 1), and a is an N × 1 matrix (column vector) of prediction
weights.
Thus for a third-order predictor, with input samples s1 · · · sM and predictor coefficients [a(1)a(2)a(3)],
p = Sa means
p(4) s(1) s(2) s(3)
p(5)
s(2) s(3) s(4)
p(6)
s(3) s(4) s(5)
a(1)
= × a(2)
p(7) s(4) s(5) s(6)
.. .. .. .. a(3)
. . . .
p(M ) s(M − 3) s(M − 2) s(M − 1)
Notice that this is closely related to the previously-discussed concept of a recursive filter, and exemplifies
why such filters are commonly called “autoregressive”.
In this case, the a(i) are just the coefficients of the left-hand (output) side of a linear constant-coefficient
difference equation — ignoring the a(0) coefficient in of an LDCE, which is 1 by convention.
We know S in p = Sa, but not a or p. What should we do? Suppose that we want to choose the
weights a so as to minimize the prediction error
norm(Sa − s)
This is what is called a “least squares” problem: find a vector x providing the “best” solution to an
overdetermined system of equations Ax = b, where “best” means minimizing norm(Ax − b).
There are a variety of methods of solving such problems, depending on the properties of A and b.
MATLAB offers us two solutions, x = A \ b and x = pinv(A)*b. We’ve seen the “backslash” A \ b
solution before; we’ll discuss the pseudoinverse solution pinv(A)*b at greater length later.
For now, we’ll just observe that if A has more rows than columns and is not of full rank, then “choose
x to minimize norm(Ax − b)” does not have a unique solution. The solution x = pinv(A)*b give us the
14
smallest x (x minimizing norm(x)) while the solution x = A \ b gives the x with the fewest possible
nonzero components.
We’ll examine the this question as we come to it in practice, but if there is a difference, the pseudo-
inverse solution is probably the one that we want.
Let’s get a chunk of speech to work with:
>> load(’audio1’);
>> S1 = S(2246:2347);
>> plot(S1);
>> length(S1)
ans =
102
This is just a convenient piece of the middle of a vowel from the audio file we have used before. It
represents as close to 3 pitch periods as we can get in this sampled signal:
function A = makeA(S,N)
%
% Set up matrix for linear prediction calculation
% S is a vector of length M whose ith sample will be predicted
% as a linear combination of the N previous samples.
% Thus we want to set up a matrix A,
% with M-N rows and N columns,
% whose jth row consists of the samples from S(j) to S(j+N-1).
% Obviously M must be greater than N.
AA = hankel(S,S(1:N));
A = AA(1:(length(S)-N),:);
15
Thus we get
>> makeA(1:10,3)
ans =
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
The role of b in our equation Ax = b will be played by the input samples from S(N + 1) to S(M ).
A is full rank
>> rank(A)
ans =
14
>> x = pinv(A)*b;
>> x’
ans =
Columns 1 through 7
-0.1660 0.3545 -0.4917 0.5188 -0.7772 1.0481 -1.3567
Columns 8 through 14
1.1888 -1.5503 1.4317 -1.0071 0.8159 -1.2254 1.4033
16
Let’s look at the (windowed version of the) original signal, and its DFT spectrum:
>> S2 = hamming(length(S1)).*S1;
>> plot(S2);
>> q = zeros(512,1);
>> q(1:length(S2)) = S2;
>> lspecplot(fft(q), 8000);
17
Now let’s look at the impulse response of the recursive filter whose coefficients are x — well, OK, whose
coefficients are 1 followed by -1 times x backwards... we have to obey the conventions derived from the
LCDE formulation:
>> lspecplot(fft(impresp),8000);
18
Now let’s figure out what the pole frequencies and amplitudes are:
>> angle(roots(fcoef))*4000/pi
ans =
1.0e+03 *
3.2923
-3.2923
2.9825
-2.9825
2.3412
-2.3412
1.4361
-1.4361
1.4294
-1.4294
0.4731
-0.4731
0.6518
-0.6518
>> abs(roots(fcoef))
ans =
0.9653
0.9653
0.8436
0.8436
0.8628
0.8628
0.9703
0.9703
0.8048
0.8048
0.9803
0.9803
0.7576
0.7576
In tabular form, the seven complex poles, corresponding to the seven complex-conjugate roots of the
predictor polynomial, have frequencies and amplitudes of
Frequency (Hz) Amplitude (0-1)
473.1 .9803
651.8 .7576
1,429.4 .8048
1,436.1 .9703
2,341.2 .8628
2,982.5 .8436
3,292.3 .9653
Note that by performing a 14-th order analysis, we’ve guaranteed that we will get seven complex poles
(well, we might have substituted a couple of real poles for one of them). If we want to use this method to
find the vowel formants, we have to decide which poles correspond to formats and which do not. Here, it
is clear (because of the amplitude) that F 1 is 473 Hz., and F 2 is 1,436 Hz. (though of course the apparent
estimation accuracy is spurious). It is less clear what F 3 should be.
19
1.12 Inverse filtering
Suppose we have a causal LTI system S1, with impulse response h, whose z transform, H(z), is a ratio of
polynomials Bpoly /Apoly .
Now suppose we have another causal LTI system S2, with impulse response g, whose Z-transform,
G(z), happens to be exactly the inverse, namely Apoly /Bpoly .
If we convolve some input x with h, and then convolve the result with g – x ∗ h ∗ g – this is equivalent
to convolving x with the convolution of h with g – x ∗ (h ∗ g).
By the convolution rule for the Z-transform, the Z-transform of (h ∗ g) will be the product H(z)G(z).
Given the way we constructed the two systems, this product will be 1 for all values. Therefore the combined
system (h ∗ g) will do nothing to its input.
More interestingly, if we already have the output S1(x) = x ∗ h, we can apply the second system to
the result to get the original input back: S2(S1(x)) = x. This process is called inverse filtering .
Let’s try this with a simple one-pole recursive filter with a center frequency half the sampling rate,
excited by an impulse:
r = .95*exp(i*.5*pi);
A= poly([r r’]);
impulse=zeros(256,1); impulse(15)=1;
out1 = filter([1 0 0],A,impulse);
out1(1:30)’
ans =
Columns 1 through 7
0 0 0 0 0 0 0
Columns 8 through 14
0 0 0 0 0 0 0
Columns 15 through 21
1.0000 0.0000 -0.9025 -0.0000 0.8145 0.0000 -0.7351
Columns 22 through 28
-0.0000 0.6634 0.0000 -0.5987 -0.0000 0.5404 0.0000
Columns 29 through 30
-0.4877 -0.0000
out2 = filter(A,[1 0 0],out1);
out2(1:30)’
ans =
Columns 1 through 12
0 0 0 0 0 0 0 0 0 0 0 0
Columns 13 through 24
0 0 1 0 0 0 0 0 0 0 0 0
Columns 25 through 30
0 0 0 0 0 0
20