You are on page 1of 5

APPLICATION OF QUADRATURE MIRROR FILTERS TO SPLIT BAND VOICE CODING SCHEMES

0. Esteban

and

C. Galand

IBM Laboratory
06610 La Gaude, France
2) QMF band splitting

Abstract
This paper deals with applications of Quadrature
Mirror Filters (QMF) to coding of voice signal in
sub-bands. Use of QMF's enables to avoid the ailssing effects due to samples decimation when signal
is split into sub-bands. Each sub-band is then coded independently with use of Block Companded PCM
(BCPCM) quantizers. Then a variable number of bits
is allocated to each sub-band quentizer in order to
take advantage of the relative perceptual effect of
the quantizing error.
The paper is organized as follows
splitting in two sub-bands wlth QMF's is

Principle
Let us consider for explanation purposes Fig. I
in which we describe the decomposition of a sampled signal in two contiguous subbands, where
is a sampled half band low pass filter with
H1
an impulse response h1(nL
is the corresponding half band mirror filH2
ter, i.e. which satisfies the following magnitude relation

First,

(e

analysed.
- Then, a general description of a splitband voice
coding scheme using QMF's is made.
Finally, two coding schemes are considered, operating respectively at 16 KBps and 32 KOps. Averaged values of S/N performances are given when
encoding both male and female voices. Comparisons
are made with conventional BCPCM and CCITT A-Law.

(1)

After frequency limiting to f5/2, the signal


x(t) is sampled at f5 and filtered by H and H2.
The obtained signals x (n) and x (n) reiresent
respectively the low
high hafbands of x(n).
As their spectra occupy half the Nyquist bandwidth of the original signal, the sampling rate
in each band can be halved by ignoring every second sample. For reconstruction, the signals
y (n) and y (n) are interpolated by inserting
zero va?ued sample between each sample and
filtered by K and K2 before being added to give
the signal s(?).

ad

1) Introduction

Decomposition of the voice spectrum in sub-bands


has been proposed by R. Crochiere et al. /1/ as a
means to reduce the effect of quantizing noise due
to coding. The main advantages of this approach are
the following
to localize the quantizing noise in narrow
frequency sub-bands, thus preventing noise interference between these subbands,

oe

first,

second, to enable the attribution of bit

(A))T)I

where ws =
211/T denotes the sampling
rate and H 2U51=
(a w
denotes the Fourier Transform of h1n).
is a half band low pass filter with an imK1
pulse response k1(n) and K2 is the corresponding mirror filter of K1.

Taped results will be played at the conference.

j( 2

H2(e

resour-

ces to the various frequency bands according to


perceptual criteria.
As a result, the quantizing noise is perceptually
more acceptable, and the signal to noise ratio is
improved.
The implementation proposed in /1/ is straightforward and takes advantage of a bank of non-overlapping band-pass filters. Unfortunately, for a non
perception of ellasing effects due to decimation,
this approach needs sophisticated band-pass filters.
The split-band coding scheme we propose here avoids
these inconveniences. Quasi perfect sub-band splitting can be achieved by use of Quadrature Mirror
Filters (QMF) /2/ associated with decimation/interpolation techniques.

IiT

f'/Z
e

I'B

of 2 sub-bendeep1ittin
L1 Principle
bp use of hell'bend etrear filter.

191

If we choose for H a symmetrical FIR filter,


its Fourier transfrm H1[eJWT) can be expressed
in term of its magnitude H1(w)

Let us analyse the structure of Fig. I. If X(z),


H(z) and X1(z) represent respectively the z
transforms of x(n), h1(n) and x1(n), then
(2)

H1(z)X(z)

X1(z)

'( (z) of the decimated signal


and the z tansform U1(z) of the iriterpoy1(n)
lated signal u1(n) are given bY /3/

The z transform

Y1(z)
U1tz)

After final filtering, the


-

where

First case, N even

(5)

S(eiT)

transform of

k1(n).

Considering the case of perfect filters,

Combining relations (2)-(5) gives

{H1(z)X(z)+H1(-z)X(Z)}K1(Z)

transform
is derived in
The
T2(z)
manner

(6)

s(n)

Second case,

H2(z)

E
n=0

K1

t(2(z)

To summarize, we have defined a et of conditions for perfect reconstruction


=
H1

H (z) =

H1(z)

h1(n)(_1)nlz

K(z)
K2(z)

H1 (z)

{H(eT)

H(e

H[

H[ w+

w5/2) =

H2(z)

an efficient implementation of the


QMF band splitting, using a symmetrical FIR half
band filter with an even number of coefficients.
The input signal x(t) is sampled at f5 and filtered by
ed H , giving the low-band channel
andH1
the hig-band channel x (n). Then the
x1(n)
sampling rate is decreased to f5/ by decimating
every second sample, giving the signals y1(n)
and y2(n).

(13)

Let us evaluete this relation on the unit circle

S(e3T)

H(z)

Fig. 2a gives

(121

-{H(z)H[z)}X(z)

filter of even order

H (-z)

Implementation

(II)

-H1(-z)

Symmetrical FIR

(10)

Equation (8) now becomes


S(z)

odd

(9)

-H2(z)

case, the original signal cannot be perfectly reconstructed, it can be seen from (16)
that the amplitude at w = w5/4 is always zero.

We can now cancel the second term of (8) by


choosing
[zi =

(20)

In this

It can be seen that the impulse response h2(n)


of the mirror filter
is obtained by inverting
H2
every second sample of h4(n).
N-I

!x(nN+1)

samples.

(z)+H2(-z)K2[Z)}X(-Z)

E
h1(n)z
n=Q

[19)

The signal is perfectly reconstructed (neglecting the gain factor 1/2) with a delay of (N-I)

te

e_T.X(ejT)

or

The second term of this sum represents aliasing


effects due to decimation and can be eliminated
if we choose K and K appropriately. First, we
must satisfy
symmtry relation [1). This is
elegantly solved if H1 is a finite impulse response (FIR) filter
H1(z)

..

S[ejwT)

.{H(z)K(z)+H2(z)K2(z)}X(z)
4{H1 (-z)K1

[18)

we get

The z transform S(z) of the signal s(n) is obtained by adding relations (6) and (7)
5(z)

H(i!5) I

similar

!(i)X(Z1+H2_zXi_ZK2(z)

T2(z)

(16)

Two cases are to be considered, depending on the


parity of N

transform of t1(n)

K1(z)U1(z)

the
K1(z) represents

T1(z)

j.{H()_H(w+ .s)e3(Nl)I}
xe (N-I )2IIw5.X(ejwT )

-j

(4)

V1(z2)

T1(z)

S(ejuT )

(3)

{X1Cz)+X1(-z))

(15)

H1()e31ws

H1(e3wT)
Substituting in (14) gives

Fig. 2b shows the reconstruction of the inItial


signal with the same filter. First, the sampling
rate is increased to f by inserting one zero
valued sample between ach sample of y (n) and
y (n), giving two signals u1[n)
u2n). Then
tese signals are filtered by H1 and H2, and

ST)X(JWT)(I4)

nd

1e2

the signal s(n) is obtained by subtracting the


filtered signals
and t2(n).
t1(n)

The total number of multiplications to perform


per initial sampling interval (splitting and
reconstruction) is equal to the filter length N,
the number of additions if of the order of N.

xn)

y(n)

Fig 3
Fig 2a

Block Companded PCM (BCPCM) principle

Quadrature channels splitting


Quantization

of the sub-band signals

As mentioned in /1/ and due to the fact that the


sub-band signals are narrow band and Nyquist
sampled, the sample-to-sample correlation of
these signals is low. Consequently, straight
PCM encoding techniques are preferred to differential methods.

Fig 2b

An efficient end simple approach to code the


sub-bands signals is obtained by means of Blocic
Compended PCM (BCPCM) coding scheme /4/. This
type of cornpending has been initially proposed
for full band coding of speech waveforms, but
can be straightforwardly applied to sub-band
encoding. The principle of BCPCM coding can be
sumarized as follows

Quadrature channels reconstruction

The samples are encoded on a block basis. For


each block of M samples, a scale factor is
chosen in such a way that the larger sample
in the block will not fall out of the coded

3) Split-band voice coding scheme based on QMF


sub-bands tree decomposition
In the previously described implementation, a
signal x(t) was sampled at f to give a signal
x(n), and split into two sigals y (n) and y2(n)
with reduction of the sampling rats tof /2.
This decomposition can be extended to mope than
two sub-bands by applying to y (n) and
which represent respectively ti1ie low sub-band
and the high sub-band of x(fl), the same decomposition process as to the initial signal x(n)
(see Fig. 4). Four signals are thus obtained
with reduction of the sampling rate to f /4. The
spectrum of each of these signals represnts
the spectrum of x(n) in the corresponding sub-

range.

- Then, the M samples of the block are


quantized with respect to the obtained scale factor
and both the coded values end the scale factor are transmitted.
The overhead bit rate necessary to the transmission of the scale factor is- inversely proportional to thelength of the:blocks, but this length
must be chosen so as to take in account the formant evolution. Fore full band coding, a length
of 8 to 16 ms has been found satisfying.

band.

The main advantages of BCPCM are a low overhead


information rate, a very large dynamic range,
and no transient clipping. Fig. 3 shows the
adaptation of the scale factor to.the signal,
considering three consecutive blocks, and assuming 3 bits quantization.

This decomposition can be generalized by repeating the processus p times. The initial signal
is thus split into
signals sampled at
by a p-stage tree arrangement of decimation filters of the type shown on Fig.. 2a. As the ith
filters, the total number
stage includes
of filters is
The resulting information
rate after p stages is the same as the one of
the original signal.

21

2.

The BCPCM coding scheme has been used with success in conjunction with the QMF band splitting,

193

IN

-r

f/2
5

f/4
5

f/2
5

Fig 4

Four

sub-bands SVCS with

assuming different number of bits to code each


frequency sub-band so as to weight the perceptual effect of the quantizing noise in the voice
spectrum. Examples of bit allocationwill be
discussed in section 4. After quantization (see
Fig. 4), the signals and scale factors from all
channels are time multiplexed and transmitted.

QIIF

and

BCPCII

32 KBps SVCS
The characteristics of this coder are the same
as the previous one.. excepted the bit allocation
that has been increased to

55544331

2 sub-bands reconstruction
Performance
At the receiving end, the data is demultiplexed
and decoded. The reconstruction of the speech
signal is made by a p-stage tree arrangement of
filters of the type of the one shown in Fig. 2b.

The performance of the two considered SVCS has


been evaluated by comparison with conventional
BCPCII coders operating at the same bit rate.
For convenience, two types of ECPCM coders have
been considered, the first one operating in PCtI
mode, the second one being able to take a PCN/
DPCM decision /4/, so as to encode the highcorrelated blocks of samples in differential

If a same filter of N taps is used for each stage, the number of multiplies per input sample
for the whole
sub-bands decomposition/reconstruction is Np. In fact, filter constraint can
be reduced from stage to stage with respect to
the bandwidth so as to optimize the total processing. It has been shown in section 2. that
there is a delay of (N-I) samples between the
original and reconstructed signals In case of
two sub-bands splitting. Consequently. the number of delayed samples is (2-1)(N-I) for the 2P
subbands splitting.

mode.

The experimentationswere made on a set of utterances pronounced by 7 speakers (4 female


voices and 3 male voices) representing a total
duration of 3.5 minutes of continuous speech.
The averaged signal to noise ratios are given
in table 1.

4) Simulation of Split-band Voice Coding Scheme

Table

In this section, two Split-band Voice Coding


Schemes (SVCS) are considered. The first one operates at a bit rate of 16 KSps and provides a quality
sufficient for telephony applications,the second
operates at a bit rate of 32 KBps and gives a quality
comparable to that provided by standard companded
laws. The characteristics of these two coders are
given hereafter.
16 KBps SVCS
input signal band limited
sampling rate

number of sub-bands

bit allcation

block
n&ntar of

ituratlon

overhead bits

to 0-4000 Hz

Comparative performances (d8)


of BCPCM and SVCS coders.

16 KSPS

32 KBPS

21

BCPCN
(PCN/DPCM ilode)

II

24

SVCS

14

25

Bit Rate
Coder
BCPCM
(PCII Node)

KHz

3 331 1 1 1 1
20 ms (160 samples)
40

194

It must be noted that, for BCPCII coders, the


PCM/DPCtI decision enables a signal-to-noise iimprovement SNRI) of 3d8. ThIs improvement is not
surprising and is in accordance with the wellknown results of conventional PCM /5/. Moreover,
it can be seen that split-band coding techniques
provide SNRI over full-band techniques. This improvement is 3dB in case of 16 KBps. and only
1dB in case of 32 KBps. However, as noticed in
/1/. it has been observed that for SVCS, the
subjective level of the quantizing noise is less
than for BCPCFI, resulting in a more pleasant
voice quality.
The previously described 16 KBps SVCS provides
a speech quality which is sufficient for telephony applications. Furthermore, listening tests
have shown that it is not possible to tell the
difference between the 32 KBps SVCS and the
CCITT 64 1(B.ps A-Law, although the measured signal to noise ratios are respectively 25dB nd
37dB.

5) Conclusions
The application of quadrature Mirror Filters to
Split-band Voice Coding Schemes has been discussed.
As noticed ifl /1/. sub-band coding results in a
signal to noise improvement over full-band coding.
Moreover, the subjective effects of quantizing
noise are leSs, resulting in a more pleasant coding
quality.
Use of QMF enables to avoid aliasing effects due to
decimation. Consequently. band splitting can be
performed up to a large number of sub-bands without
using sophisticated filters.
Two SVCS hayS been described, using BCPCM techniques and opecating at 16 KBps and 32 KBps, The
first one gives a speech quality which is sufficient for telephony applications. The seccnd allows
a quality comparable to that provided by the stanPCtI code, thus achieving a halving of
dard 64
the bit rate for speech encoding.

ps

References
/1/ R.E. Crohiere, S.A. Webber, J.L, Flanagan,
"Digital coding of speech in sub-bandS",
1976 Int'l IEEE Conf. on ASSP, Philadelphia..
/2/ A. Croisier, 0. Esteban, C. Galand,
"Perfect channel splitting by use of interpolation/decimation/tree decomposition techniques"
1976 Int'l Conf. on Information Sciences and
Systems, Patras.

// k. Sthaf-qer,

. Ram,

"A digital signal processing approach to interpolation",


Proc. IEEE, Vol. 61, pp. 692-702, June 1973.
/4/ A. Croisier.
"Progress in PCTI and delta modulation
companded coding of speech signals",
1974 Int'l ZUrich seminar.

block

/5/ K.W. Cattermole,


"Principles of pulse code modulation",
tondon Iliffe Boffics Ltd.

195

You might also like