You are on page 1of 65

ECNG 3001 Communication Systems II

Topic 3 Source Coding


Source coding introduction
A/D Conversion
Sampling Theorem

Encoding
Waveform Coding - Pulse
Modulation (PM) Techniques
PAM, PCM, Companding

Huffman Coding Algorithm

Performance in source coding

Source Coding Introduction

Source Coding
Introduction
Information
Source

A source encoder (source coder) converts the


output of an information source into a code whose
form may be processed at the transmitter end of a
communications link
ECNG 3001 is concerned with digital source codes
How are analog or digital source outputs prepared
for source coding?
3

Source Coding prep for


coding
Before source coding can occur,
the sources output must be represented in electronic form
Transduce non-electronic source outputs e.g. using a microphone,
sensor, etc.
Condition electronic sources (apply impedance matching, filtering, etc.)

the sources output must also be represented in digital


form
Convert analog source information to digital via A/D conversion
Sample
Quantize

Directly encode digital source outputs


May employ sampling and/or quantization

Preparing source types Analog


Analog source outputs - convert to
electronic, digital form
Analog non-electronic
Information Source & preprocessing
Source
Transducer
A/D Converter

Source Coder

Analog electronic
Information Source & preprocessing
Signal
Source
A/D Converter
Conditioner

Source Coder
5

Preparing source types - Digital


Input is already digital convert to
electronic form
Digital non-electronic
Information Source & preprocessing
Source
Transducer

Source Coder

Digital electronic
Information Source & preprocessing
Signal
Source
Conditioner

Source Coder
6

Preparing source types for


digital communication
Digital source outputs generally require less
processing than analogue prior to source coding
Analog inputs must be digitized
What is the digitization process for analog
sources?
Transduce the analog source output if not already
electronic (make electronic)
Analog-to-digital (A/D) conversion (make digital)
Sample
Quantize
7

Analog to digital conversion


The process of converting a
continuous-time, continuousamplitude (analog) source output
into a discrete-time, discrete
amplitude (digital) signal

The A/D Process


Obtain the instantaneous
value of the analog input at
regular intervals sampling

Approximate instantaneous
(still continuous) values to
a set of finite valuesquantization
Quantized
signal

Original signal
before sampling
9

Sampling
The electronic analog source is tapped at
regular intervals to acquire its instantaneous
value
But why sample?
As the first step in digitizing the source output
discretizing the source output in time
The original (source) signal can be approximated
at the receiver by interpolating between
successive restored sample values

How often should the source be tapped or


sampled?
10

Sampling how often?


ORIGINAL SOURCE OUTPUT
SINE WAVE

a) Sample once per cycle constant


level recovered completely distorted

c) 2 times per cycle recovered sig.


begins to resemble the original sine
wave

b) 1.5 times per cycle lower freq sine wave


recovered (less distorted than in a))

d) > 2 times per cycle even better!

11

Sampling how often


It is clear that sampling at a rate of at
least 2x that of the sources output
frequency allows a reasonably
accurate reconstruction of the original
signal
Illustration was for a single frequency.
An analog source may generate
multiple frequencies. Considerations?
Observation and considerations for
complex signals summarised in
Nyquists sampling theorem

12

The Nyquist Sampling


Theorem
If a signal has a maximum frequency of B Hz,
it is completely defined by samples which
occur at intervals of 1/(2B) seconds
Interval 1/(2B) seconds aka sampling interval, T s
In other words,
a signal band-limited to B Hz can be sufficiently
reconstructed from samples taken at a
minimum rate of 2B samples per second
Or,
the sampling frequency must be equal to or
greater than twice the highest frequency
component of the signal
13

Proof: Nyquist Sampling


Theorem
Let the discrete signal x(t) represent the result of
sampling the source output using impulses at nT s time
instants
x(t) is the time-domain product of the electronic source
output x(t) and a pulse train (t - nTs), denoted as:

x (t )

x(nT ) (t nT ) or ,
s

x (t ) x(t ) (t nTs )
n

since each sampled pulse x(t) is defined only when t =


nTs, yielding the value of x(t) only at each multiple n of T s
14

Proof of Nyquist Sampling


Theorem (2)
Taking the Fourier Transform of both sides yields:

X ( f ) X ( f )* F

1
X ( f ) X ( f )*
Ts
1
X ( f )
Ts

(t nTs )

( f

X(f

or ,

n
) thus,
Ts

n
)
Ts

In words, the Fourier Transform of the


product of the impulse train and original
signal is a repetition of the FT of the
original signal, with each replica
separated by (1/Ts) Hz

15

Nyquist sampling Fourier


Analysis
Original signal and FT

Fourier
Multiplier pulse train

Sampled signal and


FT Multiplication
in t domain results
in replication of
original FT in f
domain, each
replica shifted by

16

Nyquist Sampling: Theorem


Result
For a source signal x(t) of bandwidth B, spectrum
replicas are adjacent and do not overlap for Ts =
1/(2B), or equivalently for fs = 2B
fs = 2B adjacent spectra

For Ts < 1/(2B) or fs > 2B,


replicas are separated from
each other
Spectrum overlap occurs for Ts > 1/(2B), or
equivalently for fs < 2B (time domain change has
inverse effect in freq. domain - Fourier)
Results in distortion of the source outputs spectrum
receiver cannot completely recover original signal -

17

Sampling Problem: Aliasing


For overlapping replicas of the source spectrum
(under-sampling), distortion is due to:
The loss of the tail of any spectrum copy G()
beyond I f I > fs/2 Hz. (folding freq.)
The reappearance of this tail inverted or folded
onto an adjacent spectrum replica (spectral
folding or aliasing)

Folding frequency
18

Effect of Aliasing
To reconstruct the signal at the receiver:
A single signal spectrum copy is selected by appropriate band
filtering with an effectively ideal low-pass filter of bandwidth f s/2
Hz

However, aliased signal cannot be recovered


completely
Folding results in loss of some frequency components after
filtering
Remaining frequency components sum with their folded versions
from adjacent spectrum copy
How can this be resolved?
19

Solving the Aliasing problem


In practical systems, sampling is done at a
higher rate than Nyquist
Allows the construction of a realizable filter at the
receiver to extract one spectral replica by creating
a guard band between adjacent source spectrum
replicas. Filter response needs not approach ideal
Size of guard band denoted by BG = fs 2B for
source output bandwidth of B and sampling
frequency of fs
20

Example 1
A bandlimited signal has a bandwidth equal
to 3400Hz. What sampling rate should be
used to guarantee a guard band of 1200Hz?
Let fs = sampling frequency, B = signal
bandwidth, and BG = guard band.
fs = 2B + BG
fs = 2(3400) + 1200 = 8000 Hz or
Samples/sec.
21

Example 2
A source with bandwidth 4000 Hz is
sampled at the Nyquist rate. Assuming
that the resulting sampled sequence can
be approximately modeled by a DMS with
1 1 1 1 1
, , 2}
, , and with
alphabet A = {-2, -1, 0, 21,
4 8 16 16
corresponding probabilities {
},
determine the rate at which information is
generated at this source in bits per
second.

22

The next step: Quantization


Quantization: the process of
approximating a continuous range of
values with a relatively small set of
integer values or discrete symbols
Source output is digitized in
amplitude in this process output
now becomes fully digitized
23

Quantization the process


Continuous signal has amplitudes within the range (m p,-mp)
This range is segmented into L equally-spaced levels
termed linear quantization
A representation point is selected either at each of the L
levels, at the mid-point of the sub-ranges between each
level, to which sources output samples are rounded
Thus, the process produces only a finite set of amplitude
values
digitized amplitudes

24

Determining Quantization Error

Consider a sampled source output quantized to 8 level


Peak-to-peak input = 2mp
volts
Number of quantization levels
=L
Spacing between each level =
V = 2mp/L

mp

d ( x x )

x
2mp

Quantization error q for


sampled input value x, after
quantization,
q ( xis x )

- mp

where
is the level to which
the actual value of x is
quantized
Max. error qmax = V/2

25

Determining Quantization Noise


Power
Quantization error q is continuous over any given
quantization interval
x is continuous, and so q can take on any value up to |
V/2| for a given quantization interval

If x takes on units of voltage, then q is also defined in volts

Taking the square of the instantaneous error


voltage q referenced to 1 gives instantaneous
noise power:
q 2 ( x x ) 2 W

The average value of q2 within any quantization


interval is the mean of all possible instantaneous
noise power values over the quantization interval
Mean-squared noise power found by weighting noise
power with the likelihood of occurrence for all values of
noise power over the interval
26

Quantization Noise Power (2)


Thus mean-squared noise power D i per interval is
the weighted average noise power over an interval:

Di q 2 q 2 f X ( x)dx

where fX(x) is a continuous function of the probability of any x


(and hence difference q) occurring. Simplest case is
equiprobable uniform distribution

For x equally likely to occur anywhere in the


interval, V, (and not outside) the probability of
any given x occurring is 1/V, hence

1
Di q
V
2

V
2

2
q
dq

V
2

(interval defined over the


range V)

27

Quantization noise power (3)


Quantization noise power, Nq, for a signal equally likely to
fall anywhere in a quantization interval 2mp/L, is therefore:

1
N q Di q
V
2

Nq

(V ) 2
q dq
12
V / 2
V / 2

m 2p
3L2

28

Performance in Quantization
Quality of quantized signal described quantitatively in
terms of the Signal to Quantization Noise Ratio
(SQNR): ratio of signal power to quantization noise power

Lets consider signal = m(t)


denoted x(t))
Avg signal power = So = m2(t)
(referenced to 1)
Quantization noise power = Nq

(often

2
Thus SQNR for equal-sized
S0
m
(t ) quantization
2
3L
intervals, 2mp/L,N quniform
m 2p quantization:
29

Uniform Quantization:
Observations
SQNR increases with
Increasing signal power
Decreasing quantization noise power

In uniform quantization, quantization noise


power is constant (mp2/3L2) over signal dynamic
range
SQNR varies with signal power
Uniform quantization not optimum because lower
signal amplitudes have lower SQNR (consider voice)

30

Increasing SQNR
In uniform quantization schemes, SQNR can be
increased by decreasing quantization noise at lower
amplitudes. How do we do this?
Nq decreased by increasing no. of quantization
levels
To equalize SQNR across signal dynamic range, we
may
Reduce quantization interval for small signal amplitudes:
non-uniform quantization or
Nonlinearly expand signal dynamic range so small
amplitudes see same Nq as large ones: Companding

31

Non-uniform Quantization
One scheme designed to
compensate for low SQNR of
uniform quantization
Lower m(t) amplitudes divided
into more quantization levels
Lower amplitudes subject to
lower quantization noise
SQNR equalized over
dynamic range of m(t)

32

Companding
Another scheme designed to compensate for low SQNR of
uniform quantization
Signal passed through non-linear
element that compresses large
amplitudes

Compressed signal quantized


uniformly

At receiver, dynamic range of

Compressing and Expanding


referred to as Companding

For voice signals, SQNR typically


improved by ~ 24 dB by

received signal restored (expanded)


with inverse characteristic of
compressor
Typical Compression
Characteristic

33

Common compression
characteristics

(a) -law characteristic (North America)

(b) A-law characteristic (Europe)

To read graphs :
As the actual input signal m approaches its maximum value (x
axis), apply less gain to it according to an amount of
compression set by or A (y axis).
For low signal levels m, apply more gain according to the
values of or A

34

- and A-law companding


characteristics
-law characteristic (used in North America)
Output y is expressed as:

1
m

y
ln 1
ln(1 )
m p

m
1
mp

A-law characteristic (used in Europe)


A m
y
1 ln A m p

1
Am
y
1 ln
1 ln A
m p

m 1

mp A

NOTE:
The greater
or A, the more
larger
amplitudes of
m are
compressed in
the output y.

m
1
mp
35

Improving SQNR with


companding
For -law
companded source
output, SQNR is
2
2 by
given
m
So
3L
2
Nq

ln(1 )

With
compression

______
2

m (t )
Without
compression

S0/Nq often
denoted as S0/N0
36

Summary of Source Prep


Perfce

Sampling

Sampling frequency must


be at least twice highest
frequency component in
signal:
fs 2B samples / second

Quantization
Quantization introduces
error, q, and noise, Nq,
V
which increase
with
Quantized
2
1 interval,
2
quantization
V :signal
Nq q2
q dq
V V
2

Original signal
before sampling

37

Summary of Source Prep


Perfce
Uniform
Quantization
For L equally-sized
quantization intervals,
2
S0
2 m (t )
Signal to Noise
is:
3L Ratio
2

Nq

mp

Alternatives
Non-uniform
quantization
Companding

38

Encoding
In the encoding process, a sequence
of bits is assigned to different
quantization values
Maybe independent of quantization
process or integral to quantization

39

Encoding Independent of
Quantn
Quantized levels mapped to fixed length binary
code words
Original signal

Samples Continuous values at


discrete times (PAM)

Quantization Continuous
values rounded to a set of
discrete values

Coding Each discrete value


mapped to a code
40

Mapping quantized values - PCM


Pulse Code Modulation, PCM

PCM
sign
al

4-bit PCM
41

41

Uniform PCM the process


The source output (bandlimited to B Hz) is usually
passed through an anti-aliasing filter of same BW
before sampling to eliminate aliasing of spurious
signals
Sampling is then done at a rate of Nyquist or
more provides a guard band
Samples are then passed to a uniform quantizer
(even-sized quantization intervals), L levels
Each quantized source output is mapped to a
binary sequence of fixed length v bits, where, L =
2v (encoding)
L generally chosen to be a power of 2

42

Source-coded bit rate


example
An analog source bandlimited to 4000 Hz is sampled at
the Nyquist rate, then quantized and encoded with 4-level
uniform PCM. Determine the source coding bit rate for this
source. Compare this rate with that obtained in example 2
on slide 22.

Nyquist sampling rate = 2(4000) = 8000


samples/sec
Uniform PCM employs uniform quantization, hence
no. of bits per level is v = log2(4) = 2 bits
Thus the source coded bit rate is 2(8000) = 16000
bits/s

43

PCM Observations
The output of the PCM process is a bit stream
Each quantized, encoded sample generates v bits
per sample
Sampling carried out at a minimum of 2B samples/sec
v bits per sample generates 2vB bits per second output
2vB bits/sec referred to as source-coded data or bit
rate
Represents the minimum coding rate for uniform PCM

Waveform coding schemes integral to quantization,


such as DPCM and DM, may be used to encode
source output with fewer bits than standard PCM

44

Increasing PCM Efficiency:


DPCM

Differential PCM, DPCM, codes the difference


between two adjacent samples rather than the
absolute value of the sample itself

Codes fewer bits than PCM


Used with sources which change slowly, i.e.
whose samples are correlated
Sources can be sampled faster to improve
correlation
Encoding fewer bits improves efficiency but
higher sampling rate may counteract this
improvement (this will be explored further in
topic 4 The Channel)
45

Increasing PCM Efficiency:


DM

Delta Modulation output encoded using a single bit

Output is a bit sequence representing slope of analog


input (i.e. derivative of input analog signal)
If staircase function goes up by one step, 1 generated
0 generated otherwise

Staircase always changes in the same direction as the


analog input
Thus, staircase very closely follows or represents the analog
input

Binary output can be used to regenerate the staircase


function and thus an estimate of the original signal
Original signal approximated by recreating and smoothing the
staircase function with an integrator or a filter
46

DM performance
If the analog source output changes too slowly or too
rapidly, distortion is introduced
If the input changes too slowly, quantization error becomes
significant
If it changes too rapidly, slope overload occurs
Either condition introduces distortion noise

Step size influences slope overload and quantization


error
As step size increases, quantization error increases
As step size decreases, slope overload increases

How can this issue be addressed?


47

Tracking the input with DM

48

Improving DM performance
Achieved by selecting optimum step
size
Use adaptive DM
For rapid changes in input, use large
steps eliminates slope overload
For slow changes, use smaller steps
significantly reduces quantization noise
49

Encoding efficiency
Source output is represented by H(X) bits per
quantized sample
Thus encoder must map each quantized sample to
a binary code of length v H(X) bits to maintain
representation accuracy
Binary codes can only be of integer lengths (e.g. 4
bits/sample, 8 bits/sample, etc.)
The value of H(X) is not always an integer (Why?)
Coding process is most efficient for v = H(X), i.e.
H(X) being an integer value
Efficiency is lower if more bits are encoded than
are required to accurately represent the quantized
source output, i.e. for v > H(X)
50

Improving encoding
efficiency
Efficiency can be improved by compressing the output of the
source. How?

Consider that H(X) is a measure of the average information


content per source output (sample)
For H(X) less than max, not all samples reveal exactly the
same amount of information (H(X) is the average of all
values)
Recall that information content of any given sample depends on
its likelihood of occurrence (higher likelihood of occurrence: less
information conveyed, and vice versa)
Thus no. of bits needed to represent less likely samples > no. of
bits for more likely samples
Hence we can encode with codes of variable lengths, based on
the sources output probability distribution
With this, the average code word length approaches H(X),
reducing the average number of bits generated by the source
coder, and improving efficiency

For H(X) = max, compression possible?

51

Implementing variable length


coding
Implemented by applying one of
several available algorithms to
quantized outputs
A common algorithm is the Huffman
Source Coding Algorithm (Huffman
Coding)

52

Huffman Coding
The probability of occurrence of each
quantized sample is used to
construct a matching codeword
This is accomplished using the
Huffman Coding Algorithm

53

The Huffman Coding


Algorithm
1. Sort source outputs in decreasing order of their
2.
3.
4.
5.

probabilities
Merge the two least probable outputs into a single output
whose probability is the sum of the corresponding
probabilities
No. of remaining outputs = 2? If no, go back to step 2;
otherwise go on to step 4
Arbitrarily assign 0 and 1 as codewords for the two
remaining outputs
If an output is the result of the merger of two outputs
from a preceding step, append the current codeword with
a 0 and a 1 to obtain the codeword for the preceding
outputs; then repeat step 5. If no output is preceded by
another output in another step, then stop

NOTE: The same convention as used in step 4 should be applied in step


5, i.e. if a 0 were assigned to the upper branch and a 1 to the lower in
step 4, the same should be done for each iteration in step 5.
54

Huffman Coding Example


An information source generates outputs with the
alphabet {a1, a2, a3, a4, a5} with probabilities {1/2,
1/4, 1/8, 1/16, 1/16}. Design a Huffman code for
this
source.
Alphabet Symbol
Algorithm
Resulting
symbol

Probability

a1

1/2

a2

1/4

a3
a4
a5

Implementation
0
0
0

1/8
1/16

1/8

1
1/16

10
1/2

1/4

Codeword
0

110

1
111
0
111
1

55

Huffman average codeword


length
Average codeword length is the weighted
average of all codeword lengths
Evaluated by weighting individual codeword
lengths by their probabilities of occurrence
and summing the result
Let p(i) = probability of ith sample occurring (ith
codeword)
Let l(i) = the length of the ith codeword
L
Average codeword
R length
p(i )l (Ri ) given by

i 1

56

Coding efficiency gain with


Huffman
The efficiency of the Huffman code is given
by
H(X )

R
1
R satisfies the inequality

H ( X ) R H ( X ) 1
Compared to standard PCM, R < v: on
average, fewer bits required to represent the
source without sacrificing accuracy
57

Huffman efficiency gain


example
For the source in the previous example,
calculate H(X), R and the efficiency of the
Huffman code. Determine how many bits are
required to represent the source with
standard PCM, and compare the efficiency of
the Huffman code with that obtained by
coding the source with PCM

58

Limitations of Huffman
coding
Huffman coding requires advanced knowledge
of the sources probability distribution
Cannot accommodate sources with differing statistics
Thus implementation can become quite complicated
in practice

Other techniques can be employed


E.g. Lempel-Ziv coding
Also employs variable-length encoding
However, does not rely on the statistics of the source
Encodes a string of consecutive binary codewords by parsing
it into phrases that are then coded into variable-length codes

59

Source Coding Brief Review


Source coding comprises:
Digitization (for analog sources)
Sampling
Quantization

Encoding
Waveform coding
Various techniques for improving coding
efficiency

Basic performance metrics examined


60

Summary of source coding


performance
For sampling:
A bandlimited analog source output of
maximum frequency B must be sampled
at the Nyquist rate (2B Hz) or higher
Sampling at lower rates termed
undersampling and prevents accurate
source representation
Effect of aliasing reduced for sampling rates
higher than Nyquist provide guard band

61

Source coding performance


(2)

For Quantization:

The sampled source output is quantized using L


levels
Process introduces quantization noise
Ratio of signal power to quantization noise power
measures the quality of the sampled output SQNR
For uniform quantization, SQNR varies with signal
level lower levels, lower SQNR, vice versa for
SQNR improved by employing non-uniform
quantization
Companding a type of non-uniform quantization
in which large amplitudes are compressed before
quantization, effectively increasing the SQNR of
lower-amplitude inputs
62

Source coding performance


(3)
For PCM:
Encoding produces unique fixed-length code
words for each quantized sample
DPCM produces shorter codewords per quantized sample
DM produces 1-bit codewords
DPCM and DM assume high correlation between
successive samples

Source coding data rate defined as a minimum of


2vB bits per second
Coding efficiency can be improved for quantized
samples from sources with non-uniform probability
distributions
63

Source coding performance


(4)

For improving encoding efficiency:

A minimum of H(X) bits per quantized sample


required to accurately encode the source output
PCM encodes fixed-length blocks of v bits per
quantized sample
Necessary that v H(X)
Not most efficient way to code for sources with varying
probabilities for each sample

Thus encode with variable-length codewords:


encode each source output according to its
probability of occurrence. Average number of bits
per quantized sample moves closer to H(X).
Average codeword length R < v, improving
efficiency

64

Reading
Mandatory reading:
Proakis, J, Masoud Salehi. Fundamentals of
Communication Systems : Sections 7.17.2.1, 7.3, 7.4, 12.2, 12.3-12.3.1

Recommended reading
Proakis, J, Masoud Salehi. Fundamentals of
Communication Systems : Section 12.3.2
Proakis, J, Masoud Salehi. Communication
Systems Engineering : Section 1.
65

You might also like