Exercises Information Theory

Information Theory Exercise
Anil Mengi, M.Sc.

Source Coding: Source Coding
Problem 1: Source Coding
We consider 64 squares on a chess board.
(a) How many bits do you need to represent each square?
(b) In a game on a chessboard one player has to guess where his opponent has placed the
Queen. You are allowed to ask six questions which must be answered truthfully by a
yes/no reply. Design a strategy by which you can always find the Queen. Show that
you can not ensure the exact position when you are allowed to ask five questions.
(c) How do you interpret your result in (b) together with your result in (a)?
Suggest solution:
a. As ld64 = 6, so we need at least 6 bits to represent each square.
b. Considering the worst situation, in which we always make the wrong guess. For every
yes/no reply, we can still eliminate half possible positions, so after 6 questions, there is
only 64 26 = 1 position left. When we are allowed to ask only 5 questions, there should
be 64 25 = 2 probable position left.
c. As each sequare has 6 bits to represent, 6 questions are just related to each bit, that is
also why we cant be sure with only 5 questions.

A language has an alphabet of five letters, xi , i = 1, 2, ..., 5, each occurring with probability 1/5. Find the number of bits needed of a fixed-length binary code in which
(a) Each letter is encoded separately into a binary sequence.
(b) Two letters at a time are encoded into a binary sequence.
(c) Three letters at a time are encoded into a binary sequence.
Which method is efficient in the sense of bit per letter?
Suggest solution:
a. we have 5 letters to code, so we need ld5 = 3 bits. B/L = 3/5 = 0.6
b. we have 52 = 25 probable letters to code, so we need ld25 = 5 bits. B/L = 5/25 = 0.2
c. we have 53 = 125 probable letters to code, so we need ld125 = 9 bits. B/L = 9/125 =
0.72
Obviously C is the most efficient method.
Problem 3: Entropy
Let p(x, y) be given by the following figure
Y
0
1/8
1/4
1/4
1/4
1/8
Find
H(X), H(Y ), H(X|Y ), H(Y |X), H(X, Y ), H(Y ) H(Y |X)
Draw a Venn diagram for the quantities you found.
H(X) = P (X = 0)ldP (X = 0) P (X = 1)ldP (X = 1) P (X = 2)ldP (X = 2)
= (P (X = 0, Y = 0) + P (X = 0, Y = 2))ld(P (X = 0, Y = 0) + P (X = 0, Y =
2)) P (X = 1, Y = 1)ldP (X = 1, Y = 1) (P (X = 2, Y = 0) + P (X = 2, Y =
2))ld(P (X = 2, Y = 0) + P (X = 2, Y = 2)))
= (3/8)ld(3/8) (1/4)ld(1/4) (3/8)ld(3/8)
= 11/4 9ld3
4
As the given diagram is symetric, which means we can get the same result of H(Y) easily
by changing the position of X and Y in the above fomular.
H(Y ) = H(X) = 11/4 9ld3
4
There are 5 probable positions for point P, so
H(X, Y ) = 3 14 ld4 + 2 18 ld8 = 94
As H(X) = H(Y ), H(X|Y ) = H(Y |X) = H(X, Y ) H(X) = 9ld3
12
4
H(Y ) H(Y |X) = 11/4 9ld3
( 9ld3
12 ) = 13
9ld3
4
4
4
2

A language has an alphabet of eight letters, xi , i = 1, 2, ..., 8, with probabilities
0.25, 0.20, 0.15, 0.12, 0.10, 0.08, 0.05, and 0.05.
(a) Determine a binary code for the source output.
(b) Determine the average number of binary digits per source letter.
(c) Calculate the entropy for the language given above.
(d) Check your result in (b) with the entropy calculated in (c). Is the code determined
in (a) optimum?
Suggest solution:
a. we can consider many codes, ont all of them are necessarily optimum.
For example, we distribute each letter with equally 3 bits.
b. 3.
c. H(X) = /sumP ldP = 2.7979
d. Obviously not, we can improve it by using Hoffman code.
Problem 5: Entropy
Consider the outcome of a fair dice throw and let D be the number of dots on the top
face. The random variables X, Y, and Z are defined over this sample space as follows:
X{1, 2, 3, 4, 5, 6} with {X =i} = {the number is i}
Y {e, o} with {Y = e/o} = {the number is even / odd}
Z{s, b} with {Z = s/b} = {the number is small / big},
where small means 6 3 and big means > 3 Make a table showing the mapping of
random variables.
Determine H(X),H(Y ), H(Z), H(X, Y ), H(Y, Z).
Determine H(X|Y ), H(Y |X), H(Y |Z).
Suggest solution:
H(X) = /sumP ldP = 6 16 ld6 = ld6
H(Y ) = H(Z) = 2 12 ld2 = 1
Because both Y and Z are decided by X, H(XY ) = H(X) = ld6
H(Y Z) = /sumP ldP = 2 ( 31 ld3 + 16 ld6) = ld3 + 13
H(X|Y ) = H(XY ) H(Y ) = ld3
H(Y |X) = H(XY ) H(X) = 0
H(Y |Z) = H(Y Z) H(Z) = ld3 32
Problem 6: Entropy
yaxis
xaxis
Consider the following figure.

The random pair (X, Y ) can only take values (0,0), (3,3), (4,0), (1,1), (1,3), and (3,2)
equally likely. Determine H(X), H(Y ), H(X|Y ), and H(X, Y ).
Problem 7: Entropy
Z
X1
X2
X1 {0, 1} and X2 {0, 1} are the inputs, where P (X1 = 1) = 1 P (X1 = 0) = p1

and P (X2 = 1) = 1 P (X2 = 0) = p2 . Z represents the noise, where Z {0, 1} and
P r{Z = 1} = 1 P r{Z = 0} = 1/3. represents the real multiplication and represents the XOR operation. The channel output Y {0, 1} is given as
Y = (X1 Z) X2 .
(Q1 ) Compute H(X2 |Y ) and H(Y ).
(Q2 ) Let p(X1 = 1) = p(X1 = 0) = p(X2 = 0) = p(X2 = 1) = 1/2. Calculate the mutual
information I(X2 ; Y ).
Problem 8: Entropy
Consider the following transmission channel
Y
X
BSC
Observer
A binary symmetric channel (BSC) with crossover probability p has input X and
output Y . The input X = 0 is used with probability q. The observer indicates Z = 0
whenever X = Y and Z = 1 otherwise.
(Q1 ) What is the uncertainty H(Z) in the observer output?
(Q2 ) What is the capacity and capacity achieving input distribution if the receiver is
provided with both Y and Z?
Problem 9: Fano Inequality

Let the following channel X to Y be given, see figure. P r{X = 0} = 1 P r{X =
1} = 2/3.
3/4
0
0
1/4
X
1/2
1/2
1. Compute H(X|Y ).
2. Calculate the error probability Pe .
3. Compute the Fano inequality for H(X|Y ).
Problem 10: Connection probabilities
Consider the relation between height L and weight W shown in the figure, indicating
tall people tend to be heavier than short people.
Weight(W)
Height (L)
00
11
11
00
00000000000000
11111111111111
000000000000000
111111111111111
00000000000000
11111111111111
000000000000000
111111111111111
00000000000000
11111111111111
000000000000000
111111111111111
0100000000000000
11111111111111
0000000000000001
11111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
0111111111111111
01
000000000000000
111111111111111
00000000000000
000000000000000
111111111111111
00000000000000
101011111111111111
00
11
00000000000000
11111111111111
000000000000000
111111111111111
00
11
00000000000000
11111111111111
000000000000000
111111111111111
00000000000000
11111111111111
000000000000000
111111111111111
11111111111111
00000000000000
00000000000000
000000000000000
111111111111111
0111111111111111
11
00
1/2
Very tall 1/8
1/2
1/4
Tall 1/4
1/2
Very heavy
Heavy
1/4
1/4
Average 1/4
1/2
Average
1/4
1/4
Short 1/4
1/2
Light
1/4
1/2
Very short 1/8
1/2
Very light
(a) What is the entropy of L?

(b) What is the conditional entropy H(W |L)?
(c) Find the probabilities of the weight categories?
(d) Flip the channel to find the reverse transition probabilities?
(e) Find the mutual information I(L; W ), which is how much information, on average,
about a persons height is given by his or her weight.
Problem 11: Mutual Information

Consider the points A, B, C, and D in the figure below.
Z
(0,0,1)
(0,0,0)
C
Y
(0,1,0)
A
(1,0,0)
X
A point P = (X, Y, Z) is selected with probabilities P r{P = A} = P r{P = B} =

P r{P = C} = P r{P = D} = 1/4. So X, Y and Z are the coordinates of P . Compute
I(X; Y ), I(X; Y |Z), I(X; Y, Z).
Problem 12: Inequalities
Let X, Y and Z be joint random variables. Prove the following inequalities and find
conditions for equality.
H(X, Y |Z) H(X|Z).
I(X, Y ; Z) I(X; Z).
H(X, Y, Z) H(X, Y ) H(X, Z) H(X)
I(X; Z|Y ) = I(Z; Y |X) I(Z; Y ) + I(X; Z)
Problem 13: Fano Inequality
Let the random variables X and Y denote the input and the output of a channel, where
X, Y {0, 1, 2, 3, 4}. All input values X are equally probable. The channel is characterized by the transition probabilities

1/2 y = x,
pY |X (y|x) =
(1)
1/8 y 6= x,
for all x. Apply Fano inequality to this example and interpret the results as an asking
strategy.
Problem 14: Typical Sequences

A binary memoryless source U {0, 1} defined by the probability pu (0) = 0.98, pu (1) =
.
0.02 generates sequences of length L. Let L = 100 and = = ld7
50
Which sequence is the most likely sequence?
Is the most likely sequence an element of the typical set A? Prove your result
numerically.
How many typical sequences exist?
Problem 15:
output Y.
Given the following channel with two inputs X1 and X2 and the
X1 : {0, 1}
X1
X2
X2 : {0, 1}
Y : {0, 1, 2}
+ = real addition
* = real multiplication
Also we have
P r(X1 = 1) = 1 P r(X1 = 0) = p1 , 0 p1 1
P r(X2 = 1) = 1 P r(X2 = 0) = p2 , 0 p2 1.
(a) Compute H(Y ), H(Y |X1), H(Y |X2), I(X1; Y |X2) in bits.
(b) Determine the input probabilities for X1 and X2 that maximizes H(Y ).
Problem 16: Typical sequences
An information source produces independent binary symbols with p(0) = p and p(1) =
1-p with p > 0.5 and an information sequence of 16 binary symbols. A typical sequence
is defined to have two or fewer symbols of 1.
(Q1 ) What is the most probable sequence that can be generated by this source and what
is its probability?
(Q2 ) What is the number of typical sequences that can be generated by this source?
We assign a unique binary codeword for each typical sequence and neglect the nontypical sequences.
(Q3 ) If the assigned codewords are all of the same length, find the minimum codeword
length required to provide the above set with distinct codewords.
(Q4 ) Determine the probability that a sequence is not assigned with a codeword.
Problem 17: Channel Capacity

Determine the channel capacity of the following channels.
Channel1:
1
0
1/2
X
Y
1/2
1
2
1
Channel2:
1
0
0
1/2
X
1
1/2
Problem 18: Cascaded Channel Capacity

Consider the given two discrete memoryless channel (DMC) models. Two channels can
be cascaded such that the output of the first one is the input of the second one. Let X
denote the input of the first channel, Y the output of the first and the input of the second
channel, and Z the output of the second channel.
p
0
1/2
X
1/2
2
2
p
DMC1
DMC2
Determine the missing transition probabilities.

Determine the transition probabilities of the concatenated DMC with input X and
output Z.
DMC1 and DMC2 are split up again and a channel encoder is used between DMC1
and DMC2, which controls the input distribution of DMC2 such that the channel
capacity of DMC2 is achieved.
Determine the channel capacity of this system.
Problem 19: Which of the following models has/have a channel capacity different
from C=1 bit/channel symbol?
0
Y
1
Y
1
0
X
0
0 Y
Y
1
0 Y
X
2
2
3
10
Problem 20: Capacity

Given the following channel.
| Z | :{0,1}
noise
side information
Z : {1,0,1}
p(z=1)=p(z=0)=p(z=1)=1/3
X : {0,1}
Y=X+Z
receiver
transmitter
channel
real addition
(Q1 ) If the receiver uses side information, i.e. the absolute value of Z, what is the capacity C1 of the channel in bits per transmission?
(Q2 ) If the receiver can not access side information, i.e. the receiver does not know the
absolute value of Z, what is the capacity C2 of the channel in bits per transmission?
(Q3 ) Now, let the transmitter change its alphabet to {0, 2}. Determine again C1 and C2
in this case.
Problem 21: Huffman Code
Consider a random source with statistically independent source symbols qi , 1 i 8.
The distribution of the source is given as follows:
Q
q1
p(q) 0.5
q2 q3 q4
0.1 0.1 0.1
q5
q6
q7
q8
0.1 0.05 0.025 0.025
a) Determine the entropy of the source and compare the result to a source with eight
identically distributed symbols. (Hint: ld10 3.32).
b) Construct an optimal binary prefix-free code for the given source.
c) Determine the average code word length of the constructed code by means of path
length lemma. Compare the result to the entropy.
d) Determine the sequence of code bits for the following sequence of source symbols:
q = [q1 q4 q8 q6 q1 q7 ].
e) Determine the code word length for the given sequence. Compare the result to the
average code word length.
11
Problem 22: Lempel-Ziv

An alphabet {a, b, c} is given. The code table is indexed with (#1, #2, #3).
a) Decode the message #1#2#3#4#5 and construct a code table.

b) Encode the string aacbacaca of an alphabet {a, b, c} with the Lempel-Ziv algorithm.
Show the construction of the code table and the coded string in detail.
c) Decode the code which you have gained and show how the code table is built up
dynamically.
Problem 23: Shannon Code
Consider the following method for generating a code for a random variable X which
takes on m values {1, 2, ..., m} with probabilities p1 , p2 , ..., pm . Assume that the probabilities are ordered so that p1 p2 ... pm . Define
Fi =
i1
X
pi ,
(2)
k=1
the sum of the probabilities of all symbols less than i. Then the codeword for i is the
number Fi [0, 1] rounded off to li bits, where li = log p1i .
a) Show that the code constructed by this process is prefix-free and the average length
satisfies
H(X) L H(X) + 1.
(3)
b) Construct the code for the probability distribution (0.5, 0.25, 0.125, 0.125).
Problem 24: Enumerative Coding
Let S be a set of binary sequences of length 13 with 3 ones. What is the sequence for
index 95 in the lower lexicographical ordering of S? Hint: Apply Enumerative decoding.
12
Problem 25: Quantization

Let X denote a source which produces given values with a distribution:
x
p(x)
0.3 0.7 0.5 1.8 1.1 0.45 1.2 0.1

1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
Assume that the source X is followed by a quantizer which uses four level of quantization
given as
quantized value
0.45
0.7
1.15
1.8
interval
0 < x 0.5
0.5 < x 1
1 < x 1.5
1.5 < x 2
Find the entropy of the quantized source?

Problem 26: Data Reduction
Recall the chapter data reduction. Apply the given system to the input:
b
b
b
b
b
b
b
b
g
g
o
o
g
g
o
o
where b represents blue, g represents green and o represents orange. Use the
transform matrix T given as:
1 1
1
1
1 1
1 1 1
2 1
1 1 1
1 1 1 1
Use the same quantization levels given in the slide and show all your steps.
Problem 27: Error Detection
A binary code has block length 6 and given as:
A: 000000
B: 001111
C: 111100
D: 111111
The information is transmitted over a binary symmetric channel with cross-over probability given as p. Calculate the probability of a detection error for A, B, C, and D.
Problem 28: Data Reduction
Check the slide number 17 in chapter error detection. Why is the number 1s in C(x)
is even?
13
Problem 29: Error Detection

The information packet (1 0 1 1) is written as A(x) = 1 + x2 + x3 . Given that A(x)
divides xi + 1, what is the smallest i?
Consider a random source with statistically independent source symbols qi , 1 i 8.
The distribution of the source is given as follows:
Q
q1
p(q) 0.5
q2 q3 q4
0.1 0.1 0.1
q5
q6
q7
q8
0.1 0.05 0.025 0.025
a) Determine the entropy of the source and compare the result to a source with eight
identically distributed symbols. (Hint: ld10 3.32).
b) Construct an optimal binary prefix-free code for the given source.
c) Determine the average code word length of the constructed code by means of path
length lemma. Compare the result to the entropy.
d) Determine the sequence of code bits for the following sequence of source symbols:
q = [q1 q4 q8 q6 q1 q7 ].
e) Determine the code word length for the given sequence. Compare the result to the
average code word length.
Let Q denote a source with the following distribution:
Q
q1
p(q) 0.3
q2 q3 q4
0.2 0.1 0.1
q5 q6 q7
0.1 0.1 0.1
a) Construct a binary Huffman code.

b) Determine the sequence of code bits for the following sequence of source symbols:
q = [q2 q1 q4 q1 q1 q3 q7 ].
c) Decode the resulting code bit sequence.
d) Introduce a bit error in the sequence of code bits by flipping the 4th code bit. Decode
the resulting code bit sequence.
Problem 32: Lempel-Ziv Code
A source bit sequence is given as
[00101010011001001100111111100100]
Assume that codebook is initialized with 0 and 1 and limited to 16 entries at the
transmitter as well as at the receiver.
Encode this sequence according to LZ78-Lempel-Zivs algorithm.
14
The coded bits are transmitted error-free. Recover the original sequence of source
bits back from the sequence of code bits.
Introduce a bit error in the code bit sequence by flipping the 5th code bit. Decode
the resulting (erroneous) code bit sequence.
Problem 1: Capacity
Given the following channel.
| Z | :{0,1}
noise
side information
Z : {1,0,1}
p(z=1)=p(z=0)=p(z=1)=1/3
X : {0,1}
Y=X+Z
receiver
transmitter
channel
real addition
(Q1 ) If the receiver uses side information, i.e. the absolute value of Z, what is the capacity C1 of the channel in bits per transmission?
(Q2 ) If the receiver can not access side information, i.e. the receiver does not know the
absolute value of Z, what is the capacity C2 of the channel in bits per transmission?
(Q3 ) Now, let the transmitter change its alphabet to {0, 2}. Determine again C1 and C2
in this case.
15

Exercises Information Theory

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exercises Information Theory

Uploaded by

Copyright:

Available Formats

Information Theory Exercise

Anil Mengi, M.Sc.

Problem 2: Source Coding

Problem 4: Source Coding

Consider the following figure.

X1 {0, 1} and X2 {0, 1} are the inputs, where P (X1 = 1) = 1 P (X1 = 0) = p1

Problem 9: Fano Inequality

Very tall 1/8

Very short 1/8

(a) What is the entropy of L?

Problem 11: Mutual Information

A point P = (X, Y, Z) is selected with probabilities P r{P = A} = P r{P = B} =

Problem 14: Typical Sequences

Problem 17: Channel Capacity

Problem 18: Cascaded Channel Capacity

Determine the missing transition probabilities.

Problem 20: Capacity

Problem 22: Lempel-Ziv

a) Decode the message #1#2#3#4#5 and construct a code table.

Problem 25: Quantization

0.3 0.7 0.5 1.8 1.1 0.45 1.2 0.1

Find the entropy of the quantized source?

Problem 29: Error Detection

a) Construct a binary Huffman code.

You might also like