You are on page 1of 12

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO.

6, NOVEMBER 2005

2777

Transmission of Adaptive MPEG Video Over


Time-Varying Wireless Channels: Modeling
and Performance Evaluation
Laura Galluccio, Giacomo Morabito, Member, IEEE, and Giovanni Schembra

AbstractWireless channels are characterized by high timevarying bit-error rates (BERs). To cope with this problem, several
adaptive forward-error-correction (AFEC) schemes have been
proposed in the literature. They work locally at the wireless link,
adding a variable amount of redundancy to the transmitted data in
order to maintain the packet error rate below an acceptable level.
However, when such schemes are utilized, the bandwidth offered to
the applications changes when channel conditions change. In this
paper, the effects of these bandwidth variations are investigated
in the case of real-time Motion Picture Experts Group (MPEG)
video transmission. The MPEG encoder is controlled in order to
adapt its emission rate to the current bandwidth offered by the
wireless link. To this end, the encoding quality is diminished by
the source rate controller when the transmission rate has to be
decreased due to an increase in the channel BER, whereas it is
improved when the transmission rate can be increased due to a
decrease in the channel BER. A Markov-based model, denoted as
SBBP/SBBP/1/K, has been introduced to model the scenario being
considered. The analytical framework allows evaluation of the
performance of the system and can be used to optimize the design
of a video transmission system for wireless channels, providing the
instruments to derive the tradeoff between information corruption
in the wireless channel and MPEG video encoding quality.
Index TermsForward error correction (FEC), Motion Picture
Experts Group (MPEG), quality of service (QoS), switched batch
Bernoulli process (SBBP), wireless channels.

I. I NTRODUCTION

HE NEED for supporting multimedia applications in


dynamic environments where users are equipped with
wireless terminals is one of the most challenging research
topics today. In fact, it is known that wireless channels are
characterized by bit-error rates (BERs) that are several orders of
magnitude higher than the corresponding values for terrestrial
networks. Accordingly, data packets may arrive at their destination corrupted, thus becoming useless.
To overcome this problem, one of the solutions most widely
adopted today is using forward error correction (FEC). FEC
algorithms introduce a chosen amount of redundancy: the
Manuscript received August 15, 2003; revised September 3, 2004; accepted September 13, 2004. The editor coordinating the review of this
paper and approving it for publication is V. K. Bhargava. The work of
L. Galluccio and G. Morabito was supported by Ministero dellIstruzione,
dellUniversit e della Ricerca (MIUR) under contract VICOM. The work of
G. Schembra was supported by MIUR under contract TANGO.
The authors are with the Dipartimento di Ingegneria Informatica e delle
Telecomunicazioni (DIIT), University of Catania, 95124 Catania, Italy (e-mail:
laura.galluccio@diit.unict.it; giacomo.morabito@diit.unict.it; schembra@
diit.unict.it).
Digital Object Identifier 10.1109/TWC.2005.858028

higher the BER, the higher the amount of redundancy introduced. However, in wireless channels, the BER is characterized
by high time variability: There are periods when channel
conditions are good, that is, the BER is low, and periods when
channel conditions are bad, that is, the BER is high. In order to
maintain a high level of resource efficiency while guaranteeing
the information accuracy required by applications, several
adaptive FEC (AFEC) schemes have been introduced in the
recent past [1], [2], [6], [7]. According to these schemes, the
amount of redundancy at any time depends on the channel
conditions being low if channel conditions are good, and high
if channel conditions are bad. One consequence is that AFEC
schemes cause variations in the bandwidth offered to user
applications, which therefore have to adapt their output rate
accordingly.
This paper focuses on video applications that are destined
to become very common in wireless-communication scenarios.
More specifically, the target of the paper is the definition of
an analytical framework for the design of a real-time Motion
Picture Experts Group (MPEG) video transmission system over
a wireless link that applies AFEC to keep the packet corruption probability acceptable, i.e., below a given threshold. The
MPEG encoder uses a rate controller that adapts the output
rate by appropriately setting the quantizer scale parameter
(QSP) [8], [12], [29] to follow the bandwidth variations, while
maximizing encoding quality and stability. In order to achieve
this target, the rate controller monitors the activity of the frame
that is being encoded, its encoding mode, and the number of
bytes used to encode the previous frames. Then, it chooses the
appropriate QSP in such a way that the transmission buffer at
the sender site never saturates, even during periods with low
available bandwidth. The whole system can be modeled by an
emission process that feeds the transmission buffer. The server
of this buffer behaves according to the channel conditions
estimated by the adaptive error controller: The serving rate
is higher when channel conditions are good and lower when
channel conditions are bad.
Switched batch Bernoulli processes (SBBPs) are used to
model both the MPEG source [4], [15], [17], and the server
process of the transmission buffer that coincides with the timevarying bandwidth available in the wireless channel [20], [24]
[28]. Accordingly, an SBBP/SBBP/1/K model is introduced to
describe the whole system.
The analytical framework proposed in the paper is used to
evaluate the performance in terms of the distortion introduced

1536-1276/$20.00 2005 IEEE

2778

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

Fig. 1. Mobile terminal system architecture.

by the quantization mechanism in the encoding process, which


are the loss and mean delay in the transmission buffer, at
different target packet error probabilities (PEPs) achieved using
AFEC. Results obtained in the paper can be used to obtain
the best tradeoff between encoding quality, which requires a
high available bandwidth, and information correctness at the
destination, which requires a high level of redundancy, thus
causing bandwidth reduction.
The rest of the paper is organized as follows. Section II
describes the wireless MPEG transmission system considered
in this paper. Section III proposes an analytical framework of
the whole video transmission system, accounting for both the
video source and the transmission channel. Section IV provides
a derivation of the performance parameters. Section V applies
the analytical framework to a case study in order to demonstrate the models capability of providing performance insights
for the system design. Finally, Section VI concludes the paper.
II. D ESCRIPTION OF THE S YSTEM
The architecture of the video transmission system in the
mobile terminal considered in this paper is shown in Fig. 1.
The adaptive rate source is an adaptive-rate MPEG video source
over a User Datagram Protocol (UDP)/IP protocol suite. The
video stream generated by the video source is encoded by
the MPEG encoder according to the MPEG video standard
[30], [31]. In the MPEG encoding standard, the frame, which
corresponds to a single picture in a video sequence, is the basic
displaying unit. Three encoding modes are available for each

frame: intraframes (I), predictive frames (P), and interpolative


frames (B). The basic idea behind MPEG video compression
is to remove spatial redundancy within a video frame and
temporal redundancy between successive video frames. The
encoder output is a deterministic period sequence in which the
period is a group of pictures (GoPs) realized with three types of
encoded frames.
1) I frames coded using only information present in the
picture itself in order to provide potential random access points in the compressed video sequence. The coding is based on the discrete-cosine transform according
to the joint photographic experts group (JPEG) coding
technique.
2) P frames coded using a coding algorithm similar to the
one used for I frames, but with the addition of motion
compensation with respect to the previous I or P frame
(forward prediction).
3) B frames coded with motion compensation with respect
to the previous I or P frame, and the next I or P frame, or
an interpolation between them (bidirectional prediction).
Typically, I frames require more bits than P frames, while B
frames have the lowest bandwidth requirement.
In encoding each frame, it is possible to tune the number
of bits needed to represent the frame and, thus, its quality, by
appropriately choosing the so-called QSP. Its value can range
within the set [1, 31]: 1 being the value giving the best encoding
quality but requiring the maximum number of bits to encode the

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

frame, and 31 being the value giving the worst encoding quality,
but requiring the minimum number of bits.
The QSP can be dynamically changed according to the
feedback law implemented by the rate controller in order to
achieve a given target. The MPEG encoder emits one frame
every seconds, and its output is packetized in the packetizer according to the UDP/IP protocol suite: the packetizer
fragments the information flow into blocks of UP bytes1 ; these
blocks constitute the payloads for the UDP, which adds a header
of 8 bytes; each UDP packet is then put in the payload field of
an IP packet.
The IP packets are then sent to a transmission buffer whose
service rate is time varying and depends on the channel condition estimated by the adaptive error controller, as will be
explained below. The main target of the rate controller is to
avoid buffer saturation, which causes losses and long delays,
while maximizing the encoding quality and stability. To this
end, it chooses the QSP parameter according to a feedback
law monitoring the activity of the frame being encoded, its
encoding mode (I, P, or B), and the current number of packets
in the transmission buffer. The model introduced in the paper
is so general that it can be applied whatever the feedback law.
The feedback law used in the paper was introduced in [4] and
[17] and, for the sake of completeness, will be reported in
Section V-A. It has been defined in such a way that a controlled
number of packets are present in the transmission buffer at the
end of each GoP, while pursuing a constant distortion level
within the GoP. Packets leaving the transmission buffer enter
the adaptive error controller. Its main target is to use FEC to
partially solve the problem of wireless-link unreliability. The
FEC block creator divides packets into sets of k blocks. These
blocks are given as input to the AFEC encoder and encoded
in sets of m blocks, with m k. If any set of k or more
blocks belonging to the same packet is received correctly, then
the original packet can be reconstructed properly. Obviously,
the larger the value of m, the higher the probability that the
information can be reconstructed at the receiver station, but the
lower the wireless-link bandwidth available at the video source.
The value of m is chosen by the FEC controller in such a
way that the PEP, i.e., the probability that a packet cannot be
reconstructed at the receiver station, is no higher than a target
(C)
value PPEP . Given that wireless channel conditions change
dynamically, AFEC encoding is applied, as proposed in [1],
[2], [6], and [7]. This encoding technique requires knowledge
of the current BER on the link. This estimation is performed
by the wireless channel estimator. The estimated BER value
is given as input to the FEC controller, which evaluates m so
that the requirement on the PEP is satisfied. The value of m
therefore changes in time and, as a consequence, the available
link capacity c(t) also changes in time as

c(t) =

k
c
m(t)

(1)

1 If Real-Time Protocol (RTP)/Real-Time Control Protocol (RTCP) protocols


are also used over the UDP/IP protocol suite, the related overhead should be
considered.

2779

where c is the capacity (in packets/s) when FEC is not used.


At any time, the service rate of the transmission buffer is set
equal to c(t). Accordingly, both the MPEG encoder output
process and the transmission-buffer service process are stochastic processes, the first depending on the behavior of the source
and the rate controller, and the second on the BER behavior of
the wireless channel. These processes will be modeled with two
(n), respectively, as
discrete-time SBBP processes Y (n) and N
described in detail in Section III.
III. S YSTEM M ODEL
In this section, we derive a discrete-time analytical model of
the system described in the previous section. We will set the
slot duration equal to the video-frame interval.
As a first step, Sections III-B and III-C will describe the models of the noncontrolled MPEG encoder output and the available
capacity of the channel as SBBPs [9]. Then, the whole system
will be modeled as an SBBP/SBBP/1/K queueing system in
Section III-D, where K is the maximum number of packets the
transmission buffer can contain. For the sake of completeness,
Section III-A provides a brief outline of SBBP processes.
A. Switched Batch Bernoulli Processes (SBBPs)
An SBBP Y (n) is a discrete-time emission process
modulated by an underlying Markov chain [9], and represents
a special case of the family of the hidden Markov model
processes [19].
Each state of the Markov chain is characterized by an emission probability density function (pdf): The SBBP emits data
units according to the pdf of the current state of the underlying
Markov chain. Therefore, the SBBP Y (n) is fully described
by the state space (Y ) of the underlying Markov chain, the
maximum number of data units the SBBP can emit in one
(Y )
slot rMAX , and the matrix set (Q(Y ) , B (Y ) )), where Q(Y )
is the transition probability matrix of the underlying Markov
chain, while B (Y ) is the emission probability matrix whose
rows contain the emission pdfs for each state of the underlying
Markov chain.
If we indicate the state of the underlying Markov chain in the
generic slot n as S (Y ) (n), the generic elements of the matrices
Q(Y ) and B (Y ) are defined as follows:


(Y )
Q s ,s = Prob S (Y ) (n + 1) = sY |S (Y ) (n) = sY
[ Y Y]

(Y )

B s ,r
[Y ]

sY , sY (Y )




= Prob Y (n) = r|S (Y ) (n) = sY


(Y )
sY (Y ) , r 0, rMAX .

(2)

(3)

We will introduce an extension to the meaning of the


SBBP to model not only a source emission process, but also
a video-sequence activity process, and an available wirelesschannel-capacity process. In the latter cases, we will indicate them as an activity SBBP and a transmission-channel
SBBP, respectively, and their matrices B (Y ) as the activity

2780

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

probability matrix and the channel-transmission probability


matrix, respectively.
B. Noncontrolled MPEG Source Model
The noncontrolled MPEG video source is part of the
adaptive-rate source shown in Fig. 1 comprising the video
source, the MPEG encoder, and the packetizer. We denote it as
noncontrolled because we are assuming it works with a constant
QSP q not controlled by the rate controller.
The first step in modeling the whole video transmission
system shown in Fig. 1 is the derivation of the SBBP process
Yq (n), modeling the emission of the noncontrolled MPEG
video source at the packetizer output for each QSP q.
This model was calculated by the authors in [4] and [17].
Here, for the sake of brevity, we will refer to those works
in order to define the notation. The model captures two different components: the activity-process behavior and the activity/emission relationships. As input, it takes the first- and
second-order statistics of the activity process, and the three
functions, one for each encoding mode (I, P, or B), characterizing the activity/emission relationships. The state of the
underlying Markov process of Yq (n) is a double variable,

S (Y ) (n) = (S (G) (n), S (F ) (n)), where S (G) (n) (G) is the


state of the underlying Markov chain of the activity process
G(n), and S (F ) (n) J is the frame to be encoded in the GoP at
the slot n. The state set (G) represents the set of activity levels
to be captured. For example, according to [5], we have (G) =
{Very Low, Low, High, Very High}. Set J, on the other hand,
represents the set of frames in GoP and depends on the GoP
structure. For example, if the movie is encoded with the GoP
structure IBBPBB, set J is defined as J = {I, B, B, P, B, B}.
As demonstrated in [4] and [17], the underlying Markov
chain of Yq (n) is independent of q. Therefore, we will indicate

its transition probability matrix as Q(Y ) instead of Q(Yq ) ,

and set (Q(Y ) , B (Yq ) ), for each q [1, 31], defines the SBBP
emission process modeling the output flow of the noncontrolled
MPEG encoder, when it uses a constant QSP value q.
C. Service SBBP Model
The target of this section is to derive the SBBP model of
(n), which represents the service process of the
the process N
transmission buffer when AFEC is employed. As said so far,
it closely depends on the amount of redundancy the AFEC
(C)
encoder introduces to achieve the target maximum PEP PPEP
due to the wireless channel.
As usual, (e.g., [14], [24], and [26]), we assume that the
channel behavior can be described by means of an M -states
Markov process. Accordingly, channel statistical behavior can
be described by an M M transition probability matrix Q(C)
and by BERi , the BERs for each state of the process i [1, M ].
Thus, the service SBBP model is represented by the following parameters:
1) the maximum number of packets that can be transmitted
)
(N
in a time slot rMAX ;

2) the state space (N ) ;

3) the matrix set (Q(N ) , B (N ) ) containing the transition


probability matrix and the channel-emission probability
matrix.

Obviously, the transition probability matrix Q(N ) of the un


derlying Markov chain of the process N (n) coincides with
the channel-transition probability matrix Q(C) , as calculated

in [26]. The state space (N ) coincides with the channel state

space, i.e., (N ) = [1, M ]. Instead, in order to derive B (N ) ,


we have to calculate the bandwidth reduction due to the AFEC
redundancy for each state i of the channel SBBP. This depends
on the BER characterizing the state BERi .
The FEC redundancy to be introduced to achieve the target
(C)
value for the maximum PEP PPEP should be such that the
(C)
resulting PEP for any state i of the channel PPEP,i is lower
than or equal to the target one, i.e.,
(C)

(C)

PPEP,i PPEP .

(4)

According to the notation introduced in Section II, indicating


the size of each block expressed in bits as R, and assuming
that losses introduced by the wireless channel are independent
and uniformly distributed within a block,2 the PEP, when the
channel is in the generic state i, can be calculated as follows:
(C)
PPEP,i

m

l=mk+1


m
(1 PBEP,i )ml (PBEP,i )l (5)
l

where PBEP,i represents the probability that a block is corrupted when the channel is in state i, and can be evaluated as
follows:
PBEP,i = 1 (1 BERi )R .

(6)

Now, substituting (5) in (4), we can numerically find the


minimum value of m verifying the inequality in (4) for each
value i of the channel state. Let us indicate this value as mi .
i (in packets/s), which is actually
Accordingly, the capacity N
available for the transmission of data to obtain a PEP lower
(C)
than PPEP in the wireless channel when its state is i, can be
calculated as follows:
i = k c
N
mi

(7)

where c is the channel capacity when no FEC encoding is


applied (in [packets/s]).
i .
In general, from (7), we obtain a noninteger value for N
However, we can assume that, when the channel state is i, in
i  packeach slot, the channel is able to transmit either Di = N
ets with a probability of pDi = 1 (Ni Di ), or (Di + 1)
packets with a probability of pDi +1 = 1 pDi , where we have
indicated the largest integer no greater than x as x.
2 This assumption is accurate if interleaving is utilized, which is usual in
wireless communications [28].

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

In summary, the emission probability matrix of the SBBP

modeling the channel is B (N ) [M


element can be calculated as follows:

B

)
(N


[i,d]

pDi ,
pDi +1 ,
0,

)
(N
rMAX ]

, and its generic

if d = Di
if d = Di + 1
otherwise

(8)

3) S (Y )(n) is the state of the underlying Markov chain of


Y (n), which coincides with that of Yq (n), for any q
[1, 31].
According to the late-arrival-system-with-immediate-access
time diagram, the transmission-buffer state in slot (n + 1) can
be obtained through the Lindley equation [13]




sQ = max min sQ + r, K d, 0

)
(N
rMAX

where
is the maximum number of packets that can be
transmitted in one slot, i.e.,
)
(N

rMAX = max{Di + 1}.


i

(9)

The transition probability matrix and the state space, together


with the channel-emission probability matrix and the maximum
number of packets that can be transmitted in one slot defined in
(8) and (9), completely characterize the channel SBBP model.

1) S (Q) (n) [0, K] is the transmission-buffer queue state


in the nth slot, i.e., the number of packets in the queue
and in the service facility at the observation instant;

2) S (N ) (n) is the state of the underlying Markov chain of


(n);
the channel SBBP N

(10)

where sQ is the transmission-buffer state in the generic slot n,


while r and d are the server capacity and the number of arrivals
at slot n + 1, respectively.
(n), modeled in Section III-C, can
The channel SBBP N
be equivalently characterized through the set of transition

probability matrices M (N ) (d), which are transition probability


matrices including the probability that the server capacity is
d (in packets/slot). These matrices can be obtained from the

parameter set (Q(N ) , B (N ) ) as follows:

D. Video-Transmission-System Model
The adaptive-rate source pursues a given target by implementing a feedback law in the rate controller, which calculates
the value q of the QSP to be used by the MPEG encoder for
each frame. The target of this section is to model the video
transmission system as a whole, indicated here as . To this
aim, we use a discrete-time queueing system model.
Let K represent the maximum number of packets that can
be contained in the queue of the transmission buffer and its
server. The server capacity of this queueing system, that is, the
number of packets that can leave the queue at each time slot,
is a stochastic process that has been modeled with the channel
(n).
SBBP process N
The input of the queue system is the emission process of
the adaptive-rate source, indicated here as Y (n). Therefore, at
slot n, the transmission-buffer queue size is incremented by
(n). Both the input and the output
Y (n), and decremented by N
processes can be modeled by means of two SBBP processes,
as discussed above, where the slot duration is the frame
duration .
To model the queueing system, we assume a late-arrivalsystem-with-immediate-access time diagram [3], [11]: Packets
arrive in batches, and can enter the service facility if it is
free, with the possibility of them being ejected almost instantaneously. Note that in this model, a packet service time is
counted as the number of slot boundaries from the point of entry
to the service facility up to the packet departure time. Therefore,
even though we allow the arriving packet to be ejected almost
instantaneously, its service time is counted as 1, not 0.
A complete description of at the nth slot requires a
three-dimensional Markov process, whose state is defined as

S () (n) = (S (Q) (n), S (N ) (n), S (Y ) (n)), where:

2781

M (N ) (d) 

s ,s
N



(n + 1) = d, 
N
(N )


S (n) = sN
Prob

S (N ) (n + 1) = sN 





= Q(N )     B (N )   


s ,s
N

)
(N
0, rMAX

s ,d
N


.

(11)

The adaptive-rate source emission process is modeled by


an SBBP whose emission probability matrix depends on the
transmission-buffer state. In order to model this process, we
use the SBBP models of the noncontrolled MPEG video source
described in Section III-B, Yq (n), for each q [1, 31]. So, we

have a parameter set (Q(Y ) , B (Y1 ) , B (Y2 ) , . . . , B (Y31 ) ), which

represents an SBBP whose transition matrix is Q(Y ) , and


whose emission process is characterized by a set of emission

matrices {B (Yq ) }q=1,2,...,31 . Consequently, at each time slot,


the emission of the MPEG video source is characterized by an
emission probability matrix chosen according to the QSP value
defined by the feedback law q = (sQ , a, j).
More concisely, as in (11), for the channel SBBP, we
characterize the emission process of the adaptive-rate source
(Y )
(Y )
through the set of matrices {Ms (r)}, r [0, rMAX ], each
Q
matrix representing the transition probability matrix including
the probability of r packets being emitted when the buffer
state is sQ . Accordingly, the generic element of the ma(Y )

trix Ms (r) can be obtained from the above parameter set
Q

(Q(Y ) , B (Y1 ) , B (Y2 ) , . . . , B (Y31 ) ) as follows:




  
(Y )
Ms (r)
Q(Y )
=
Q

[(i ,j  ),(i ,j  )]

a (Act)




B (Yq ) (r)

[(i ,j  ),r]

[(i ,j  ),(i ,j  )]

fAct (a |i , j  )

(12)

2782

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

where the following hold:


1) q  is the QSP chosen when the frame to be encoded is the
j  th in the GoP, the activity is a , and the transmissionbuffer state before encoding this frame is sQ . The value
of q  is determined by the feedback law = (), i.e.,


q  = sQ , a , j  .
(13)

element is


Q() 

sQ ,s ,s , s


,s ,s
Q
N

(Q)


S (n + 1) = sQ ,
)
(N
Prob S (n + 1) = sN ,

(Y )
S (n + 1) = sY ,
)
(N

In this section, we evaluate both the static and time-varying


statistics of the quantization distortion, represented by the process PSNR(n).
More specifically, we will quantize the PSNR process with
a set of L different levels of distortion, {1 , 2 , . . . , L },
each representing an interval of distortion values where the
quality perceived by the users can be considered constant.
As an example, for the movie Evita, from a subjective
analysis obtained with 300 tests, the following L = 5 levels of distortion were envisaged: 1 = [31.2, 34.2] dB, 2 =
[34.2, 35.0] dB, 3 = [35.0, 36.2] dB, 4 = [36.2, 38.4] dB,
and 5 = [38.4, 52.1] dB.
The pdf fPSNR (p) can be easily calculated from the transition probability matrix and the steady-state probability array of
the whole system, which have been derived in (14) and (16),
respectively [see (18) at the bottom of the page], where the
following hold:
1) [sQ ,a ,j  ] (p) is a Boolean condition defined as follows

 (Q)

 S (n) = sQ ,


 (N )
 S (n) = sN ,


 S (Y ) (n) = s ,
Y

dMAX rMAX

d=0

r=0



(Y )
Ms (r) 
Q

s ,s

 



s
,
s
,
.
.
.
,
r,
d
Q
Q


s ,s
Y

(14)

where (sQ , sQ , . . . , r, d) is a Boolean condition for the queue


state behavior, and is defined as follows:


sQ , sQ , K, r, d





1, if max min sQ + r, K d, 0 = sQ
=
. (15)
0, otherwise

[s 

] (p) =




1, if F (j ) sQ , a , j  = p .
0, otherwise
(19)



()

element l , for each l [1, L], is the QSP range giving a


distortion belonging to the lth level for a frame encoded with
encoding mode {I, P, B}. Of course, by so doing, we
()
assume that a variation of q within the interval l does not
cause any appreciable distortion. From the distortion curves
for the movie Evita, we have calculated the following QSP

where 1 is a column array whose elements are equal to 1,


and () is the steady-state probability array, whose generic

K


,a ,j 

2) F (j ) (q) in (19) is the so-called distortion curve [5], [17],


[22] for the generic frame j, which is the curve linking
the average PSNR to the QSP value, q, used to encode
the frame.
Now, in order to calculate the statistics of the quantized PSNR
process, let us define the array () in which the generic

Once the matrix Q() is known, we can calculate the steadystate probability array of the system as the solution of the
following linear system

() Q() = ()
(16)
() 1 = 1

fPSNR (p) Prob {PSNR(n) = p} =

. (17)

IV. Q UANTIZATION -D ISTORTION A NALYSIS

)
(Y


  

M (N ) (d) 
=

A direct solution of the system in (16) may be difficult


since the number of states grows explosively as the maximum
transmission buffer size K increases. Nevertheless, many algorithms, e.g., [10], [18], and [23], enable us to calculate the array
() , while maintaining a linear dependence on K.

= Prob S (Q) (n) = sQ , S (N ) (n) = sN ,


S (Y ) (n) = sY

2) fAct (a |i , j  ) is the probability that the generic frame j 


in the GoP has an activity a when its activity level is i .
This function, as demonstrated in [15], [16], and [21], is a
Gamma pdf, whose mean value and variance characterize
the video trace.
3) (Act) is the set of all the possible activities.
Finally, we can model the video transmission system as a
whole. If we indicate two generic states of the system as s =
(sQ , sN , sY ) and s = (sQ , sN , sY ), the generic element of
the transition matrix of the video transmission system as a
whole Q() can be calculated, due to (11) and (12), as follows:


()
[(sQ ,sN ,sY )]

K
  

fAct (a |i , j  )

) i (G) j  J s =0 
) i (G) a (Act)
sQ =0 s (N
s (N
Q

Q

()

sQ ,s ,(i ,j  ) , s


,s ,(i ,j  )
Q
N

 ()

sQ ,s ,(i ,j  )
N

 s ,a ,j  (p) (18)


[Q
]

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

the same level for (m 1) consecutive slots, and leaves this


level at the mth slot, that is

m1
()
Q( =l ) 1T
fl (m) = (1,l ) Q()
l

ranges corresponding to the above distortion levels l , for each


l [1, 5].
1) For I frames: (I) = [[16, 31], [13, 15], [10, 12], [6, 9],
[1, 5]].
2) For P frames: (P) = [[15, 31], [13, 14], [10, 12], [6, 9],
[1, 5]].
3) For B frames: (B) = [[17, 31], [14, 16], [11, 13], [7, 10],
[1, 6]].

()

where :

(PSNR)

= Prob {PSNR(n) l }

(20)

sQ ,s ,(i ,j  ) , s


,s ,(i ,j  )
Q
N

() Ql
()

() Ql 1T

(25)



PSNR(n + 1) l , . . . , PSNR(n + m 1) l ,  PSNR(n 1) l
 PSNR(n) l
PSNR(n + m) l




(24)

We analyzed the statistical characteristics of 1 hour of MPEG


video sequences of the movie Evita. To encode this movie, we
used a frame rate of F = 25 frames/s, and a frame size of 180
macroblocks. The GoP structure IBBPBB was used, selecting a
ratio of total frames to intraframes of GI = 6, and the distance
between two successive P frames or between the last P frame in
the GoP and the I frame in the next GoP as GP = 3. The size
of the transmission buffer has been set to K = 60 packets. The
gross link capacity assigned to the video application is 2 Mb/s.
The IP packets at the wireless terminal are divided into 40 bytes
blocks, as usual in the universal mobile telecommunications
system (UMTS) environment. The AFEC module encodes sets
of k = 16 blocks into sets of m.
In this case study, we use the eight-state finite-state Markov
channel (FSMC) model introduced in [26] for the wireless
channel and consider two different cases.
1) Pedestrian: The mobile users velocity is 5 km/h.
2) Driver: The mobile users velocity is 55 km/h.
Assuming that wireless transmission is performed in the 2-GHz
band, which is the value used in UMTS, the maximum Doppler
frequency is fm = 10 Hz in the first case and fm = 100 Hz in
the second.
The values that characterize Q(C) are given in Table I for the
pedestrian and driver cases. The above matrices were calculated

In order to calculate the pdf fl (m) in (21), let us indicate


the matrix containing the one-slot probabilities of transition
towards system states in which the distortion level is l as
()
Ql . It can be obtained from the transition probability matrix
of the system Q() , as in (23), shown at the bottom of the page.
Therefore, the pdf fl (m) can be calculated as the probability
that the system , starting from a distortion level l , remains in

Q()
l

A. System Characterization

pl

()

(, =l ) Ql 1T

V. C ASE S TUDY

(PSNR)

fl (m) = Prob

(,l ) Ql

()

(, =l ) =

in
and (21), shown at the bottom of the page. The term [l]
(20) can be calculated from the pdf fPSNR (p) obtained in (18)
as follows:

(PSNR)
=
fPSNR (p).
(22)
[l]

(1,l ) =

The array (1,l ) in (24) is the steady-state probability array in


the first slot of a period in which the distortion level is l . The
array (, =l ) , on the other hand, is the steady-state probability
array in a generic slot in which the distortion level is other than
l , and is defined as

Let q = (sQ , a , j  ) be the feedback law, linking the


transmission-buffer state at the beginning of a generic slot
n, sQ [0, K], the activity of the frame in the same slot,
a (G) , and the position in the GoP of the frame to
be encoded, j  J, to the QSP to be used to encode the
(a ,j  )
current frame. Moreover, for each a and j  , let l
=
()


 
{sQ such that (sQ , a , j ) l } be the range of values
of the transmission-buffer state for which the rate controller
chooses QSP values belonging to the level l , according to the
adopted feedback law. By definition, it follows that a variation
(a ,j  )
does not cause
of the transmission-buffer state within l
any appreciable distortion variation.
Let us now calculate the probability that the value of the
(PSNR)
, and
process PSNR(n) is in the generic interval l , [l]
the pdf fl (m) of the stochastic variable l , representing the
duration of the time the process PSNR(n) remains in the
generic interval l without interruption. They are defined as
[l]

2783

a (Act)

 =
0,

()
Q 

sQ ,s ,(i ,j  )
N

s
,s ,(i ,j  )
Q N

 fAct (a |i , j  ),

(21)

(a ,j  )

if sQ l
otherwise

(23)

2784

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

TABLE I
Q(C) PARAMETERS IN THE PEDESTRIAN CASE (fm = 10 Hz) AND DRIVER CASE (fm = 100 Hz)

TABLE II
REDUNDANCY BLOCKS AND NET LINK CAPACITY OFFERED TO THE APPLICATION FOR DIFFERENT CHANNEL STATES AND TARGET ERROR
(C)
PROBABILITIES PPEP IN THE DRIVER CASE, WHEN THE GROSS LINK CAPACITY IS c = 2 Mb/s

assuming that the video-frame rate is 25 frames/s and therefore,


the slot duration is = 40 ms.
(C)
The target error-probability values considered are PPEP =
(C)
(C)
(C)
105 , PPEP = 104 , PPEP = 103 , and PPEP = 102 .
Table II lists, for each state i of the server SBBP model, the
values of mi and the resulting available link capacities ci for
(C)
these PPEP values in the driver case, taken as an example.
In this case study, we will consider a feedback law obtained
from the statistics of the movie Evita, expressed in terms of rate
and distortion curves [5], [17], [23]. The rate curves Ra,j (q)
give the expected number of packets which will be emitted
when the jth frame in the GoP has to be encoded, if its activity
value is a, and is encoded with a QSP value q. The distortion
curves F (j) (q) give the expected encoding PSNR, and have
been defined in Section IV. The rate and the distortion curves
for the movie Evita are shown in Fig. 2.
The considered feedback law aims to maintain the number
of packets in the transmission-buffer queue lower than a given
threshold K at the end of each GoP interval, while maintaining
stable the PSNR during the whole GoP. In this case, both the
rate curves Ra,j (q) and the distortion curves F (j) (q) are used.

More specifically, if we indicate the transmission-buffer queue


length and the channel available capacity when the jth frame in
the GoP has to be encoded as sQ and sN , respectively, and a
being the activity of this frame, the QSP is chosen assuming the
following.
1) The activity will remain constant during the rest of the
GoP, that is, Act(n) = a, for each frame h [j + 1, GI ].
2) The channel behavior, and therefore the available network
(n), remains constant during the rest of the
bandwidth N
GoP, that is, for each frame h [j + 1, GI ].
Under these assumptions, the QSP is chosen as the minimum
QSP q, such that it is possible to find a set of QSP values for the
next frames of the GoP, [qj+1 , . . . , qGI ], so that the following
hold.
1) The PSNR of those frames is constant, and equal to the
value that should be achieved for frame j.
2) The number of emitted packets expected for the next
frames of the GoP, if these QSP values are used, added
to the current queue, minus the number of packets that
will leave the queue until the end of the GoP, results to
lower than the given threshold K .

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

Fig. 2.

2785

Rate-distortion curves for I, P, and B frames. Rate curves for (a) frame I, (b) frame B, and (c) frame P. (d) Distortion curves.

In other words, the feedback law works by choosing the QSP


as in (26), shown at the bottom of the page.

B. Numerical Results
Fig. 3 shows the pdfs of the transmission-buffer queue
size for the two values of the Doppler frequency fm and for
a given value of the target error probability among those being
considered. The values shown have been calculated as follows:



Prob S (Q) (n) = sQ =
sN

)
(N


sY

(Y )

()

(s ,s ,s ) .
[ Q N Y ]

(27)

We can observe that the curves are basically Gamma distributions and are very similar to each other independently of
(C)
the PPEP value. This is the evidence that the feedback law
works properly. This is further demonstrated in Fig. 4 where
we show the average queue size as well as the mean delay in
the transmission buffer. The value of the average queue size
(C)
does not change significantly when the PPEP changes and is
higher in the driver case. This can be explained by the fact that
in the driver case, the wireless medium quality is lower and
therefore, the transmission-buffer service rate is lower. Similar
discussions can be carried out concerning Fig. 5, where the
performance in terms of loss probability in the transmission
buffer is shown and calculated as in [4].

such that [qj+1 , . . . , qGI ] for which :


q

q)
k [j + 1, . . . , GI ]
F (k) (qk ) = F (j) (
q = (sQ , a, j) = min
 I

q[1,31]

q) + G
sQ + Ra,j (
k=j Ra,j (qk ) (GI j + 1) N (n) K

(26)

2786

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

Fig. 3. Transmission-buffer size pdf for PPEP = 102 (a) in the pedestrian case and (b) in the driver case.
(C)

(C)
Fig. 4. Average transmission-buffer size and mean delay versus the target error probability PPEP .

Fig. 6 shows the performance related to the encoding quality.


(C)
In particular, it can be observed that, for high values of PPEP ,
due to the high amount of available bandwidth, the most likely
PSNR level is the highest. On the contrary, for low values of
(C)
PPEP , as a result of the large amount of redundancy introduced
by AFEC, the available bandwidth is low, and therefore, the
video source reduces the encoding quality. For this reason, the
(C)
lower the value of the target PEP in the wireless link PPEP ,
the greater the probability of poorer PSNR levels. In order to
better quantify the influence of the choice of the target value
(C)
PPEP on the encoding performance, in Fig. 6, the average
PSNR level is shown. As expected, the worst case for the average PSNR level is given when the AFEC has a very stringent
(C)
target for the maximum PEP PPEP . When a less stringent target
value for the PEP is required, the encoding quality increases.

Obviously, the average PSNR value, and thus encoding quality,


is higher in the pedestrian case.
VI. C ONCLUSION
In this paper, we have defined an analytical framework for
the evaluation of the performance of real-time MPEG video
transmission over a wireless link that applies AFEC to keep the
PEP below a given threshold.
The MPEG encoder uses a rate controller that adapts the
output rate by appropriately setting the QSP to follow the
bandwidth variations while maximizing encoding quality and
stability. The whole system has been modeled by an emission
process that feeds the transmission buffer; the server of this
buffer behaves according to the channel conditions, i.e., the

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

Fig. 5.

(C)
Packet loss probability in the transmission buffer versus the target error probability PPEP .

Fig. 6.

Average PSNR level versus the target error probability PPEP .

2787

(C)

service rate is higher when channel conditions are good and


lower when channel conditions are bad.
SBBPs have been used to model both the MPEG video source
[4], [15], [17] and the server process of the transmission buffer
that coincides with the time-varying available bandwidth in the
network. Accordingly, the whole system has been modeled as
an SBBP/SBBP/1/K process.
The analytical framework proposed in the paper has been
used to evaluate the performance in terms of the distortion
introduced by the quantization mechanism in the encoding
process, which are the loss and mean delay in the transmission

buffer. Numerical results show that our system is very robust


and reliable due to the implemented feedback law that maintains almost constant the mean delay and the loss probability in the output buffer. Moreover, the corruption probability
in the wireless channel is also limited in spite of possible
variations in time in the wireless-channel BER. The proposed
model allows the designer to evaluate the introduced encoding
quality variation that represents the cost of using this approach. The results obtained in the paper can be used to obtain
the best tradeoff between encoding quality and information
correctness.

2788

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

R EFERENCES
[1] I. F. Akyildiz, I. Joe, H. Driver, and Y. L. Ho, A new adaptive FEC
scheme for wireless ATM networks, in Proc. IEEE Military Communications Conf. (MILCOM), Boston, MA, Oct. 1998, pp. 277281.
[2] E. Altman, C. Barakat, and V. M. Ramos, Queueing analysis of simple
FEC schemes for IP telephony, in Proc. IEEE Information Communications (INFOCOM), Anchorage, AK, Apr. 2001, pp. 796804.
[3] J. J. Bae, T. Suda, and R. Simha, Analysis of individual packet loss
in a finite buffer queue with heterogeneous Markov modulated arrival
processes: A study of traffic burstiness and a priority packet discarding,
in Proc. IEEE Information Communications (INFOCOM), Florence, Italy,
Apr. 1992, pp. 219230.
[4] A. Cernuto, F. Cocimano, A. Lombardo, and G. Schembra, A queueing
system model for the design of feedback laws in rate-controlled MPEG
video encoders, IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 4,
pp. 238255, Apr. 2002.
[5] C. F. Chang and J. S. Wang, A stable buffer control strategy for MPEG
coding, IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 6, pp. 920
924, Dec. 1997.
[6] S. R. Cho, Adaptive error control scheme for multimedia applications in
integrated terrestrial-satellite wireless networks, in Proc. IEEE Wireless
Communications and Networking Conf. (WCNC), Chicago, IL, Sep. 2000,
pp. 629633.
[7] A. Chockalingam and M. Zorzi, Wireless TCP performance with link
layer FEC/ARQ, in Proc. IEEE Int. Conf. Communications (ICC),
Vancouver, BC, Canada, Jun. 1999, pp. 12121216.
[8] W. Ding and B. Liu, Rate control of MPEG video coding and recording
by rate-quantization modeling, IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 1, pp. 1220, Feb. 1996.
[9] O. Hashida et al., Switched batch Bernoulli process (SBBP) and the
discrete-time SBBP/G/1 queue with application to statistical multiplexer,
IEEE J. Sel. Areas Commun., vol. 9, no. 3, pp. 394401, Apr. 1991.
[10] A. E. Kamal, Efficient solution of multiple server queues with application to the modeling of ATM concentrators, in Proc. IEEE Information
Communications (INFOCOM), San Francisco, CA, 1996, pp. 248254.
[11] A. La Corte, A. Lombardo, and G. Schembra, An analytical paradigm to
calculate multiplexer performance in an ATM multimedia environment,
Comput. Netw. ISDN Syst., vol. 29, no. 16, pp. 18811900, Dec. 1997.
[12] L. J. Lin and A. Ortega, Bit-rate control using piecewise approximated
rate-distortion characteristics, IEEE Trans. Circuits Syst. Video Technol.,
vol. 8, no. 4, pp. 446459, Aug. 1998.
[13] D. V. Lindley, The theory of queues with a single server, Proc. Cambridge Philos. Soc., vol. 48, pp. 277289, 1952.
[14] H. Liu and M. El Zarki, Performance of H.263 video transmission over
wireless channels using hybrid ARQ, IEEE J. Sel. Areas Commun.,
vol. 15, no. 9, pp. 17751786, Dec. 1997.
[15] A. Lombardo, G. Morabito, and G. Schembra, An accurate and treatable Markov model of MPEG-video traffic, in Proc. IEEE Information
Communications (INFOCOM), San Francisco, CA, Mar./Apr. 1998,
pp. 217224.
[16] A. Lombardo, G. Morabito, S. Palazzo, and G. Schembra, A Markovbased algorithm for the generation of MPEG sequences matching intraand inter-GoP correlation, Eur. Trans. Telecommun. J., vol. 12, no. 2,
pp. 127142, Mar./Apr. 2001.
[17] A. Lombardo and G. Schembra, Performance evaluation of an adaptiverate MPEG encoder matching intServ traffic constraints, IEEE/ACM
Trans. Netw., vol. 11, no. 1, pp. 4765, Feb. 2003.
[18] M. F. Neutz, Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach. Baltimore, MD: The Johns Hopkins Univ. Press,
1981.
[19] L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, vol. 77, no. 2, pp. 257286,
Feb. 1989.
[20] A. Ramesh, A. Chockalingam, and L. B. Milstein, A first-order Markov
model for correlated Nagakami-m fading channels, in Proc. IEEE Int.
Conf. Communications (ICC), New York, Apr. 2002, pp. 34133417.
[21] O. Rose, Statistical properties of MPEG video traffic and their impact on
traffic modeling in ATM systems, Univ. Wrzburg, Inst. Comput. Sci.,
Wrzburg, Germany, Tech. Rep. 101, Feb. 1995.
[22] G. M. Schuster and A. K. Katsaggelos, Rate-Distortion Based Video
Compression, Optimal Video Frame Compression and Object Boundary
Encoding. Norwell, MA: Kluwer, 1997.
[23] T. Takine, T. Suda, and T. Hasegawa, Cell loss and output process analyses of a finite-buffer discrete-time ATM queueing system with correlated
arrivals, in Proc. IEEE Information Communications (INFOCOM), San
Francisco, CA, Mar. 1993, pp. 12591269.

[24] C. C. Tan and N. C. Beaulieu, On first-order Markov modeling for


the Rayleigh fading channels, IEEE Trans. Commun., vol. 48, no. 12,
pp. 20322040, Dec. 2000.
[25] B. Vucetic, An adaptive coding scheme for time-varying channels, IEEE
Trans. Commun., vol. 39, no. 5, pp. 653663, May 1991.
[26] H. S. Wang and N. Moayeri, Finite-state Markov channelA useful
model for radio communication channels, IEEE Trans. Veh. Technol.,
vol. 44, no. 1, pp. 163171, Feb. 1995.
[27] M. Zorzi and R. R. Rao, On the statistics of block errors in bursty
channels, IEEE Trans. Commun., vol. 45, no. 6, pp. 660667, Jun. 1997.
[28] M. Zorzi, R. R. Rao, and L. B. Milstein, Error statistics in data transmission over fading channels, IEEE Trans. Commun., vol. 46, no. 11,
pp. 14681477, Nov. 1998.
[29] Coded Representation of Picture and Audio Information, MPEG Test
Model 5. ISO-IEC/JTC1/SC29/WG11, Apr. 1993.
[30] Coded Representation of Picture and Audio Information, MPEG Test
Model 2. International Standard ISO-IEC/JTC1/Sc29/WG11, Jul. 1992.
[31] Coding of Moving Pictures and Associated Audio for Digital Storage Media up to 1.5 Mb/s Part 2, Video, International Standard ISOIEC/JTC1/SC29/WG11, DIS11172-1, Mar. 1992.

Laura Galluccio received the Laurea degree in electrical engineering and the Ph.D. degree in electrical,
computer and telecommunications engineering, both
from the University of Catania, Catania, Italy, in
2001 and 2005, respectively.
Since 2002, she has been with the Italian National
Consortium of Telecommunications (CNIT), where
she is working as a Research Fellow within the Virtual Immersive Communications (VICOM) Project.
From May to July 2005, she was a Visiting Scholar
at the COMET Group, Columbia University, New
York, NY. Her research interests include ad hoc and sensor networks, protocols
and algorithms for wireless networks, and network performance analysis.
Dr. Galluccio served and will serve in the Program Committee of the
4th Academic Network for Wireless Internet Research in Europe (ANWIRE)
International Workshop on Wireless Internet and Reconfigurability, the 20th
International Symposium on Computer and Information Sciences (ISCIS 05),
and Networking 2006.

Giacomo Morabito (M02) received the Laurea


degree in electrical engineering and the Ph.D.
degree in electrical, computer, and telecommunications engineering from the University of Catania,
Catania, Italy, in 1996 and 2000, respectively.
From November 1999 to April 2001, he was with
the Broadband and Wireless Networking Laboratory
of the Georgia Institute of Technology as a Research
Engineer. Since May 2001, he has been with the
School of Engineering at Enna of the University of
Catania, where he is currently an Assistant Professor.
He is serving as a Guest Editor on the editorial board of Computer Networks
and Mobile Networks and Applications (MONET). He is also a Member of the
technical program committee of several conferences. Moreover, he has been
the Technical Program Co-Chair of Med-Hoc-Net 2004. His research interests
include mobile and satellite networks, self-organizing networks, quality of
service (QoS), and traffic management.
Dr. Morabito is serving on the Editorial Board of IEEE Wireless Communications Magazine.

Giovanni Schembra received the degree in electrical engineering from the University of Catania,
Catania, Italy, in 1991. Working in the telecommunications area, he received the Masters degree
from CEFRIEL, Milan, Italy, in 1992, with his thesis
focusing on the analytical performance evaluation in
an ATM network. He received the Ph.D. degree in
electronics, computer science, and telecommunications engineering with a dissertation on multimedia
traffic modeling in a broadband network.
He is currently an Assistant Professor in Telecommunications at the University of Catania.

You might also like