Professional Documents
Culture Documents
COMPRESSION
International Standardized Video Encoders
ITU-T i MPEG:
Videoconference, Video 20-320 kbps H261, H263,
telephony MPEG-4/H264 AVC
Video Broadcast 2-5Mbps MPEG-2/H264 AVC
(Digital Television) (10-20Mbps ptHD)
ITU-T H261 V1 H261 V2 H263 H263+ H263++ Video delivery 4-8 Mbps MPEG-2/H264 AVC
DVD video (10-20Mbps ptHD) VC-1
HD DVD
Blu-Ray Disk
Joint ITU-T / H262/ H264 / Internet Streaming 20-600 kbps Proprietary encoders
MPEG2 MPEG4 AVC H263, MPEG-4/H264
MPEG AVC, VC-1
Video over 3G radio 20-200 kbps H263, MPEG-4/H264
networks AVC, VC-1
MPEG MPEG1 MPEG4 V1
1988 Politehnica
Universitatea 1992din Bucureti
1996 2000 2004 Prof. dr. ing. Cristian Negrescu
H261 Video Encoder
Destination: Video-telephony and videoconference:
Developed by CCITT (actual ITU-T) during 1988-1990
Intended to use for ISDN lines as a part of the protocol H320
Delivers encoded streams with p x 64kbps bitrate
Low-latency encoder
CBR encoder (constant bitrate for the encoded stream)
Accepted image formats CIF (352x288) and QCIF (176x144)
Characteristics of the encoder:
Predictive video encoder
Based on motion compensation
Compression unit: MB
Used in stand-alone or PC-based video conferences systems
Current Coded Decoded
frame C E Transform Quantizer Coding
error Decoding Tr. Inv.
E frame
Q 1 Q 1 1
+ T C C T (display)
Input - (innovation) + +
Q 1 Estimated of
current frame
Tr. Inv.
Prediction by motion R Memory
compensation frame
T 1 MC
Estimated of
current frame + E V Decoder
+
Decoded frame
(display)
Prediction by motion
compensation
R Memorie
MC cadru
Aux. Info.
Universitatea
V Politehnica
MV, din
CR, etc Bucureti
Motion estimation
Prof. dr. ing. Cristian Negrescu
ME Encoder
H261 Video Encoder
Structure of the H261 video sequence:
Level 1: Frame
Level 2: Group of blocks (GOB)
Helpful in synchronization recovery
1 CIF frame= 2x6 = 12 GOB
1 QCIF frame = 1x3= 3 GOB
Level 3: The macroblock
1 GOB = 11x3=33MB
Represents the compression unit
+
Tr. Inv.
T 1
+ E
Decoded frame
Current
frame C
Current
Transform
DCT
Cuantizor
Q
H261 Video Encoder
Variable Length Coding (VLC)
Run Length + Huffman
Buffer Error
correction
Estimate of
current frame
Tr. Inv.
T 1
+ E
Input frame +
Decoded frame
Macroblock characteristics:
(display)
Q 1 Prediction by motion
compensation
R Memory
frame
MC
Aux. Info.
V
Each macroblock
Tr. Inv. MV, CR, etc
IDCT corresponds to a 16x16pixels Motion estimation
ME Encoder
C
Color space (YCbCr)
Decoded current
frame (display)
CR
Format 4:2:0 Memory
frame
CB
Intra MB Coding
16x16 pixels for Y ' Y'
1 block 8x8 pixels CB + 1 block 8x8 pixelsCR
Intra-MB Coding:
The coding is applied for the current frame
8x8 DCT
Uniform quantization:
= 8 for DC coefficients
= 2,4, , 62 for AC coefficients. Unlike for JPEG, for
all the coefficients, the same is applied
Zig-zag scanning:
RLE for Huffman coding
Symbols (Run-length, value)
Universitatea Politehnica din Bucureti Prof. dr. ing. Cristian Negrescu
Huffman entropic coding for non-zero coefficients
Current Coded
frame C E Transform Quantizer Coding
error
Current Input + -
T Q C
(innovation)
frame C E Transform Quantizer Variable Length Coding (VLC) Buffer Error Q 1
Input + -
DCT Q
Coded
error
correction
Estimate of
current frame
+
Tr. Inv.
T 1
+ E
Decoded frame
(innovation)
Macroblock characteristics:
Tr. Inv. (display)
IDCT
Prediction by motion
compensation
R Memory
frame
MC
Estimated of Aux. Info.
current frame + E V
Each macroblock
MV, CR, etc
Decoded current
Color space
R (YCbCr)
Prediction by motion
Memory
frame (display)
compensation
CR
Format 4:2:0
frame
MC
CB
V
16x16 pixels for Y '
Motion estimation
Y'
ME Inter MB Coding
Inter-MB Coding:
The coding is applied for the prediction error (MCME)
8x8 DCT
Uniform quantization:
= 2,4, , 62 for AC coefficients. Unlike for JPEG, for
all the coefficients, the same is applied
Zig-zag scanning:
RLE for Huffman coding
Symbols (Run-length, value)
UniversitateaPolitehnica
Huffman entropic coding for non-zero coefficients
din Bucureti Prof. dr. ing. Cristian Negrescu
Current Coded
frame C E Transform Quantizer Coding
error
Current Input + -
T Q C
(innovation)
frame C E Transform Quantizer Variable Length Coding (VLC) Buffer Error Q 1
Input + -
DCT Q
Coded
error
correction
Estimate of
current frame
+
Tr. Inv.
T 1
+ E
Decoded frame
(innovation)
IDCT
Prediction by motion
compensation
R Memory
frame
MC
Estimated of Aux. Info.
current frame + E V
Each macroblock
MV, CR, etc
Decoded current
Color space
R (YCbCr)
Prediction by motion
Memory
frame (display)
compensation
CR
Format 4:2:0
frame
MC
CB
V
16x16 pixels for Y '
Motion estimation
Y'
ME Inter MB Coding
Motion estimation:
The searching algorithm operates on the pixel level
The largest searching window is -15 +15
H261 operates between 64kbps 1984kbps
The searching window is application dependent
Motion on small areas (head, shoulders, )
Rhomboidal smaller searching windows
frame C E Transform Quantizer Coding
error
Current Input + -
T Q C
(innovation)
frame C E Transform Quantizer Variable Length Coding (VLC) Buffer Error Q 1
Input + -
DCT Q
Coded
error
correction
Estimate of
current frame
+
Tr. Inv.
T 1
+ E
Decoded frame
(innovation)
IDCT
Prediction by motion
compensation
R Memory
frame
MC
Estimated of Aux. Info.
current frame + E V
Each macroblock
MV, CR, etc
Decoded current
Color space
R (YCbCr)
Prediction by motion
Memory
frame (display)
compensation
CR
Format 4:2:0
frame
MC
CB
V
16x16 pixels for Y '
Motion estimation
Y'
ME Inter MB Coding
Motion estimation:
The searching algorithm operates on the pixel level
The largest searching window is -15 +15
H261 operates between 64kbps 1984kbps
The searching window is application dependent
Motion on small areas (head, shoulders, )
Rhomboidal smaller searching windows
The final perceptual quality is similar
frame C E Transform Quantizer Coding
error
Current Input + -
T Q C
(innovation)
frame C E Transform Quantizer Variable Length Coding (VLC) Buffer Error Q 1
Input + -
DCT
DC T Q
1
Q
H261 Video Encoder
Run Length + Huffman
Coded
error
correction
Estimate of
current frame
+
Tr. Inv.
T 1
+ E
Decoded frame
(innovation)
(innovation
Macroblock characteristics:
Tr. Inv. (display)
IDCT
Prediction by motion
compensation
R Memory
frame
MC
Estimated of Aux. Info.
current frame + E V
Each macroblock
MV, CR, etc
+
Loop Filter
coresponds to a 16x16pixels image Motion estimation
ME Encoder
Decoded decoded
Current current
Color space
R (YCbCr)
Prediction by motion
Memory
frame (display)
compensation
CR
Format 4:2:0
frame
MC
CB
V
16x16 pixels for Y '
Motion estimation
Y'
ME Inter MB Coding
frame C E Transform Quantizer Coding
error
Current Input + -
T Q C
(innovation)
frame C E Transform Quantizer Variable Length Coding (VLC) Buffer Error Q 1
Input + -
DCT
DC T Q
1
Q
H261 Video Encoder
Run Length + Huffman
Coded
error
correction
Estimate of
current frame
+
Tr. Inv.
T 1
+ E
Decoded frame
(innovation)
(innovation
Macroblock characteristics:
Tr. Inv. (display)
IDCT
Prediction by motion
compensation
R Memory
frame
MC
Estimated of Aux. Info.
current frame + E V
Each macroblock
MV, CR, etc
+
Loop Filter
coresponds to a 16x16pixels image Motion estimation
ME Encoder
Decoded decoded
Current current
Color space
R (YCbCr)
Prediction by motion
Memory
frame (display)
compensation
CR
Format 4:2:0
frame
MC
CB
V
16x16 pixels for Y '
Motion estimation
Y'
ME Inter MB Coding
Macroblock characteristics:
Tr. Inv. IDCT
IDCT Estimated of
current frame + E
C Decoded current
Coding type:
Coded
error
(innovation)
Tr. Inv.
IDCT
current frame + E
GOP consists from I, P and B frames: +
Decoded current
frame (display)
The I frames (see H261) are frames encoded in intra mode, with R
Prediction by motion
compensation Memory
frame
MC
current frame + E
GOP consists from I, P and B frames : +
Decoded current
frame (display)
The I frames (see H261) are frames encoded in intra mode, with R
Prediction by motion
compensation Memory
frame
MC
current frame + E
GOP consists from I, P and B frames : +
Decoded current
frame (display)
The I frames (see H261) are frames encoded in intra mode, with R
Prediction by motion
compensation Memory
frame
MC
The P frames (see H261) are frames for which it applies forward
ME MCME predictive coding
Pros.
(innovation)
Tr. Inv.
IDCT
Estimated
Forward Prediction current frame + E
f(A)=A Increasing of coding efficiency +
Decoded current
FPS can be increased with only a Prediction by motion R
frame (display)
compensation Memory
slightly increasing of the bitrate MC
frame
Input Pre Q 1
Pros.
Tr. Inv. (innovation)
Tr. Inv.
IDCT IDCT
Estimate for the Estimated
current frame + E
Forward Prediction + E
f(A)=A Increasingcurrent frame
of coding efficiency Decoded current +
Decoded current
FPS can be increased with only+ a frame
R Prediction by motion
frame (display)
compensation Memory
slightly increasing of the bitrate MC
frame
V2
Motion estimation
ME
Input Pre Q 1
video seq. processing
Weighting
Motion
Vectors
W 1
The + frame
I and P frames from a GOP are named anchors
Prediction by motion
R1
(references) compensation
MC
Memory
Ref 1
MPEG-1 Encoder
frames MC
Ref 1
V2
Between the anchors it can be any number of B frames Motion estimation
ME
B frames can be placed even in front of the I frame The GOP parameters
Into a GOP with B frames, the order of the frames for N number of frames in the
encoding/decoding can be different from the order of the frames GOP
in the original sequence or from the presentation order M 1+ number of B frames
Before coding/decoding process, the frames must be ordered between anchors (I or P)
in such a way that for each B frame, the anchors should be Default values N=18, 15, 12, 7
available Usual N=15, M=3
GOP types Examples
I BB BBP|I BB BBP|
Closed GOP All the B frames from GOP have
references inside the current GOP (the GOP I B P| I B P|
ended with a P frame) I B B B B B B B B B P| I B B B B B B B B B P|
Open GOP Some of the B frames have B I B P| B I B P|
references outside the current GOP (the GOP
ended with a B frame)
Universitatea Politehnica din Bucureti Prof. dr. ing. Cristian Negrescu
Cadru
curent
Reordonare C E Transform Ponderare Cuantizor Codare cu lungime variabil (VLC) Buffer
cadre DCT W Q Run Length + Huffman
DCT + -
Intrare Pre Q 1
secv. video procesare
DCT Ponderare
Vectori
Tr. Inv.
IDCT
Estimat
cadru curent + E Cadru curent
+ decodat
Predicie prin
compensare micare
R1 Memorie
Ref 1
MC
Macroblock characteristics:
Codorul MPEG-1
I P(A) V1
+
1/2 Estimare micare
B ME
R +
Predicie prin
compensare micare
R2 Memorie
Ref 1
MC
Intrare Pre Q 1
secv. video procesare
DCT Ponderare
Vectori
Tr. Inv.
IDCT
Estimat
cadru curent + E Cadru curent
+ decodat
Predicie prin
compensare micare
R1 Memorie
Ref 1
MC
Macroblock characteristics:
Codorul MPEG-1
I P(A) V1
+
1/2 Estimare micare
B ME
R +
Predicie prin
compensare micare
R2 Memorie
Ref 1
MC
P frames coding:
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
It encodes the difference between the current frame and the 16
16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
motion compensated reference (first previous I or P) frame 16 16 16 16 16 16 16 16
The motion compensation has subpixel accuracy (1/2sample). It 16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
uses a bilinear motion model
16 16 16 16 16 16 16 16
For each macroblock, 1 single motion vector is encoded
8x8 DCT, weighting for bitrate control, uniform quantization,
scanning, entropic coding (similar to the I frames)
It exists a standard quantization table (different from the table
used for I frames). Other optional quantization tables are allowed.
Intrare Pre Q 1
secv. video procesare
DCT Ponderare
Vectori
Tr. Inv.
IDCT
Estimat
cadru curent + E Cadru curent
+ decodat
Predicie prin
compensare micare
R1 Memorie
Ref 1
MC
Macroblock characteristics:
Codorul MPEG-1
I P(A) V1
+
1/2 Estimare micare
B ME
R +
Predicie prin
compensare micare
R2 Memorie
Ref 1
MC
B frames coding:
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
It uses reference frames (I or P) for which the prediction is done 16
16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
considering the motion compensation 16 16 16 16 16 16 16 16
2 motion vectors/MB are used for the bidirectional prediction and 16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
one single motion vector/MB is used for the forward or backward
16 16 16 16 16 16 16 16
prediction
It encodes the difference between the current frame and the
bidirectional or unidirectional predicted estimate of it
The motion compensation has a subpixel accuracy (1/2sample). It
uses a bilinear motion model
8x8 DCT, weighting for bitrate control, uniform quantization,
scanning, entropic coding (similar to the P frames)
Universitatea Politehnica din Bucureti Prof. dr. ing. Cristian Negrescu
Comparison MPEG-1 H261
MPEG-1 H261
Allows multiple spatial resolutions Allows only QCIF and CIF
(CIF, SIF, QCIF, SQCIF, etc)
Variable aspect ratio (defined in the Aspect ratio: only 4:3
header)
It introduces the GOP concept No GOP
Macroblocks I, P, B Macroblocks I, P
Optimized for 1.2 Mbps Optimized for. 384kbps
No restrictions regarding the number Only 1,2 or 3 successive frames are
of the successive skipped frames allowed
Algorithm for motion estimation with Algorithm for motion compensation
sub-pixel accuracy with pixel accuracy
Usual searching window (for motion Usual searching window (for motion
compensation): 15 pixels compensation): 7 pixels
Table based uniform quantization Uniform quantization, the same
(similar with JPEG) quantization step for all AC
coefficients
For application where the end-to-end For application with critic end-to-end
latency is not critic latency
Universitatea Politehnica din Bucureti Prof. dr. ing. Cristian Negrescu
MPEG-2 Video Encoder
A down-compatible superset for the MPEG-1 encoder:
Delivers video with better quality (at least conforming with the quality standard
CCIR 601)
Larger image formats
Impose restrictions for the Slice level
Offers support for operating with interlaced video sequences (TV)
Introduces frame images/field images
Allows other chroma formats (4:4:4 4:2:2)
Introduce new prediction modes
Allows the alternate scanning of the DCT coefficients
Allows applying DCT for frames/fields
Allows motion compensation on blocks with 16x8 pixeli
Enhances the entropic coding
Ofers support for film-to-video transfer
Video scalability (allows efficient delivery of simultaneous SD and HD versions
of the same content or optimisation of the perceived quality when transmission
channels prone to the errors are used)
Universitatea Politehnica din Bucureti Prof. dr. ing. Cristian Negrescu
MPEG-2 Video Encoder
Support for coding of interlaced video sequences (TV) :
Frame Mode
Each 8x8 DCT block contains both phases of the motion (the two fields are combined into
one single frame and subsequently coded)
Artefacts for scenes containing fast motion (the vertical edges become dented large
HF components (low compression rate)
Field Mode
Each 8x8 DCT block contains only one single phase of motion (each field is encoded
separately)
More efficient for the compression of static images or scenes containing slow motion