8 views

Uploaded by Gämmêy Shiferaw

multimedia chapter 3

- Data Compression Complete
- 62-305-2-PB.pdf
- 1998 Digital Program Insertion for Local Advertising
- vlsi 3
- Compression Using Huffman Coding
- Solution IA5010
- Huffman Text Compression Technique
- Image Compression
- Camera LZW(ID 280)
- lab4
- rfc8478.txt
- IRJET-Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source
- Lec - 14 Image Compression and Coding v4.0
- graphics research
- Han2
- W#5B Multimedia Processing
- A Novel Steganographic Methodology For Secure Transmission Of Images
- Apple Computer Inc. v. Burst.com, Inc. - Document No. 71
- Wavelet-based Image Compression
- A Beginners Guide to Image Preprocessing Techniques _Chaki,Dey

You are on page 1of 61

Multimedia Data

Compression

•Compression with loss and lossless

•Huffman coding

•Entropy coding

•Adaptive coding

•Dictionary-based coding(LZW)

chapter3: Multimedia Compression 1

Data Compression

• Branch of information theory

– minimize amount of information to be

transmitted

• Transform a sequence of characters into a

new string of bits

– same information content

– length as short as possible

chapter3: Multimedia

2

Compression

Why Compress

• Raw data are huge.

• Audio:

CD quality music

44.1kHz*16bit*2 channel=1.4Mbps

• Video:

near-DVD quality true color animation

640px*480px*30fps*24bit=220Mbps

• Impractical in storage and bandwidth

Compression

Graphic file formats can be regarded as being of three types.

• The first type stores graphics uncompressed.

– Windows BMP (.bmp) files are an example of this sort of

format.

• The second type is called "non-lossy“ or “lossless”

compression formats.

– Most graphic formats use lossless compression - the GIF

formats are among them.

• The third type of bitmapped graphic file formats is called

"lossy" compression.

– the details are what prevent areas from being all the same

color, and as such from responding well to compression.

– perhaps too subtle to be discernable by your eye

chapter3: Multimedia Compression 4

Lossless is not enough!

• The best lossless audio and image

compression ratio is normally a half

• Lossy audio compression like mp3 or ogg

achieve 1/20 ratio while remain

acceptable quality, and 1/5 ratio for

perfect quality

• Lossy video compression reduce a film to

1/300 size

Lossy Compression

• Massively reduce information we don’t

notice

• Highly content specific

• Psychology

Lossy Audio Compression

• Frequency domain

• Quantization

– The importance varies in bands

– Higher frequency, larger quantum

• Psychoacoustics

– Pitch resolution of ear is only 2Hz without

beating

– Threshold of hearing varies in bands

– Simultaneous and temporal masking effect

Lossy Image Compression

• Frequency domain

– Discrete Cosine Transform (in Jpeg)

– Discrete Wavelet Transform (in J2k)

• Quantization

– Reduce less important data

Transform Quantization

data Coding data

Broad Classification

• Entropy Coding (statistical)

– lossless; independent of data characteristics

– e.g. RLE( Run Length Encoding), Huffman, LZW,

Arithmetic coding

• Source Coding

– lossy; may consider semantics of the data

– depends on characteristics of the data

– e.g. DCT, DPCM, ADPCM, color model transform

• Hybrid Coding (used by most multimedia systems)

– combine entropy with source encoding

– e.g., JPEG-2000, H.264, MPEG-2, MPEG-4, MPEG-7

chapter3: Multimedia

9

Compression

Huffman Coding

• Huffman codes can be used to compress

information

– Like WinZip – although WinZip doesn’t use the

Huffman algorithm

– JPEGs do use Huffman as part of their compression

process

• The basic idea is that instead of storing each

character in a file as an 8-bit ASCII value, we will

instead store the more frequently occurring

characters using fewer bits and less frequently

occurring characters using more bits

– On average this should decrease the filesize (usually ½)

Huffman Coding

“duke blue devils”

• We first to a frequency count of the characters:

• e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1

• Next we use a Greedy algorithm to build up a

Huffman Tree

– We start with nodes for each character

e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1

chapter3: Multimedia Compression 11

Huffman Coding

• We then pick the nodes with the smallest

frequency and combine them together to

form a new node

– The selection of these nodes is the Greedy part

• The two selected nodes are removed from

the set, but replace by the combined node

• This continues until we have only 1 node

left in the set

Huffman Coding

e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1

Huffman Coding

i,1 s,1

Huffman Coding

Huffman Coding

b,1 v,1

Huffman Coding

b,1 v,1

Huffman Coding

e,3 4 4 3 2

b,1 v,1

Huffman Coding

e,3 4 4 5

b,1 v,1

Huffman Coding

7 4 5

b,1 v,1

Huffman Coding

7 9

e,3 4 4 5

b,1 v,1

Huffman Coding

16

7 9

e,3 4 4 5

b,1 v,1

chapter3: Multimedia Compression 22

Huffman Coding

• Now we assign codes to the tree by

placing a 0 on every left branch and a 1 on

every right branch

• A traversal of the tree from root to leaf

give the Huffman code for that particular

leaf character

• Note that no code is the prefix of another

code

e 00

Huffman Coding d 010

u 011

16 l 100

sp 101

7 9 i 1100

s 1101

e,3 4 4 5

k 1110

b 11110

d,2 u,2 l,2 sp,2 2 3

v 11111

b,1 v,1

Huffman Coding

• Thus, “duke blue devils” turns into:

010 011 1110 00 101 11110 100 011 00 101 010 00 11111 1100 100 1101

01001111 10001011 11101000 11001010 10001111 11100100 1101xxxx

characters * 1 byte/char = 16 bytes

uncompressed

Huffman Coding

• Uncompressing works by reading in the file bit

by bit

– Start at the root of the tree

– If a 0 is read, head left

– If a 1 is read, head right

– When a leaf is reached decode that character and start

over again at the root of the tree

• Thus, we need to save Huffman table

information as a header in the compressed file

– Doesn’t add a significant amount of size to the file for

large files (which are the ones you want to compress

anyway)

– Or we could use a fixed universal set of

codes/freqencies

chapter3: Multimedia Compression 26

Entropy Coding Algorithms

(Content Dependent Coding)

• Run-length Encoding (RLE)

– Replaces sequence of the same consecutive

bytes with number of occurrences

– Number of occurrences is indicated by a

special flag (e.g., !)

– Example:

• abcccccccccdeffffggg (20 Bytes)

• abc!9def!4ggg (13 bytes)

chapter3: Multimedia

27

Compression

Variations of RLE (Zero-suppression

technique)

• Assumes that only one symbol appears

often (blank)

• Replace blank sequence by M-byte and a

byte with number of blanks in sequence

– Example: M3, M4, M14,…

• Some other definitions are possible

– Example:

• M4 = 8 blanks, M5 = 16 blanks, M4M5=24 blanks

chapter3: Multimedia

28

Compression

Adaptive Coding

Motivations:

– The previous algorithms (Huffman) require the statistical

knowledge which is often not available (e.g., live audio, video).

– Even when it is available, it could be a heavy overhead.

– Higher-order models incur more overhead. For example, a 255

entry probability table would be required for a 0-order model. An

order-1 model would require 255 such probability tables. (A

order-1 model will consider probabilities of occurrences of 2

symbols)

The solution is to use adaptive algorithms. Adaptive

Huffman Coding is one such mechanism that we will

study.

The idea of “adaptiveness” is however applicable to other

adaptive compression algorithms.

Adaptive Coding

ENCODER

Initialize_model();

do { DECODER

c = getc( input ); Initialize_model();

encode( c, output ); while ( c = decode (input)) != eof) {

update_model( c ); putc( c, output)

} while ( c != eof) update_model( c );

}

r The key is that, both encoder and decoder use exactly the

same initialize_model and update_model routines.

The Sibling Property

The node numbers will be assigned in such a way

that:

1. A node with a higher weight will have a higher node

number

2. A parent node will always have a higher node number

than its children.

In a nutshell, the sibling property requires that the

nodes (internal and leaf) are arranged in order of

increasing weights.

The update procedure swaps nodes in violation of

the sibling property.

– The identification of nodes in violation of the sibling

property is achieved by using the notion of a block.

– All nodes that have the same weight are said to belong

to one block chapter3: Multimedia Compression 31

Flowchart of the update procedure

The Huffman tree is

START initialized with a single

First

node, known as the Not-

NYT gives birth

To new NYT and

Yes appearance

of symbol

Yet-Transmitted (NYT) or

escape code. This code

external node

No

Increment weight

of external node

Go to symbol

external node

will be sent every time

that a new character,

and old NYT node;

Adjust node

numbers

Node

number max

No

Switch node with

highest numbered

which is not in the tree,

is encountered, followed

Go to old in block? node in block

NYT node

Yes

by the ASCII encoding of

the character. This allows

Increment

node weight

to distinguish between a

Is this No

the root Go to

node? parent node

Yes

code and a new

STOP character. Also, the

procedure creates a new

chapter3: Multimedia Compression 32

Example

NYT

#0

Initial Huffman

Counts: Tree

Root

(number of W=16

#8

occurrences)

W=6 E

B:2 #6 W=10

C:2 #7

W=2 W=4

D:2 #4 #5

E:10

NYT B C D

W=2 W=2 W=2

#0 #1 #2 #3

processed in accordance

with the Multimedia

chapter3: sibling property

Compression 33

Example

Counts:

(number of Root

W=16+1

occurrences) #10

A:1 W=6+1 E

#8 W=10

B:2 #9

C:2 W=2+1 W=4

#6 #7

D:2

E:10 W=1 B C D

#2 W=2 W=2 W=2

#3 #4 #5

NYT A

#0 W=1

#1

of symbol A

Increment

Counts:

Root

A:1+1 W=17+1

#10

B:2

C:2 W=7+1 E

#8 W=10

D:2 #9

E:10 W=3+1 W=4

#6 #7

W=1+1 B C D

#2 W=2 W=2 W=2

#3 #4 #5

NYT A

W=1+1

#0

#1

to the root

Swapping

Counts: Another increment in the count for A results in

A:2+1 swap Root

W=18

#10

B:2

C:2 W=8 E

#8 W=10

D:2 #9

E:10 W=4 W=4

#6 #7

W=2

B

#2

W=2

C

W=2

D

W=2 Swap nodes 1

A

#3 #4 #5 and 5

NYT W=2

#0 #1

Counts:

Root

A:3 W=18+1

#10

B:2

C:2 W=8+1 E

#8 W=10

D:2 #9

E:10 W=4 W=4+1

#6 #7

W=2

#2 B C A

W=2 W=2 W=2+1

#3 #4 #5

D

NYT W=2

#0 #1

chapter3: Multimedia Compression 36

Swapping … contd.

Counts:

Root

A:3+1 W=19+1

#10

B:2

C:2 W=9+1 E

#8 W=10

D:2 #9

E:10 W=4 W=5+1

#6 #7

W=2 B C A

#2 W=2 W=2 W=3+1

#3 #4 #5

D

NYT W=2

#0 #1

propagates up

Swapping … contd.

Another increment in the count for A causes

swap of sub-tree

Counts:

Root

A:4+1 W=20

#10

B:2

C:2 W=10 E

#8 W=10

D:2 #9

E:10 W=4 W=6

#6 #7

W=2 B C A

#2 W=2 W=2 W=4

#3 #4 #5

D

NYT W=2

#0 #1

Swap nodes 5

and 6

Swapping … contd.

Further swapping needed to fix the tree

Counts:

Root

A:4+1 W=20

#10

B:2

C:2 W=10 E

#8 W=10

D:2 #9

E:10 A W=6

W=4+1 #7

#6

C W=4

W=2 #5

#4

W=2 B

#2 W=2

#3

D

NYT W=2

#0 #1

Swap nodes 8

and 9

Swapping … contd.

Counts:

Root

A:5 W=20+1

#10

B:2

C:2 E W=10+1

W=10 #9

D:2 #8

E:10 A W=6

W=5 #7

#6

C W=4

W=2 #5

#4

W=2 B

#2 W=2

#3

D

NYT W=2

#0 #1

Lempel-Ziv-Welch (LZW) Compression Algorithm

Introduction to LZW

As mentioned earlier, static coding schemes require

some knowledge about the data before encoding takes

place.

advance knowledge and can build such knowledge on-

the-fly.

compression due to its simplicity and versatility.

the capacity of your hard drive”

common choice for the number of table entries.

chapter3: Multimedia Compression 42

Introduction to LZW (cont'd)

Codes 0-255 in the code table are always assigned to

represent single bytes from the input file.

first 256 entries, with the remainder of the table being

blanks.

4095 to represent sequences of bytes.

sequences in the data, and adds them to the code table.

compressed file, and translating it through the code table

to find what character or characters it represents.

chapter3: Multimedia Compression 43

LZW Encoding Algorithm

1 Initialize table with single character strings

2 P = first input character

3 WHILE not end of input stream

4 C = next input character

5 IF P + C is in the string table

6 P=P+C

7 ELSE

8 output the code for P

9 add P + C to the string table

10 P=C

11 END WHILE

Example 1: Compression using LZW

BABAABAAA

Example 1: LZW Compression Step 1

BABAABAAA P=A

C=empty

ENCODER OUTPUT STRING TABLE

output code representing codeword string

66 B 256 BA

Example 1: LZW Compression Step 2

BABAABAAA P=B

C=empty

ENCODER OUTPUT STRING TABLE

output code representing codeword string

66 B 256 BA

65 A 257 AB

Example 1: LZW Compression Step 3

BABAABAAA P=A

C=empty

ENCODER OUTPUT STRING TABLE

output code representing codeword string

66 B 256 BA

65 A 257 AB

256 BA 258 BAA

Example 1: LZW Compression Step 4

BABAABAAA P=A

C=empty

ENCODER OUTPUT STRING TABLE

output code representing codeword string

66 B 256 BA

65 A 257 AB

256 BA 258 BAA

257 AB 259 ABA

Example 1: LZW Compression Step 5

BABAABAAA P=A

C=A

ENCODER OUTPUT STRING TABLE

output code representing codeword string

66 B 256 BA

65 A 257 AB

256 BA 258 BAA

257 AB 259 ABA

65 A 260 AA

Example 1: LZW Compression Step 6

BABAABAAA P=AA

C=empty

ENCODER OUTPUT STRING TABLE

output code representing codeword string

66 B 256 BA

65 A 257 AB

256 BA 258 BAA

257 AB 259 ABA

65 A 260 AA

260 AA

chapter3: Multimedia Compression 51

LZW Decompression

during decompression.

single characters.

input stream, except the first one.

them through the code table being built.

LZW Decompression Algorithm

1 Initialize table with single character strings

2 OLD = first input code

3 output translation of OLD

4 WHILE not end of input stream

5 NEW = next input code

6 IF NEW is not in the string table

7 S = translation of OLD

8 S=S+C

9 ELSE

10 S = translation of NEW

11 output S

12 C = first character of S

13 OLD + C to the string table

14 OLD = NEW

15 END WHILE

Example 2: LZW Decompression 1

Example 1:

<66><65><256><257><65><260>.

Example 2: LZW Decompression Step 1

<66><65><256><257><65><260> Old = 65 S=A

New = 66 C=A

ENCODER OUTPUT STRING TABLE

string codeword string

B

A 256 BA

Example 2: LZW Decompression Step 2

<66><65><256><257><65><260> Old = 256 S = BA

New = 256 C = B

string codeword string

B

A 256 BA

BA 257 AB

Example 2: LZW Decompression Step 3

<66><65><256><257><65><260> Old = 257 S = AB

New = 257 C = A

string codeword string

B

A 256 BA

BA 257 AB

AB 258 BAA

Example 2: LZW Decompression Step 4

<66><65><256><257><65><260> Old = 65 S = A

New = 65 C = A

string codeword string

B

A 256 BA

BA 257 AB

AB 258 BAA

A 259 ABA

Example 2: LZW Decompression Step 5

<66><65><256><257><65><260> Old = 260 S = AA

New = 260 C = A

string codeword string

B

A 256 BA

BA 257 AB

AB 258 BAA

A 259 ABA

AA 260 AA

chapter3: Multimedia Compression 59

LZW: Some Notes

This algorithm compresses repetitive sequences of data

well.

character will expand the data size rather than reduce it.

data. After a reasonable string table is built, compression

improves dramatically.

LZW requires no prior information about the input data stream.

LZW can compress the input stream in one single pass.

Another advantage of LZW its simplicity, allowing fast

execution.

LZW: Limitations

What happens when the dictionary gets too large (i.e., when all the

4096 locations have been used)?

Here are some options usually implemented:

Simply forget about adding any more entries and use the table

as is.

compression.

input characters.

chapter3: Multimedia Compression 61

- Data Compression CompleteUploaded byRuchi Rewri
- 62-305-2-PB.pdfUploaded byGwgw Xypolia
- 1998 Digital Program Insertion for Local AdvertisingUploaded byDiego Marcelo Martín
- vlsi 3Uploaded byVimala Priya
- Compression Using Huffman CodingUploaded bySameer Sharma
- Solution IA5010Uploaded byNarendra Kumar
- Huffman Text Compression TechniqueUploaded byAndysah Putra Utama Siahaan
- Image CompressionUploaded bysophiaaa
- Camera LZW(ID 280)Uploaded byapi-3848062
- lab4Uploaded byAbdelaali Sayfdine
- rfc8478.txtUploaded byspirit21
- IRJET-Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message sourceUploaded byIRJET Journal
- Lec - 14 Image Compression and Coding v4.0Uploaded byNikesh Bajaj
- graphics researchUploaded byapi-296006332
- Han2Uploaded bygdeepthi
- W#5B Multimedia ProcessingUploaded byir12u
- A Novel Steganographic Methodology For Secure Transmission Of ImagesUploaded byijcsis
- Apple Computer Inc. v. Burst.com, Inc. - Document No. 71Uploaded byJustia.com
- Wavelet-based Image CompressionUploaded byAmin
- A Beginners Guide to Image Preprocessing Techniques _Chaki,DeyUploaded byLaura
- NIST Fingerprint Testing StandardsUploaded byCash Cash Cash
- Rr410507 Digital Speech Image Processing Speech Image ProcessingUploaded byvasuvlsi
- InTech-Improved Intra Prediction of h 264 AvcUploaded byThanh Ngo
- ECE4007 Information-Theory-And-Coding ETH 1 AC40Uploaded byharshit
- Digital Image ProcessingUploaded byakku21
- Seminar ReportUploaded byVaibhav Bhurke
- itUploaded byMithun Konjath
- Sons.Compressed Video Communications 2002 good.pdfUploaded bymaragilclub
- PrateekUploaded bygirishprb
- Business CommunicationUploaded byVs Sivaraman

- Seminar DocumentUploaded byGämmêy Shiferaw
- Final Ch 1 Software Project DecumentUploaded byGämmêy Shiferaw
- Final Documentation(G 9)Uploaded byGämmêy Shiferaw
- Final Documentation(G-9).docxUploaded byGämmêy Shiferaw
- Final Documentation(G 9)Uploaded byGämmêy Shiferaw
- cost sharing documentationUploaded byGämmêy Shiferaw
- Testing Sw Project DecumentUploaded byGämmêy Shiferaw
- Chapter 5 Multimedia Database SystemUploaded byGämmêy Shiferaw
- Seminar Presentation FinalUploaded byGämmêy Shiferaw
- OurreportUploaded byGämmêy Shiferaw
- Online Class and Exam ScheduilngUploaded byGämmêy Shiferaw
- Complaint Management SystemUploaded byGämmêy Shiferaw
- ispmUploaded byGämmêy Shiferaw
- Chapter OneUploaded byGämmêy Shiferaw
- DkaUploaded byGianca98
- DkaUploaded byGianca98
- serial.txtUploaded byGämmêy Shiferaw
- Hci Group Work22Uploaded byGämmêy Shiferaw
- GameUploaded byGämmêy Shiferaw
- GameUploaded byGämmêy Shiferaw
- GameUploaded byGämmêy Shiferaw
- chapter 5Uploaded byGämmêy Shiferaw
- Supported File TypesUploaded byGämmêy Shiferaw
- assignmentUploaded byGämmêy Shiferaw

- 02Uploaded byShyloo Gsa
- Monopolistic Market @ Bec DomsUploaded byBabasab Patil (Karrisatte)
- ch1-chap1v2pres-mj3mUploaded byDanOtah
- Zero Padding of SignalsUploaded by1balamanian
- chap4_01.pdfUploaded byRob-Joke
- Final Project t CadUploaded byChaitanya Nitta
- Integer programing.pptUploaded byDeepankumar Athiyannan
- Lie Symmetries of Differential EquationsUploaded byAnonymous Tph9x741
- Oop AssignmentUploaded bydalu
- Vincent Kenny - Life, The Multi Verse and Everything - An Introduction to the Ideas of Humberto MaturanaUploaded bydrleonunes
- Helmholtz, A. - The Origin and Meaning of Geometrical AxiomsUploaded bydiotrephes
- Oxford Math ClassesUploaded byChua Ying Li Pamela
- Experiment 5 Center of PressureUploaded bysakura9999
- Effective Project Cost & SchedulingUploaded byakdass
- Capital Budgeting presentationUploaded byaldamati2010
- Lesson1-5.5.pdfUploaded byProjectlouis
- [Lecture Notes in Mathematics] Ana Cannas Da Silva - Lectures on Symplectic Geometry (2009, Springer)Uploaded byDunga Pessoa
- The Turning Point in PhilosophyUploaded byGiuseppe Bianco
- 4Uploaded bymifcom
- hw2Uploaded byKelly Hartmann
- Discrete and Continuous DataUploaded byTeresa Navarrete
- Refrigeration Engineer Quick ReferenceUploaded byventilation
- 1-s2.0-S0014305713005004-main_2.pdfUploaded byeid elsayed
- Optimization of New FuzzyUploaded byMagdy Roman
- TutorialUploaded byCarmine Russo
- Surface TensionUploaded byKarla Velasco
- rlsUploaded byathos00
- 0000018-Foundamental of Fluid Film Journal Bearing Operation and ModellingUploaded byFolpo
- Seismic ReflectionUploaded byJeevan Babu
- Maxwell's Equations of Electrodynamics an Explanation [2012]Uploaded byNemish Prakash