You are on page 1of 16

EE #652 FINAL PROJECT REPORT

JPEG COMPRESSION TECHNIQUE


BY SUJITH REDDY GUDISA [RED ID # 812741898]

ABSTRACT: JPEG is an image compression standard developed by a joint ISO/CCITT committee known as the Joint Photographic Experts Group. Basically JPEG has two compression methods. One is the DCT based lossy compression and second is the predictive method for lossless compression. This project provides an overview of the DCT based JPEG lossy compression technique and the steps involved in it. The steps of this algorithm are explained briefly in this report and the steps are implemented in Matlab. INDEX TERMS: Joint Photographic Experts Group (JPEG), Image compression, Discrete Cosine Transform (DCT), Inverse Discrete Cosine Transform (IDCT), Quantization, and Inverse quantization, PSNR.

INTRODUCTION An image is a two dimensional array/matrix of picture elements, called pixels. Image intensity at the corresponding location is represented by these pixels. The size of the image is the number of pixels (width x height). The increase in the availability of the digital imaging devices, such as scanners and digital cameras resulted in the enormous increase in the numbers of digital images in recent years. We require vast amount of data to represent these digital images. Hence, we require great storage capabilities to store them and larger bandwidth to transfer these images between networks. This has motivated the development of many image compression techniques for various applications and needs. Generally there are two types of compression, lossless compression and lossy compression. Lossless Compression: In this type of compression the compressed file is almost identical to the original file. For images the compression ratio goes up to 4:1. Example: Huffman coding Lossy Compression: This compression is visually lossless. It maximizes the compression ratio while maintaining the required level of quality. In this technique there will be some degradation in the quality of the compressed file when compared with the original one. The compression ratio goes up to 80:1. Example: DCT transform The level of compression is defined by the term Compression Ratio (CR), defined as the ratio of the number bits required to represent the data before compression to the total number of bits required to represent the data after compression. Compression Ratio = # of bits before compression/ # of bits after compression.

Images usually have large amounts of redundant data in them. We exploit these redundancies to achieve compression by removing them. There are mainly four types of redundancies when talking about images. They are: Interpixel Redundancy: Also known as spatial redundancy, in an image the values of the adjacent pixels are highly correlated. Hence, these pixels are redundant with each other. This refers to interpixel redundancy. Spectral Redundancy: This refers to the correlation between different color planes or spectral bands. Psychovisual Redundancy: Human visual system has unequal sensitivity to different visual information, for example human eye is less sensitive to very high and very low intensities than normal intensity. This refers to psychovisual redundancy. Coding Redundancy: This refers to the entropy of the source data.

In this project, we deal with the JPEG compression technique, which is a lossy compression technique. JPEG is the most important current standard for image compression. International Organization for Standardization (ISO) created this standard called the Joint Photographic Experts Group (JPEG). JPEG takes advantage of the redundancies in the images to achieve higher compression rates. It has many modes and in this project we deal with the sequential mode, which is the default mode. Sequential Mode: Each gray level image or color image component is encoded in a single left to right and top to bottom scan.

SYSTEM DESCRIPTION The basic image encoding and decoding block diagrams are discussed below. ENCODER
SYMBOL ENCODER

Original Data

MAPPER

QUANTIZER

COMPRESSED DATA

Decompressed Data

INVERSE MAPPER

DEQUANTIZER

SYMBOL DECODER

DECODER

Mapper: It is used to transform the given input data into the format designed to remove interpixel redundancy. The output of the mapper is the transformed coefficients. It is reversible operation and it facilitates exploitation of redundancies in the later stages. Example: DCT in JPEG.

Quantizer: It exploits pyschovisual redundancy. It reduces the accuracy of the mappers output i.e. it quantizes the mapper output coefficients. Hence, it is the lossy part in the encoder and is irreversible.

Symbol Encoder: This unit is used to represent the output of the quantizer by fixed or variable length codes. It is a reversible process.

The decoder part is the reverse operation of the encoder part.

JPEG ENCODER: JPEG encoder consists of four different phases. They are: Conversion of image from RGB format to YIQ or YUV format and divide the image into 8 x 8 pixel blocks. Perform DCT on image blocks. Quantization Entropy coding.

The block diagram of the JPEG compression technique implemented in this project is shown below.

Original RGB Image

Transform RGB to YCbCr

Subsampling

DCT

Quantization

Reconstructed Image

Up Sampling

IDCT

Inverse Quantization

These blocks are explained in detail below. Transform RGB to YCbCr format: We represent color images by three components; they are red (R), green (G) and blue (B) components. This format is called RGB format. In this we represent each pixel with 3 bytes, one for each component. JPEG converts RGB color space to YCbCr color space because YCbCr tends to compress more tightly than RGB and is much easier than compressing the RGB image. In

YCbCr, Y is the luminance, which represents the intensity of the color. Cb and Cr are chrominance values, and they describe the color itself. To implement this, we have an inbuilt MATLAB function rgb2ycbcr to convert the image to YCbCr color space. Division of image: In an image the contents change relatively slowly across small areas, i.e. practically it is unusual for intensity values to vary widely several times in a small area. Hence, attempting to compress the entire image, as one does not yield optimal results. Therefore, JPEG divides the image into 8 x 8 image blocks. We choose the size of the block to be 8 x 8 because using 8 makes the DCT (and IDCT) computation very fast. Example of an image being divided into blocks is shown below.

Subsampling: YCbCr image comprised of luma and chroma components. We can reduce the data in chroma components while keeping full details of the luma component to achieve compression-using subsampling. In this project we use 4:2:0 subsampling technique. In this we make the alternate rows and columns of Cb and Cr components to be zero. Later these zeros are deleted to subsample the components.

Discrete Cosine Transform (DCT): We perform DCT on the image blocks in order to reduce the high frequency contents and then efficiently code the result into a bitstring. We know that the 8 x 8 image block is 64 point discrete signal which is a function of 6

a decoder. Many applications will have systems or devices which require only one or the other.

approximately regarded),as ( ) = 1 2 for , = 0 ; FDCT takes such a signal as its input and decomposes compression of multiple where: DCT values it into 64 orthogonal each DCT Each contains one the ( grayscale images, which are eitherare represented less accurately, we dividebasis signals. coefficient by compressed entirely of the 64 unique two-dimensional (2D) spatial one at a time, or are compressed by alternately frequencies we comprise the input x 8 ( ),quantizationotherwise. ( ) = 1 coefficient. In this project, which used the default 8 signals sed Codinginterleaving 8x8 sample blocks from each in turn. different spectrum. The ouput of the FDCT is the set of 64 ng steps which basis-signal amplitudes or DCT coefficients whose The DCT is related to the which include the Discrete Fourier Transform For s of operation. DCT sequential-mode codecs, quantization simple intuition for DCT-based chrominance quantization. The parameter to (DFT). Some matrices for luminance and values are uniquely determined by the particular Baseline sequential codec, the simplified diagrams ial case of compression can be obtained by viewing in FDCT as64-point input signal. a indicate how single-component compression worksthe a mpression. The

(2 +1) (2 +1) ll utilize every (1) mentations now Processing Steps for DCT-Based Coding 16 16 ( ), ( ) = 1 otherwise. 4 approval) have Figures 1 and 2 show the key processing steps which l codec. 7 7 The DCT is related to the Discrete Fourier Transform are the heart of the DCT-based modes of operation. 1 (Quantization: the special ( case basically defined as the process of representing large Quantization is , of * (DFT). Some simple intuition for DCT-based = 4 ( ) ) ( ) These figures, )illustrate ntly a rich and compression can be obtained by viewing the FDCT as a hich will be single-component (grayscale) image compression. The =0 =0 harmonic analyzer and the IDCT as a harmonic g this minimum reader can set of values with a of DCT-basedIn JPEG, quantization part is the lossy part. In grasp the essentials smaller set. synthesizer. Each 8x8 block of source image samples properly and compression by thinking+1)of it (2 +1) as essentially (2 is stry with an (2) effectively a 64-point discrete signal which is compression of a stream of part, blocks of grayscaleDCT coefficient by quantization coefficient. As a 8x8 we divide each 16 quantization16 function of the two spatial dimensions x and y. The nge of images image samples. Color image compression can then be

The four modes of operation and their various codecs have resultedoperation, the goal of being are usedand generic as of from JPEGs steps Standard from the diversity of image formatsshown applications. building 7 7 across blocks within a larger framework. The multiple pieces can give the impression of our modes of 1 ( , * two spatial they should actually be DCT is applied ) ach mode, one undesirable complexity, butdimensions x and y. The 2D ( , ) = 4 ( ) ( to each of these) block 4.1 8x8 FDCT and IDCT =0 =0 odecs withinregarded as a comprehensive toolkit which can span a a f source image range of images,to the the output obtained samplesbasis signal amplitudes or DCT coefficients. and encoder, source image is wide continuous-tone image applications. It is the are At the input coding method (2 +1) (2 +1) unlikely that many implementations will from unsigned integers grouped into 8x8 blocks, shifted utilize every (1) coder/decoder) -- indeed, range of thePearly implementations now tool most [0, 2 - 1] to signed integers at range 16 withThe coefficient with zero value withboth the dimensions 16 called DC coefficient is no requirement the market ,(even-1], and input to the Forward have (FDCT). on [-2P-1 2P-1 before final ISO approval) DCT an encoder and At the the Baseline the decoder, the implemented only output from sequential codec. Inverse DCT 7 7 ve systems or which gives the average intensity of the (IDCT) outputs 8x8 sample blocks to form that particular block, and the remaining 63 1 her. ( ) ( ) ( , )* reconstructed codec The following equations The Baseline sequentialimage. is inherently a rich and are the ( , ) = 4 idealized mathematicalcalled AC of the be FDCT DCT equation is shown below. definitions coefficients. The 8x8 sophisticated coefficients are compression method which will =0 =0 various codecs sufficient and many applications. Getting this minimum for 8x8 IDCT: ng generic and JPEG capability implemented properly and (2 +1) (2 +1) 7 7 ss applications. interoperably will provide the industry with an (2) 16 16 impression of 1 ( , ) = 4 ) ) ( , ) important initial capability( for (exchange of images * ld actually be =0 =0 hich can spanacross vendors and applications. a where: ( ), ( ) = 1 2 for , = 0 ; plications. It is

(IDCT) outputs 8x8 sample blocks to form the reconstructed image. The following equations are the idealized mathematical definitions of the 8x8 FDCT and 8x8 IDCT:

harmonic represent this and the is called harmonic f DCT-based fairly complete way.analyzer8x8 block isIDCT as aquantization step, which aims at reducing the total Each division input, makes synthesizer. Each 8x8 block of source image samples DCT coefficient values can thus be regarded as the The as essentially way through each processing step, and yields output its is effectively a 64-point discrete signal which is relative amount of the 2D spatial frequencies contained a s of grayscale compressed form into the data stream. a compressed image. Quantization step is given by in number of bits needed for For x and function of the two spatial dimensions DCT y. The the 64-point input signal. The coefficient with zero in on can then be progressive-mode codecs, an signal buffer exists prior image FDCT takes and n of multiple the entropy codingsuch a so thatas its input can decomposes to step, basis signals. Each contains one (u, v)/ in (u, v))dimensions is called the DC an image v)= round frequency Q both it into 64 orthogonal F^(u, be (F coefficient and the remaining 63 coefficients are pressed entirely stored and then parceled out in multiple scans with (2D) spatial of the 64 unique two-dimensional succalled the AC coefficients. Because sample values by alternately cessively frequencies which comprise the mode signals improving quality. For the hierarchical input h in turn. F (u, v) represents the DCT coefficient, Q (u, v) is a quantization matrix, and F^ (u, v) spectrum. The ouput of the FDCT is the set of 64 basis-signal amplitudes or DCT coefficients whose ch include the values are uniquely determined by the particular represents the quantized DCT coefficients. The user can choose the range of compression fied diagrams 3 64-point input signal. sion works in a

s input, makes d yields output am. For DCT fer exists prior image can be scans with sucrarchical mode

depending on hisvalues can thus be regarded as the The DCT coefficient application. More compression means there are large entries in Q.
in the is a trade off between quality and quantization. Hence, quantization represents the There64-point input signal. The coefficient with zero relative amount of the 2D spatial frequencies contained frequency in both dimensions is called the DC coefficient and the remaining 63 coefficients are DCT coefficients with no greater precision than required to achieve the desired level of called the AC coefficients. Because sample values

quality.
3

of operation, the steps shown are used as building 3 Architecture of the Proposed Standard blocks within a larger framework. JPEG DECODER: JPEG decoder does The proposed standard contains the four modes of the following operations

operation identified previously. For each mode, one 4.1 8x8 FDCT and IDCT of operation, are steps shown are used a Inverse Quantization Standard or more distinct codecstothe specified. Codecs withinas building mode differ according a larger framework. the precision of source image At the input to the encoder, source image samples are blocks within four modes samples they can handle or the entropy coding method of grouped into 8x8 blocks, shifted from unsigned integers Inverse Discrete Cosine Transform (IDCT) - 1] to signed integers with range they each mode, one use. Although the word codec (encoder/decoder) with range [0, 2P 4.1 8x8 FDCT and IDCT Codecs withinis used frequently in this article, there is no requirement a [-2P-1, 2P-1-1], and input to the Forward DCT (FDCT). that include both an encoder and of source image implementations must encoder, source image samples are the output from the decoder, the Inverse DCT At At the input to the Upsampling. a applications will have systems or coding method decoder. Manyinto 8x8 blocks, shifted from unsigned integers (IDCT) outputs 8x8 sample blocks to form the grouped devices which require only P or the other. one ncoder/decoder) reconstructed image. The following equations are the with range-1], and - 1] to signed integers first range Inverse [0, 2 input to the Forward DCT (FDCT). Quantization: This is the with attempt to reconstruct the original image. We no requirement idealized mathematical definitions of the 8x8 FDCT [-2P-1, 2P-1 The modes an encoder and four At the of operation and their various codecs and output from the decoder, the Inverse DCT 8x8 IDCT: from JPEGs goal of being generic and ave systems have resulteddo inverse quantization by multiplying the matrix that we used in the quantization or (IDCT) outputs 8x8 sample blocks to 7 7 from the diversity of image formats across applications.form the her. reconstructed can give the impression of The multiple pieces image. The following equations are the 1 ( ) ( ) ( , )* idealized mathematical should actually be FDCT ( , 8x8 undesirable complexity, but theydefinitions of theis shown below. ) = 4 various codecs andprocess in encoder. The function a 8x8 IDCT: =0 =0 regarded as a comprehensive toolkit which can span ing generic and wide applications. It oss applications. range of continuous-tone image F` 7 v) = F^ is v) * Q (u, v) (u, 7 (u, (2 +1) (2 +1) unlikely that many implementations will utilize every (1) impression of 1 ( ) ( ) tool -- indeed, most = the early implementations now * 16 16 ( , ) of 4 ( , ) uld actually be on the market (even before final ISO approval) have =0 =0 which can span a only the Discrete Cosine Transform: IDCT is the reverse process of the DCT in implemented Inverse Baseline sequential codec. 7 7 pplications. It is 1 (2 +1) (2 ill utilize every Baseline sequential codec is inherently +1) and ( ) ( ) ( , )* (1) ( , ) = The a IDCT is shown below. 4 encoder. The equation of therich mentations now 16 sophisticated compression method which16will be =0 =0 approval) have sufficient for many applications. Getting this minimum al codec. 7 7 JPEG capability implemented properly and (2 +1) (2 +1) 1 interoperably will provide the industry with an (2) 16 16 ( , ) = 4 ( ) ( ) ( , ) ently a rich and important initial capability for exchange of images * which will be =0 across vendors and applications. =0 ( ), ( ) = 1 2 for , = 0 ; where: g this minimum properly and (2 +1) (2 +1) ustry with an Processing Steps for 16 (2) 16 ( ), ( ) = 1 otherwise. 4 DCT-Based Coding nge of images Figures 1 and 2 show the key processing steps which where: are the heart of the ( ), ( ) modes of operation. 0 ; The DCT is related to the Discrete Fourier Transform DCT-based = 1 2 for , = Upsampling: The original case ofis reconstructedSome simple intuition for DCT-based image from the subsampled image by (DFT). These figures illustrate the special compression can be obtained by viewing the FDCT as a single-component (grayscale) (image compression. The ( ), ) = 1 otherwise. ased Coding harmonic reader can upsampling essentials of DCT-based row-column analyzer and In this methodharmonic grasp the it. This is done by replication. the IDCT as a the synthesizer. Each 8x8 block of source image samples ing steps which compression by thinking of it as essentially The DCT is related to blocks of Fourier Transform effectively a 64-point discrete signal which is a is es of operation. compression of a stream of 8x8the Discrete grayscale (DFT). Color imagesimple values are replicatedfunction of the two spatial dimensionsThen, y. The Some compression can for DCT-basedfrom the neighboring pixels. x and the intermediate pixel intuition then be cial case image samples. of compression can be obtained by of multiple a approximately regarded as compression viewing the FDCT asFDCT takes such a signal as its input and decomposes mpression. The it harmonic analyzereither compressed entirely harmonic into 64 orthogonal basis signals. Each contains one and the IDCT as a grayscale images, which are and Cr components are combined and YCbCr image is formed, which of DCT-based upsampled Cb of the 64 unique two-dimensional (2D) spatial one time, or Each 8x8 block by alternately as essentially at asynthesizer. are compressed of source image samples is 8x8 sample blocks from each in turn. effectively a 64-point discrete signal which is frequencies which comprise the input signals a interleaving ks of grayscale spectrum. The ouput of the function of the two spatial image by using ycbcr2rgb function in Matlab. FDCT is the set of 64 is converted to RGB dimensions x and y. The ion can then be basis-signal amplitudes or DCT coefficients whose For sequential-mode codecs, as its include decomposes on of multiple DCT FDCT takes such a signal which input and the values are uniquely determined by the particular it into 64 orthogonal basis signals. diagrams Each contains one Baseline sequential codec, isthe simplified of quality of the reconstructed image. It is the ratio of the pressed entirely 64-point input signal. of PSNR: This the measure the 64 unique compression works (2D) spatial two-dimensional in a indicate how single-component by alternately frequencies which comprise the input fairly complete way. Each 8x8 block is input, makes signals ch in turn. The spectrum. The ouputpower FDCT is the power. DCT coefficient values can thus be (MSE) to its way through each processing step, the to the output set of 64 We also use mean squared error regarded as the maximum signal of and yields noise relative amount of the 2D spatial frequencies contained basis-signal into the data stream. coefficients whose amplitudes or DCT For DCT in ich include the compressed form in values codecs, an image buffer exists the progressive-mode are uniquely determined by prior particular the 64-point input signal. The coefficient with zero 2 lified diagrams frequency in signal) /MSE. compute step,PSNR. an image can be = 10log10 (peak both dimensions is called the DC the so that PSNR (dB) 64-point input ssion works into the entropy coding signal. a coefficient and the remaining 63 coefficients are stored and then parceled out in multiple scans with sucis input, makes called the AC coefficients. Because sample values The DCT compression can thus be regarded as the cessivelyQuality ofquality. For the if said to be good if the values of PSNR falls in the range of 30dB improvingcoefficient valueshierarchical mode nd yields output relative amount of the 2D spatial frequencies contained eam. For DCT in the 64-point input signal. The coefficient with zero ffer exists prior to 45dB. in both dimensions is called the DC frequency n image can be 3 coefficient and the remaining 63 coefficients are scans with succalled the AC coefficients. Because sample values 8 erarchical mode

SIMULATION RESULTS Original Image:

Reconstructed Image:

Difference:

The difference of the original image and the reconstructed image is shown above. We can observe the details in the above image if observed closely.

10

CONCLUSION In this project, we successfully implemented the JPEG compression technique in MATLAB. By implementing this project we became familiar with the basic concepts like DCT, quantization, subsampling, upsampling, IDCT, inverse quantization etc. We compressed a .bmp image using the JPEG algorithm in this project. As discussed earlier, PSNR is the measure of the quality of compression. The higher is the PSNR value, the higher is the quality achieved. The achieved PSNR in this project is 36.47 dB, which falls in the desired range. Hence, the compression achieved is satisfactory, which can be observed by looking at the difference of the original image and the reconstructed image.

REFERENCES 1. http://en.wikipedia.org/wiki/Jpeg 2. Chapter 9: Image Compression Standards from Fundamentals of Multimedia by Ze-Nian Li & Mark S. Drew. 3. The JPEG Still Picture Compression Standard by Gregory K. Wallace. 4. Introduction to Data Compression by Khalid Sayood.

11

MATLAB CODE % % % % % Name: Sujith Reddy Gudisa Red Id: 812741898 EE #652 Final Project JPEG Compression Technique Date: 05/19/2011

clc; clear all; close all; image=imread('landscape','bmp'); figure(); imshow(image); title('Original Image'); [rows,columns,color]=size(image); % RGB to YCbCr Conversion image_ycbcr= rgb2ycbcr(image); Y= image_ycbcr(:,:,1); % Y component C_b= image_ycbcr(:,:,2); % Cb component C_r= image_ycbcr(:,:,3); % Cr component % Subsampling Cb= C_b; Cr= C_r; Cb(2:2:rows,:)= []; Cb(:,2:2:columns)= []; Cr(2:2:rows,:)= []; Cr(:,2:2:columns)= []; % ENCODER % DCT DCT= @dct2; % Y component Y_d= blkproc(Y, [8 8], DCT); Y_dct= fix(Y_d);

12

% Cb component Cb_d= blkproc(Cb, [8 8], DCT); Cb_dct= fix(Cb_d); % Cr component Cr_d= blkproc(Cr, [8 8], DCT); Cr_dct= fix(Cr_d); % Quantization % Quantization Matrices (default) lum_matrix = [16 11 10 16 24 40 51 61;12 12 14 19 26 58 60 55;14 13 16 24 40 57 69 56;14 17 22 29 51 87 80 62;18 22 37 56 68 109 103 77;24 35 55 64 81 104 113 92; 49 64 78 87 103 121 120 101; 72 92 95 98 112 100 103 99]; chrom_matrix = [17 18 24 47 99 99 99 99;18 21 26 66 99 99 99 99;24 26 56 99 99 99 99 99;47 66 99 99 99 99 99 99;99 99 99 99 99 99 99 99;99 99 99 99 99 99 99 99;99 99 99 99 99 99 99 99;99 99 99 99 99 99 99 99]; Y_q=@(Y_dct) round(Y_dct./lum_matrix); Yq=blkproc(Y_dct, [8 8], Y_q); Cb_q=@(Cb_dct) round(Cb_dct./chrom_matrix); Cbq=blkproc(Cb_dct, [8 8], Cb_q); Cr_q=@(Cr_dct) round(Cr_dct./chrom_matrix); Crq=blkproc(Cr_dct, [8 8], Cr_q); % DECODER % Inverse Quantization Y_iq=@(Yq) round(Yq.*lum_matrix); Yiq=blkproc(Yq, [8 8],Y_iq); Cb_iq=@(Cbq) round(Cbq.*chrom_matrix); Cbiq=blkproc(Cbq, [8 8], Cb_iq); Cr_iq=@(Crq) round(Crq.*chrom_matrix); Criq=blkproc(Crq, [8 8], Cr_iq); % IDCT IDCT= @idct2 Y_i=blkproc(Yiq, [8 8], IDCT); 13 % Luminance Y Component

Y_idct=uint8(fix(Y_i)); Cb_i=blkproc(Cbiq, [8 8], IDCT); % Chrominance Cb component Cb_idct=uint8(fix(Cb_i)); Cr_i=blkproc(Criq, [8 8], IDCT); % Chrominance Cr component Cr_idct=uint8(fix(Cr_i)); % Upsampling for k=2:2:rows for i=2:2:columns Cbrc(k-1,i-1)=Cb_idct((k/2),(i/2)); Crrc(k-1,i-1)=Cr_idct((k/2),(i/2)); end end for k=2:2:rows Cbrc(k,:)=Cbrc(k-1,:); Crrc(k,:)=Crrc(k-1,:); end for i=2:2:columns Cbrc(:,i)=Cbrc(:,i-1); Crrc(:,i)=Crrc(:,i-1); end % Reconstruction and Conversion from YCbCr to RGB format rec(:,:,1)=Y_idct; rec(:,:,2)=Cbrc; rec(:,:,3)=Crrc; rec_image=ycbcr2rgb(rec); % Reconstructed Image figure(); imshow(rec_image); title('Reconstructed Image') % Difference between the original and the reconstructed error(:,:,1)=Y-Y_idct; error(:,:,2)=C_b-Cbrc; error(:,:,3)=C_r-Crrc; figure(); imshow(error); 14

title('Difference'); % Calculation Of PSNR dif= (Y-Y_idct).^2; S= sum(sum(dif(:))); MSE= S/(rows*columns) ; PSNR= 10*(log10(((255)^2)/MSE)) % Obtained PSNR = 36.4719

15

16

You might also like