You are on page 1of 4

Multiclass Detect of Current Steganographic Methods for JPEG Format Based Re-stegnography

Xiaozhong Pan, BoTao Yan, Ke Niu Network and Information Security Key LaboratoryElectronics Department of Engineering College of the APF Xian, China E-mail: xzpan@yeah.com; ywbotao@163.com; niuke@163.com

AbstractThe aim of this paper is to properly classify various stego images of JPEG to their own stegnographic methods (current steganographic methods, such as F5, OutGuess, Steghide, JPhide and Jsteg). Although some Multiclass Detection methods had been previously published by the authors, they all had various limitations and disadvantages. First, models of some detect methods are too complicated, and their process are too fussy. Second, the performance of some detect methods could decline when the embed rate minish. Based on re-stegnography, the detection of this papers algorithm extracts 109-dimensions features and trains SVM(support vector machine) multiclassifiers to classify all kinds of stego images and cover images with very high precisions (approximately 100%). Not only the model is very simple, but the performance is all the same excellent when the embed rate minish. Keywords- re-stegnography; multiclass detect; JPEG

encrypt the images to detect once, and then, using the transformation relation of storage sizes between the original images(cover) and encrypted ones(stego), construct the features, which will be applied as a basis for judging stego or not. This method can work successfully at very high accuracy rates (98%). From this revelation, we can apply the idea of re-steganography in universal detection, and construct effective features through distilling the storage sizes from the original images and stego ones separately, and realize generic detection to several popular steganography algorithms at high detection rates, integrated with SVM (support vector machine) multi-class classification methods. II. CORRELATIVE THEORY The main basis of Luos detection method [8] is that the change of storage size from re-stegnography to image is much smaller than that from stegnography to image. Specifically, the storage size of JPEG image, with the first JPhide steganography, is reduced about 50% than before; but with the second JPhide steganography to the image, the change of the storage size was relatively small, almostly no change. Dealing with the cover by JPhide algorithm has a greater compression ratio than a stego image encrypted by JPhide. This conclusion is obtained through the large number of experiments by Luo et al. However, it is neather used in universal detection, nor in multi-class classification. For universal detection and multi-class classification, the conclusion can be expended as follows: for the 5 kinds of steganography algorithms, dealing with a cover image by each one of the 5 algorithms has a greater compression ratio than a stego image. This conclusion will be proved in this paper using lage number of experiments, which is the basis of this paper. Since JPhide re-stegnography algorithm can be used to detect itself, and are other similar stegnography algorithms have similar detection possibility? The answer is yes. Accoding the conclusion above, we can use the restegnography idea to achieve universal detection and multiclass classification. Experiments also prove that the restegnography technology have good results in these stegnography algorithms for specific detection, such as F5, OutGuess, Steghide, and Jsteg. There are two types algorithms of re-stegnography, one is two steganography algorithms are the same (experiment 1), and the other is two different steganography algorithms (experiment 2). Firstly, we conduct the experiment 1:
79

I.

INTRODUCTION

The goals of JPEG Image steganography detection are not only distinguish cover images (the normal images) and stego images, but also determine the steganography algorithms of the stego images, and extract the embedded information, of which the precondition is to determine the image steganography algorithm. Fridrich et al. analysed and researched deeply the current several steganography algorithms in the literatures [1-2] (such as F5[3], OutGuess[4], Steghide[5], JPhide[6] and Jsteg[7]), and designed detection system, which can, through a multi-class classification, implement the correct classification of steganography algorithms. The detection system can achieve the detection to steganography algorithms in a higher probability and realize universal detection within a certain range (limited to several steganography methods). The shortcomings of the system are as follows: 1. The process of the algorithm is too cumbersome, models too complex, feature sets too large, the calculation complexity too high, which make the realization more difficulty. 2. The performance of the algorithm present a downward trend along with the reduction of embedded rate, especially when the embedded rate dropped to a certain extent, the detection results are in the sharp decline, and in extreme circumstances (for example, the embedded information was only 20B), it is almost hard to distinguish the cover images from the stego ones, let alone a multi-class classification. Luo et al [8] proposed a special detection method based on re-steganography,which is only to JPhide algorithm: First,
978-1-4244-5848-6/10/$26.00 2011 IEEE

1) Use JPhide to encrypt 2000 images in succession (collection A) with the message with storage size of 1k and 0.5k respectively, obtain 2000 stego images (collection B) and 2000 images of re-steganography (collection C) respectively. 2) Let p1i=S(afi)/S(ai), p2i=S(bfi)/S(afi), i= 1,2,,4000, p1i are the features of the cover images, p2i are the features of the stego images, extract 2000 group features (of which, 2000 p1i features, 2000 p2i features), train the FLD[9] classifier to generate features proposed by this paper . 3) Repeat the two steps above, process another 2000 in the same methods to extract another set of 2000 features, and then use the trained classifier to do their test. 4) With the same method above to detect the other steganography algorithms respectively, for example, F5, OutGuess, Steghide and Jsteg, all the experimental results above are in table1:
TABLE I.
algorithm F5 JPhide OutGuess Steghide Jsteg

as a typical combination to illustrate the problem. The test images are encrypted with the 5 kinds of stegnography software for the first time, with the JPhide for the second. Classifier also uses the first experiment classifier generated by JPhide. Other conditions are the same, the experimental results are in table 2:
TABLE II.
algorithm F5 JPhide OutGuess Steghide Jsteg

DETECTION RATE WITH FEATURES USING TWO DIFFERENT STEGNOGRAPHY


rN(%) 97.30 100.00 99.20 94.77 98.40 rS(%) 99.00 100.00 92.20 99.40 97.86 wN(%) 2.70 0.00 0.80 5.23 1.60 wS(%) 1.00 0.00 7.80 0.60 2.14

DETECTION RATE WITH FEATURES USE TWICE THE SAME


rN(%) 99.45 100.00 100.00 95.30 100.00 rS(%) 100.00 100.00 100.00 97.55 100.00 wN(%) 0.55 0.00 0.00 4.70 0.00 wS(%) 0.00 0.00 0.00 2.45 0.00

In table 1, rN(rightNormal) is the correct detection rate of the cover images, rS(rightSteg) is the correct detection rate of stego images, wN(wrongNormal) is the error detection rate of stego images, wS(wrongSteg) is error detection rate of normal images. As can be seen, the classification effect is very good, with the exception of Steghide, for the other algorithms, detection rate can reach 99% or more, and three have reached 100%. In the experiment 1, the cover images A, are encrypted to stego images B. In actual detection, if the images to detect are covers, they are encrypted to generate B; if the images to detect are stego ones, they are encrypted to generate the second encrypted images set C, so p1i and p2i are respectively the features of the cover and the stego images. The experiment above is the experiment 1, the two stegnography algorithms are the same situation. In actual practice, in advance, we can not determine whether the image contains hidden information, even if the images have been identified as stego images and do not know their specific stegnography algorithms. Therefore, experiment 1 is just special testing, which have not the ability of universality detection. However, the most cases are that two steganography algorithms are different, which is the second experiment in the following(experiment 2), and have a certain degree of of detection(a certain ability means the algorithm is universal detection ability only to the five kinds of stegnography algorithms based on JPEG, not to all known stegnography algorithms). Because the number of combinations of the two algorithms is 2 = 20 and

In table 2, the JPhide row is the same as experiment 1, its performance is certainly good. While the performance of other rows decline, because of two different stegnography algorithms, two different embeding, because the classifier is a dedicated classifier, whose performance, except for JPhide, is not very good to the other algorithms. As a result, we must use five kinds of stegnography algorithms to generate the features integrated and train better classifier. Adopting the multi-class SVM classifier, and combining with the features of storage volume relationship between before and after stegnography, we can construct to design multi-class classification algorithm to achieve the general detection. Now, a large number of experiments using five kinds of stegnography algorithms are conducted to find the most suitable one for the second steganography algorithm. After comparison we found, Steghide performs the best as a second steganography tool, and has the highest rate of classification accuracy, the classify effect are in figure 1,others classification effect figure ignored.

Figure 1. Detection rate with Steghide re-stegnography and FLD.

experimental data too much, we just take 5 kinds of which

In the same conditions, we conduct the other four experiments, the training features are storage size ratio before and after steganography, with a total of 2,000 features, of which each algorithm obtained 400 groups (p1 are cover features, p2 are stego features). With 3000 images

80

for testing, each algorithm take 600 groups (p1 are cover features, p2 are stego features) where the experimental results are in table 3:
TABLE III. DETECTION RATE WITH INTEGRATED TRAINING
algorithms Steghide JPhide OutGuess F5 rN(%) 99.967 98.60 96.03 100 rS(%) 100 96.00 95.96 99.67 wN(%) 0.033 1.40 3.97 0 wS(%) 0 4.00 4.04 0.33

III. A.

MULTICLASS DETECT SYSTEM DESIGHN

Feature Construction and Extraction

As can be seen from section , one-dimensional of storage size ratio can not only distinguish between the cover and the stego images in a probability close to 1, but has a certain ability of multi-class classification. Combining this one-dimensional feature with 108-dimensional features of the improved method of Farid[10], we can construct a total of 109-dimensional features vector. B. Classifier Design In [11], based on one class support vector machine (OCSVM), which is ultra-spherical multi-class support vector machine, a new multi-class support vector machine algorithm was proposed. For the k class problem, khypersphere is proposed to cover the k class sample and construct a k-classifier, each super-ball is trained by one sample class. Algorithm of this paper continues to use the multi-class support vector machine technology to implement multi-class classification. C. System Design Detection system is designed to two parts: classifier training and detection. The classifier training consists of the following components: 1) First stegnography. We get 4000 cover JPEG images from the library [12](let the collection as A (ai,i=1,2,,4000)), and encrypt every image in A using F5, OutGuess, Steghide and JPhide software, with a 100Bs text file as hidden information, receive four images sets, (each set have 4000 first stego images, referred as their first stego images), which are as follows: AF(afi), AO(aoi) , AS(asi) and AJ(aji); we use Jsteg to encrypt the set of 4000 GIF images corresponding to A(ai ,i=1,2,,4000) in the same conditions, and receive a 4000s set, it is its first stego images. 2) Re-stegnography. Using Steghide to encrypt the first stego images (a total of 5 * 4000 = 20000), with 20Bs text file, receive the 20000 stego images (referred to as restegnography images) collection, which are defined as follows: BF(bfi), BO(boi) , BS(bsi) , BJ(bji) and BD(bdi) (i= 1,2,,4000) 3) Extract the proportion of the value of image storage size. For one image X, let S(X) as the its storage size, take 3 images ai, afi and bfi for example, the ratio of the storage
81

size is defined as p1i=S(afi)/S(ai) and p2i =S(bfi)/S(afi), other images definition are similar. Each three related images constitute a group, by definition, and we find all the 20000 (=4000*5) feature values p1i, p2i (i= 1,2,,4000), in which, p1i for the cover features, p2i for the stego ones. 4) According to the literature [1], respectively extract 108-dimensional wavelet features from six sets of images, which comprise the cover set A(ai,i=1,2,,4000) and the image sets encrypted by five stegnography software AF(afi), AO(aoi), AS(asi), AJ(aji) and AD(adi) (i=1,2,,4000), coupled with p1i or p2i, the result is that each image (cover or stego) has a 109-dimensional feature vector. 5) The four steps above have formed a collection of six kinds of images, each set has 4000 images, and in all six sets of images one correspondence, each image has a 109dimensional feature vector. These 24000 (4000 * 6) features are input into a support vector machine vector experimental software (Libsvm [13]) to form a multi-class classification of support vector machine classifier. Image detection is divided into the following steps 1) Calculation the proportion of the value of image storage size. Using Steghide software to conduct a detection of image steganography, with 20Bs text file embeded, extract the storage size before and after stegnography, and find their ratio as an one-dimensional feature. 2) Wavelet statistical feature calculation. Conduct an image wavelet decomposition to form the primitive features, and then obtain a 108-dimensional feature vector after feature selection. 3) Classification. Combine features, and use multi-class SVM for classification, class images to their categories, namely, determine the stegnography algorithms of the images encrypted. Such as figure :
Image to test

Proportion of image size +Wavelet statistical feature Multi-class SVM

Cover

Certain stego image

Figure 2. JPEG stegnography detect system.

IV.

PERFORMANCE ANALYSIS AND EVALUATE

Based on the Multi-Class SVM Classifier generated in I, we detect these 3000 * 6 images. Among them, all of the stego images have been embedded with information of only 100B, experimental results are in table 4:

TABLE IV.
Embed Size(B) Embed algorithm cover Jsteg Steghide JPhide OutGuess F5 Jsteg Steghide JPhide OutGuess F5

DETECTION RATE OF OUR METHOD


cover 100 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Jsteg 0.00 99.02 0.00 0.00 0.00 0.92 98.63 0.00 0.00 0.00 0.95 Classified as(%) Steghide JPhide 0.00 0.00 0.00 97.89 0.00 0.87 0.93 0.00 98.06 0.00 0.92 0.96 0.00 0.00 100 0.00 0.00 0.00 0.00 100 0.00 0.00 OutGuess 0.00 0.00 1.19 0.00 98.24 0.03 0.00 0.97 0.00 98.70 0.02 F5 0.00 0.98 0.92 0.00 0.89 98.12 1.37 0.97 0.00 0.38 98.07

200

state-of-the-arts in detecting the modern steganographic methods: Steghide, JPhide, Jsteg, F5 and OutGuess. In the future, we also plan to further improve the detection reliability of the proposed steganalytic algorithm. Firstly, expanding the application range of the algorithm, we want the proposed algorithm outperform in other steganographic methods: MB1, MB2, SSIS and H4PGP. Secondly, we want the algorithm to further simplely and run with more high efficiency. REFERENCES
T. Pevny and J. Fridrich, Multiclass Detector of Current Steganographic Methods for JPEG Format, IEEE Trans. Inf. Forensics Security, vol. 3, no. 4, pp. 635-650,December. 2008. [2] T. Pevny and J. Fridrich, Detection of souble-compression for applications in steganography, IEEE Trans. Inf. Forensics Security, vol. 3, no. 2, pp. 247-258, June. 2008. [3] A. Westfeld, F5, [Online]. Available: wwww.rn.inf.tudresden.de/westfeld/f5. [4] N. Provos, Outguess [Online]. Available: http://www.outguess.org [5] Stefan Hetzl. Stork, http://steghide.sourceforge.net/. [6] Allan Latham, JPHIDE and JPSEEK [Online]. Available: http://linux01.gwdg.de/~alatham/stego.html/ [7] D. Upham, Jsteg [Online]. Available: Available: ftp://ftp.funet.fi/pub/crypt/steganography/ [8] Dong Luo,Yue Xu and Li Ya Chen,Steganographic Detection Based on the Second Encryption JPhide,Computer Engineering,July. 2007. [9] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. New York: Wiley, 2001. [10] H. Farid and S. Lyu: Detecting Hidden Messages Using HigherOrder Statistics and Support Vector Machines, in F.A.P. Petitcolas (ed.): 5th International Workshop on Information Hiding, LNCS vol. 2578, Springer-Verlag, Berlin-Heidelberg, New York, pp. 340354, 2002. [11] C.W.Hsu.and C.J. Lin: A Comparison of methods for multi-class support vector-machines. IEEE Trans. Inf. Neural Networks, vol. 13, no. 2, pp. 415-425,Mar. 2002. [12] [Online].www.freefoto.com. [13] C.-C. Chang and C.J.Lin.LIBSVM: a library for support vector machines,Available at: http: //www. csie. ntu. edu. tw/~cjlin/libsvm. 2001. [1]

100

The storage size of hidden information, with which the algorithm proposed by this paper re-crypt images, is very small (only 100B); multi-class classification detection efficiency can reach more than 98%, and the detection results remain almost constant, without a dramatic change as the amount of hidden information reduce. In the same experimental conditions, the performance of the algorithm [1] gradually decrease as the amount of information reduce, specific data are in table 5:
TABLE V.
Embed size(B)

DETECTION RATE OF FRIDRISHS METHOD


cover 99.05 0.07 0.68 0.08 0.12 0.13 1.41 0.29 0.77 1.01 1.54 0.51 1.51 Steghide 0.00 94.65 0.00 0.03 0.00 92.83 0.00 0.18 0.00 91.13 0.00 0.37 0.00 Classified as(%) JPhide OutGuess 0.02 0.00 96.37 0.00 0.00 0.07 90.96 0.00 0.00 0.18 87.52 0.00 0.00 0.19 4.73 1.62 94.34 1.72 1.25 3.26 91.27 3.65 2.01 4.83 89.89 3.68 F5 0.74 0.59 1.33 5.55 98.16 5.72 4.37 8.26 95.58 6.67 6.13 10.23 94.81

Embed algorithm cover Steghide JPhide OutGuess F5 Steghide JPhide OutGuess F5 Steghide JPhide OutGuess F5

1000

500

200

Compare table 4 and 5, the algorithm proposed has the following advantages: 1) More efficient to detection, and very stable. Detection efficiency do not decrease as embedding rate reduces. Even if the embedded information is only 100B (a few dozen characters), we also can get rather a high detection rate. 2) More efficient to run. The frame of the algorithm in literature [1] is too complex, the process too cumbersome, and it has too large dimension of feature vector (274 dimensions), runs slowly and inefficient. The algorithm presented by this paper is simple, whose feature dimension is small, only 109-dimensional, whose computing time is short, and it has higher efficiency. 3) Wider range to apply. The algorithm presented by this paper can detect Jsteg stego images with excellent performance, while the literature [1] algorithm can not detect Jsteg algorithm with a tolerance of detection rate. V. CONCLUSION

We have proposed an effective steganalysis multi-class classification algorithm in this paper, which outperforms the

82

You might also like