Professional Documents
Culture Documents
Abstract of time polling and the cryptography has been far away
from what we need. A primary drawback of Hash
A novel tamper-proof model of web pages using algorithm [1] for tamper-proof of website is that it
virus-based watermarking is proposed in this paper. requires extra storage and channel to transmit the Hash
The model provides a good security and accuracy value. Recently developed watermark technique
about judging the situation of web page tampering. provides alternatives for integrity protection of digital
The classifying theory based on virus is applied when documents [2]. Katzenbeisser et al.[3] proposed a
watermark embedded and extracted. The proposed watermark-based method by adding space and tag into
scheme is applied in all kinds of HTML or XML files, the source code of web pages. However, it has the
not just for English letters but also for the rest of problem of expanding the size. The watermark scheme
characters. More importantly, it can be restored to the based on PCA [4] takes up greats of computing though
original file completely when the watermark is it does not expand the file size. The information hiding
extracted. Therefore, the proposed scheme, associated technology based on web page tags [5] purposes to
with 3rd generation technology of tamper-proof for web insert the information into the predicted position and
pages, exhibits a good property of real-time the tags may be executed by browser.
performance and security. Experiment results show This paper provides a novel watermark scheme that
that it overwhelms existing schemes of tamper-proof in can be associated with the 3rd generation technique.
that it does not increase the file size and it does not And it overcomes the defaults, exhibiting a good
expend great computing time such as cryptography. property of real-time performance, security.
Authorized licensed use limited to: University of South Australia. Downloaded on March 24,2010 at 23:25:02 EDT from IEEE Xplore. Restrictions apply.
the way of event-triggered. All the files in the folder
that have been sorted by fast algorithm will be 3.2. Watermark embedded algorithm
extracted out of the watermark information, which is
timely to be compared with the information embedded We select a character that exists in high frequency
in advance. If they don’t match, the corresponding file as the key. The key will act as the dividing point and
content of backup will be copied to the location of the the text, such as HTML and XML, will be segmented
tampered file. The process of copy is completed by the to lots of section. We call the section as element. All
way of the non-protocol and pure text, so it behaves the elements are divided into 32 classes just by Hash
high security. classified. Thus one class may contain several or more
Besides the process lasts only millisecond. The elements. Meanwhile it produces the 32-bit random
running property and real-time detection reach a sequence of ASCII value by seed, where the ASCII
relatively high standard. When users want to browse value ranged from 00 to 31.
the web page, the request will be sent to the web The 32 spices of information to be embedded
server. Once the server responds, it calls the program respond to 32-bit sequence by certain way. Then some
to extract the watermark out of the relative file. Then spaces in each class are replaced by the ASCII value
the file restores the original one that will be sent to according to the characteristic statistic of spaces. The
users. Please see Fig.1. relative table which contains the class, the character
embedded and the significant letters is created for
3. Watermark embedded and tamper extracting significant information.
detection The key issue is the location where we should
define. In this paper, the location is defined according
3.1. Watermark embedded and extracted to the statistics of “<” and “>” on each aggregate. If the
aggregate doesn’t contain any “<” or “>”, all the
Fig.2 and Fig.3 show method of watermark spaces will be replaced by the responding ASCII. Else
embedded and extracted respectively. the spaces that are in front of “<” and behind of “>”
will be replaced by the responding ASCII.
Key
3.3. Tamper detection
1013
Authorized licensed use limited to: University of South Australia. Downloaded on March 24,2010 at 23:25:02 EDT from IEEE Xplore. Restrictions apply.
the original file is running on the server and sent to the Because the embedded information is the invisible
user. characters whose ASCII value is less than 32, the
editor can not recognize them. Fig.6 reflects integrate
4. Experiments and analysis watermark information extracted in good effect, which
is consisting of meaningful phrase information. It can
In the experiment, we choose a simple HTML file not only present itself copyright, more importantly it is
as the original file, and its source code is also called used for detecting whether the web page file is
the cover file. Before the information will be tampered by the watermark matching.
embedded the cover file is read as the .txt file. The Fig.7 shows the watermark information extracted
experiment and analysis are as follows. from the tampered file just as adding the tag <td> and
</td>. Obviously three information bits have been
changed, so they don’t match. The hint will be given
that the file has been tampered. Actually the watermark
information is related with the length of each item in
aggregate, so the watermark will be different if the
code is added in or deleted.
5. Conclusions
Fig.4 Original cover file The paper provides a new watermark scheme which
is applied in tamper-proof of web page. It presents
good property as follows.
(1) It behaves less computing than the cryptography
that is always used in the second generation technique
[7]
. Also it can make the most accurate judge on
whether the file has been tampered, which is not done
by the cryptography. Table 1 presents the running time
of both algorithms.
Fig.6 Watermark (2) The size of all embedded files doesn’t expand
and it’s the same size as before, although the
watermark is embedded.
(3) The scheme, associated with the 3rd generation,
provides the good security for the website. The user
can’t look through the tampered web page because the
restored file can’t be sent to the user when the
Fig.7 Watermark tampered
extracted watermark doesn’t match the embedded
watermark. Besides the program can detect the security
In the experiment, the detect program is simulated.
file timely and copy the file on bottom layer to cover
The program reads the source code of web page
the tampered file once being detected to be tampered.
showed as Fig.4. The secure file is obtained when the
Actually the speed of Internet traveling is so fast that it
watermark information is embedded. Thus it is just the
proposes high requirement on security.
file that saves in the path on the server and becomes
On the opposite, if it doesn’t depend on the 3rd
the object to be attacked by hacker. Fig.5 presents the
generation, the program must match the information
watermark embedded file. We can see that there are no
with the embedded one when the server responds to the
distinct difference between original and the watermark
user, which will surely add the time consumption and
embedded file.
slow down the speed of browsing the web page.
1014
Authorized licensed use limited to: University of South Australia. Downloaded on March 24,2010 at 23:25:02 EDT from IEEE Xplore. Restrictions apply.
(4) It accomplishes blind detection, which decreases Watermarking system. IEICE Trans, Fundamentals E
the overhead and consumption on OS. 87-A(4): 949-951, 2004
One side to be mentioned is that the new scheme
presents the fragile watermark so that it makes the [3] S.Katzenbeisser, A.P.Petitcols. Information hiding
techniques for steganography and digital Watermark.
tamper detecting behave good robustness and security. Boston, Artech House, 2000
However, it demands the server to be good property
and high speed. Other side, the database linked to the [4] Qijun Zhao and Hongtao Lu. A PCA-based
file should be copied and modified timely. watermarking scheme for tamper-proof of web pages.
Pattern Recognition 38: 1321-1323, 2005
6. Acknowledgements
[5] Changzheng Wang and Jianhui Liu. Research and
implementation of the information hiding technology
This work was supported by the Natural Science based on web page tags. 2007
Foundation of Hubei (China) and Grant
No.2007ABA119. [6] Haiyan Zhou, Fengsong Hu and Can Chen. English text
digital watermarking algorithm based on idea of virus.
7. References Computer Engineering and Applications, 43(7): 78-80,
2007
[1] W. Stallings. Cryptography and network security
[7] Liu Gu. Research and implementation of information
principles and practice. Prentice-Hall, Englewood
hiding based on web page, Microcomputer
Cliffs, NJ, 1999
Information. 22: 186-187, 2006
[2] Guorui Feng, Lingge Jiang and Chen He. Orthogonal
transformation to enhance the security of the still image
Request Copy
The file of information embedded
User Backup
Watermark extracted Tamper detection Files
Respond N
End
1015
Authorized licensed use limited to: University of South Australia. Downloaded on March 24,2010 at 23:25:02 EDT from IEEE Xplore. Restrictions apply.