Professional Documents
Culture Documents
Prepared By: Sirajuddin bin Ab Aziz Prepared For: Miss Sri Intan binti Shahrul Asaari
Introduction
O Web archiving is the process of collecting a portion of
the World Wide Web (Internet) and preserve it in an archive for the use of researchers, historians, and the public in the future. O They were collected by web archivist via Web Crawler and in a form of HTML Web Pages Style Sheets Java Scripts Images Video
Current Issue
O Copyright Issues O As been stated by Peter Lyman Although the Web
is popularly regarded as a public domain resources, it is copyrighted; thus, archivist have no right to copy the web. O Management Issues O How to manage the Web Content? O Have we obtained this legally? O Policy
Challenges
O The Web Crawler Limitations:
Large portion of a Web Site many be hidden
of the Deep Web. Crawler Trap may cause crawler to download infinite number of pages. O The changes, infinite size of the web O Consume a lot of bandwidth if taken lightly. O Virus attacks.
Solution?
O Obtain the web material through legal deposit
act whereby a person or a group must submit a copy of their publications to a repository. O Configure the web crawler by limiting the pages that they can crawl. O Provide a better standard on organizing the web content. O Create a backup on all of the Web content.
Conclusion
O As the present era takes a great care on the
archived materials, the same would be implied on the future when the Web content would be placed on same place with the archive for the Web archiving is important to protect their own corporate heritage, regulatory and legal purposes.