You are on page 1of 10

Archive of Anime Fan Fiction and Fan Art: A Proposal

Cynthia Smith
July 9, 2007
IRLS 540

Introduction
One cultural phenomenon that has been underground in the United States for
decades and has only within the past decade or so become acknowledged is Japanese
Animation, commonly known as Anime. While creating stories and art based on Anime
has been going on also for decades (also known as fan fiction and fan art), with the
introduction of the internet, it has become easier than ever for fans to share their works
with each other. Many of these works are created digitally and only available digitally,
and they are arguably a part of our cultural heritage from the late 1990s to now. And no
one seems to be concerned with preserving these items of possible cultural interest.
Therefore, I propose the creation of a digital archive whose task it will be preserve these
cultural artifacts for our posterity. The purpose of this paper is to look at some of the
issues such a project would come across, what it would take to implement, and some of
the different options that are available. Specifically, selection, copyright, cataloging,
access, and actual preservation will be looked at.
Selection
There are several issues arising from selecting fan fiction and fan art to preserve.
One is finding these works in the first place. A web crawler could be created specifically
for the tags or titles fan fiction and fan art. This could result, however, in some
duplication and some misses. Naturally, human monitoring would be necessary as well to
keep track of what the web crawler is not finding to try to find ways to either improve the
web crawler, or to have points of human intervention. Even with human searching,
however, some items are still bound to be missed. The digital archive could also spread
word around of its existence and outline a process where creators of fan fiction/fan art
could submit their works directly if they so choose. Part of the problem of archiving fan
fiction in particular is that for particularly long works the author often posts chapters in
various installments, meaning that it might take a web crawler several times to capture all

of a work, the first part of which may already be archived by the time the final
installment is submitted. Or, with the ease of posting to the internet these days, a work
may have been submitted to several different sites, leading the web crawler to document
multiple copies of a work. Thus, while web crawlers may be able to take much of the
initial burden off humans, close human monitoring will be necessary.
Another issue is the fact that the quality of these works ranges from near
professional quality to wondering why the person bothered to post the work on the
internet in the first place. Yet the time it would take to go through to sort out the good
from the bad would probably be more than the archive could afford to spend. Webmasters
of fan fiction sites are of little help, since rarely do they have much criteria as far as
quality beyond basic spelling and grammar checking. There is also the consideration that
many of these creators are young and do well for their level, which may be a low one in
the beginning but may grow over time. In this instance, the time it would take to check
for quality would be too high, though there should be a review process for items that are
particularly obvious in being of questionable value. This process should have clearly
defined steps with clearly defined criteria having nothing to do with the content of the
piece in question.
There is one criterion as far as content is concerned that must be absolute in the
archive: Any work under consideration by the archive must be related to Anime in some
way. This rule may eventually be reconsidered in the future to allow other fandoms to be
included such as Star Wars or Harry Potter, but for a beginning point Anime will be the
focus.
There are possible legal problems with content considering that many potential
visitors to this digital archive may be minors, but this could be effectively handled by a
combination of cataloguing, which shall be detailed bellow under cataloguing, and
sectioning off a part of the site to be used only by those over eighteen, which also shall be
detailed below under access.
Copyright
Most sites that host fan fiction/fan art have copyright statements to the effect that
the fan fiction/fan art is the property of their creator and any who wish to use them in any
way (other than reading on the hosting site) must ask the creator for permission. Legally,

it would probably be best to get permission to archive from each individual, though this
logistically becomes problematic. Some of this could be simplified by the archive by not
only gaining permission to preserve and display all current works the author has created
but also all future works unless specifically prohibited by the individual. Initiatives could
also be begun with prominent fan fiction/fan art sites where anyone who submits work
would also be asked permission to have their work to be archived and displayed in the
digital library. Anyone who directly petitions the digital archive to preserve their work
will automatically be asked for the rights of archiving and display.
In some exceptional cases, an author could possibly request specific restrictions
on their works. A possible example would be if a person wrote fan fiction during their
adolescence and later became a prominent author and wanted their earlier work placed
under restriction for privacy or other reasons. The digital archive would then do its best to
negotiate for the widest possible access, but still respect the wishes and rights of the
creator. In all cases, copyright should be strictly kept track of. To assist with this, it would
be advisable to include as part of the metadata any copyright information.
The archive must also have a clear policy about works where the
creator/copyright holder is unknown and also a procedure to follow once an author
becomes known.
Cataloging
The general framework that would be ideal for cataloging this archive would be
the Dublin Core in XML. (Stielow 2003, 113) Other alternatives such as MARC are
available, but Dublin Core has the advantage of being fairly comprehensive in metadata
without having more available than is probably needed. XML is the up and coming
language of the web, and has the advantage of being able to not only handle content and
format, as HTML does, but it also can handle multimedia. This will allow the archive the
greatest current possibility of future expansion. Several physical copies of the archive
catalogue will be made to be distributed to key personnel and locations to be kept off site
incase of disaster.
Cataloging for this digital archive specifically has several different components to
be considered. First, these items need to be searchable by the Anime that is depicted in
the work. It may also be worthwhile to divide by genre, such as comedy, fantasy,

adventure, etc. for further searchability, realizing that this will create more work for the
cataloger. Considering that many possible visitors may be minors, it may also be
worthwhile to note which works are particularly graphic in their description of sex or
violence, etc. Since this is all available online, and no record really needs to be kept of
who views what, this should eliminate the chilling effect while warning viewers of
content they may not be comfortable with. Again, this would create more work for
catalogers. Some websites that host fan fiction and fan art already have such genre
categories and warnings, which could be used as a basis for such descriptions helping to
eliminate part of the work of the cataloger. Again, the procedures for this process would
need to be explicitly laid out, along with a process for reconsideration.
Some works and some websites specifically cater to shonen ai (romanticized love
between boys) and yaoi (shonen ai that is explicitly sexual). They also have their female
counterparts. (Poitras, 2007) All of this content should be at least clearly labeled as such,
and usually are in fan sites so those who are uncomfortable with such stories may avoid
them. Further issues regarding yaoi and like works are discussed under access.
Access
There are two problems of access in a digital archive. The first is the more general
one of providing as much access as possible while respecting copyright and privacy
rights, much of which has already been discussed above.
One problem specific to this archive is the fact that a large fraction of visitors will
probably be minors. According to US v. ALA, in which the Supreme Court ruled on the
Child Internet Protection Act, children do not have the right to view obscenity or child
pornography (adults do not have the right to view these either), or anything harmful to
minors (US v. ALA, 2003). Because of the nature of some of the fan fiction/fan art
(shonen ai and yaoi especially), some could possibly be considered harmful to minors.
In fact, fan fiction sites that cater to such audiences often have warnings at the entrances
to the section of fan fiction that houses such stories. Some even prohibit anyone under
eighteen from entering sections of the site that host such works. Certainly, the archive is
not in a position to be a law enforcement officer. On the other hand, the archive could be
found legally negligent if we ignore this issue, especially considering that a large fraction
of our probable clientele will be minors. First, a clear definition of what is harmful to

minors needs to be established, with a procedure to be followed in making


determinations. Anything that is already restricted by age by a fan fiction page probably
should continue to be restricted. This is in the interest of streamlining the process. Also,
certain categories such as yaoi may automatically be restricted. Finally, there must be a
process in place where restricted items may be reconsidered for free access and nonrestricted items be reconsidered for restriction at the patrons request.
Preservation
For the preservation of fan fiction and fan art, there are several format options
available: create a paper copy, convert to TXT/JPEG format, convert to PDF format,
convert to HTML, or convert to XML format. There is also the issue of storage medium:
paper, cd/dvd, and server. Lets take a look at the issues of each of these options.
(Gladney 2007, 139-161, 235-249)
The good news about paper is that it lasts. It is an excellent storage and format
medium for this reason. A situation like this where the object is not originally on paper is
actually ideal since acid-free paper could be used from the beginning. Also, no
intervening equipment once it is printed would be required. The bad news about paper is
that the cost both in terms of time and resources may be rather high. Also, of all the
options, paper probably takes the most space to store. Since these objects are primarily
born digital, there is no guarantee that the information will be transferred perfectly to
paper. This would also mean that if the paper copy became the only copy, people would
have to travel to the archive itself to view the information. At least with a project like this
where the information is mostly static, paper is a viable option that would preserve these
artifacts beyond anything digital. This would, however, limit the archive from expanding
its activities beyond preserving static artifacts.
The benefit of converting all text files to TXT format and all picture files to JPEG
format is these are non-proprietary formats that are open source and tend to be the lowest
common denominator for these formats. Every computer today no matter how advanced
can still read TXT based on ACSII, and JPEG is similar for still pictures. The problem
with TXT is it saves only information, not formatting. Unless the item in question was
originally written as a TXT file, information will be lost. In both these cases as well,

external equipment is needed. This file format would also limit future projects to static
words and pictures.
Adobe Acrobatic PDF file format has the advantage of basically taking a snapshot
of the content and saving it, so little if any formatting or content is lost. Unfortunately,
the software to read these files is proprietary, adding another level of complication to the
needed equipment. Also, PDF does not necessarily capture everything from the original
document. Adobe Acrobatic also works primarily with still images again limiting future
projects.
The one thing that HTML does well, part of the reason it was created, was to
include format along with content. It is also non-proprietary and open source. The one
issue with HTML is that it does not handle multimedia well. Given the nature of this
collection, however, this may not be an issue. Also, since many of these documents are
likely already in html format, this would mean no loss of data due to transferring to a
different file format. Since this format does not handle multimedia well, however, it could
again limit future projects.
XML is one of the latest and most promising formats, preserving not only content
but also formatting information, and unlike HTML can handle multimedia. This means
that in the future the archive could easily expand to begin archiving Anime Music Videos.
It also has the benefit of being non-proprietary. Probably the biggest risk is the fact that it
has had the shortest life thus far, and while its future seems bright today there is no
guarantee that this will in fact be the case.
In all cases, the file format would be uncompressed to help preserve maximum
content from the artifact, keeping in mind that in the future it is unknown what
information from the artifact may be considered important or relevant.
The problem with transferring data to cd/dvd is the fact that it is unknown exactly
how long these disks will last with their information uncorrupted. There is also no
guarantee how long cd/dvd readers will be in popular use and therefore how long the
needed equipment to play them will be available.
With servers, there are basically two options: migration and emulation. Migration
basically means continuously moving the information into formats and onto media that
are readable as technology changes. Emulation is basically making one computer system

act like another. So the computer of the future could be made to act exactly like a
computer that is used today, meaning it will then be capable of reading the formats of
today, assuming that the format is known and the software to read it is known. This will
also mean that the information will be able to be seen exactly as it is seen today, as long
as all the specifications are known. Fortunately, some of this information, i.e. file format,
is information that should already be included in the metadata. Unfortunately, that still
leaves some extra metadata work to be done. Another concern is the fact that emulation is
the newer method.
In this case, a tiered approach would be best. Testing should be done between
HTML and XML. Since most of the material is likely to be in HTML, staying with this
format would save time. However, XML would allow for greater flexibility in future
projects for the archive. Subject to testing, HTML is the recommended format for initial
use since most items are already in HTML. A secondary copy of the archives holdings
could be made in XML as the archive has time and resources. One consideration of this is
the ease of converting HTML to XML. As for storage format, servers would be the
preferred format, migrating from server type to server type as technology progresses.
However, emulation should be reconsidered and implemented as it is proved to be
feasible, cost-effective, and reliable. Also, it is recommended that a duplicate server be
stored and maintained offsite in case of emergency. This duplicate server would house the
XML copy of the holdings. If the time comes when the archive decides to start archiving
multimedia works (Anime Music Videos), the XML copy would be brought to the front.
Technological conditions at that point in the future would determine whether the
secondary copy of the archive would be a straight XML copy or whether another file type
would be more ideal.
My Recommendations
As far as selection is concerned, the main consideration will be whether the
artifact deals with Anime in any way, although there will also be a process whereby
artifacts may be brought back for reconsideration under more careful scrutiny. This is
because the time it would take to scrutinize each artifact for further evaluation, and also
because of the difficulty in setting criteria for quality for such a wide range of submitters.

Copyright will be respected through meticulous records keeping and metadata


keeping track of rights, though during negotiations the archive will strive to procure
rights for the fullest access possible with no restrictions except those required by current
copyright law.
Cataloging will use the Dublin Core in XML for metadata. Special attention will
be given to searchability by Anime and to tagging indicating content especially
questionable for minors.
There will be two public levels of access: minors and adults. While the archive
wishes to promote free access within the realms of copyright and privacy, the archive will
be housing content that minors legally may not have a right to. Since minors will likely
make up a large fraction of visitors to the archive, this is being taken into account by
these two layers.
The archive will host its holdings on two servers. The copy on the onsite server
will be in HTML, this being the most likely format the original documents will be written
in. The secondary server will be off site and will hold the holdings in XML.
One final note: specialists should be made part of the process from the beginning.
Digital preservation experts should be consulted to help detect any preservation issues
that the archive has overlooked and to make sure the archive has considered all options.
Web designers are crucial to making the archive easy for users to navigate. Lawyers
should be consulted to make sure the archive is complying with all laws regarding
copyright and minors.
There are many issues to consider when creating a digital archive. These are some
of the primary issues that should be addressed when creating an archive to house Anime
fan fiction and fan art, along with recommendations on how to implement such an
archive in actuality.

Bibliography

Gladney, Henry M. (2007) Preserving Digital Information. Springer: New York.


Hunter, Gregory S. (2003) Developing and Maintaining Practical Archives, 2nd ed. NealSchuman Publishers Inc.: New York.
Poitras, Gilles. (2007) Gilles Service to Fans Page: Anime and Manga Terminology.
Accessed 7/9/2007 from http://www.koyagi.com/Terminology.html.
Stielow, Frederick. (2003) Building Digital Archives, Descriptions, and Displays. NealSchuman Publishers Inc.: New York.
United States v. The American Library Association, No. 539 U.S. 194 (2003).

You might also like