The Adaptation of Digital Presevation to Information Centers
When speaking with those outside of the information field about
digital preservation, the question I am most often asked is “Why can’t you just
save everything?”. The issue of digital preservation may initially appear to be
that simple but with further investigation, one quickly realizes that it is a
much more detailed topic with several issues. By understanding past
conservation methods, current concerns and future threats, great strides can be
made to ensure that long-term archiving is achieved.
To
better understand why digital preservation is needed, it is important to
understand the evolution of the process.
Approximately 500,000 clay tablets dating back to the Bronze Age are
essentially our earliest archives. While the tablets were inadvertently saved,
anyone who has handled a tablet can tell you that they are heavy, unwieldy, and
take up a great deal of space. Several centuries later, a better option came
along in the form of microfilm.
Before the digital revolution was conceived, microfilm is practical since it is nearly static. No special retrieval system, new machinery or technology are required to access the data, just light and a magnifying glass. With its standardized practice and proper storage, it can be expected to last for at least five hundred years. While waiting to be digitized, or even in place of digitization, microfilm can be used to protect information that is vulnerable to loss or damage. The low cost of the process is also a major selling point. While microfilm has many pluses, it also has several minuses. While the film can be read with the naked eye, a machine simplifies the work but also needs to be maintained. Also, the user needs to physically visit the facility to access the information.
Before the digital revolution was conceived, microfilm is practical since it is nearly static. No special retrieval system, new machinery or technology are required to access the data, just light and a magnifying glass. With its standardized practice and proper storage, it can be expected to last for at least five hundred years. While waiting to be digitized, or even in place of digitization, microfilm can be used to protect information that is vulnerable to loss or damage. The low cost of the process is also a major selling point. While microfilm has many pluses, it also has several minuses. While the film can be read with the naked eye, a machine simplifies the work but also needs to be maintained. Also, the user needs to physically visit the facility to access the information.
While microfilm is
still heavily used, with the commercial availability of the cd-rom in the
1980s, a new option arrived. With its
size and cost benefits, it was thought that the cd-rom would replace
microfilm. An article written in
December 1999 by Mr. Johnson proclaimed: Current scanning technology can place
an entire county census record on a single CD-ROM. No more fussing with faulty
microfilm readers, faded microfilms, and money hungry printers. Will this spell
the end of microfilm? At its peak in the year 2000, based
on its size and portability, the cd-rom was thought to be perfect. As we
now know, the answer is a resounding no. The cd-rom
format is no longer useful especially since there are now large information
institutions stuck with discs that are basically unusable. Some laptops such as
the MacBook Air no longer come equipped with a cd drive. While microfilm, long
considered the standard in preservation, is still durable and used by many,
with the boom of the technology age, digital preservation is on the rise and
here to stay.
With digital
preservation, the question of what to save, known as the selection process, is
never ending. We are living in a digital age where instant gratification is the
norm. With the internet and high speed connections, a wealth of information is
available at our fingertips in the blink of an eye. The problem arises when we
realize that even with unlimited funding, it is not possible to say everything.
How much of what we have saved or wish to save will
be used? Much thought and testing are required to ensure a successful
process. Through said process, one will learn quickly that mass preservation
without testing on a smaller sample usually leads to large problems later.
According
to a study conducted by Centre for Information Behavior and the Evaluation of
Research (CIBER), an organization run by University College of London (UCL),
libraries should also accept that much content will seldom or never be used,
other than perhaps a place from which to bounce. When selecting
information to be digitized, one has to consider the users of the materials as
well as the custodians who maintain and distribute said information. Having explicit requirements from both
perspectives will balance the demands and provide better planning. While items
that are frequently used are often considered the most, gray literature, or
low-value materials that were never published commercially, offered in
bookstores or unregistered and lacking ISBN numbers are usually endangered.
Most of what appears on the web falls under this category. More attention needs
to be placed on how end users actually use the content they are browsing for. That
could lead more insight into the selection process and making an attempt to
procure and protect files that will ultimately be used.
Anyone
who has the task of preserving items for a library or other information
facility quickly realizes that mass saving is not only impractical but nearly
impossible. As Hedstrom mentioned in her article, the purpose of preservation
is to protect information of enduring value for access by present and future
generations. Mechanisms that will enable
users to establish the origin, provenance, and authenticity of digital
documents require archives and libraries to preserve contextual and descriptive
information in addition to the content of digital documents. Provenance,
which was mentioned several times during our classes and conference in London,
helps to solidify the integrity of the digital object. Knowing the creator of
the object helps to ensure that the file has not been modified. With that being
the case, would orphan works need to be identified before they can be
preserved? This is an ongoing concern intertwined with copyright issues.
It is not possible to define all of the requirements related
to digital preservation. There are, however a few keys points that must be
addressed. The importance of digital preservation is to preserve the object
over the lifetime of the system. This ensures that the data is stored
indefinitely without loss and is known as reliability.
When we speak of preservation, it is important to remember
the end user. How will they use the file? One way to do this is to deal with threats
of obsolescence. The information needs to be preserved as the creators
intended. Specific software and often the hardware as well needs to be used in
order to access the files. Can save in a simplified format, however, that
usually degrades the overall quality of the work.
The Institute of Electrical and Electronics Engineers (IEEE) defines interoperability as the ability of two or more systems or components to exchange and use information. With native digital information as well as digital video, software dependencies exist. Software preservation becomes a problem when one tries to emulate an older version or migrate software to run on a new platform. This brings us into digital formats. With the cost of digitizing information, one strong format should be chosen. Saving one object under many formats is not feasible due to space limitations and the expenses related to the process.
Storage cannot be ignored when speaking about preservation. The scalability of the job can help to ensure that if the technology evolves, there will be adequate space to allow for updates. If the collection is static, as if often the case with historical archives, the size will be fixed, since no new items will be added. Even if no new items are added, the components may need to be updated to relate to updated technology. This practice is known as supporting heterogeneity. While data updates can occur, they are uncommon because the objects that are preserved are meant to remain unchanged. The main changes are usually to add new objects and to update the formats. The type of storage is also important. Cheap storage tends to use a large amount of energy, is apt to fail and can also use large amounts of water for maintaining a cool temperature.
The Institute of Electrical and Electronics Engineers (IEEE) defines interoperability as the ability of two or more systems or components to exchange and use information. With native digital information as well as digital video, software dependencies exist. Software preservation becomes a problem when one tries to emulate an older version or migrate software to run on a new platform. This brings us into digital formats. With the cost of digitizing information, one strong format should be chosen. Saving one object under many formats is not feasible due to space limitations and the expenses related to the process.
Storage cannot be ignored when speaking about preservation. The scalability of the job can help to ensure that if the technology evolves, there will be adequate space to allow for updates. If the collection is static, as if often the case with historical archives, the size will be fixed, since no new items will be added. Even if no new items are added, the components may need to be updated to relate to updated technology. This practice is known as supporting heterogeneity. While data updates can occur, they are uncommon because the objects that are preserved are meant to remain unchanged. The main changes are usually to add new objects and to update the formats. The type of storage is also important. Cheap storage tends to use a large amount of energy, is apt to fail and can also use large amounts of water for maintaining a cool temperature.
The other benefit to digitizing works is with proper Digital
Object Identifier System and metadata, it is quite easy to locate the
information later. Two popular standardized methods have been adopted in terms
of metadata: Open Archival Information System Reference Model (OAIS) and
Preservation Metadata:
Implementation Strategies (PREMIS).
Defined metadata standards help to support the integrity, authenticity,
reliability and archiving standards.
With the rapid acceptance of digital technologies and growth
of digital libraries, the growth supersedes the standardization process.
Without a standard format and methodologies across the board, will the
information need to be updated, reformatted or structured later? Future issues
are likely to include storage facilities as well as the growing popularity for
cloud storage. It works for personal accounts but will large facilities also be
able to utilize this service?
For appropriate risk management to be effective, it must
follow a trickle down approach. While
constantly monitoring and reviewing the information, the context must be
established, risks identified, analyzed, evaluated then treated while cycling
back to monitoring and reviewing. We must remember that all current methods of
preservation have tradeoffs and must balance functionality, dependability, and
cost based on current technologies and methods. With the rapid changes in technology
and lack of funds for digitization, in the foreseeable future, information centers
will be behind the curve.