Top of page

PREMIS, for Digital Preservation

Share this post:

Note:  We will occasionally post material to The Signal, with updates, that was previously published only on our website.  The following is an article from our “Meeting the Challenge” series, October, 2010.

Behind every digital object, there is usually metadata with descriptive information about the object.  But the library world is all too aware that metadata for access and discovery is no longer enough.  Now, digital library professionals are looking to the future with an eye towards preservation, not only needing to preserve the digital objects themselves but also the valuable metadata that goes along with it.

Enter PREMIS, which stands for Preservation Metadata: Implementation Strategies. According to the publication Understanding PREMIS written by Priscilla Caplan and issued by the Library of Congress, preservation metadata “supports activities intended to ensure the long-term usability of a digital resource.”

The Library of Congress sponsors the PREMIS Maintenance Activity, and thereby is promoting the use and development of this preservation metadata standard as a regular part of the digital library process.

The motivation for PREMIS is based on the needs for implementing a digital preservation repository, which requires keeping important information about its digital objects to enable long-term management.  As stated in Understanding PREMIS, “the primary uses of PREMIS are for repository design, repository evaluation and exchange of archived information packages among preservation repositories.”

So why is this important?  Rebecca Guenther, senior networking and standards specialist at the Library of Congress, illustrates this by the following comparison:  “In addition to being able to find books, you need to be able to bind the books so they don’t fall apart and perform other preservation actions that keeps their pages readable and intact.”  PREMIS provides the information to ensure that the object can be preserved – as a sort of digital “binding” – to keep the items, through the metadata, useable over time.

PREMIS grew out of an effort by the cultural heritage community to build on the Open Archival Information System repository model for digital resources.   Discussions about the need for preservation metadata led to the formation of a PREMIS Working Group in 2003, an international collaboration of experts involved in digital preservation activities, jointly sponsored by OCLC and RLG.  The initial result was a tool that has been in increasing use ever since – the PREMIS Data Dictionary for Preservation Metadata.

The Data Dictionary is a comprehensive resource for the implementation of preservation metadata in digital library systems.  It consists of a core set of standardized data elements that are recommended for repositories to manage and perform the preservation function.  These crucial functions include actions to make the digital objects useable over time, keeping them viable, or readable, displayable and kept intact, all for the purpose of future access.

The digital preservation community recognized the importance of the Data Dictionary from the start:  in 2005 the Working Group received the prestigious Digital Preservation Award from the Digital Preservation Coalition in the UK, and a year later, was given the Preservation Publication Award by the Society of American Archivists.  The judges for this last award noted, “The work is intellectually sophisticated, groundbreaking, truly collaborative and international in scope, and is of great significance for the archival preservation community.”

To illustrate a general need for preservation metadata, for example, consider that certain file formats can become obsolete and not accessible by current applications.  This would require either transforming older formats to new (migration), or reproducing the original experience with newer technology (emulation).  In order to succeed, both of these strategies would require the following: technical metadata about the original files, the older hardware and software that they ran on, and what actions had been performed on them – in other words, preservation metadata.

The implementation of PREMIS has matured since the Data Dictionary was first issued.  It is increasingly being seen as a key tool for developing a preservation infrastructure. Even though PREMIS is just beginning to be adopted by some institutions, it has had an impact on their overall preservation strategies.

Government agencies, such as the Government Printing Office and the National Archives and Records Administration have adopted or are planning to adopt PREMIS.  Ex Libris, a library information technology company, has also integrated PREMIS into their preservation product. Translation of Understanding PREMIS into Spanish, Italian and German, and translation of the entire Data Dictionary into Japanese, demonstrates broad international appeal.  In some countries, use of the Data Dictionary is mandated for digital objects in cultural heritage digital repositories.

The Library’s Network Development and Standards Office hosts the PREMIS website, which provides documentation and discussion lists, and serves as a central information point for all things PREMIS.  The initial working group has now been replaced by the ongoing PREMIS Maintenance Activity, which includes a PREMIS Editorial Committee.  This activity supports the maintenance of the Data Dictionary as well as the XML schema, maintains centralized discussion groups and forums, and provides tutorials and workshops on PREMIS. The PREMIS folks are not without a sense of humor, either – part of this ongoing activity is the PREMIS Implementers Group (the “PIG”), which hosts a wiki called “The PigPen.”

The Library of Congress contracted with the Florida Center for Library Automation to build a tool that supports PREMIS implementation by automating the preservation metadata creation process.  The result was the PREMIS in METS , a free toolbox that is available on SourceForge.

According to Rebecca Guenther, there are many near-term goals for the overall PREMIS activity, including revisions of the Data Dictionary based on user experiences, explorations of changes to the underlying data model, experimentation with the exchange of objects between repositories and implementation at the Library of Congress in some digital projects.  Guenther says, “a lot of these data elements are already available from the objects, but the challenge is capturing them and putting them in one place to be used for the preservation process.”

(UPDATE:  Since this article was originally published, there is an updated version of the Data Dictionary available, as well as a new OWL ontologyRebecca Guenther has left the Library and is now serving as an independent consultant to the project.  See the PREMIS page for the latest information.)


Comments (2)

  1. Does PREMIS or any of the associated groups have plans to provide an educational service which can be made available to schools? I lecture at art schools about copyright and business practices and have discovered that many of the schools producing creators of content, photographers, illustrators, designers, etc. are weak in this area.
    Wayne Eastep

    • In checking into your question, Wayne, I learned there is nothing specific planned for schools. However, there are several ways to get more information, all available through the PREMIS website (http://www.loc.gov/standards/premis/): For an overview of PREMIS, there is the publication “Understanding PREMIS” (http://www.loc.gov/standards/premis/understanding-premis.pdf); there are also occasionally PREMIS-related events, and if you have further questions, you can submit directly to the PREMIS folks through the “comments” link on the site. I hope this helps.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.


Required fields are indicated with an * asterisk.