Seeking Comment on Migration Checklist

The NDSA Infrastructure Working Group’s goals are to identify and share emerging practices around the development and maintenance of tools and systems for the curation, preservation, storage, hosting, migration, and similar activities supporting the long term preservation of digital content. One of the ways the IWG strives to achieve their goals is to collaboratively develop and publish technical guidance documents about core digital preservation activities. The NDSA Levels of Digital Preservation and the Fixity document are examples of this.

Birds in Pen

Birds. Ducks in pen. (Photo by Theodor Horydczak, 1920) (Source: Horydczak Collection Library of Congress Prints and Photographs Division, http://hdl.loc.gov/loc.pnp/thc.5a37506)

The latest addition to this guidance is a migration checklist. The IWG would like to share a draft of the checklist with the larger community in order to gather comments and feedback that will ultimately make this a better and more useful document. We expect to formally publish a version of this checklist later in the Fall, so please review the draft below and let us know by October 15, 2015 in the comments below or in email via ndsa at loc dot gov if you have anything to add that will improve the checklist.

Thanks, in advance, from your IWG co-chairs Sibyl Schaefer from University of California, San Diego, Nick Krabbenhoeft from Educopia and Abbey Potter from Library of Congress. Another thank you to the former IWG co-chairs Trevor Owens from IMLS and Karen Cariani from WGBH who lead the work to initially develop this checklist.

Good Migrations: A Checklist for Moving from One Digital Preservation Stack to Another

The goal of this document is to provide a checklist for things you will want to do or think through before and after moving digital materials and metadata forward to new digital preservation systems/infrastructures. This could entail switching from one system to another system in your digital preservation and storage architecture (various layers of hardware, software, databases, etc.). This is a relatively expansive notion of system. In some cases, organizations have adopted turn-key solutions whereby the requirements for ensuring long term access to digital objects are taken care of by a single system or application. However, in many cases, organizations make use of a range of built and bought applications and core functions of interfaces to storage media that collectively serve the function of a preservation system. This document is intended to be useful for migrations between either comprehensive systems as well as situations where one is swapping out individual components in a larger preservation system architecture.

Issues around normalization of data or of moving content or metadata from one format to another are out of scope for this document. This document is strictly focused on checking through issues related to moving fixed digital materials and metadata forward to new systems/infrastructures.

Before you Move:

  1. Review the state of data in the current system, clean up any data inconsistencies or issues that are likely to create problems on migration and identify and document key information (database naming conventions, nuances and idiosyncrasies in system/data structures, use metrics, etc.).
  2. Make sure you have fixity information for your objects and make sure you have a plan for how to bring that fixity information over into your new system. Note, that different systems may use different algorithms/instruments for documenting fixity information so check to make sure you are comparing the same kinds of outputs.
  3. Make sure you know where all your metadata/records for your objects are stored and that if you are moving that information that you have plans to ensure it’s integrity in place.
  4. Check/validate additional copies of your content stored in other systems, you may need to rely on some of those copies for repair if you run into migration issues.
  5. Identify any dependent systems using API calls into your system or other interfaces which will need to be updated and make plans to update, retire, or otherwise notify users of changes.
  6. Document feature parity and differences between the new and old system and make plans to change/revise and refine workflows and processes.
  7. Develop new documentation and/or training for users to transition from the old to the new system.
  8. Notify users of the date and time the system will be down and not accepting new records or objects. If the process will take some time, provide users with a plan for expectations on what level of service will be provided at what point and take the necessary steps to protect the data you are moving forward during that downtime.
  9. Have a place/plan on where to put items that need ingestion while doing the migration.  You may not be able to tell people to just stop and wait.
  10. Decide on what to do with your old storage media/systems. You might want to keep them for a period just in case, reuse them for some other purpose or destroy them. In any event it should be a deliberate, documented decision.
  11. Create documentation recording what you did and how you approached the migration (any issues, failures, or issues that arose) to provide provenance information about the migration of the materials.
  12. Test migration workflow to make sure it works – both single records and bulk batches of varying sizes to see if there are any issues.

After you Migrate

  1. Check your fixity information to ensure that your new system has all your objects intact.
  2. If any objects did not come across correctly, as identified by comparing fixity values, then repair or replace the objects via copies in other systems. Ideally, log this kind of information as events for your records.
  3. Check to make sure all your metadata has come across, spot check to make sure it hasn’t been mangled.
  4. Notify your users of the change and again provide them with new or revised user documentation.
  5. Record what is done with the old storage media/systems after migration.
  6. Assemble all documentation generated and keep with other system information for future migrations.
  7. Establish timeline and process for reevaluating when future migrations should be planned for (if relevant).

Relevant resources and tools:

This post was updated 9/3/2015 to fix formatting and add email information.

Mapping the Digital Galaxy: The Keepers Registry Expands its Tool Kit

This past month, The Keepers Registry released a new version of its website with a suite of significant new features to help its members monitor the archival status of e-journal content. The Library of Congress has been one of the archiving institutions of The Keepers Registry and we thought this was a good time to […]

Mapping Libraries: Creating Real-time Maps of Global Information

The following is a guest post by Kalev Hannes Leetaru, a data scientist and Senior Fellow at George Washington University Center for Cyber & Homeland Security. In a previous post, he introduced us to the GDELT Project, a platform that monitors the news media, and presented how mass translation of the world’s information offers libraries […]

DPOE Interview with Danielle Spalenka of the Digital POWRR Project

The following is a guest post by Barrie Howard, IT Project Manager at the Library of Congress. This post is part of a series about digital preservation training informed by the Library’s Digital Preservation Outreach & Education (DPOE) Program. Today I’ll focus on an exceptional individual, Danielle Spalenka, Project Director for the Digital POWRR Project. […]

We Welcome Our Email Overlords: Highlights from the Archiving Email Symposium

This post is co-authored with Erin Engle, a Digital Archivist in the Office of Strategic Initiatives. Despite the occasional death knell claims, email is alive, well and exponentially thriving in many organizations. It’s become an increasingly complex challenge for collecting and memory institutions as we struggle with the same issues: How is email processed differently […]

DPOE Makes a Splash Down Under!

The following is a guest post by Barrie Howard, IT Project Manager at the Library of Congress. The Digital Preservation Outreach and Education (DPOE) program is pleased to announce a successful outcome for two international Train-the-Trainer workshops. These workshops were recently held in Australia, and are the first of their kind to be held outside […]

Dodge that Memory Hole: Saving Digital News

Newspapers are some of the most-used collections at libraries. They have been carefully selected and preserved and represent what is often referred to as “the first draft of history.” Digitized historical newspapers provide broad and rich access to a community’s past, enabling new kinds of inquiry and research. However, these kinds of resources are at […]

Digital Archiving Programming at Four Liberal Arts Colleges

The following guest post is a collaboration from Joanna DiPasquale (Vassar College), Amy Bocko (Wheaton College), Rachel Appel (Bryn Mawr College) and Sarah Walden (Amherst College) based on their panel presentation at the recent Personal Digital Archiving 2015 conference. I will write a detailed post about the conference — which the Library of Congress helped […]