Seeking Comment on Migration Checklist

The NDSA Infrastructure Working Group’s goals are to identify and share emerging practices around the development and maintenance of tools and systems for the curation, preservation, storage, hosting, migration, and similar activities supporting the long term preservation of digital content. One of the ways the IWG strives to achieve their goals is to collaboratively develop and publish technical guidance documents about core digital preservation activities. The NDSA Levels of Digital Preservation and the Fixity document are examples of this.

Birds in Pen

Birds. Ducks in pen. (Photo by Theodor Horydczak, 1920) (Source: Horydczak Collection Library of Congress Prints and Photographs Division, http://hdl.loc.gov/loc.pnp/thc.5a37506)

The latest addition to this guidance is a migration checklist. The IWG would like to share a draft of the checklist with the larger community in order to gather comments and feedback that will ultimately make this a better and more useful document. We expect to formally publish a version of this checklist later in the Fall, so please review the draft below and let us know by October 15, 2015 in the comments below or in email via ndsa at loc dot gov if you have anything to add that will improve the checklist.

Thanks, in advance, from your IWG co-chairs Sibyl Schaefer from University of California, San Diego, Nick Krabbenhoeft from Educopia and Abbey Potter from Library of Congress. Another thank you to the former IWG co-chairs Trevor Owens from IMLS and Karen Cariani from WGBH who lead the work to initially develop this checklist.

Good Migrations: A Checklist for Moving from One Digital Preservation Stack to Another

The goal of this document is to provide a checklist for things you will want to do or think through before and after moving digital materials and metadata forward to new digital preservation systems/infrastructures. This could entail switching from one system to another system in your digital preservation and storage architecture (various layers of hardware, software, databases, etc.). This is a relatively expansive notion of system. In some cases, organizations have adopted turn-key solutions whereby the requirements for ensuring long term access to digital objects are taken care of by a single system or application. However, in many cases, organizations make use of a range of built and bought applications and core functions of interfaces to storage media that collectively serve the function of a preservation system. This document is intended to be useful for migrations between either comprehensive systems as well as situations where one is swapping out individual components in a larger preservation system architecture.

Issues around normalization of data or of moving content or metadata from one format to another are out of scope for this document. This document is strictly focused on checking through issues related to moving fixed digital materials and metadata forward to new systems/infrastructures.

Before you Move:

  1. Review the state of data in the current system, clean up any data inconsistencies or issues that are likely to create problems on migration and identify and document key information (database naming conventions, nuances and idiosyncrasies in system/data structures, use metrics, etc.).
  2. Make sure you have fixity information for your objects and make sure you have a plan for how to bring that fixity information over into your new system. Note, that different systems may use different algorithms/instruments for documenting fixity information so check to make sure you are comparing the same kinds of outputs.
  3. Make sure you know where all your metadata/records for your objects are stored and that if you are moving that information that you have plans to ensure it’s integrity in place.
  4. Check/validate additional copies of your content stored in other systems, you may need to rely on some of those copies for repair if you run into migration issues.
  5. Identify any dependent systems using API calls into your system or other interfaces which will need to be updated and make plans to update, retire, or otherwise notify users of changes.
  6. Document feature parity and differences between the new and old system and make plans to change/revise and refine workflows and processes.
  7. Develop new documentation and/or training for users to transition from the old to the new system.
  8. Notify users of the date and time the system will be down and not accepting new records or objects. If the process will take some time, provide users with a plan for expectations on what level of service will be provided at what point and take the necessary steps to protect the data you are moving forward during that downtime.
  9. Have a place/plan on where to put items that need ingestion while doing the migration.  You may not be able to tell people to just stop and wait.
  10. Decide on what to do with your old storage media/systems. You might want to keep them for a period just in case, reuse them for some other purpose or destroy them. In any event it should be a deliberate, documented decision.
  11. Create documentation recording what you did and how you approached the migration (any issues, failures, or issues that arose) to provide provenance information about the migration of the materials.
  12. Test migration workflow to make sure it works – both single records and bulk batches of varying sizes to see if there are any issues.

After you Migrate

  1. Check your fixity information to ensure that your new system has all your objects intact.
  2. If any objects did not come across correctly, as identified by comparing fixity values, then repair or replace the objects via copies in other systems. Ideally, log this kind of information as events for your records.
  3. Check to make sure all your metadata has come across, spot check to make sure it hasn’t been mangled.
  4. Notify your users of the change and again provide them with new or revised user documentation.
  5. Record what is done with the old storage media/systems after migration.
  6. Assemble all documentation generated and keep with other system information for future migrations.
  7. Establish timeline and process for reevaluating when future migrations should be planned for (if relevant).

Relevant resources and tools:

What is Fixity, and When Should I be Checking It?

http://digitalpreservation.gov/ndsa/working_groups/documents/NDSA-Fixity-Guidance-Report-final100214.pdf

Fixity Checkers http://e-records.chrisprom.com/checksum-verification-tools/

COPTR (Community Owned digital Preservation Tools Registry) http://coptr.digipres.org/Main_Page

POWRR tool grid http://digitalpowrr.niu.edu/tool-grid/

 

This post was updated 9/3/2015 to fix formatting and add email information.

Viewshare Supports Critical Thinking in the Classroom

This year I had the pleasure of meeting Dr. Peggy Spitzer Christoff, lecturer in Asian and Asian American Studies at Stony Brook University. She shared with me how she’s using the Library of Congress’ Viewshare tool to engage her students in an introduction to Asia Studies course. Peg talked about using digital platforms as a way to improve writing, […]

Mapping the Digital Galaxy: The Keepers Registry Expands its Tool Kit

This past month, The Keepers Registry released a new version of its website with a suite of significant new features to help its members monitor the archival status of e-journal content. The Library of Congress has been one of the archiving institutions of The Keepers Registry and we thought this was a good time to […]

Mapping Libraries: Creating Real-time Maps of Global Information

The following is a guest post by Kalev Hannes Leetaru, a data scientist and Senior Fellow at George Washington University Center for Cyber & Homeland Security. In a previous post, he introduced us to the GDELT Project, a platform that monitors the news media, and presented how mass translation of the world’s information offers libraries […]

We Welcome Our Email Overlords: Highlights from the Archiving Email Symposium

This post is co-authored with Erin Engle, a Digital Archivist in the Office of Strategic Initiatives. Despite the occasional death knell claims, email is alive, well and exponentially thriving in many organizations. It’s become an increasingly complex challenge for collecting and memory institutions as we struggle with the same issues: How is email processed differently […]

We Did All That? NDSA Standards and Practices Working Group Project Recaps

The end of the school year often finds me thinking about time gone by. What did I work on and what can I show for it? The NDSA Standards and Practices Working Group members were in the same frame of mind so we recently did a survey of our projects and accomplishments since the NDSA […]

Digital Preservation Infrastructure Tours: The Bentley Historical Library

To better understand how organizations working to ensure long-term access to digital content are meeting the challenges of digital stewardship, the NDSA Infrastructure Working Group is running a new series of interviews. In each of these, we ask individuals to answer questions about their organization and the technologies and tools they use to serve as […]

The K-12 Web Archiving Program: Preserving the Web from a Youthful Point of View

This article is being co-published on the Teaching With the Library of Congress blog and was written by Butch Lazorchak and Cheryl Lederle. If you believe the Web (and who doesn’t believe everything they read on the Web?), it boastfully celebrated its 25th birthday last year. Twenty-five years is long enough for the first “children […]

Libraries Looking Across Languages: Seeing the World Through Mass Translation

The following is a guest post by Kalev Hannes Leetaru, Senior Fellow, George Washington University Center for Cyber & Homeland Security. Portions adapted from a post for the Knight Foundation. Imagine a world where language was no longer a barrier to information access, where anyone can access real-time information from anywhere in the world in […]

Tracking Digital Collections at the Library of Congress, from Donor to Repository

When Kathleen O’Neill talks about digital collections, she slips effortlessly into the info-tech language that software engineers, librarians, archivists and other information technology professionals use to communicate with each other.  O’Neill, a senior archives specialist in the Library of Congress’s Manuscript Division, speaks with authority about topics such as file signatures, hex editors and checksums even […]