Beyond Us and Them: Designing Storage Architectures for Digital Collections 2014

The following post was authored by Erin Engle, Michelle Gallinger, Butch Lazorchak, Jane Mandelbaum and Trevor Owens from the Library of Congress.

The Library of Congress held the 10th annual Designing Storage Architectures for Digital Collections meeting September 22-23, 2014. This meeting is an annual opportunity for invited technical industry experts, IT  professionals, digital collections and strategic planning staff and digital preservation practitioners to discuss the challenges of digital storage and to help inform decision-making in the future. Participants come from a variety of government agencies, cultural heritage institutions and academic and research organizations.

The DSA Meeting. Photo credit: Peter Krogh/DAM Useful Publishing.

The DSA Meeting. Photo credit: Peter Krogh/DAM Useful Publishing.

Throughout the two days of the meeting the speakers took the participants back in time and then forward again. The meeting kicked-off with a review of the origins of the DSA meeting. It started ten years ago with a gathering of Library of Congress and external experts who discussed requirements for digital storage architectures for the Library’s Packard Campus of the National Audio-Visual Conservation Center. Now, ten years later, the speakers included representatives from Facebook and Amazon Web Services, both of which manage significant amounts of content and neither of which existed in 2004 when the DSA meeting started.

The theme of time passing continued with presentations by strategic technical experts from the storage industry who began with an overview of the capacity and cost trends in storage media over the past years. Two of the storage media being tracked weren’t on anyone’s radar in 2004, but loom large for the future – flash memory and Blu-ray disks. Moving from the past quickly to the future, the experts then offered predictions, with the caveat that predictions beyond a few years are predictably unpredictable in the storage world.

Another facet of time – “back to the future” – came up in a series of discussions on the emergence of object storage in up-and-coming hardware and software products.  With object storage, hardware and software can deal with data objects (like files), rather than physical blocks of data.  This is a concept familiar to those in the digital curation world, and it turns out that it was also familiar to long-time experts in the computer architecture world, because the original design for this was done ten years ago. Here are some of the key meeting presentations on object storage:

Several speakers talked about the impact of the passage of time on existing digital storage collections in their institutions and the need to perform migrations of content from one set of hardware or software to another as time passes.  The lessons of this were made particularly vivid by one speaker’s analogy, which compared the process to the travails of someone trying to manage the physical contents of a car over one’s lifetime.

Even more vivid was the “Cost of Inaction” calculator, which provides black-and-white evidence of the costs of not preserving analog media over time, starting with the undeniable fact that you have to start with an actual date in the future for the “doomsday” when all your analog media will be unreadable.

The DSA Meeting. Photo Credit: Trevor Owens

The DSA Meeting. Photo Credit: Trevor Owens

Several persistent time-related themes engaged the participants in lively interactive discussions during the meeting.  One topic was the practical methods for checking the data integrity of content  in digital collections.  This concept, called fixity, has been a common topic of interest in the digital preservation community. Similarly, a thread of discussion on predicting and dealing with failure and data loss over time touched on a number of interesting concepts, including “anti-entropy,” a type of computer “gossip” protocol designed to query, detect and correct damaged distributed digital files. Participants agreed it would be useful to find a practical approach to identifying and quantifying types of failures.  Are the failures relatively regular but small enough that the content can be reconstructed? Or are the data failures highly irregular but catastrophic in nature?

Another common theme that arose is how to test and predict the lifetime of storage media.  For example, how would one test the lifetime of media projected to last 1000 years without having a time-travel machine available?  Participants agreed to continue the discussions of these themes over the next year with the goal of developing practical requirements for communication with storage and service providers.

The meeting closed with presentations from vendors working on the cutting edge of new archival media technologies.  One speaker dealt with questions about the lifetime of media by serenading the group with accompaniment from a 32-year-old audio CD copy of Pink Floyd’s “Dark Side of the Moon.” The song “Us and Them” underscored how the DSA meeting strives to bridge the boundaries placed between IT conceptions of storage systems and architectures and the practices, perspectives and values of storage and preservation in the cultural heritage sector. The song playing back from three decade old media on a contemporary device was a fitting symbol of the objectives of the meeting.

Background reading (PDF) was circulated prior to the meeting and the meeting agenda and copies of the presentations are available at http://www.digitalpreservation.gov/meetings/storage14.html.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.