The following is a guest post from Lee Nilsson, a National Digital Stewardship Resident working with the Repository Development Center at The Library of Congress.
The 2014 National Agenda for Digital Stewardship makes a clear-cut case for the development of File Format Action Plans to combat format obsolescence issues. “Now that stewardship organizations are amassing large collections of digital materials,” the report says, “it is important to shift from more abstract considerations about file format obsolescence to develop actionable strategies for monitoring and mining information about the heterogeneous digital files the organizations are managing.” The report goes on to detail the need for organizations to better “itemize and assess” the content they manage.
Just what exactly is a File Format Action Plan? What does it look like? What does it do? As the new National Digital Stewardship Resident I undertook an informal survey of a selection of divisions at the library. Opinions varied as to what should constitute a file format action plan, but the common theme was the idea of “a pathway.” As one curator put it, “We just got in X. When you have X, here’s the steps you need to take. Here are the tools currently available. Here is the person you need to go to.”
For the dedicated digital curator, there are many different repositories of information about the technical details of digital formats. The Library of Congress’ excellent Sustainability of Digital Formats page goes into exhaustive detail about dozens of different file format types. The National Archives of the UK’s now ubiquitous PRONOM technical registry is an indispensable resource. That said; specific file format action plans are not very common.
Probably the best example of File Format Action Plans in practice is provided by the Florida Digital Archive. The FDA attempted to create a plan for each type of file format they preserve digitally. The result is a list of twenty-one digital formats, ranked by “confidence” as high, medium, or low for their long term storage prospects. Attached to each is a short Action Plan giving basic information about what to do with the file at ingest, its significant properties, a long term preservation strategy, and timetables for short-term actions and review. Below that is a more technically detailed “background report” explaining the rational behind each decision. Some of the action plans are incomplete, recommending migration to a yet unspecified format at some point in the future. The plans have not been updated in some time, with many stating that they are “currently under discussion and subject to change.”
A related project was undertaken by the University of Michigan’s institutional repository, which organizes file formats into three specific targeted support levels.
Clicking on “best practices” for a format type (such as the above for audio formats) will take you to a page detailing more specific preservation actions and recommendations. This design is elegant and simple to understand, yet it is lacking in much detailed information about the formats themselves.
An even more broad approach was done by the National Library of Australia. The NLA encourages its collection curators to make, “explicit statements about which collection materials, and which copies of collection materials, need to remain accessible for an extended period, and which ones can be discarded when no longer in use or when access to them becomes troublesome.” They call these outlines “Preservation Intent Statements.” Each statement outlines the goals and issues unique to each library division. This Preservation Intent Statement for NLA’s newspaper digitization project, goes into the details of what they intend to save, in what format, and what preservation issues to expect. This very top-down approach does not go into great detail about file formats themselves, but it may be useful in clarifying just what the mission of a curatorial division is, as well as providing some basic guidance.
There have been notable critics of the process of the idea of file format action plans based on risk assessment. Johan van der Knijff on the Open Planets Foundation blog compared the process of assessing file format risks to “Searching for Bigfoot,” in that these activities always rest on a theoretical framework, and that scarce resources could be better spent solving problems that do not require any soothsaying or educated guesswork. Tim Gollins of the National Archives of the UK argues that while it might be true that digital obsolescence issues are real in some cases, resources may better be spent addressing the more basic needs of capture and storage.
While taking those critiques seriously, it may be wise to take a longer view. It is valuable to develop a way to think about and frame these issues going forward. Sometimes getting something on paper is a necessary first step, even if it is destined to be revised again and again. Based on my discussions with curators at the Library of Congress, a format action plan could be more than just an “analysis of risk.” It could contain actionable information about software and formats which could be a major resource for the busy data manager. In a sprawling and complex organization like the Library of Congress, getting everyone on the same page is often impossible, but maybe we can get everyone on the same chapter with regards to digital formats.
Over the next six months I’ll be taking a look at some of these issues for the Office of Strategic Initiatives at the Library. As a relative novice to the world of library issues, I have been welcomed by the friendly and accommodating professionals here at the library. I hope to get to know more of the fascinating people working in the digital preservation community as the project progresses.