The following is a guest post by Abbie Grotke, Web Archiving Team Lead at the Library of Congress, serving as Content Working Group co-chair for the National Digital Stewardship Alliance.
Digital preservationists are generally peaceful types, so no actual dragons were harmed in the National Digital Stewardship Alliance Content Working Group’s “Slaying the Dragons: What is at risk and how do we rescue it?” workshop. Instead, attendees at our workshop took a quieter approach, through thoughtful discussion, and together we spent just over an hour mulling over what types of digital information might be particularly in need of preservation.
One of the things the Content Working Group is tasked with doing is developing a clearinghouse that will enable a variety of stakeholders to determine what types of content or collections are at risk, identify at-risk content or collections for preservation, and match orphan collections with trusted partners for access and preservation. Prior to the workshop, the group working on the clearinghouse had drafted a list of types of content that are at risk. That draft was the focus of the workshop. The goal of the workshop was to ensure that this was the right list (should anything be added or removed from the list?) and how we might move forward with these categories in terms of identifying ways to preserve this content. Examples of at-risk content were shown to help illustrate some of the issues and concerns.
So how do we define “at risk”? Fortunately there’s been quite a bit of work in this area already. The NDIIPP-sponsored Blue Ribbon Task Force on Sustainable Digital Preservation and Access identified these different categories: technical risks (due to obsolete, complicated, or proprietary formats); legal risks (who has the right to preserve it?), economic risks (who has the resources to preserve it?), organizational risks (organizations go away, or reorganize), risk of ignorance (lack of expertise or best practices), and uncertainty of the long term value.
During the workshop, we broke out into four groups and discussed various types of content that had been identified already as having potential risk in terms of preservation. The groups were pretty arbitrary, just for the purposes of discussion, and not meant to represent any specific groupings of at-risk content.
• State, local and federal government information
• News, citizen journalism, events and political
• Datasets, directories, and software
• Cultural, creative, and unique (or, as we described it, the “grab bag”)
• We need to reach out to communities that create the content, have the content, and know about the content that is at risk. Some that were mentioned in our workshop were scientists, citizens, government agencies, artists, enthusiasts. Who else should be involved in these discussions? Can they help us define value and select content for preservation?
• We need to be proactive about education, raising awareness of digital preservation: Why this is important? How can others ensure content is preserved? How can we work this into the concept of the clearinghouse?