Bit By Bit: Recent Projects on Digital Forensics for Collecting Institutions

This is a guest post by Bradley Daigle, Director of Digital Curation Services and Digital Strategist for Special Collections, University of Virginia; Matthew Kirschenbaum, Associate Professor of English and Associate Director, Maryland Institute for Technology in the Humanities (MITH), University of Maryland; and Christopher (Cal) Lee, Associate Professor at the School of Information and Library Science at the University of North Carolina at Chapel Hill.

For the last several years there has been increased attention to the role and application of digital forensic technologies in cultural heritage institutions. Individual practitioners, especially in the US, UK, Canada, and Australia, have worked the conference circuit, demonstrating capabilities and discussing the issues and challenges around the adoption of forensic methods and tools.

A 2010 report on Digital Forensics and Born-Digital Content in Cultural Heritage Institutions published by the Council on Library and Information Resources surveyed the landscape and reported that “the methods and tools developed by forensics experts represent a novel approach to key issues and challenges in the archives and curatorial community” (1), yet also concluded that “digital forensics should not simply be imported and adopted in toto into manuscript archives and the broader cultural heritage and scholarly communities” (60).

The Library of Congress, meanwhile, named the growing visibility of digital forensics as one of the “Top 10 Digital Preservation Developments of 2010.”

In addition to this activity and advocacy, there have been several larger funded projects that have begun addressing the integration of forensic technologies within the specific context of the archival workflows at cultural heritage institutions.

Day 11 Forensics by user Tojosan on Flickr

Day 11 Forensics by user Tojosan on Flickr

To an outsider, these projects may appear either redundant or self-contained; however, these first few large-scale efforts complement each other remarkably well, addressing key aspects of archival practice with notable compatibility at both technical and theoretical levels.

This post surveys several of these organized efforts to date. Others, such as the important futureArch project currently underway at the Bodleian Library, likewise merit attention.

Born Digital Collections: An Inter-Institutional Model for Stewardship (AIMS) was a two-year grant funded by The Andrew W. Mellon Foundation. It established an international team consisting of the University of Hull (UK), Stanford University, Yale University, and the University of Virginia (as project lead). Its goal was to find common ground among these partners for creating a methodological approach to stewarding born-digital materials.

Unlike many solutions that are local and customized to suit a particular organization’s needs, the AIMS framework set out to establish the parameters and questions one needs to ask before taking on born-digital archives. With four partners, the project set out to establish a shared vocabulary and method (no easy task) and applied it to the thirteen collections identified as part of the grant. The goal was to create a stewardship model informed by sound archival practice that could be repurposed by any organization. AIMS has published a white paper of its work: http://www2.lib.virginia.edu/aims/whitepaper/.

BitCurator, meanwhile, a joint effort-led by the School of Information and Library Science (SILS) at the University of North Carolina at Chapel Hill and the Maryland Institute for Technology in the Humanities (MITH) at the University of Maryland, is also a two-year project funded by the Andrew W. Mellon Foundation.  It aims to create and analyze systems for archivists, librarians and other information professionals to incorporate digital forensics methods.

BitCurator builds off of both the CLIR Digital Forensics report and AIMS.  It also follows a smaller Mellon-funded project at SILS called Digital Acquisition Learning Laboratory (DALL), which established and implemented hands-on digital forensics learning experiences for library and information science students.  Whereas AIMS focused on workflows and shared practices, BitCurator is focused on packaging and dissemination of relatively self-contained and modular tools for collecting institutions to generate disk images (bit-identical copies of disks), and then process and export the data in ways that can be incorporated into existing or emerging workflows.

Two groups of external partners are contribute to this process and have already held meetings with the project team: a professional expert panel of individuals who are at various levels of implementing digital forensics tools and methods in their collecting institution contexts, and a development advisory group of individuals who have significant experience with development of software.

Other aspects of the CLIR report still require further investigation and project-based research. In particular, much work remains to be done around articulating the nature of the kind of scholarly inquiries born-digital materials can support and facilitate, and the potential role of digital forensics tools in the hands of patrons as well as archivists.

Likewise, there are a host of ethical issues that are acknowledged but by no means resolved in the CLIR report, which notes that “As born-digital materials become commonplace within libraries and archives, the librarian’s and the archivist’s commitment to professional ethics is being tested under a new and constantly changing set of technical circumstances” (50).

Clearly much work remains to be done. However, as recent events such as the Digital Forensics symposium hosted by the University of Maryland in 2010, the AIMS partners meeting in 2011 (whose centerpiece was a participant-driven “unconference”), the Day of Digital Archives (organized by one of the AIMS partner institutions), this DPC Briefing held at Oxford, and the BitCurator advisory meetings in late 2011 and early 2012 all demonstrate, there is an energized and well-networked community of practitioners who are committed to addressing them.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.