This is a guest post written by Amanda May, Digital Projects Specialist in the Preservation Services Division. Her work includes managing digital files for the division, recovering data from removable media in Library collections, and providing consultation and services for born-digital collections data.
Born-digital preservation work most often begins with a physical object – a , a CD, a hard drive, or a laptop. This case study will document preservation work on two laptops in a collection from the Music Division.
The first step is to document details about the original object. Physical details such as the laptop’s make and model and original condition help me strategize about how to best get the data that I want to preserve and help tie the eventual digital data package to its context. For this project, I was looking at an Apple MacBook Pro 13”, circa 2008-2012, running MacOS X, and a Toshiba Satellite laptop running Windows XP. Both came with their original power cords, but I keep a collection of power cords that can help when they’re not included.
I placed the MacBook on an anti-static mat and used a surge protector to plug in the charging cable. Errant electricity is the first enemy of born-digital preservation. The MacBook was dirty and damaged – I used latex gloves when handling it to protect myself and later used antibacterial wipes on my workspace. The trackpad was swollen and misaligned. I worried that the laptop would be too damaged to work on, but it turned on and booted up, making some weird high-pitched noises as it did. The trackpad worked to move the cursor, but the ability to “click” was gone. I attached a wired mouse so that I could navigate – another tool that I keep in my arsenal.
My first actions were to turn off the WiFi and BlueTooth connectivity. Forensic investigators work in Faraday cages to protect outside forces from remotely erasing evidence. I’m not working against hostile forces, but turning off this connectivity keeps the collection items safe from system updates, account synchronization, or other automatic functions that could overwrite the data I want to preserve.
I created a list of every file on the laptop and its path using Terminal. This created a record of the entire laptop. I could not create a disk image using Disk Utility because I did not know the laptop’s password, so creating a file listing was a good way of recording the contents in situ. One option that I dismissed was creating a portable USB drive that could run a forensic imaging software and creating a disk image using that software. However, this would actually create changes to the original computer in the registry files (preference and .plist files in MacOS) and memory. From my initial look at the files on the computer, I was confident that I could find and copy the files I wanted without creating a disk image.
I then went through the laptop and copied all the user-created files I could find onto an external hard drive, maintaining the folder structure as much as possible. Most documents were in the Documents folder, but others were hiding on the desktop, in a Library folder for a specific app, or in other surprising locations. I exported an archive of the Mail inbox and the Address Book, even though that information was likely duplicated in files I had already copied from elsewhere on the hard drive. It is okay to duplicate work at this time because I was building a complete body of files, and de-duplication and weeding could be completed later. I documented all of my work, both so that I would know where I had already searched and to include with the final preservation package. When I was confident that I had captured everything, I shut the laptop down and put it away.
I repeated all of these steps for the Toshiba Satellite laptop, which was surprisingly mostly empty of user-created files. It is possible that the owner cleaned up his files before donating the laptop to the collection, but then why did they donate the laptop and why did the archive accept it?
My work with the laptops done, I set to work on finalizing the digital preservation package.
I looked through each file listing to determine if I had gotten everything I needed, then set to evaluating the files I had copied from each laptop. System files and application executables are weeded out, as well as files that were not deliberately downloaded or created by the user (cookies and viruses, for example). I used to look for duplicates, encrypted files, and personally-identifiable information, but I provided these reports to the custodial division for their own judgement and weeding instead of doing that work myself.
Finally, I packaged all of these files together with preservation and descriptive metadata and uploaded them to our digital repository. I sent the forensic reports to the custodial division along with notes on the processing methodology, then returned the original objects to the division as well.
Working with complex objects like a laptop can mean making a lot of informed decisions about how to best perform the job. Each item is unique and presents its own special challenges. Preserving born-digital data is an important and growing part of the Library of Congress’ duty to safeguard its collections.