This is a guest post by Kathleen O’Neill, Senior Archives Specialist in the Manuscript Division.
The Manuscript Division, located in the Library’s James Madison Building in Washington, D.C., is excited to announce the launch of a born-digital access workstation in the Manuscript Reading Room.
Born-digital collection materials are files created and maintained in digital form. Unlike digitized content, born-digital files are not surrogates for physical materials; their original format is digital. Word processing documents, websites, email, digital photographs, and databases are all examples of born-digital files. The division’s born-digital holdings span from the late 1970s to the present, and encompass many of the media formats, file formats, software, and operating systems in use during this period. Files in obsolete formats (e.g., WordPerfect for DOS) or created with obsolete operating systems and software (e.g., Mac OS9 and MacDrawPro v.1.5) require specialized tools for access and that’s where the born-digital access workstation comes in. While an estimated 85% of the born-digital collection material is accessible using modern software and file viewers, many files created before 2001, particularly Mac files, cannot be rendered using modern software.
Access to science and journalism collections in particular is negatively impacted by obsolescent operating systems and file format issues. For example, up to 40% of the born-digital files in the papers of molecular biologist Nina V. Fedoroff are inaccessible without specialized tools capable of running outdated software. One approach to providing access is through emulation, a process by which modern software can imitate legacy operating systems and software. Emulators have been around since the mid-1960s, but were popularized more recently by the video gaming community. Since the mid-1990s, video game enthusiasts have used emulators to preserve the look and feel of classic videos games such as Nintendo’s Excitebike. The tools installed on the Manuscript Division’s born-digital access workstation can emulate a range of Apple and DOS operating systems. These emulation tools include Basilisk II (Mac OS 0.x thru 8.1), Sheepshaver (Mac OS 7.5.3 to OS 9.0.4), QEMU (Mac OS 9.2 to 10.4) and DOSBox (MS-DOS).
Practically speaking, the availability of these emulation tools means that thousands of previously inaccessible files are now available to researchers in the Manuscript Division’s reading room. Fedoroff, for instance, initially used paper index cards to document hybridization of “jumping genes” in corn. Around 1988, she abandoned those paper index cards and began to use a Mac-based HyperCard program called HyperMaize to document this information. With the new workstation, researchers can now access and interact with more than 1,000 Hypermaize files in her papers.
Similarly, the papers of chaos theorist Edward N. Lorenz contain several DOS-based data visualization programs created for Lorenz attractor data. Without the DOSBox emulator, researchers could only look at the program and data files as text files (Fig. 2) or watch a screen capture of the data visualization running (Fig. 3). Now, researchers can interact with the software by changing inputs or even using their own data with the program.
The HyperMaize cards and the Chaos Water Wheel software are examples of file formats that are completely inaccessible without emulation. Many more files are only partially accessible using modern software, but emulation enables us to access the full information and formatting of a file. The example below, again from the Fedoroff Papers, shows an image rendered with modern software on the left and a fully emulated version on the right.
The newly installed tools on the workstation focus on supporting access by allowing researchers to view and copy files and extract metadata. Researchers do not have permission to copy software. Soon, additional tools to support digital humanities research methods will be installed, including tools for text analysis, topic modelling, and data visualization.
The workstation represents the fruition of the 2020 Staff Innovator project, Born Digital Access Now!, a 120-day detail focused on investigating barriers to born-digital access. The project enabled staff innovators Chad Conrady and Kathleen O’Neill, senior archives specialists in the Manuscript Division, to identify existing tools and technology to support access to the full range of the division’s born-digital holdings. The Manuscript Division would like to thank the Digital Strategies Directorate and LC Labs for creating, supporting, and hosting the Staff Innovator project and, in particular, Eileen Jakeway, for her tenacity and creativity as project lead. The project was essential for the Manuscript Division to reach its goal of improving access to and engaging researchers with born-digital material.
The Manuscript Division looks forward to welcoming researchers interested in working with our born-digital collections. The division holds more than 120 collections containing born-digital files, which have been processed and described, with new finding aids added monthly. Only processed, open collections are accessible. To find out more about what materials are currently available, please browse or search the division’s finding aids or ask a librarian in the Manuscript Division. When searching the Finding Aids database, type the search term “digital ID” in the search box; select “All Words” from the drop down menu; and limit the search to “Manuscript” in the Collections drop down list. You may also directly call (202) 707-5387 to set up an appointment to use the workstation in the reading room. Please share widely!
Do you want more stories like this? Then subscribe to Unfolding History – it’s free!