Top of page

Screenshot of a data visualization with a swirling "pinwheel" graphic. Blue lines on a black computer screen.
Still image of chaos water wheel visualization. Digital ID: mss85426_060_003, “Chaos Water Wheel,” software by Page, Mike & Jim Holsapple, 1990-1992. Edward N. Lorenz Papers, Manuscript Division, Library of Congress, Washington, D.C.

Accessing Our Digital Past in the Manuscript Division Reading Room

Share this post:

This is a guest post by Kathleen O’Neill, Senior Archives Specialist in the Manuscript Division.

The Manuscript Division, located in the Library’s James Madison Building in Washington, D.C., is excited to announce the launch of a born-digital access workstation in the Manuscript Reading Room.

Born-digital collection materials are files created and maintained in digital form. Unlike digitized content, born-digital files are not surrogates for physical materials; their original format is digital. Word processing documents, websites, email, digital photographs, and databases are all examples of born-digital files. The division’s born-digital holdings span from the late 1970s to the present, and encompass many of the media formats, file formats, software, and operating systems in use during this period. Files in obsolete formats (e.g., WordPerfect for DOS) or created with obsolete operating systems and software (e.g., Mac OS9 and MacDrawPro v.1.5) require specialized tools for access and that’s where the born-digital access workstation comes in. While an estimated 85% of the born-digital collection material is accessible using modern software and file viewers, many files created before 2001, particularly Mac files, cannot be rendered using modern software.

Access to science and journalism collections in particular is negatively impacted by obsolescent operating systems and file format issues. For example, up to 40% of the born-digital files in the papers of molecular biologist Nina V. Fedoroff are inaccessible without specialized tools capable of running outdated software. One approach to providing access is through emulation, a process by which modern software can imitate legacy operating systems and software. Emulators have been around since the mid-1960s, but were popularized more recently by the video gaming community. Since the mid-1990s, video game enthusiasts have used emulators to preserve the look and feel of classic videos games such as Nintendo’s Excitebike. The tools installed on the Manuscript Division’s born-digital access workstation can emulate a range of Apple and DOS operating systems. These emulation tools include Basilisk II (Mac OS 0.x thru 8.1), Sheepshaver (Mac OS 7.5.3 to OS 9.0.4), QEMU (Mac OS 9.2 to 10.4) and DOSBox (MS-DOS).

Practically speaking, the availability of these emulation tools means that thousands of previously inaccessible files are now available to researchers in the Manuscript Division’s reading room. Fedoroff, for instance, initially used paper index cards to document hybridization of “jumping genes” in corn. Around 1988, she abandoned those paper index cards and began to use a Mac-based HyperCard program called HyperMaize to document this information. With the new workstation, researchers can now access and interact with more than 1,000 Hypermaize files in her papers.

Black, white, and gray image of a single Hypermaize card, showing multiple fields documenting hybredized corn genetic data.]
Fig 1. Emulated HyperMaize Card. Digital ID: mss85579_042_006_disk_image_ver01, HyperMaize/Ear information/EAR CARDS, June 15, 1995, 7:13 am, Nina V. Fedoroff Papers, Manuscript Division, Library of Congress, Washington, D.C.

Similarly, the papers of chaos theorist Edward N. Lorenz contain several DOS-based data visualization programs created for Lorenz attractor data. Without the DOSBox emulator, researchers could only look at the program and data files as text files (Fig. 2) or watch a screen capture of the data visualization running (Fig. 3). Now, researchers can interact with the software by changing inputs or even using their own data with the program.

Two side by side images. The left hand image shows a black and white directory listing of the files for the data visualization. The files date span from 1990 to 1992. The right hand image shows a screen capture of the visualization program running. The image has a black background with fine blue lines forming the distinctive butterfly shape.
Left: Fig. 2. File list for chaos water wheel. Right: Fig 3. Still image of chaos water wheel visualization. Digital ID: mss85426_060_003, “Chaos Water Wheel,” software by Page, Mike & Jim Holsapple, 1990-1992. Edward N. Lorenz Papers, Manuscript Division, Library of Congress, Washington, D.C.

The HyperMaize cards and the Chaos Water Wheel software are examples of file formats that are completely inaccessible without emulation. Many more files are only partially accessible using modern software, but emulation enables us to access the full information and formatting of a file. The example below, again from the Fedoroff Papers, shows an image rendered with modern software on the left and a fully emulated version on the right.

Two side by side images. The left hand image shows black rectangles, arrows, and text on a grey background. The image takes up the top half of the page with the remaining space blank. The right hand image is in full color with a bright blue background and light blue, purple, orange, green, and red rectangles. The image on the right has additional information not visible on the left hand image, including an extra row of rectangles and title information.
Left: Fig. 4. 01.Ds transposon slide with modern file viewer. Right: Fig. 5. 01.Ds transposon slide. Emulated with SheepShaver in MacOS 9 and opened in MacDraw. Digital ID: mss85579_042_021_ver01\data\Slides 9_96\Transposon tagging\01.Ds transposon slide-Preservation. Nina V. Fedoroff Papers, Manuscript Division, Library of Congress, Washington, D.C.

The newly installed tools on the workstation focus on supporting access by allowing researchers to view and copy files and extract metadata. Researchers do not have permission to copy software. Soon, additional tools to support digital humanities research methods will be installed, including tools for text analysis, topic modelling, and data visualization.

The workstation represents the fruition of the 2020 Staff Innovator project, Born Digital Access Now!, a 120-day detail focused on investigating barriers to born-digital access. The project enabled staff innovators Chad Conrady and Kathleen O’Neill, senior archives specialists in the Manuscript Division, to identify existing tools and technology to support access to the full range of the division’s born-digital holdings. The Manuscript Division would like to thank the Digital Strategies Directorate and LC Labs for creating, supporting, and hosting the Staff Innovator project and, in particular, Eileen Jakeway, for her tenacity and creativity as project lead. The project was essential for the Manuscript Division to reach its goal of improving access to and engaging researchers with born-digital material.

The Manuscript Division looks forward to welcoming researchers interested in working with our born-digital collections. The division holds more than 120 collections containing born-digital files, which have been processed and described, with new finding aids added monthly. Only processed, open collections are accessible. To find out more about what materials are currently available, please browse or search the division’s finding aids or ask a librarian in the Manuscript Division. When searching the Finding Aids database, type the search term “digital ID” in the search box; select “All Words” from the drop down menu; and limit the search to “Manuscript” in the Collections drop down list. You may also directly call (202) 707-5387 to set up an appointment to use the workstation in the reading room. Please share widely!

Do you want more stories like this? Then subscribe to Unfolding History – it’s free!

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.