Most of the conversations I end up in about digital preservation are about the digital versions of analog things. Discussions of documents, still and moving images and audio recordings are important, but as difficult as the problems surrounding these kinds of digital objects are, there is a harder problem: preserving executable content, aka software. Software isn’t simply what we use to render content–it’s is an important form of creative expression, a cultural artifact, a important commodity and an entity which increasingly is enmeshed our economic, political and social systems.
I thought I would start a quick list here of a few of what I think are some nice reads on preserving software. Some of these are posts from our blog, but most are papers and reports that I think do a nice job getting into some of the issues those interested in preserving software face and some of the ways folks are going about preserving software.
Please consider adding and reacting to these with:
- Additional papers or readings on the topic and similar brief descriptions
- Reactions and comments you have to these readings and any added readings
The Life-Saving Software Reference Library: This interview I did with Doug White from NIST goes into considerable detail on the structure and design of NISTs software library, which he describes as library of software, a database of metadata, a NIST publication and a research environment. Here is a bit of how Doug explained it, The research environment allows NSRL to collaborate with researchers who wish to access the contents of the virtual library. Researchers may perform tasks on the NSRL isolated network that involve access to the copies of media, to individual files, or to snapshots of software installations. In addition to the media copies, NSRL has compiled a corpus of the 25,000,000 unique files found on the media, and examples of software installation and execution in virtual machines.
The Geeks Who Saved Prince of Persias Source Code From Digital Death: This is the most fun story of any of those in this list. Be sure to follow the dramatic events as the original source code for the Apple II version of Prince of Persia makes its way off its original media and up onto Github.
Toward a Library of Virtual Machines: Insights interview with Vasanth Bala and Mahadev Satyanarayanan: This interview goes into some depth on the design of the Olive Library project. One quote is particularly salient on the potential importance of software preservation: as all fields of scientific investigations rely on complex simulation and visualization software, the ability to archive these software artifacts in executable form becomes essential for reproducibility of scientific results. Software preservation also enables long term data preservation. Todays data formats may become obsolete tomorrow, unless the software applications that process those formats are also preserved
Emulation: From Digital Artefact to Remotely Rendered Environments: Dirk von Suchodoletz, Jeffrey van der Hoeven 2009; While not directly focused on software preservation, a section of the paper articulates the focuses on some of the needs for and various problems in constituting software archives. Here is a valuable quote the original software also needs to be preserved if digital objects are to be kept alive via emulation. Guidelines similar to those created for digital objects themselves must be brought to bear in order to safeguard emulators, operating systems, applications and utilities. That is, software should be stored under the same conditions as other digital objects by preserving them in a OAIS-based (ISO 14721:2003) digital archive.
Preserving Virtual Worlds Final Report: McDonough, J., Olendorf, R., Kirschenbaum, M., Kraus, K., Reside, D., Donahue, R., Phelps, A., Egert, C., Lowood, H., & Rojo, S. (2010). At 187 pages, this is more of a book than an essay, but its full of valuable exploration and discussion of the various issues, problems, and opportunities around preserving video games.
The Attic & the Parlor: Notes from a Workshop on Software Collection, Preservation & Access: The Computer History Museums Software Preservation Group hosted what looked to be a fascinating workshop in 2006. You can find the proceedings and presentations online and their wiki also includes a rather extensive directory of software collections. The Attic & Parlor notion in the title focuses on a distinction between highly curated collections and sprawling “gather it all up” collections. This, like the preserving virtual worlds report, focus on the value of collecting source code.
What should we collect to preserve the history of software? Shustek, L. (2006). IEEE Annals of the History of Computing, 28(4), 112 – 111. Another strong argument for preserving source code. I argue that unless we collect, preserve, and interpret the software code in addition to the related artifacts, we have discarded the softwares intellectual essence. Emphasizing collateral materials puts the focus on the history of products and downplays the development of the scientific and engineering accomplishments that underlie them.
Preserving Software: Why and How John G. Zabolitzky, Iterations: An Interdisciplinary Journal of Software History 1 (September 13, 2002): 1-8. Zabolitzky makes an impassioned argument for urgent action on software preservation and similarly makes an appeal for the preservation of original source code. the evolution of software methods, techniques, styles, etc., is described in many books and articles. However, all of that is essentially hearsay: what actually has been done (and what may be different from what the active players in this area may report since they might have wished to do something different) can only be discerned and proven by examining the source code. The source code of any piece of software is the only original, the only artifact containing the full information. Everything else is an inferior copy.
What essential readings would you add to a list like this? Please consider taking a moment to add them in the comments. Also, feel free to use this comment thread as a place to discuss the various ideas and approaches advocated for in these readings?