Preserving.exe: A Short List of Readings on Software Preservation

Most of the conversations I end up in about digital preservation are about the digital versions of analog things. Discussions of documents, still and moving images and audio recordings are important, but as difficult as the problems surrounding these kinds of digital objects are, there is a harder problem: preserving executable content, aka software. Software isn’t simply what we use to render content–it’s is an important form of creative expression, a cultural artifact, a important commodity and an entity which increasingly is enmeshed our economic, political and social systems.

I thought I would start a quick list here of a few of what I think are some nice reads on preserving software. Some of these are posts from our blog, but most are papers and reports that I think do a nice job getting into some of the issues those interested in preserving software face and some of the ways folks are going about preserving software.

Please consider adding and reacting to these with:

  1. Additional papers or readings on the topic and similar brief descriptions
  2. Reactions and comments you have to these readings and any added readings

The Life-Saving Software Reference Library: This interview I did with Doug White from NIST goes into considerable detail on the structure and design of NIST’s software library, which he describes as library of software, a database of metadata, a NIST publication and a research environment. Here is a bit of how Doug explained it, “The research environment allows NSRL to collaborate with researchers who wish to access the contents of the virtual library. Researchers may perform tasks on the NSRL isolated network that involve access to the copies of media, to individual files, or to “snapshots” of software installations. In addition to the media copies, NSRL has compiled a corpus of the 25,000,000 unique files found on the media, and examples of software installation and execution in virtual machines.”

Diagram of the NSRL workflow and work products

The Geeks Who Saved Prince of Persia’s Source Code From Digital Death: This is the most fun story of any of those in this list. Be sure to follow the dramatic events as the original source code for the Apple II version of Prince of Persia makes it’s way off it’s original media and up onto Github.

Toward a Library of Virtual Machines: Insights interview with Vasanth Bala and Mahadev Satyanarayanan: This interview goes into some depth on the design of the Olive Library project. One quote is particularly salient on the potential importance of software preservation: “as all fields of scientific investigations rely on complex simulation and visualization software, the ability to archive these software artifacts in executable form becomes essential for reproducibility of scientific results. Software preservation also enables long term data preservation. Today’s data formats may become obsolete tomorrow, unless the software applications that process those formats are also preserved”

Emulation: From Digital Artefact to Remotely Rendered Environments: Dirk von Suchodoletz, Jeffrey van der Hoeven 2009;  While not directly focused on software preservation, a section of the paper articulates the focuses on some of the needs for and various problems in constituting software archives. Here is a valuable quote “the original software also needs to be preserved if digital objects are to be kept alive via emulation. Guidelines similar to those created for digital objects themselves must be brought to bear in order to safeguard emulators, operating systems, applications and utilities. That is, software should be stored under the same conditions as other digital objects by preserving them in a OAIS-based (ISO 14721:2003) digital archive.”

Preserving Virtual Worlds Final ReportMcDonough, J., Olendorf, R., Kirschenbaum, M., Kraus, K., Reside, D., Donahue, R., Phelps, A., Egert, C., Lowood, H., & Rojo, S. (2010). At 187 pages, this is more of a book than an essay, but it’s full of valuable exploration and discussion of the various issues, problems, and opportunities around preserving video games.

The Attic & the Parlor: Notes from a Workshop on Software Collection, Preservation & Access: The Computer History Museum’s Software Preservation Group hosted what looked to be a fascinating workshop in 2006. You can find the proceedings and presentations online and their wiki also includes a rather extensive directory of software collections. The Attic & Parlor notion in the title focuses on a distinction between highly curated collections and sprawling “gather it all up” collections. This, like the preserving virtual worlds report, focus on the value of collecting source code.

What should we collect to preserve the history of software? Shustek, L. (2006). IEEE Annals of the History of Computing, 28(4), 112 – 111. Another strong argument for preserving source code. “I argue that unless we collect, preserve, and interpret the software code in addition to the related artifacts, we have discarded the software’s intellectual essence. Emphasizing collateral materials puts the focus on the history of products and downplays the development of the scientific and engineering accomplishments that underlie them.”

Preserving Software: Why and How John G. Zabolitzky, Iterations: An Interdisciplinary Journal of Software History 1 (September 13, 2002): 1-8. Zabolitzky makes an impassioned argument for urgent action on software preservation and similarly makes an appeal for the preservation of original source code. “the evolution of software methods, techniques, styles, etc., is described in many books and articles. However, all of that is essentially hearsay: what actually has been done (and what may be different from what the active players in this area may report since they might have wished to do something different) can only be discerned and proven by examining the source code. The source code of any piece of software is the only original, the only artifact containing the full information. Everything else is an inferior copy.”

What essential readings would you add to a list like this? Please consider taking a moment to add them in the comments. Also, feel free to use this comment thread as a place to discuss the various ideas and approaches advocated for in these readings?

4 Comments

  1. Paul Wheatley
    November 16, 2012 at 11:34 am

    There are lots of excellent software preservation resources available on the Software Sustainability Institute website that are also worth checking out.

  2. Bill LeFurgy
    November 16, 2012 at 12:04 pm

    The Computer History Museum/Software Preservation Group “is exploring how to collect software in support of the museum’s overall mission,” http://www.softwarepreservation.org/. There is lots of info on C++, LISP, FORTRAN, etc. Doesn’t look like it’s been especially active in the last couple of years, however.

    A Framework for Software Preservation, Brian Matthews, Arif Shaon, Juan Bicarregui and Catherine Jones, The International Journal of Digital Curation, Issue 1, Volume 5 | 2010 is also useful. http://www.ijdc.net/index.php/ijdc/article/view/148/210

  3. Charles W. Bailey, Jr.
    November 17, 2012 at 11:08 am

    Barwick, Joanna, James Dearnley, and Adrienne Muir. “Playing Games with Cultural Heritage: A Comparative Case Study Analysis of the Current Status of Digital Game Preservation.” Games and Culture 6, no. 4 (2011): 373-390.

    Gooding, Paul, and Melissa Terras. “‘Grand Theft Archive': A Quantitative Analysis of the State of Computer Game Preservation.” International Journal of Digital Curation 3, no. 2 (2008). http://www.ijdc.net/index.php/ijdc/article/view/85

    McDonough, Jerome P. “Packaging Videogames for Long-Term Preservation: Integrating FRBR and the OAIS Reference Model.” Journal of the American Society for Information Science and Technology 62, no. 1 (2011): 171-184.

    Winget, Megan A. “Videogame Preservation and Massively Multiplayer Online Role-Playing games: A Review of the Literature.” Journal of the American Society for Information Science & Technology 62, no. 10 (2011): 1869-1883

  4. Euan Cochrane
    November 18, 2012 at 12:27 am

    Thanks for the great post. Software preservation is essential to a robust emulation based digital preservation strategy so its great to see the progress being made in this space.

    Another group that readers might be interested in is the Play it Again team lead by Melanie Swalwell at Flinders University in Australia. The team includes many New Zealand members also.
    They are working to produce a database of Australasian heritage software and to preserve games and other software from Australia and New Zealand’s past.

    The Internet Archive is also doing quite a lot with software preservation. And I have heard anecdotally that they are trying to address commercial (rights-constrained) software also though this is of course a challenge due to licensing issues.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.