For a while now, Stanford Universitys special collections have had the distinct honor of holding one of the largest historical collections of interactive software in the world. The Stephen M. Cabrinety Collection in the History of Microcomputing at Stanford University consists of several thousands of pieces of computer hardware and software.
At a recent advisory board meeting for the ongoing Preserving Virtual Worlds project, I was thrilled to hear from Henry Lowood, Curator for History of Science & Technology Collections and Film & Media Collections in the Stanford University Libraries, that Stanford had entered into a partnership with NISTs National Software Reference Library to disk image and digitize related materials for a large portion of the collection. Im thrilled to have this opportunity to discuss this project here with Henry.
Trevor: First, could you tell us a bit about the Cabrinety Collection? Whats the backstory? What is its scale in terms of numbers of items, date ranges, types of items, etc?
Henry: The Stephen M. Cabrinety Collection in the History of Microcomputing was a gift to Stanford University. Stephen Cabrinety was a collector who began a very systematic effort to document the history of videogames while a teenager. Documents in the collection tell the story of his remarkably prescient vision of a historical collection or museum of the history of microcomputing. He attended Stanford briefly, stopped out and unfortunately passed away at a much too early age. His family then was faced with the decision about what to do with his collection. His sister contacted Stanford based on her discovery of our Silicon Valley Archives while searching on the web, and we negotiated the terms of the transfer by gift in 1998.
Note that this is a history of microcomputing collection, not a game collection per se, though probably more than 80 percent of the collection is console and computer games or related forms of interactive entertainment or education. The collection covers the period from the Magnavox Odyssey (1972) to just before DOOM (1993); it includes a substantial portion of the microcomputer (including console) software produced in this period. We do not have an exact count for various reasons, but the number of software titles is in the neighborhood of 12-15,000, plus more than 70 platforms, other hardware, books, magazines, ephemera, and archival documentation.
Trevor: Could you tell us about a few of the items in the collection that you think are particularly special, exciting and unique?
Henry: I can answer this question from several different perspectives. From a personal perspective, nearly every box I open from the collection catches my eye with an interesting strategy title or historical simulation — Avalon-Hill games like Nukewar or Chris Crawfords early games or the amazing run of games from SSI during the 1980s. Another criterion might be rarity, and we find that games like Ultima – Escape from Mt. Drash or hardware like a Magnavox Odyssey in the original (unopened) box are in the collection. Then of course there are the many classics in pristine condition from Atari, Electronic Arts, Nintendo, and so on, including versions of software such as VisiCalc (even a dealer demo) and Wordstar.
Trevor: Could you tell us about the partnership with NIST? What is the plan? What is Stanford doing and getting out of the relationship and what is NIST getting?
Henry: The plan is to image as much of the software collection as possible and add the images and hashes to the NSRL database. In a nutshell, we will move through the collection beginning with the items on relatively familiar, more recent formats first (e.g., CD-ROM, 3.5 floppy diskettes, etc.) and ending with now unfamiliar formats from the 1970s, such as data on audio-cassettes. NSRL will capture disk images from original media, along with photographic images of media, boxes and box inserts (manuals, registration cards, etc.); Stanford will provide the software, manage metadata, carry out quality control, and archive the images in our digital repository. We will be able to compile success rates for data capture from the various media formats, and finally, we will contact rights-owners to request permission to provide access to items in the collection for which they hold copyright.
Trevor: You guys are going big on disk images here. Do you imagine keeping these as is or do you imagine migrating them in the future? Do you see disk images as a preservation format?
Henry: First, I should say that I am using the term disk image in a very broad sense to include all formats in the collection. This stretches the meaning quite a bit for cases such as cassettes or Atari 2600 cartridges. I do think that a case can be made for disk images as a preservation format, especially in combination with the NSRL approach to providing hashes based on verified copies of the original media. So, we will end up with images that have a good provenance trail and can be audited and serve as baselines for future data preservation activities. I am more skeptical about disk images as the final mechanism for delivery of the software to collection users; however, first things first – we need to pull the data off these media that in many cases are well past their use by dates. As your question suggests, it is likely that there will be other, derivative formats created from the images for delivery to patrons. I could imagine, for example, delivery might involve installed, executable versions of software, perhaps in conjunction with environments under development elsewhere, such as the Olive Executable Archive. Clearly, much work — both technical and legal — lies ahead before we can deliver the content.
Trevor: What is your plan for storing and managing the resulting digital collection over the long term?
Henry: The collection will be stored and managed in the Stanford Digital Repository. As a collections curator, I believe the important next step for us will be attention to item-level descriptive metadata, which in turn will require work on metadata scheme, ontologies, terminology, etc. I expect that will be the next big area for game preservation; following the technical work that has been done on data capture and emulation, metadata for search, discovery, citation, etc. is the missing link. We are tossing around some ideas about how to address metadata and opening up the conversation with other institutions around the world with similar interests, such as the National Videogame Archive in the UK or the Game Archive Project group at Ritsumeikan University in Japan.
Trevor: Aside from the value of being able to preserve the games, you had mentioned that you were excited about the project producing a data set on disk media failures. Could you explain that value?
Henry: Keep in mind that nearly all of the Cabrinety collection came to us in pristine condition. We took off a lot of shrink-wrap! This means that we have a collection of virtually unused, original media from the 1970s through the early 1990s. By tracking the success rates of our efforts to pull data off these media, we will produce a unique data-set relating to their viable life-times. This will not be a laboratory experiment with conclusions about media longevity based on tossing a disk into a microwave or stomping on it to simulate aging; it will be real-world data. I am really not sure what to expect. After all, many of the items are over thirty years old.
Trevor: What is your ultimate vision for what you will do with the disk images of the games? Are you imagining making them available in a reading room? Or, do you have other ideas and plans for access?
Henry: I wish that the answer to this question could be expressed in terms of my vision or even in terms of what our current technology can offer. However, the reality is that intellectual property considerations will be a major constraint. As I mentioned earlier, we intend to meet this problem head-on by engaging in a conversation with rights-holder and asking for permission to provide full access to data. My guess is that some publishers and developers will be fine with doing that, and thus we will have good options for providing access to a portion of the collection. In other cases, we will be restricted, and reading-room access will be as far as we can go. With orphaned titles and other situations that are less clear, we have not decided yet what we will do. Whatever the results, this project will give us a good opportunity to find out more about the willingness of the game industry to support game preservation activities, at least insofar as they are willing to grant permission to provide access to historical titles.