Tag and Release: Acquiring & Making Available Infinitely Reproducible Digital Objects

What does it mean to acquire something, like a set of animated .gifs,  that are already widely available on the web? Archives and Museums are often focused on acquiring, preserving and making accessible rare or unique documents, records, objects and artifacts. While someone might take a photo of an object, or reproduce it in any number of ways, the real object would reside in the institution. How does this perspective shift when we switch to working with rare and unique born-digital materials?

"I code therefore I am" by user circle_hk on Flickr.

“I code therefore I am” by user circle_hk on Flickr.

Given that digital objects function on a completely different logic where (for nearly all intents and purposes) any copy is as original as the original, rare and unique is a somewhat outmoded notion for digital material. Any accurate copy of a digital object is as much the object as the original. So, if it is trivial to create lots of copies of unique materials how does that change what it means to acquire and make them available?

Consideration of Cooper Hewitt’s acquisition of the source code of an iPad application offers an opportunity to rethink some of what acquisition can mean for digital materials, and in the process rethink part of what the functions of cultural heritage organizations are and can be in this area. What follows are reflections largely inspired by thinking through Sebastian Chan and Aaron Straup Cope’s recent essay Collecting the present: digital code and collections and Doug Reside’s recent essay File Not Found: Rarity in the Age of Digital Plenty (pdf). Together, I think these two essays suggest a potential shift for thinking about digital artifacts. Potentially, a shift away from a mindset of Acquire and Make Available to a mindset of Tag and Release. It may be that the best thing cultural heritage organizations can do with rare and unique born-digital materials is to make it so that they are no longer rare and unique at all. To make it easy for anyone to interact with and validate copies of these materials. This is some formative thinking on the topic, so I look forward to discussing/talking about these issues with anyone interested in the comments.

Copy the Source, Let Others Copy the Source

In 2013 the Smithsonian Cooper-Hewitt National Design Museum acquired Planetary, an iPad application that creates visualizations of collections of music. In practice, this involved acquiring it’s source code and making that code available through its GitHub account. Note, the acquisition did not involve making a commitment of resources to ensure that people will be able to experience the application as users did on iPads. In fact, in that sense, the Planetary software is already obsolete, in that new versions of the iOS software will not run it.

However, by acquiring the source code under version control, along with all of the bug reports and tickets associated with its development, Cooper Hewitt is preserving and making available both the raw material for anyone to make use of and an extensive record of the design and development process. As Doug Reside, digital curator for the performing arts at the New York Public Library recently suggested in an essay in Rare Books and Manuscripts, “the source code behind the program might be considered a manuscript.” In a case like this, where documentation of the entire history of the software’s development is present, the Planetary files might be better understood as an archive, a manuscript collection or, as they are textual in nature, even a documentary edition.

Each commit message with changes and edits to the source is itself a record of the production and creation of the software. In this vein, the acquisition, in a way, escapes the limitations of screen essentialism, i.e. privileging the single representation of a digital object on a screen as its essential form instead of respecting the myriad ways that digital objects manifest themselves. To this end, forgoing the complex issues of attempting to keep the software functional and instead focusing on the ease of collecting the source code and representative documentary materials such as screencaptures will provide future users a base from which to understand and potentially recreate or expand the app.

Anyone can download the entirety of the Planetary acquisition. You can save it to your computer and you too will have, in a sense,  acquired the application as well. That is, the copy of the “real” object on the shelf, or on Cooper-Hewitt’s servers, is no more or less authentic than any other copy of it. The digital objects that make up the acquisition are themselves infinitely, perfectly reproducible. Much like the geocities special collection, anyone is welcome to do what they like with it, exhibit it, revise it, etc. So, what role does the museum as repository play in this case? Using GitHub  to provide access to the source code and its history, Cooper Hewitt has put a stake in the ground to offer resources to steward the code, but it opens up a broader question about what it means to acquire something when anyone can have a perfect copy, undifferentiatable from the original.

The Acquisition of a Sequence of Symbols

In Collecting the present: digital code and collections, Sebastian Chan and Aaron Straup Cope of the Cooper-Hewitt Design Museum offer a wealth of information contextualizing and explaining the acquisition of the Planetary app. Of particular relevance to the question of uniqueness and acquisition they point to an even more symbolic acquisition, the Museum of Modern Art’s acquisition of the @ symbol.

In 2010, the Museum of Modern art acquired the @ symbol. Not a representation of it, but the symbol itself. As Paola Antonelli, Senior Curator, Department of Architecture and Design explains: The acquisition of @ “relies on the assumption that physical possession of an object as a requirement for an acquisition is no longer necessary, and therefore it sets curators free to tag the world and acknowledge things that ‘cannot be had’—because they are too big (buildings, Boeing 747’s, satellites), or because they are in the air and belong to everybody and to no one, like the @—as art objects befitting MoMA’s collection.” While the @ symbol is significantly more ethereal than a digital object, I think the story of this acquisition has some interesting lessons for thinking about acquiring digital materials which are infinitely and perfectly reproducible.

Software’s source code is much more concrete than the @ symbol. A software’s source code consists of a range of digital files. With that said, the acquisition of source code is functionally the acquisition of a sequence of symbols. The non-rivalrous nature of digital objects, means that one organization having a copy of a file doesn’t in any way preclude another organization or individual from having exactly the same thing. The logic of the acquisition Planetary is of pinning these digital objects down, providing some context, and making some commitments to ensuring access to data. It is a logic of non-rivalrous acquisition, simply making a commitment to ensure long term access to these materials.

Tag and Release

The idea of “tagging the world,” in Antonelli’s remarks about the acquisition of the @ symbol, can open up a fruitful way of thinking about digital acquisitions. As I’ve suggested before, I think it’s important for cultural heritage organizations to start letting go of the requirement to provide the definitive interface. Instead, cultural heritage organizations can focus more on selection and working to ensure long term preservation and integrity of data. The Planetary case pushes that idea even further. The Planetary acquisition includes a set of materials that document the experience of the application. They include things like screenshots and descriptions of how it functioned. While these assets offer a sense of what the experience of using the app was, the source code provides a rich set of materials for future users to use to understand how it worked and potentially reenact it.

Instead of wading into the complex issues of attempting to keep the software functional in perpetuity, they have acquired a copy of its source code, made a commitment to ensure long-term access to the data, and made it available under the most liberal license they could. The curatorial function of selection, identifying digital objects that matter and should be preserved, persists without the need to be the only entity that “owns” the object.

In this scenario, the library, archive or museum identifies objects of significance — tagging them in Antonelli’s terms — and then works to broker the right to collect and acquire records and other artifacts that document the object and provide as unrestricted access as possible to what they acquire. Just as a design museum might collect the blueprints for a building instead of collecting the building itself  an institution can collect the source code of a piece of software instead of, or alongside, collecting a copy of the software in its executable form and then working to make that material available in the broadest way possible. From there out, the institution serves to provide authentic copies and validate the authenticity of copies while also providing provenance and context; and ensuring ongoing preservation of an authentic copy.

The future of collecting and preserving born-digital special collections, collections of rare or unique materials like manuscripts, drafts and original source code, might upset some of the core ideas of custodianship. I think the best thing cultural heritage organizations can do with these rare are and unique born-digital materials may be to make it so that they are no longer rare and unique at all. By making a set of unique and rare materials easy for anyone to see and copy the institution can help ensure both the broadest use and access of the materials.

7 Comments

  1. Jon Ippolito
    July 8, 2014 at 1:50 pm

    Some great ideas here, Trevor, especially regarding the importance for contemporary institutions of sharing to preserve. (Richard Rinehart and I make the case for such “proliferative preservation” in the book Re-collection.)

    I like the way your “Tag and Release” dynamic frees curators from the obligation to replicate or replace native habitats for digital artifacts such as the Internet or App Store. Instead they can add value to them–in the form of metadata, promotion, and the like–and insert them back into circulation.

    My only qualm is with the metaphor of source code as manuscript. Rinehart argues that the better analogy is a score, which requires greater effort to interpret and perform.

    Don’t get me wrong: Chan and Cope have done a great service in collecting and distributing the source code for Planetary–less for rescuing a particular iPad app than for offering a preservation model that goes beyond storing executable files on a drive. I just believe forward-thinking museums need to be more than GitHub. They need to document the context of creation (why did the creators make it?) as well as the instruments necessary to achieve it (what are the iPad’s salient properties?).

    That way, when the work is resuscitated in future, we will be able to make tough choices like whether to emulate the “original instruments” or migrate to contemporary hardware that has become the vernacular.

    jon

  2. Gary McGath
    July 8, 2014 at 2:06 pm

    Should that be “Tag, sign, and release”? A digital signature helps to insure that a document is a true copy from a trusted source. People will sometimes want to modify documents for legitimate reasons, so they’ll lose the benefit of the signature, but that leaves them no worse off than before.

  3. Seb Chan
    July 8, 2014 at 2:37 pm

    @jonippolito: Agreed. There is a lot of extra contextual work that we need to do around Planetary that is ongoing. We did try to deal with some of this at acquisition by the collection (and release) of the ancillary documents in the Planetary Extras repository. We had to do this on Github because our collection management system wasn’t geared to handle this sort of object or the materials we needed to make available.

    I’d expect that at some point a book like Monfort & Bogost’s Racing The Beam will need to be written on the affordances of iOS (and hardware) and I’d hope that examples like Planetary in public collections have been collected in ways that are useful enough to be able to be used in such scholarship.

    Related to this, I found the release of the development team’s emails for iOS game Threes (http://asherv.com/threes/threemails/) to be quite interesting. Should a museum acquiring Threes in the future acquire these? Definitely. Again a collapsing of the boundaries between museum, library and archival practices?

  4. Trevor Owens
    July 8, 2014 at 3:18 pm

    @Gary Great point on “sign”! Yes, providing fixity information to validate authentic copies is probably the most important role in this story so it makes sense to make sure that is in the banner headline.

    @jonippolito: I like the music score comparison! Software is the performance of the compiled code, so it makes sense to think about how folks who work in preservation in the performing arts think about these issues. With that said, the manuscript comparison brings another element to this. A manuscript collection often includes a range of drafts of a work. So, this is the second of twenty drafts of Carl Sagan’s book Pale Blue Dot. When you look at the sequence of drafts you can see the history of the development of a work. This parallels what you get with a versioned source code library that includes things like comments.

    That said, we can also combine these two analogies and suggest that the versioned copies of software source code are like a manuscript collection of a composer or playright. That is, you have substantive documentation and information about how the idea of the performance developed over time but you don’t necessarly have a rendering of an original performance.

  5. Doug Reside
    July 8, 2014 at 5:17 pm

    .@jon My manuscript metaphor which Trevor quotes was meant to connect the pre-published nature of the code with the pre-published nature of manuscripts/typescripts. Doesn’t work as well for scripts that are interpreted at run time as for compiled languages, though.

    @Trevor – The “release” part doesn’t quite fit for me as it seems to imply that we redistribute only once. Another, equally imperfect but in different ways, analogy might be the library as the sun. We keep shining out the data to everyone all the time to the point that they always know its there and can access it (and possibly trap and modify/reproduce it through solar panels or photosynthesis or whatever) but they know they can always count on there being a source copy at a predictable location

  6. Trevor Owens
    July 9, 2014 at 4:01 pm

    @Doug, yeah I see what you are saying about release sounding like a one time thing which isn’t the right notion there. I mostly intended it to be a shift in perspective from tightly controlling and holding close precious and rare things to letting perfectly reproducible copies of them proliferate as freely as possible. That said, the long term value there is of course not just about access but about persistent access, so the term is imperfect in capturing that component.

  7. Kathi Inman Berens
    July 17, 2014 at 4:02 am

    I’m struck that conversation here pivots on metaphor. Is source code a script, a musical score or a manuscript collection of a composer or playwright? Is a cultural heritage organization that makes source code available the sun, or a fisherman tagging & releasing? As shorthands for parsing an object’s evolution or origin, metaphor’s work by proximation suggests how disruptive it is to crack open the curated object into code and performance of that code (and human interaction with both).

    As a literary critic examining the protean, emergent e-literature canon, I’m hearing in this post’s conversation the kind of echo-location techniques we’re using too to test whether fundamental assumptions are still valid. Does a literary canon, with its foundational dyad of continuity and rupture, give us suitable tools to examine computational literary works that are *always* a copy (every time it is called in the machine) and whose outputs are often irreproducable? For e-lit, obsolescence and ephemerality, not continuity and rupture, are the agon. The literary canon may be a relic of the print era.

    I love the idea of MOMA curating the @. But “tagging the world” replicates the heritage organization’s habit of looking at curation as centered on its own activity, when really the freedom is for guests, users, readers, interactors to do stuff with the curated objects, within or outside the auspices of the cultural steward.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.