Fixity and Fluidity in Digital Preservation

Kent Anderson offers a provocative post in The Mirage of Fixity — Selling an Idea Before Understanding the Concept.  Anderson takes Nicholas Carr to task for an article in the Wall Street Journal bemoaning the death of textual fixity.  Here’s a quote from Carr:

Once digitized, a page of words loses its fixity. It can change every time it’s refreshed on a screen. A book page turns into something like a Web page, able to be revised endlessly after its initial uploading… [Beforehand] “typographical fixity” served as a cultural preservative. It helped to protect original documents from corruption, providing a more solid foundation for the writing of history. It established a reliable record of knowledge, aiding the spread of science.

Example of a file fixity error, by Wlef70, on Flickr

Example of a file fixity error, by Wlef70, on Flickr

To my mind, Anderson does a good job demonstrating that not only is “file fluidity” a modern benefit of the digital age, it has long existed in the form of revised texts, different editions and even different interpretations of canonical works, including the Bible. Getting to the root of textual fixity, according to Anderson, means getting extremely specific–“almost to the level of the individual artifact and its reproductions.”

In the world of digital stewardship, file fixity is a very serious matter.  It’s regarded as critical to ensure that digital files are what they purport to be, principally through using checksum algorithms to verify that the exact digital structure of a file remains unchanged as it comes into and remains in preservation custody. The technology behind file fixity is discussed in an earlier post on this blog; a good description of current preservation fixity practices is outlined in another post.

It is well and good to strive for file fixity in this context, and it is indeed “to the level of the individual artifact and its reproductions.”  The question arises about the degree of fidelity that needs to be maintained with respect to the original look, feel and experience of a digital file or body of interrelated files.  Viewing a particular set of files is dependent on a particular stack of hardware, software and contextual information, all of which will change over time.  Ensuring access to preserved files is generally assumed to eventually require: 1) migrating to another format, which means that it will need to change it in some way by keeping some of its properties and discarding others, or 2) emulating the original computing environment.

Each has advantages and disadvantages, but the main issue comes down to the importance placed on on the integrity of the original files.  Euan Cochran, in a comment on an earlier post on this blog, noted that “I think it is important to differentiate between preventable and non-preventable change. I believe that the vast majority of change in the digital world is preventable (e.g. by using emulation strategies instead of migration strategies).”  He noted that the presumed higher cost emulation works against it, even though we currently lack reliable economic models for preservation.

I wonder, however, if the larger issue is that culturally we are still struggling with the philosophical concepts of fixity and fluidity. Do we aim for the kind of substantive finality that Carr celebrates or do we embrace and accept an expanded degree of derivation–ideally documented as such–in our digital information?  Kari Kraus, in a comment on a blog post last week, put the question a different way:

[Significant properties] are designed to help us adopt preservation strategies that will ensure the longevity of some properties and not others. But if we concede that all properties are potentially significant within some contexts, at some time, for some audiences, then we are forced into a preservation stance that brooks no loss. What to do?

Ultimately I think wider social convention will determine the matter.  Until then it makes good sense to continue to explore all the options open to us for digital preservation.

12 Comments

  1. Ulrich Tibaut Houzanme
    October 31, 2012 at 1:58 pm

    Interesting article, Bill, that does a good summary of dilemmas and furthers questions worth asking.
    To me, the level of fixity (100-0%?) to maintain throughout the digital objects’s life-cycle (including long term preservation and transformations occurring particularly at this stage), will have to directly draw on the initial appraisal recommendation, which, as part of the preservation metadata, should follow the object, ad infinitum or, as long as the records is preserved.
    Subsequent decisions to reformat should refer to the appraisal recommendation and if a need to deviate, a notation should be added to the initial decision as well as an explanation as to why. So on and so forth. Does seem a bit process and documentation prone, so the expectation would be that this becomes automated for better performance.

    For now, initial appraisal of electronic records are yet to be established and conducted.

    Tibaut Houzanme

  2. Bill LeFurgy
    October 31, 2012 at 2:07 pm

    Tibaut: Good point about keeping a continuous record of both intention and action regarding the files in question. Users will want to know how (and why) files came to be presented as they are.

    –Bill

  3. Andrew Wilson
    November 1, 2012 at 1:00 am

    Great post Bill. I do wonder (without intending to be offensive) if the focus on “look and feel” and keeping digital objects completely unchanged for ever, is something that concerns the library community more than many others (such as the archival community). Its possibly to do with treating bibliographic resources as artefacts which seems to me to dominate the management of physical resources.

    It was observed in 2001 by Su-Sing Chen, then Professor of computer engineering and computer science at the University of Missouri-Columbia, that “our digital environment has fundamentally changed our concept of preservation requirements. If we hold on to digital information without modifications, accessing the information will become increasingly difficult, if not impossible.” [http://goo.gl/wCTGu]

    As an archivist I’m comfortable with the idea that to preserve digital records over long periods of time (forever, in theory) we might have to change aspects of the original to ensure that the information the record was intended to convey remains accessible, useable and understandable. Our responsibility is to ensure that all changes that happen to the record and its metadata are recorded and, themselves, kept permanently so there can be no doubt about what (if anything) has changed over time. Archivally, this is part of the process of ensuring authenticity, which is about a lot more than data integrity.

    I agree wholeheartedly with the view expressed by Kari Kraus, which you quote in the post, and made the same point myself in a keynote talk I gave at a significant properties workshop at the British Library in April 2008. In such a situation where what is seen to be significant is a function of the community doing the looking all we can do is our best to ensure that there is as little change as possible over time, and what change there is is documented fully so future users will be able to know what has happened to the record.

    Although this might be read as advocating emulation as the only possible preservation strategy, that is not what I’m saying. I am not as convinced as Euan that emulation is as successful or as viable in most contexts as he thinks. I would not deny that it has its place in digital preservation strategies as one element in a mix of preservation approaches but the idea of putting all one’s archival eggs in the emulation basket is not a prospect I would consider.

  4. David Underdown
    November 1, 2012 at 4:57 am

    An interesting example of how even the printed word may not be quite so fixed can be found in the 2004 (and subsequent) setting of the 1662 Book of Common Prayer from Cambridge University Press. The process leading to its creation is described in http://www.cambridge.org/bibles/bcp/article.htm – while they were able to ensure that all major elements began and ended on the same pages (ie page numbering) as in previous settings, page breaks may occur at different points in the text. This makes the book “backward compatible” – the clergy taking a service can give out the same page number for a psalm or whatever and everyone in the congregation will be able to find it whether they have a new or old copy, but they may notice that people are turning pages at different times

  5. David Underdown
    November 1, 2012 at 5:00 am

    On reading through it again, I notice that the article I referred to previously also makes an interesting reference to how having a digital version of the text made adjusting the layout much more straightforward

  6. kiyohisa Tanada
    November 1, 2012 at 7:08 am

    It is a splendid thing to save digitally.
    When I match the present thought and sense of values and save it,
    I think that I may help the future, “a historical verisimilitude research” of the intelligence.

  7. Bill LeFurgy
    November 2, 2012 at 12:20 pm

    Andrew: Great comments, thanks. I won’t quibble with your statement comparing the relative concerns of the archival and library community in terms of treating resources as artifacts. As a current policy matter, I also wholehearted endorse “all we can do is our best to ensure that there is as little change as possible over time, and what change there is is documented fully so future users will be able to know what has happened to the record.” But, waxing philosophical, I wonder if our culture is moving in a direction where overall concern about the primacy of information fixity (and, dare I say, authenticity) is declining. The baseball manager Casey Stengel was fond of saying “you can look it up” after uttering a supposedly factual statement. Problem now is that it’s hard to look anything up and get a final answer–the internet has made everything “too big to know,” as David Weinberger says. Increasingly we settle for contextually close-enough evidence. While our current principles are solid, and we are obligated to carry them forward, I do wonder what future users really expect from archives and libraries.

  8. Andrew Wilson
    November 4, 2012 at 10:13 pm

    Thanks for the reply Bill. I think I share your view about the way society is moving in regards to fixity – that’s why I have no problem with the idea of changing file formats as a preservation approach.

    I think authenticity is a little different. When archivists use the term authenticity they do not mean the usual ‘dictionary definition which is bound up with ideas of truth and sincerity. For archivists, authenticity is about other things than just fixity, ie. reliability and useability as well. The core, critical thing to ensure these characteristics is metadata which document all the contextual, technical, and administrative information needed to use records over time. So, I agree with you that society generally may be moving away from the notion of authenticity inherent in the usual meaning of the word. However, information without metadata is mostly data with little meaning, so I don’t think I could agree that in the archival domain authenticity will become less important or necessary.

  9. Paul Wheatley
    November 6, 2012 at 6:42 am

    Loads of interesting thoughts here, but I’m tempted to play devils advocate to Andrew’s position (not for the first time *;-).

    If we’re going to accept some changes to our digital files as we preserve them, how are we going to ensure that: “all changes that happen to the record and its metadata are recorded and, themselves, kept permanently so there can be no doubt about what (if anything) has changed over time”? The process of identifying change and capturing it (not to mention ensuring that change hasn’t happened in the wrong places) is incredibly difficult. Perhaps one of the biggest challenges in this field. We don’t have the tools to do it, and I’m unsure of where they will come from. If this process just isn’t practical, it may actually be easier to maintain access to a record identical in bits to the original. And perhaps a record that *looks* identical to the original, enabling reasonably straightforward automatic comparison of then and now renderings.

    QA came up as a top practitioner need from our SPRUCE work, as I outlined here.

  10. Bill LeFurgy
    November 6, 2012 at 12:36 pm

    Paul: I agree with you, much as I sympathize with Andrew in a rhetorical sense. As our digital collecting continues to gallop ahead, it’s increasingly difficult to mind files in anything approaching an itemized, fine-grain manner (barring miraculous advances in AI). This is an area in the digital preservation/curation/stewardship field that is unsettled: how much is enough in terms of properly managing collections? I sometimes wonder if we are looking at the issue from the right end of the telescope. The Blue Ribbon Task Force on Sustainable Digital Preservation and Access strongly asserts that use drives preservation. If we accept that, then we ought to seriously consider what users want, now and in the future. We’ve long assumed that issues of trust are critical for users–but is that really the major consideration? Maybe. Or maybe users especially care about issues such as collection scope, availability, completeness, etc. Given that stewardship resources are finite, it’s important to make sure our priorities are oriented correctly.

  11. Euan Cochrane
    November 12, 2012 at 12:46 am

    As I’ve said in the past, I’m worried about a tendency to presume that more change is unpreventable that is actually the case.

    Andrew quoted Su-Sing Chen saying that accessing information will be coming difficult or impossible if we hold onto it without any changes. He is right of course we do need some change. However I would argue that in most cases solutions to that problem based around a continual migration approach introduce excessive, unnecessary and expensive changes when compared to emulation based solutions. Emulation based solutions require little in the way of change to anything over time aside from migrating the emulator software or the emulator module assembler software. Continual migration on the other hand introduces changes to every file over and over, over time.

    But more fundamentally I believe Su-Sing is simply wrong if interpreted as saying we need to change the files that we get given for preservation in order to maintain access to them. We don’t need to. We need to if we assume the only option for maintaining access is to migrate content from files over time. However that is simply not the case and I suspect is actually going to turn out to be a more expensive option over time than alternatives such as emulation based strategies.

    We have fairly strong circumstantial evidence to show that migration for most types of content is extremely problematic, often ineffective, and almost impossible to verify. Its not just a case of the “look and feel” not being maintained post migration, fundamental content is often lost when migration actions are performed. I’d go as far as to say that I have more confidence in using emulation strategies to preserve content than migration strategies. Emulators and related tools are well established and used on a regular basis throughout many industries. The buisness model of http://www.gog.com relies on emulators. Mobile phone development relies on emulators. To me it is not a case of putting all our eggs into one basket, its a case of finding any basket that actually holds our eggs. In this case, a migration strategy is not that basket. For long term digital preservation I really believe that an emulation based strategy is likely to be the only one that will actually achieve the goal of preserving the fundamental content. Its not that its better than a migration based strategy, its that it works. And migration based strategies, more often than not, do not, or can’t be proven to have worked.

    In addition to Su-Sing being wrong (if I have interpreted him correctly) I’d say he’s (unintentionally I’m sure) letting down the whole digital preservation community by just rolling over and assuming there is nothing we can do about change over time. Yes, there may be a trend toward reluctantly accepting change in content due to constant tehcnological change, but that doesn’t mean the digital preservation community should be replicating that. We should be leading the revolt against it from a cutural heritage perspective. Imagine if we only had the king-james version of the bible? Or Miley Cirus’s cover of Nirvana’s seminal song “smells like teen spirit”? We would not have the same content and wouldn’t/shouldn’t be able to claim that we did. That is the potential outcome of making such assumptions about our options in the digital preservation space. To me that would be sad world to be living in.

  12. Andrew Wilson
    November 12, 2012 at 1:20 am

    Paul, QA is the elephant in the room of course. And I couldn’t agree more that post-preservation QA is un-doable at the moment, in any meaningful way. Hey – another research project suggests itself!

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.