Digitization–making a digital copy of a non-digital object–is a bedeviling topic for digital preservationists. Establishing a clear line of demarcation between the process of creating the digital copy and the process of keeping the copy over time is the central issue.
I’ve always thought this was semantics. Well-meaning, but ill-informed, people said “digital preservation” when they meant “digitization” in reference to something like scanning a book or stack of family pictures. In my mind there was a bright line between creating the object and preserving it. Helping people understand this distinction would help bridge the gap.
I still think talking about the issue is critically important, but resolution is proving complicated, even within our own community. The Digital Preservation forum on Stack Exchange will, we are told, die as a non-public beta, in part because those of us who participated were unable to agree whether digitization was in scope. This is how it appeared to the presumably impartial site administrator who is pulling the plug:
It seems fairly clear that this proposal was backed by several distinct groups of people, and “digital preservation” means something different to all of them. To some, it means asset management within a company. To others, it’s about preserving legacy software. To others still, it is about manipulating file formats. The end result is that none of these groups is served well by the site we have now.
There is an element of truth here. Participants differed in how they defined digital preservation in connection with digitization. Here’s a shortened version of the question/comment that kicked off the discussion (spelling has been Americanized):
I would propose that this community should be involved in decisions related to the results of a digitization initiative. For example, the file formats and metadata schemas used, where/how the results are stored, and so on. However, questions focused on how to engage in digitization, what types of scanners should be used or resolution to digitize at, will be off topic…. As many of us in the digitization community are well aware, confusing digital preservation for digitization is a common mistake. I’d suggest adding some clear scoping detail on this to the “What kind of questions should I not ask here”?
This is an essential issue. Anyone involved in digital preservation has had experience with people who think the work is purely about scanning, as well as experience with people who are deeply interested in creating optimally preservable content. A public forum absolutely must be clear if a distinction is to be drawn between scanning and preserving, and if so, how.
That turned out to be a difficult task. Of the seven responses offered, none were chosen as “the best.” Here is my attempt to distill the essence of each answer, ranked in the order of up-votes.
- The mechanics of digitization are not in scope of this site, but the preservation of the results are.
- The answer should be a conditional yes, rather than an unqualified yes… It seems to me that if someone wants to digitally preserve something, and they have a legitimate question about the digitization process, especially if it relates to the preservation aspect, then we shouldn’t be turning them away. They’re a legitimate part of the constituency of this site and we should be taking their questions seriously.
- There’s a continuum of relevance here. “What’s the best scanner to use” and “what DPI should I scan at” seem obviously out of scope, “should I save as TIFF or JPG” less so, “how should I organize and describe the files once I’ve scanned them” clearly on-topic.
- I think digitization of analog objects with the goal of digitally preserving them, or keeping them easily-preservable in the future is on topic.
- Yes. The phrasing is key however and it should be focused on the results of digitization – the formats, the metadata associated with it, the storage etc. and beyond that, errors that appear in the data stream.
- If you’re not digitizing, how can you be doing digital preservation? Seems like it has to be on-topic.
- Let’s think about it in narrow terms as creation of digital content from a preservation perspective, which would be in scope. Anything that doesn’t relate to creating content for clear-cut preservation purposes (equipment, throughput, QC) would be out of scope. (Full disclosure: this was my suggestion).
The range of responses was interesting. Some were black and white (on both sides!) but some hedged, acknowledging a gray area. And, I have to say, the more I considered the comments, the grayer things got. Take my own answer. Is always true that equipment and quality control are out of scope for digital preservation? On reflection, some scanning workflows will create clearer, more detailed and accurate images than others. Better quality images–for certain types of content, at least–presumably have higher value, which arguably could influence their level of preservation.
Digital preservation is an emergent enterprise, and advocates are naturally eager to promote it as a distinct and deliberate activity. But we have to consider the value of a more general approach to framing the objective. Narrowly drawn questions and answers are important, but they apply narrowly as well. As a community, we still have quite a bit left to do in terms of fleshing out digital preservation methods, concepts and outcomes. More importantly, a huge amount of work is needed to raise public awareness about the need for preserving digital materials.
We should think about the value of engaging anyone with a question that touches even remotely on digital preservation. So what if someone asks about how to “digitally preserve” their family slides by scanning them? Instead of dismissing them as a hopeless noob, why not take the opportunity to explain that scanning is only the first step? They might even come away with a new appreciation for what preservation is all about.