Libraries, archives, and museums are acquiring increasing numbers of born-digital collections. I’ve been interested to see the increased use of digital forensics tools in the appraisal and processing and accessing of such collections. But there are challenges.
Some of the software tools come from the realm of legal forensics, where chain of custody and recovery of maliciously destroyed or intentionally deleted files are among the key goals. Some of the software introduces new technical concepts – what are Disk images? Or checksums?
Archives are looking at vintage media, which often requires vintage hardware and software, or specialized hardware. What’s a FRED? And what could a Catweasel possibly be? Libraries are setting up forensics labs to deal with these new collections (Stanford, the Bodleian, among others). The collections at the Library’s Packard Campus or at the Computer History Museum are something to behold, but I shudder at what it will take to keep the equipment operational.
There is definitely a need to document computing history in aid of digital preservation. There are multiple initiatives to document and verify file formats (Sustainability of Digital Formats, GDFR, UDFR, JHOVE2, PRONOM, DROID). There is at least one initiative to document carrier media (MediaPedia). There are archives of manuals (University of Minnesota Charles Babbage Institute). I am thinking a lot about what other sorts of documentation are needed – operating systems, application software, hardware of all types… I heard these challenges subtly woven through many presentations and discussions at the Library’s 2010 storage architecture meeting. To understand some of the challenges, I recommend two recent key reports: Preserving Virtual Worlds and Digital Forensics in Cultural Heritage.