Digital was everywhere at this year’s Society of American Archivists annual meeting. What is particularly exciting is that many of these sessions were practical and pragmatic. That is, many sessions focused on exactly how archivists are meeting the challenge of born-digital records.
In one such session, Sibyl Schaefer, Head of Digital Programs at the Rockefeller Archive Center, offered such advice. I am excited to discuss some of the themes from her talk, “We’re All Digital Archivists: Digital Forensic Techniques in Everyday Practice,” here as part of the ongoing Insights Interview series.
Trevor: Could you unpack the title of your talk a bit for us? Why exactly is it time for all archivists to be digital archivists? What does that mean to you in practice?
Sibyl: We don’t all need to be digital archivists, but we do need to be archivists who work with digital materials. It’s not scalable to have one person, or one team, focus on the “digital stuff.” When I was first considering how to structure the Digital Team (or D-Team) at the RAC, it crossed my mind to mirror the structure of my organization, which is based on the main functions of an archive: collection development, accessioning, preservation, description, and access. I quickly realized that integrating digital practices into existing functions was essential.
The archivists at my institution take great pride in their knowledge of the collections, and not tapping into that knowledge would disadvantage the digital collections. We also don’t have many purely digital collections; the vast majority are hybrid. It wouldn’t make sense for one person to arrange and describe analog materials and another the digital materials. The principles of arrangement and description don’t change due to the format of the materials. Our archivists just need guidance in how to be effective in handling digital records, they need experience using tools so they feel comfortable with them, and they need someone available to ask if they have questions. So the digital archivists on my team are figuring out which software and tools to adopt, which workflows are the most efficient, and how to best educate the rest of the staff so they can do the actual archival work. The digital archivists aren’t actually doing traditional archival work and in that sense, “digital archivist” is a misnomer.
Trevor: If an archivist wants to get caught up-to-speed on the state and role of digital forensics for his or her work, what would you suggest they read/review? Further, what about these works do you see as particularly important?
Sibyl: The CLIR report, “Digital Forensics and Born-Digital Content in Cultural Heritage Collections,” is an excellent place to start. It clearly outlines what is gained by using forensics techniques in archival practice: namely the ability to capture digital archival materials in a secure manner that preserves more of their context and original order. These techniques also allow archivists to search through and review those materials without worrying about inadvertently altering them and affecting their authenticity.
I was ecstatic when I first saw Peter Chan’s YouTube video on processing born-digital materials using the Forensic ToolKit software. It was the first time I saw how functionality in FTK could be mapped to traditional processing activities: weeding duplicates, identifying Personally Identifiable Information and restricted records, arranging materials hierarchically, etc. It really answers the question of “So you have a disk image, now what do you do with it?” It also conveyed that the program could be picked up fairly easily by processing archivists.
The “From Bitstreams to Heritage: Putting Digital Forensics into Practice in Collecting Institutions” report (pdf) provides a really good overview of the recent activities in this area and a practical analysis of some of the capabilities and limitations of the forensics tools available.
Trevor: Could you tell us a bit about how the digital team works at the Rockefeller Archive Center? What kinds of roles do people take in the staff? How does the team fit into the structure of the Archive? How do you define the services you provide?
Sibyl: My team takes a user-centered approach in fulfilling our mission of leveraging technology to support all our program areas. We generally start by identifying a need for new technology, whether it be to place our finding aids online, create digital exhibits for our donors, preserve the context and authenticity of materials as they move from one physical medium to another, or increase our efficiency in managing researcher requests. We then try to involve users — both internal and external — as much as possible throughout the process. This involvement is crucial given that we usually aren’t the primary users of the software we implement.
One archivist focuses on delivery and access, which includes managing our online finding aid delivery system, as well as working very closely with our reference staff to develop and integrate tools that will help increase the efficiency of their work. Another team member is focused on digitization and metadata projects which includes things like scanning and outsourced digitization projects, as well as migrating from the Archivists’ Toolkit to ArchivesSpace. We just hired a new digital archivist to really delve into the digital forensics work I discussed in my presentation at SAA. She will be disk imaging and teaching our processing archivists to use FTK for description. In addition to overseeing the work of all the team members, I interface with our donor institutions, create policies and procedures, set team priorities and oversee our digital preservation system.
As I mentioned before, the RAC is divided up into five different archival functional areas: donor services, collections management, processing, reference and the digital team. Certain services, like digital preservation and digital duplication for special projects, are within our realm of responsibility, while with others we take a more advisory role. For example, we’re in the midst of an Aeon special collections management tool implementation, and although we won’t be internally hosting the server, we are helping our reference staff articulate and revise their workflows to take advantage of the efficiencies that system enables.
Our services are quite loosely defined; one of our program goals is to “leverage technology in an innovative way in support of all RAC program areas.” This gives us a lot of leeway in what we choose to do. I prioritize our preservation work based on risk and our systems work based on an evaluation of institutional priorities. For example, over the last year the RAC has been trying to increase the efficiency of our reference services, so we evaluated their workflows, replaced an unscalable method for organizing reference interactions with a user-friendly ticketing system, and are now aiding with the Aeon implementation.
Trevor: Could you tell us a bit about the workflow you have put in place to implement digital forensics in processing digital records? What roles do members of your team play and what roles do others play in that workflow?
Sibyl: My team takes care of inventorying removable media, creating disk images, running virus checks on those images, and providing them to the processing staff for analysis and description. Processing staff then identifies duplicates, restricted materials, and materials that contain PII. They arrange and describe materials within FTK. When they have finished, they notify the D-Team and we add the description to the Archivists’ Toolkit (or ArchivesSpace — we’re preparing to transition over soon) and ingest those files and related metadata into Archivematica.
There’s a lot of details we need to add in that will greatly increase the complexity of the process, and some of them will require actual policy decisions to be made. For example, the question of redaction comes up every time I review this process with our archivists. Redaction can be pretty straightforward with certain file formats, but definitely not with all. Also, how do we relay that information has been redacted to our researchers? We need to have a policy that clearly outlines when we redact information (for materials going online? for use in the reading room?) what types of information we redact, and what types of files can securely be redacted.
Trevor: As your process is established and refined, what do you see as the future role and place of the digital team within the archive? That is, what things are on the horizon for you and your team?
Sibyl: In the years since I joined the RAC we’ve placed our finding aids and digital objects online in an access system, architected a system for digital preservation, and configured forensics workflows. Now that we’ve got that foundation for managing and accessing our digital materials, I want to start embodying our goals to be innovative and leaders in the field. One area I think we can contribute to is integrating systems. For example, we’re launching a new project with Artefactual, the developers of Archivematica, to create a self-submission mechanism for donors to transfer records to us. Part of the project includes integrating ArchivesSpace with Archivematica. How cool would it be to have an accession record automatically created in ArchivesSpace when a donor transfers materials to our Archivematica instance?
Likewise, I’ve been talking with a few people about using data in FTK to create interactive interfaces for researchers. We could use directory data captured during imaging or created during analysis (like labeling materials “restricted”) to recreate (but not necessarily emulate) the way files were originally organized, including listing deleted and duplicate files and then linking that directly to their final, archival organization. The researcher would be able to see how the files were originally organized by the donor and what is missing (or restricted) from what is presented as the final archival organization. I get giddy when I think of how we can use technology to increase the transparency of what happens during archival processing. I’m also excited about the prospect of working EAC-CPF records into our discovery interface to bolster our description.
We also have a great deal of less innovative but very necessary tasks ahead of us. We need to implement a DAMS to help corral the digitized materials that are created on request and also to provide more granular permissions to materials than what we currently have. We need to create and implement policies to fill in gaps in our policy framework and inch towards TRAC compliance. And lastly, we need to systematize our preservation planning. We have a lot of work to keep us busy! That said, it’s a really great time to be in the archival field. Digital materials may present new and complex challenges, but we also have a chance to be creative and innovative with systems design and applying traditional archival practices to new workflows.
Good information Trevor. This is helpful to understand the context of the digital archive ,digital collection roles and responsibilities and various functional subject matter expertise needed to archive.
i would suggest you to try DuplicateFilesDeleter , it can help resolve duplicate files issue.