Many Goals for One Residency: An NDSR Project Update

The following is a guest post by Jen LaBarbera, National Digital Stewardship Resident at Northeastern University Library.

Jen head shot

Jen LaBarbera

It’s hard to believe that I only have two and a half months left in this residency. Despite Boston’s interminable winter (officially the snowiest on record), my time as a National Digital Stewardship Resident at Northeastern University has flown by!

As with many digital preservation projects, while my residency is technically under the auspices of Northeastern’s Archives and Special Collections, my work is closely tied to a few other areas of the library. My desk is in the Library Technology Services suite, but about half of the folks I work most closely with are in the Digital Scholarship Group, and I’m in fairly regular contact with our metadata librarians.

At present, my project involves four distinct but related goals. The first three involve developing workflows for the following basic categories of digital material in the archives:

  1. Born-digital materials (ingest recently born-digital content from Our Marathon to a new digital repository service).
  2. Digitized materials (ingest digitized images and documents for two Latino collections – Inquilinos Boricuas en Acción and La Alianza Hispana – to the new digital repository service).
  3. Legacy digital materials (make accessible the “box of disks” from the Hispanic Office of Planning and Evaluation records).
  4. Develop a digital preservation plan for Northeastern.

Of course, this last goal is enormous, and could easily be its own separate project for a resident. When I started at Northeastern in September, there was a lot of interest in developing a digital preservation plan for the institution. Most of this work will happen after my residency ends, but I am in the process of creating a light framework for Northeastern’s future in planning for digital preservation.

As you can see, the first two goals involve developing workflows as they relate to Northeastern’s new digital repository service (DRS). When I started here in the fall, Northeastern was in the middle of launching this beautiful new custom-built Fedora-based institutional repository. Along with the release of this upgrade from their old repository service, of course, comes the development of new workflows. As the resident at Northeastern, it is my job to develop these workflows for the three types of digital material that the Archives and Special Collections deal with: digitized, recently born-digital and legacy born-digital. The collections mentioned in the list above are to act as test cases for these workflows.

Screenshot of "Our Marathon"

Screenshot of “Our Marathon.”

To date, I’ve spent the most time working on the ingest of recently born-digital material, specifically, ingesting a digital archive on one platform (in this case, Omeka) to the new DRS. The digital archive we’re working with is Our Marathon, which is a crowd-sourced and curated archive of pictures, videos, stories and social media related to the Boston Marathon bombing on April 15, 2013. It was created as a public, community archive, so people were encouraged to submit their own stories, primarily in the form of images and testimonials. It also includes some more curated material from traditional archival sources like WBUR, Boston City Archives and the Countway Medical Library, among others.

Our Marathon was created as a digital humanities project in our Digital Scholarship Group, but unlike a lot of digital humanities projects, it was created in close consultation with the archives and special collections staff from the start. This means that for the most part, we’re luckily working with some very solid and robust metadata. As a crowd-sourced archive and a digital humanities project, though, this collection poses a number of interesting metadata-related and structural challenges. For example, there are some images in the crowd-sourced collection that may not be particularly unique, but include a description that acts as a testimonial; in essence, there are two items that should be captured in one item’s container. We’re of course pulling the individual items and their descriptive and technical metadata over from the Omeka site, but we spent a good deal of time wrestling with the best way to preserve the structure and intellectual choices made by the creators of the digital archive on Omeka.

It’s pretty clear that each digital humanities project like this has its own unique intellectual and technical challenges for digital preservation, which makes developing a blanket workflow for this ingest a little difficult. I’m attempting to create a framework for a workflow that can be both sturdy enough to provide consistency and versatile enough to be applied to other digital humanities projects as they are created and then ingested into the repository for long-term preservation.

Screenshot from Latino Collections

Screenshot from Latino Collections.

I’m also developing a workflow for digitized material in Northeastern’s Latino collections, consisting of images and documents that were digitized from records of two Latina/o community organizations. (The Archives’ collection scope includes not only institutional records but also the records of Boston-area social justice organizations.) This project will be a little simpler, as at least some metadata was assigned to the files as they were digitized. My work on this project has just begun, and ingesting these items into the DRS should be fairly straightforward. I will be providing recommendations to Northeastern for enhancing the metadata on these records to make them more discoverable within the DRS and to ensure that the records adhere to our newly adopted metadata standards for MODS records.

box of disks 1Lastly, the legacy born-digital material I’m working with involves what so many archives and special collections are receiving from donors: boxes of disks. Specifically, in this collection, we have four record boxes that include 48 CDs, 18 iOmega zip drives, and 177 3.5″ floppy disks. Though we dream about acquiring a FRED workstation, we don’t have high-end, high-powered digital forensics equipment at Northeastern. My work on this, then, has involved a lot of research on other, more economical workarounds for accessing the information that’s trapped in these boxes of disks. I’ve found some promising possibilities, and by the end of my residency, I will provide Northeastern with a report of my findings and recommendations for pulling this information off of these disks and making it accessible to researchers.

These next few months will no doubt be a whirlwind of activity as I wrap up the various aspects of this project. I’ll also be finishing up some work with my cohort of residents, including an exciting project we took on to provide some digital preservation recommendations to the History Project, Boston’s LGBT community archive. I’ve learned so much about digital preservation and digital stewardship during this residency, both in my project and through the NDSR Boston cohort, and I’m excited to bring that knowledge and experience with me when I move on to the next step in my archival career.

Reaching Out and Moving Forward: Revising the Library of Congress’ Recommended Format Specifications

The following post is by Ted Westervelt, head of acquisitions and cataloging for U.S. Serials in the Arts, Humanities & Sciences section at the Library of Congress. Nine months ago, the Library of Congress released its Recommended Format Specifications. This was the result of years of work by experts from across the institution, bringing their […]

Creating Workflows for Born-Digital Collections: An NDSR Project Update

The following is a guest post by Julia Kim, National Digital Stewardship Resident at New York University Libraries. I’m now into the last leg of my nine-month residency, and I’m amazed by what has been accomplished and the major steps still ahead of me. In this post, I’ll give a project update on my primary […]

Introducing the Federal Web Archiving Working Group

The following is a guest post from Michael Neubert, a Supervisory Digital Projects Specialist at the Library of Congress. “Publishing of federal information on government web sites is orders of magnitude more than was previously published in print.  Having GPO, NARA and the Library, and eventually other agencies, working collaboratively to acquire and provide access […]

Boxes of Hard Drives and Other Challenges at WGBH: An NDSR Project Update

The following is a guest post by Rebecca Fraimow, National Digital Stewardship Resident at WGBH in Boston I have a pretty comprehensive list of goals to accomplish over the course of my time as the National Digital Stewardship Resident at WGBH’s Media, Library and Archives. That is: Document WGBH’s existing ingest workflow for production media […]

DPOE Interview: Three Trainers Launch Virtual Courses

The following is a guest post by Barrie Howard, IT Project Manager at the Library of Congress. This is the first post in a series about digital preservation training inspired by the Library’s Digital Preservation Outreach & Education (DPOE) Program.  Today I’ll focus on some exceptional individuals, who among other things, have completed one of […]

All in the (Apple ProRes 422 Video Codec) Family

We’ve spent a lot of time recently thinking about digital video issues. As mentioned in a previous blog post, the Federal Agencies Digitization Guidelines Initiative published several reports on this topic including “Creating and Archiving Born Digital Video.” Work on the “Eight Federal Case Histories” (PDF) report nudged us to add the Apple ProRes 422 […]

From the Field: More Insight Into Digital Preservation Training Needs

The following is a guest post by Jody DeRidder, Head of Digital Services at the University of Alabama Libraries.  This post reports on efforts in the digital preservation community that align with the Library’s Digital Preservation Outreach & Education (DPOE) Program. Jody, among many other accomplishments, has completed one of the DPOE Train-the-Trainer workshops and […]