Digital Preservation Pioneer: Anne R. Kenney

“Technology has had most of the attention in digital preservation but it is the least of our concerns,” said Anne R. Kenney. That’s a bold declaration. But Kenney has earned the right to make it, based on her 25 years at Cornell University Library, conducting ground-breaking digital research, creating award-winning training resources and fostering national and international digital-library partnerships.

Kenney talked with me about her career, casually tossing out details that hinted at deeper, monumental volumes of work and struggle; at project outcomes that influenced the worldwide library and archive community; and at lessons learned the hard way, through trial and error. She began her career rescuing decaying books by converting their contents to microfilm; these days she is trying to rescue scholarship itself from the unique and daunting institutional challenges that are emerging in the digital age.

Kenney started working at Cornell University library in 1987 in their newly founded preservation program and took on the massive Brittle Books Program, converting books – made of slowly decaying acidic paper – to microfilm. Though microfilm was widely accepted as a reasonable storage medium, Kenney became intrigued by a high-speed printer from Xerox as a possible preservation alternative. “I’d looked at scanning a little bit before that and the quality didn’t rival microfilm or photocopy,” said Kenney. “But the scanner they developed to complement their high-speed printer was doing quick scanning and at the 600 dpi level.”

She dove into research on digital-imaging technology (which is what scanning is) to see if it could capture book facsimiles better than film. Kenney said, “I started looking at how one could translate microfilm requirements — as in how many line pairs per millimeter — to equivalencies in dots per inch.” She experimented with problematic fonts to determine which technology could faithfully reproduce them. She studied printing technologies of the 19th and early 20th century until finally she found her “lab rat” font.

ITC Bodoni on Wikicommons by GearedBull

Kenney said, “Bodoni italics became my benchmark because it is on a slant, which is difficult to capture on a grid, and the character formation is both seriffed and contains both thick and thin lines. It is a very difficult character to capture. I got a lot of books out of the fine arts library that contained font examples in common use in the early 19th and 20th centuries and we scanned the heck out of them. And then we did comparisons of printouts from the scanned images and found that at 600 dpi they were the functional equivalent to the font sets in the printed books, in photocopies of those pages and microfilmed versions as well.

“We developed formulas and a method to determine the type of imaging one would need to capture text at various sizes. Because lead type, which was used during this period, has a tendency to spread in mass printing, publishers rarely used fonts smaller than 1 mm in height, which 600 dpi can capture very well. Those results were used for the JSTOR project, to define their capture requirements. And they ultimately were adopted by Google in their Google book scanning project for basic text materials.”

By 1994, Kenney had gained enough new knowledge and experience from her digital-imaging research that she felt compelled to share it with other professionals. She began offering training programs, intensive week-long workshops for libraries and archives.

The Making of America collection, in 1995, was her first massive digitization project and one of her earliest large-scale projects done with a collaborator. Cornell partnered with the University of Michigan to highlight the era from mid 19th century to the early 1920s. “We did about 1 million images at Cornell,” said Kenney. “We focused a lot on journal serial literature and popular reading material from that era. I think it was a pretty big success.”

As more collections went online there were display problems. “We worked with vendors to define requirements,” said Kenney. “And a really fascinating part of that was, ‘what do we present on-screen?’. Because the 600 dpi files are huge and we wanted to make them available on screen at the then standard 600 by 400 hundred screen resolution we had to scale them down significantly. Otherwise, you’d see maybe a word or two of the page and given bandwidth at that time, each page would have taken a long time to be retrieved. The files were downsized and compressed using lossy compression to send over those dial-up modems. You didn’t want to wait forever for something to come. Much of that work was driven by the limitations of the technology.”

In 1996, Kenney co-wrote the first of two books on digital imaging for libraries and archives. The Society of American Archivists gave “best book” awards to both and similarly honored the online tutorial on digital imaging. But despite her pioneering digital-library work, Kenney wasn’t paying much attention to digital preservation. She was so wrapped up with the byproducts of digital imaging that digital preservation was not a big consideration to her. But that changed with the Task Force on Archiving of Digital Information’s publication of their report, Preserving Digital Information. “Folks began to focus on such things as media rot, file formats and their longevity,” said Kenney.

She teamed up with others at Cornell to research the impact of file migration on digital images. “That was one of the first migration studies to document the impact of migrating across technology in file formats,” said Kenney. A decade earlier her work had been about preserving the content of books; now it was evolving into preserving the content of digital files. Her second book on digital imaging, co-authored by Oya Rieger, included chapters on preservation, management, and costing over time.

In the final years of the 1990s, as she turned her full attention to the challenges of digital preservation, she began developing an educational program that incorporated hard-won, practical knowledge about digital libraries and digital preservation. The essence of her approach is practicality. “The best you can do is figure out what you can do in a time-constrained period” said Kenney. “That involves the technology, the organizational will and way and fiduciary sustainability.”

Nancy McGovern

In 2001, she teamed up with Nancy McGovern to develop a digital preservation management workshop series and from the beginning their partnership was complimentary and fruitful. It was a chance for both of them to shine and share their accumulated knowledge about technology, workflow, institutions, funding and collaboration.

They taught people how to make prudent decisions as technology evolves, so as to not get hung up on a particular technology, and they encouraged people to develop sustainable programs that work for their organization. Kenney said, “I’ve always thought that it’s not how much you can capture, it’s how little you can capture and get away with doing the things that you need to do. It’s always been how you make managerial decisions where there are trade-offs.” Eventually they put the tutorial “Digital Preservation…Short-Term Solutions for Long-Term Problems” online. It too won an award from the Society of American Archivists.

Over the past decade, Kenney has championed collaboration as a progressive strategy for the future of research libraries. “Gone are the days of one library, one institution,” said Kenney. “It is insufficient for Cornell to have a great university library. It needs to have a great university library that partners really well with other great university libraries.”

2CUL (pronounced “too cool”), for example, is a collaboration between Cornell University Library and Columbia University Library. Kenney talked about how they evaluated the preservation of their eJournals and realized they had to work together to take control of a disturbing situation. “We had an increasing reliance on e-journals but we discovered that across the two campuses, only 15% to 18% of what we rely on, is covered by Portico and LOCKSS, the two most prominent, third party digital preservation programs available. That is not okay.”

“I don’t think that we can rely on entities that don’t have as a primary core mission the long-term preservation of material to take on that function. So part of our work these days is focusing on bringing the community together. One institution cannot solve it. Collectively, we need to apply pressure to publishers and our third-party vendors for preservation services.

“But it is hard to do collaboration; it is not a natural act. And the more partners you have the more difficult it becomes…it’s an inverse ratio to how much you actually get done, and process becomes more important than product. With Columbia, we are looking at a very deep integration of our collections, staff and resources. We are working to integrate our technical services operations across the two campuses. We need budgetary transparency and fluidity so that the money at Colombia and Cornell can be variously tapped within one system that works for both institutions.

“In the 20th century, we focused on the product. Around 2000 to 2010 we went from ownership to access. Now, as we move toward 2020, it’s away from the product to be integrated into the process of scholarship — including the stretching out of the scholarly continuum — and being engaged at the front end, middle and back end in a way that we did not concern ourselves with in the past.

“We’ve partnered with a number of other institutions in looking at the long-term storage of big data. We need to do more collectively. Take institutional repositories for example. They are one of those conceptually strong ideas that really have not panned out as well as they might because scholarship and science do not stop at the borders of universities and colleges.

“And so it isn’t so much those individual repositories that are the key to what we do, it’s the sinews between them and the blood vessels in the sinews between them. The globalized services that support them is where the action will be.”

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.