Reaching Out and Moving Forward: Revising the Library of Congress’ Recommended Format Specifications

The following post is by Ted Westervelt, head of acquisitions and cataloging for U.S. Serials in the Arts, Humanities & Sciences section at the Library of Congress.

Nine months ago, the Library of Congress released its Recommended Format Specifications. This was the result of years of work by experts from across the institution, bringing their own specialized knowledge in the needs and expectations of our patrons, developments in publishing and production and the technical aspects of creation, presentation and distribution. The Library of Congress invested so much effort in this because it is essential to the mission of the institution.

The Library seeks to acquire both broadly and deeply, collecting works from almost every subject area and from every country on earth. This forms one of the world’s foremost collections of creative works and one which the Library is committed to making available to its patrons now and for generations to come. In order to accomplish this, the Library must be able to differentiate between the physical and technical characteristics which will aid it in this effort and those which will have to be overcome to fulfill this goal.

With the Recommended Format Specifications, the Library created hierarchies of these characteristics, such as file formats, in order to provide some guidance. By using the Specifications, staff in the Library can determine the level of effort involved in managing and maintaining content which he or she might want to acquire for the collection and use this knowledge to make an informed judgment about his or her actions. In a time of limited funds and unlimited creation, acting strategically in this manner is essential for an institution like the Library of Congress.

Beyond the Books from user difei on <a href="https://flic.kr/p/53yLwV">Flickr</a>.

Beyond the Books from user difei on Flickr.

It is not merely the Library of Congress which might benefit from something like the Recommended Format Specifications. The fundamental interest of the Library in creative works is to ensure that those it adds to its collection will last and remain accessible to patrons well into the future. Yet the identification of the characteristics which encourage preservation and long-term access are not ones which are of value to the Library of Congress alone. Creators of these works want their creations to last; distributors and vendors want to ensure that the content they are sharing will remain available to their customers long after it is sent to them and remains available for distribution to future customers.

Libraries and other archiving institutions need works which will last in order to fulfill their charges. So the Library of Congress has attempted to make the Recommended Format Specifications as useful and available as possible for these other participants in the life cycle of creative works. The process of creating the Specifications naturally came from the perspective of the Library of Congress, which has a rather unique role. However, the work which went into the Specifications was not sealed within an ivory tower.

The experts who developed the hierarchies knew that developing guidance which did not have the potential for broad application could not be successful. So they looked at the issue of preservation and long-term access as holistically as possible. Naturally, these teams of experts started with established Library of Congress guidance and documentation, such as the Best Edition Statement (PDF) and the Sustainability of Digital Formats; but they also consulted the recommendations of external groups such as the International Association of Sound and Audiovisual Archivists and the Audio Engineering Society.

Not only did the teams engage with the deep bench of expertise within the Library of Congress, but they took full advantage of experts from outside it as well. As much as the Specifications have to meet the needs of the Library specifically, the basic criteria which informed them were ones which are universally applicable: adoption, transparency, superior technical characteristics, coordination with international standards. The Recommended Format Specifications were written with the broader community in mind. And happily, the broader community has shown a real interest in the Specifications.

World Airline Routes from user josullivan59 on Flickr.

World Airline Routes from user josullivan59 on Flickr.

As we have shared them through listservs, articles, blog posts and presentations at various conferences, there has been a great deal on interest expressed in the Specifications. The Library has received comments and feedback from individuals and institutions as far afield as Germany and New Zealand. And we continue to disseminate the Specifications, not to enforce others to do things exactly as the Library of Congress does them; but to help them address the same issues we all face and hopefully help their efforts in overcoming these obstacles at least somewhat easier.

The Library does however seek a rather more tangible goal from the sharing of the Recommended Format Specifications with other stakeholders around the world. There is a real temptation, when accomplishing something like the Recommended Format Specifications, something which took years of effort on the part of many dedicated individuals, and that temptation is to lay down one’s tools and be satisfied with a job well done. And there is no denying that the Library considers the Recommended Format Specifications a job well done and rightfully so.

However, the nature of creative works, especially digital creative works, does not allow us to rest on our laurels. What might be the preferred format for a digital photograph or an eBook today might not be the preferred format tomorrow. Unless we keep reviewing the landscape and the ongoing developments in the world of digital creation, the Specifications will soon be as useful as a map liberally dotted with the phrase ‘here be dragons’.

Thar Be Dragons!!! by user eskimo_jo on Flickr.

Thar Be Dragons!!! by user eskimo_jo on Flickr.

The teams of experts within the Library are already looking back at the Specifications, identifying places in which they want to revise, update, tighten and improve them. And we are engaged in further investigation of potential additions to the Specifications. Currently, experts at the Library of Congress are working with colleagues at the National Archives and Records Administration, exploring the potential value of the SIARD format developed by the Swiss Federal Archives as a means of preserving relational databases. So there is more to sharing the Specifications with others than just in providing those others with an opportunity to take advantage of them; it also gives us a chance to learn more about what might make the Specifications better.

The Library of Congress has committed to a sustained investment in the Recommended Format Specifications, which means an annual review and revision process. And to accomplish this, it is actively soliciting feedback and comments from any and all who can help us make them better and more useful, for ourselves and to all of our stakeholders and colleagues in the creative world. This feedback is requested by March 31st, after which date our teams of experts will take the input we have received from others and the results of our own investigations to spend the next three months developing a revised version of the Recommended Format Specifications for the coming year. The greater the input, the better the product, so please do not hesitate to contact us here to share your thoughts and ideas about the Recommended Format Specifications.

Introducing the Federal Web Archiving Working Group

The following is a guest post from Michael Neubert, a Supervisory Digital Projects Specialist at the Library of Congress. “Publishing of federal information on government web sites is orders of magnitude more than was previously published in print.  Having GPO, NARA and the Library, and eventually other agencies, working collaboratively to acquire and provide access […]

All the News That’s Fit to Archive

The following is a guest post from Michael Neubert, a Supervisory Digital Projects Specialist at the Library of Congress. The Library has had a web archiving program since the early 2000s.  As with other national libraries, the Library of Congress web archiving program started out harvesting the web sites of its national election campaigns, followed […]

An Online Event & Experimental Born Digital Collecting Project: #FolklifeHalloween2014

If you haven’t heard, as the title of the press release explains, the Library of Congress Seeks Halloween Photos For American Folklife Center Collection.  As of writing this morning, there are now 288 photos shared on Flickr with the #folklifehalloween2014 tag. If you browse through the results, you can see a range of ways folks […]

Gossiping About Digital Preservation

In September the Library held its annual Designing Storage Architectures for Digital Collections meeting. The meeting brings together technical experts from the computer storage industry with decision-makers from a wide range of organizations with digital preservation requirements to explore the issues and opportunities around the storage of digital information for the long-term. I always learn […]

Five Questions for Will Elsbury, Project Leader for the Election 2014 Web Archive

The following is a guest post from Michael Neubert, a Supervisory Digital Projects Specialist at the Library of Congress. Since the U.S. national elections of 2000, the Library of Congress has been harvesting the web sites of candidates for elections for Congress, state governorships and the presidency. These collections  require considerable manual effort to identify […]

The Library of Congress Wants Your File Format Ideas

In June of this year, the Library of Congress announced a list of formats it would prefer for digital collections. This list of recommended formats is an ongoing work; the Library will be reviewing the list and making revisions for an updated version in June 2015. Though the team behind this work continues to put […]

Beyond Us and Them: Designing Storage Architectures for Digital Collections 2014

The following post was authored by Erin Engle, Michelle Gallinger, Butch Lazorchak, Jane Mandelbaum and Trevor Owens from the Library of Congress. The Library of Congress held the 10th annual Designing Storage Architectures for Digital Collections meeting September 22-23, 2014. This meeting is an annual opportunity for invited technical industry experts, IT  professionals, digital collections […]