Recommended Formats Statement: Expanding the Use, Expanding the Scope

This is a guest post by Ted Westervelt, head of acquisitions and cataloging for U.S. Serials – Arts, Humanities & Sciences at the Library of Congress.

Photo of a structure design.

“Model Photo: Parametric Gridshell.” Photo by James Diewald on Flickr.

As summer has fully arrived now, so too has the revised 2016-2017 version of the Library of Congress’s Recommended Formats Statement.

When the Library of Congress first issued the Recommended Formats Statement, one aim was to provide our staff with guidance on the technical characteristics of formats, which they could consult in the process of recommending and acquiring content. But we were also aware that preservation and long-term access to digital content is an interest shared by a wide variety of stakeholders and not simply a parochial concern of the Library. Nor did we have any mistaken impression that we would get all the right answers on our own or that the characteristics would not change over time. Outreach has therefore been an extremely important aspect of our work with the Recommended Formats, both to share the fruits of our labor with others who might find them useful and to get feedback on ways in which the Recommended Formats could be updated and improved.

We are grateful that the Statement is proving of value to others, as we had hoped. Closest to home, as the Library and the Copyright Office begin work on expanding mandatory deposit of electronic-only works to include eBooks and digital sound recordings, they are using the Recommended Formats as the starting point for the updates to the Best Edition Statement that will result from this. But its value is being recognized outside of our own institution.

The American Library Association’s Association for Library Collections Technical Services has recommended the Statement as a resource in one of its e-forums. And even farther afield, the UK’s Digital Preservation Coalition included it in their Digital Preservation Handbook this past autumn, bringing the Statement to a wider international audience.

The Statement has even caught the attention of those who fall outside the usual suspects of libraries, creators, publishers and vendors. Earlier this year, we were contacted by a representative from an architectural software firm. He (and others in the architectural field) has been concerned about the potential loss of architectural plans, as architectural files are now primarily created in digital formats with little thought as to their preservation. Though the Library of Congress has a significant Architecture, Design and Engineering collection, this is a community that overlaps little with our own. But he saw the intersection between the Recommended Formats and the needs of his own field and he came to us to see how the Recommended Formats might relate to digital files and data produced within the fields of architecture, design and engineering and how they might help encourage preservation of those creative works as well. This, in turn, led to the addition of Industry Foundation Classes — a data model developed to facilitate interoperability in the building industry — to the Statement. We hope it will lead to future interest, not simply from the architectural community but from any community of creators of digital content who wish their creations to last and to remain useful.

connected_400We have committed to an annual review and revision of the Recommended Formats Statement to ensure its usefulness to as wide a spectrum of stakeholders as possible. In doing so, we hope to encourage others to offer their knowledge and to prevent the Statement from falling out of sync with the technical realities of the world of digital creation. As we progress down this path, one of the benefits is that the changes each year to the hierarchies of technical characteristics and metadata become fewer and fewer. More and more stakeholders have provided their input already and, happily, the details of how digital content is created are not so revolutionary as to need to be completely rewritten annually. This allows for a sense of stability in the Statement without a sense of inertia. It also allows us to engage with types of digital creation with which we might not have addressed as closely or directly as was possible. This is proving to be the case with digital architectural plans and it is proving to be even more the case with the biggest change to the Recommended Formats with this new edition: the inclusion of websites as a category of creative content.

At the time of the launch of the first iteration of the Recommended Formats Statement, websites per se were not included as a category of creative content. This omission was the result of various concerns and perspectives held then but there was no gainsaying that it was definitely an omission. Of all the types of digital works, websites are probably the most open to creation and dissemination and probably the most common digital works available to users, but also not something that content creators have tended to preserve.

Unsurprisingly, this also tends to make them the type of digital creation that causes the most concern to those interested in digital preservation. So when the Federal Web Archiving Working Group reached out about how the Recommended Formats Statement might be of use in furthering the preservation of websites, this filled a notable gap in the Statement.

Naturally, the new section of the Statement on websites is not being launched into a vacuum. The prevalence of websites and much of their development is predicated on the enhancement of the user experience, either in creating them or in using them, which is not the same as encouraging their preservation. It is made very clear that the Statement’s section on websites is focused specifically on the actions and characteristics that will encourage its archivability and thereby its preservation and long-term use.

Nor does the Statement ignore the work that has been done already by other groups and other institutions to inform content creators of best practices for preservation-friendly websites, but instead builds upon them and links to them from the Statement itself. The intention of this section on websites is twofold. One is to provide a clear and simple reminder of the importance of considering the archivability of a website when creating it, not merely the ease of creating it and the ease of using it. The other is to bring together those simple actions along with links to other guidance in order to provide website creators with easy steps that they can take to ensure the works in which they are investing their time and energy can be archived and thereby continue to entertain, educate and inform well into the future.

As always, the completion of the latest version of the Recommended Formats Statement means the beginning of a new cycle, in which we shall work to make it as useful as possible. Having the community of stakeholders involved with digital works share a common commitment to the preservation and long-term access of those works will help ensure we succeed in saving these works for future generations.

So, use and share this version of the Statement and please provide any and all comments and feedback on how the 2016-2017 Recommended Formats Statement might be improved, expanded or used. This is for anyone who can find value in it; and if you think you can, we’d love to help you do so.

Co-Hosting a Datathon at the Library of Congress

On June 14 and 15, the Library of Congress hosted Archives Unleashed 2.0, a web archive “datathon” (otherwise known as a “hackathon,” but apparently any term with the word “hack” in it might sound a bit menacing) in which teams of researchers used a variety of analytical tools to query web-archive data sets in the hopes of discovering some intriguing insights before their 48-hour deadline […]

FADGI MXF Video Specification Moves Up an Industry-organization Approval Ladder

The following is a guest post by Carl Fleischhauer, who organized the FADGI Audio-Visual Working Group in 2007. Fleischhauer recently retired from the Library of Congress. The Federal Agencies Digitization Guidelines Initiative Audio-Visual Working Group is pleased to announce a milestone in the development of the AS-07 MXF video-preservation format specification. AS-07 has taken shape […]

DPOE Program Harnesses the Spirit of Kentucky Librarians

This is a guest post by Barrie Howard. The Library of Congress’s Digital Preservation Outreach and Education program delivered a train-the-trainer workshop on June 10, providing professional development in digital preservation to library professionals from Kentucky and West Virginia. The workshop was held at Northern Kentucky University and sponsored by the State Assisted Academic Library […]

Library of Congress Advisory Team Kicks off New Digitization Effort at Eckerd College

This is a guest post by Eckerd College faculty David Gliem, associate professor of Art History, and Nancy Schuler, librarian and assistant professor of Electronic Resources, Collection Development and Instructional Services. On June 3rd, a meeting at Eckerd College in St. Petersburg, Florida, brought key experts and College departments together to begin plans for the […]

The Radcliffe Workshop on Technology & Archival Processing

This is a guest post from Julia Kim, archivist in the American Folklife Center at the Library of Congress. The annual meeting of the Radcliffe Technology Workshop (April 4th – April 5th, #radtech16) brought together historians, (digital) humanists and archivists for an intensive discussion of the “digital turn” and its effect on our work. The […]

O Email! My Email! Our Fearful Trip is Just Beginning: Further Collaborations with Archiving Email

Apologies to Walt Whitman for co-opting the first line of his famous poem O Captain! My Captain!  but solutions for archiving email are not yet anchor’d safe and sound. Thanks to the collaborative and cooperative community working in this space, however, we’re making headway on the journey. Email archiving as a distinct research area has […]