The Library of Congress Wants Your File Format Ideas

"Uncle Sam Needs You" painted by James Montgomery Flagg

“Uncle Sam Needs You” painted by James Montgomery Flagg

In June of this year, the Library of Congress announced a list of formats it would prefer for digital collections. This list of recommended formats is an ongoing work; the Library will be reviewing the list and making revisions for an updated version in June 2015. Though the team behind this work continues to put a great deal of thought and research into listing the formats, there is still one more important component needed for the project: the Library of Congress needs suggestions from you.

This request is not half-hearted. As the Library increasingly relies on the list to identify preferred formats for acquisition of digital collections, no doubt other institutions will adopt the same list. It is important, therefore, that as the Library undertakes this revision of the recommended formats, it conducts a public dialog about them in order to reach an informed consensus.

This public dialog includes librarians, library students, teachers, vendors, publishers, information technologists — anyone and everyone with an opinion on the matter and a stake in preserving digital files. Collaboration is essential for digital preservation. No single institution can know everything and do everything alone. This is a shared challenge.

Librarians, what formats would you prefer to receive your digital collections in? What file formats are easiest for you to process and access? Publishers and vendors, what format do you think you should create your digital publications in if you want your stuff to last and be accessible into the future? The time may come when you want to re-monetize a digital publication, so you want to ensure that it is accessible.

Those are general questions, of course. Let’s look at the specific file formats the Library has selected so far. The preferred formats are categorized by:

  • Textual Works and Musical Compositions
  • Still Image Works
  • Audio Works
  • Moving Image Works
  • Software and Electronic Gaming and Learning
  • Datasets/Databases

Take, for example, digital photographs. Here is the list of formats the Library would most prefer to receive for digital preservation:

  • TIFF (uncompressed)
  • JPEG2000 (lossless (*.jp2)
  • PNG (*.png)
  • JPEG/JFIF (*.jpg)
  • Digital Negative DNG (*.dng)
  • JPEG2000 (lossy) (*.jp2)
  • TIFF (compressed)
  • BMP (*.bmp)
  • GIF (*.gif)

Is there anything you think should be changed in that list? If so, why? Or anything added to this list? There’s a section on metadata on that page. Does it say enough? Or too little? Is it clear enough? Should the Library add some description about adding photo metadata into the photo files themselves?

Please look over the file categories that interest you and tell us what you think. Help us shape a policy that will affect future digital collections, large and small. Be as specific as you can.

Email your questions and comments to the digital preservation experts below. Your emails will be confidential; they will not be published on this blog post. So don’t be shy. We welcome all questions and comments, great and small.

Send general email about preferred formats to Theron Westervelt (thwe at loc.gov) Send email about specific categories to:

  • Ardie Bausenbach (abau at loc.gov) for Textual Works and Musical Compositions
  • Phil Michel (pmic at loc.gov) for Still Image Works
  • Gene DeAnna (edea at loc.gov) for Audio Works
  • Mike Mashon (mima at loc.gov) for Moving Image Works
  • Trevor Owens (trow at loc.gov) for Software and Electronic Gaming and Learning
  • Donna Scanlon (dscanlon at loc.gov) for Datasets/Databases

They are all very nice people who are up to their eyeballs in digital-preservation work and would appreciate hearing your fresh perspective on the subject.

One last thing. The recommended formats are just that: recommended. It is not a fixed set of standards. And the Library of Congress will not reject any digital collection of value simply because the file formats in the collection might not conform to the recommended formats.

2 Comments

  1. Cezar Popescu
    October 4, 2014 at 4:33 am

    The jpg is to be preferred over dng? What sort of madness is this? And tiff on the first position? A privately controlled file format? 🙁

  2. Phil Michel
    October 6, 2014 at 2:32 pm

    I agree that the ordered list of formats can seem illogical. It wouldn’t make sense to create a JPEG from a DNG prior to submitting an image to the Library. We would accept either. It would probably be better to emphasize that the Library is interested in the best version of the file as it was published or created. It is less important that the image is in a format that appears high on the list. We should consider removing the statement that the list is based on “order of preference.”

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.