Today’s guest post is from Morgan Morel, Laura Davis, Rachel Curtis, and Andrea Leigh of the Library of Congress’s National Audiovisual Conservation Center, Charles Hosale of the American Folklife Collection and members of the Recommended Formats Statement Moving Image content team.
The digital preservation landscape is ever-evolving, and the Library of Congress has recently made a significant update to its Recommended Formats Statement (RFS). Updated yearly, the RFS serves as guidance for content creators in selecting analog and digital formats most suitable for long-term preservation. In an off-cycle review of video formats, the Library of Congress decided to upgrade FFV1 (version 3) in Matroska (.mkv) container from an “Acceptable Format” to one of five “Preferred Formats” for the preservation and long-term access of video materials, reflecting its ongoing commitment to staying at the forefront of audiovisual preservation. Starting in 2020, FFV1/MKV was added as an “Acceptable” format only for video content without closed captions and/or timecode information. But recent developments have allowed the upgrade from “Acceptable” to “Preferred” and removed the constraint about captions and timecodes.
What Does “Preferred” Mean?
What does it mean to be a “Preferred” format in the RFS context? As described in the RFS introduction,
“The key underpinning to the RFS remains a focus on both global/community criteria and local/institutional criteria as key to preservation and long-term access. The global/community criteria have been based on the seven sustainability factors developed for the Library’s Sustainability of Digital Formats website: Disclosure, Adoption, Transparency, Self-documentation, External dependencies, Impact of patents and Technical protection mechanisms. Each of these factors may have different emphasis or importance depending on the community of practice and content type. Some may not be applicable or essential for every format. The local/institutional factors estimate the level of resources at The Library of Congress available to preserve and manage the content over time. These include Staff experience and expertise, Software/Hardware/Operating System availability, Representation/extent in LC collections/storage, Established workflow/functionality and, new for 2023, Access options including support on the Library’s website, loc.gov. The outcome of this analytical structure are clearer definitions of “Preferred” and “Acceptable” when categorizing digital file formats in the RFS.”
A “Preferred” format is one which meets or exceeds benchmarks for all relevant global/community sustainability factors and satisfies the local/institutional factors because the Library of Congress has the skills, experience, workflows, tools and systems to manage and preserve these formats in current systems with confidence. With all that in mind, we are happy to say that now FFV1/MKV meets both requirements!
The decision to include FFV1/MKV as a “Preferred” format in the RFS was not taken lightly. FFV1 has been used as an archival format by large universities, municipal libraries, small community archives, and many others in between as early as 2015. The wide adoption of the format across the cultural heritage field encouraged the RFS review panel to consider whether FFV1/MKV could be considered a “Preferred” format, and what would need to happen in order to make it so. This led to years of work by federal employees and contractors. The review process considered global and community-based sustainability factors, local and institutional factors related to the Library of Congress and involved a panel consisting of moving image experts from the Library of Congress.
FFV1 / MKV Description
FFV1 is a lossless intra-frame video codec developed by the FFmpeg project. The format is designed to support a wide range of lossless video applications such as long-term audiovisual preservation, scientific imaging, screen recording, and other video encoding scenarios that seek to avoid the generational loss of lossy video encodings.
Matroska (MKV) is an open, non-proprietary multimedia container format based on the EBML for Matroska structure. Matroska is designed to carry a variety of payloads including video, audio, subtitles, and metadata–all crucial for the preservation of digitized and born-digital media objects.
Open Source and Affordable
Both FFV1 and MKV adhere to standards developed and maintained by the Internet Engineering Task Force (IETF). In accordance with the open-source ethos, these standards documents can be freely accessed by anyone. The specification for FFV1 versions 0, 1 and 3 is currently published as RFC 9043. While the current specification for Matroska is available as a draft, the status is listed as “Almost Ready” and it is in the “Last Call” for review stage.
FFmpeg, a free and open-source tool, can create FFV1/MKV files using generalized hardware, such as Mac or PC computer purchased from a consumer-focused vendor. This eliminates the need for proprietary hardware.
FFV1 / MKV Adoption in the Archiving Field
Several institutions in the archiving field have adopted FFV1/MKV as their target preservation format for moving image content. Lossless compression offers savings in storage space as compared to uncompressed formats. Additionally, built-in fixity and embedded metadata support can be used to enable robust preservation and archival workflows. FFV1/MKV offers a feature-rich solution to institutions working to preserve both digitized and born-digital audiovisual materials that lack the budget or technology infrastructure to support more complex formats used by the broadcast and media entertainment industries.
In 2021, Indiana University completed the Media Digitization and Preservation Initiative (MDPI), resulting in the creation and archiving of over 120,000 FFV1 files from their Film and Video collections.
Currently, the Library of Congress has archived many FFV1/MKV files as part of the American Archive of Public Broadcasting (AAPB) project with more expected in the near future. Notably, partners from smaller public broadcasting stations and community archives, often constrained by technology and budget limitations, have contributed significantly to the AAPB.
Road to FFV1/MKV at the Library (with help from FADGI)
Starting in 2018, Laura Davis and Rachel Curtis from NAVCC took a leadership role along with other colleagues in advancing use of FFV1/MKV for audiovisual content at the Library of Congress. This eventually led to FFV1/MKV being added to the RFS as an “Acceptable” format for file-based video in 2020, with some conditions. Part of the issue for more widespread adoption was that there was limited support for complex captions and timecodes. To address this the Federal Agencies Digital Guidelines Initiative (FADGI) Audio Visual Working Group conducted research to define the scope of effort and sponsored contract work with Dave Rice (RiceCapades) and Jérôme Martinez (MediaArea) for FFmpeg development to enhance functionality for the gaps in timecode and captions storage. This development was completed in 2023 and, as first announced at the FADGI meeting at NAVCC in October 2023, FFV1/MKV was moved from an “Acceptable” to a “Preferred” format on the RFS! For a more complete overview of the process, see the poster below be discussing this timeline presented at the 2023 Association of Moving Image Archivists (AMIA) annual conference.
The Library of Congress is enthusiastic about embracing FFV1/MKV in its Video Lab, located at the National Audio Visual Conservation Center in Culpeper, VA. The Video Lab, responsible for creating over 17,000 video preservation files yearly, has initiated testing and development to begin creating and supporting FFV1/MKV files in parallel to the J2K/MXF files currently created in the Lab.
The Library of Congress remains open to accepting FFV1/MKV files from its AAPB partners and eagerly anticipates receiving these files from other donors, contributing to the continued growth and diversification of the audiovisual preservation landscape.
You can read more about the RFS, FFV1/MKV and moving image preservation formats at the links below: