ODF: The Open Document Format

The following is a guest post by Carl Fleischhauer, a Digital Initiatives Project Manager at the Library of Congress.

During December 2015, the Library’s Format Sustainability website added descriptions of eleven members of the Open Document Format family, aka OpenDocument and ODF. These eleven join a number of other format descriptions mounted in 2015, many of which are also carried in the Library’s Recommended Format Statement, first published in 2014 and revised in early 2015.

These two complementary websites support the Library’s ever-increasing acquisition of born-digital content. The Recommended Format Statement is designed to inform staff and external content creators about preferred and acceptable formats to acquire for the Library’s holdings. These formats are ones for which the Library believes that the provision of access and long-term preservation management will be feasible.

Meanwhile, the Format Sustainability website provides technical descriptions about formats of all types, candidates for the recommended list as well as those that may be deemed to be unsuitable for acquisition. This information is intended to aid staff when they assess new content offerings and when they revise and refine the Recommended Format Statement.

In the Recommended Format Statement, ODF is listed as an acceptable type in the “text” category, in the same bullet with OOXML, the XML expression of Microsoft’s family of Office formats. This section of the Recommended Format Statement carries a parenthetical comment that features the term “electronic books.” (For more on OOXML, see my post from February 3, 2015.)

The truth is, however, that examples of ODF and OOXML will be most frequently encountered as born-digital segments within collections of personal papers and organizational records, the types of unpublished materials that are acquired by the Library’s special collection divisions. (In contrast, “e-publications” will most often be acquired in formats like ePub; other publisher-favored, schema-governed XML formats; and as PDF files.)

Detail of the cover page for the OASIS publication of the ODF 1.2 standard.

Detail of the cover page for the OASIS publication of the ODF 1.2 standard.

Like many other complex format families, ODF exists in several versions and “parts.” ODF is developed and maintained under the auspices of the Organization for the Advancement of Structured Information Standards (OASIS), headquartered in Massachusetts. ODF has also been approved as an international standard through the ISO/IEC joint technical committee JTC1, a collaborative effort of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), both headquartered in Switzerland. Our descriptions focus on the versions approved by both standards bodies, with an emphasis on ODF version 1.2. Where version 1.1 differs substantially from version 1.2, separate descriptions have been produced.

Here’s the list of ODF-related formats added to the sustainability website last month:

When sorting out the taxonomy and history of digital formats, the sustainability team has often been intrigued by the intricacy and nuances of a given format’s history. ODF’s ancestry takes us back to the 1980s, but the narrative line sharpens in the early 2000s, when the format was being refined and two factors influenced the development team.  First, they drew inspiration from the movement for open government, arguably dating from the eighteenth century Enlightenment but taking on fresh intensity in the Internet age. The introduction to a 2006 ODF white paper (PDF) states, “In the case of public [governmental] documents . . . no resident should be excluded from data access [and/or] . . . forced to buy software from one particular vendor or for one particular operating system platform.”

A second motivation for the format’s developers was the need–felt by memory institutions in many nations–to preserve documents for the long term, an outcome that was seen as threatened by the widespread use of commercial office software applications and by the proprietary binary document formats they produced at that time. In the words of the white paper cited above, the use of an open-source format “guarantees long-term access to data even if companies cease to operate, change their strategies, or dramatically raise their prices.”

The sustainability team’s principal investigator for ODF, Caroline Arms, has prepared an outline of the format’s history. Her main findings are presented in the ODF family description. The story begins at Sun Microsystems, a private company founded in 1982 and acquired by Oracle in 2010. In 1999, Sun acquired the German StarOffice software suite (first released in 1985) and quickly made it available as a free download. This edition of StarOffice produced binary files but within a year or two, the Sun team had modified the tool to produce output files in XML and, like StarOffice, made this application available at no cost.

In 2002, after more elaborations, the output format–by then referred to as the OpenOffice.org 1.0 format–was submitted for standardization to OASIS. Sun was joined in this standardization effort by Boeing, Stellent, Arbortext, the National Archives of Australia, and the Society of Biblical Literature. To buttress the standard and to encourage wider acceptance, the OASIS ODF technical committee also moved the specification to ISO/IEC JTC1, where it was published as ISO/IEC 26300 in 2006, designated as version 1.0, amended in 2012 to align with version 1.1. In 2015, ISO/IEC published version 1.2 in three parts and, at this writing, OASIS is developing what will be version 1.3.

The development of software to support ODF changed course after Oracle’s 2010 purchase of Sun. Oracle was not interested in continuing the activity and multiple independent efforts soon emerged. Two important examples are the LibreOffice effort, formally coordinated through a German non-profit doing business as The Document Foundation, and the Apache OpenOffice (AOO) project, running under the auspices of the U.S.-based Apache Software Foundation. These two efforts involve a worldwide community of volunteer coders. Their codebases have grown apart since 2010, and the ODF family format description summarizes a number of perspectives on the implications of this sometimes confusing circumstance.

Chronological branching diagram for the Open Office family from 1985 to 2015, created by David Gerard, as presented in the Wikipedia entry for NeoOffice.  The StarWriter/StarOffice source material is at left, later abbreviated as SO.  The middle set of permutations from 2001-2012 include OpenOffice.org (OOo), Go-Open Office (Go-oo), as well as Oracle and IBM Workplace and Lotus/Symphony manifestations.  The current trio of active applications are at right: NeoOffice (Neo), LibreOffice (LO), and Apache OpenOffice (AOO).

Chronological branching diagram for the Open Office family from 1985 to 2015, created by David Gerard, as presented in the Wikipedia entry for NeoOffice. The StarWriter/StarOffice source material is at left, later abbreviated as SO. The middle set of permutations from 2001-2012 include OpenOffice.org (OOo), Go-Open Office (Go-oo), as well as Oracle and IBM Workplace and Lotus/Symphony manifestations. The current trio of active applications are at right: NeoOffice (Neo), LibreOffice (LO), and Apache OpenOffice (AOO).

The ODF family format description also identifies some the organizations, including government bodies, that have adopted the ODF family of formats as mandatory or recommended for documents that must be editable in order to support collaboration within the government or between the government and the public. Success in this area represents a payoff for the creators’ initial goal of supporting open government. Here are a few selected examples:

  • Brazil, 2008. The ePING (Standards for Interoperability for Electronic Government) includes ODF 1.2 and ISO/IEC 26300: 2008 as the only editable formats for office documents.
  • Norway, 2009. Norway adopted a new set of obligatory information technology standards, mandating ODF as the only editable format for exchanging documents between the government and users by email. See announcement and summary in English.
  • Germany, 2011. Version 5.0 of the German Standards and Architecture for e-Government Applications (SAGA) includes ODF and OOXML among its formats under observation.
  • Portugal, 2012. Regulation incorporating a list of mandatory formats. The only editable format for documents listed was ODF 1.1.
  • The United Kingdom, 2014. Sharing or collaborating with government documents mandates ODF 1.2.
  • Denmark, 2014. eGovernment recommendation v.16.0 indicates that Denmark continues to accept editable documents in “all common formats (including OpenDocument Format – ODF and Office Open XML – OOXML).”
  • EU, 2014. Statement from the Vice-President of the European Commission Maroš Šefčovič recommended, “For revisable documents, all European institutions are recommended to support as a minimum two ISO standards, the Open Document Format (ODT) and Office Open XML (OOXML).”
  • United States, 2014. ODF 1.0 is recommended in the National Archives transfer guidance statement.
  • Canada, 2015. Libraries and Archives Canada’s Guidelines on File Formats for Transferring Information Resources of Enduring Value lists ODF 1.0 as a preferred format.

One Comment

  1. Helen (Evans) Wanamaker
    January 19, 2016 at 10:11 pm

    I am a grateful fan of The Library of Congress. While doing research on my family tree; I came across black and white film neg prints taken in 1935-1938 taken of my grandparents. I’m still not sure how I connected with the link with my grandfather’s name. The photographers were documenting life during the depression/war of the life farmers in Farm Security agencies – between 1935 and 1937. This was the first glimpse that I had of their life at that time. I didn’t know they lived on a farm in Barto, Pa nor did I know that my grandfather thought violin lessons to neighbors. By looking at a few black and white pictures, I could see into their life and how different it was in 1937. I now understand why my grandmother was so thrifty when she cooked and baked and understand why she encouraged me to darn the holes in socks and why she could make gifts made at home. Her mac and cheese recipe called for 35 cents worth of cheddar cheese. I had to figure out how much cheese that was by checking the price of cheese in 1935. Pa. My father was 18 and my uncle was 20 at that time and they were having dinner with the family around the table. They are both deceased. Some pictures are “untitled” and some aren’t digitized so I haven’t seen all that I will eventually view. I couldn’t view much of the video of the overview of the collection except for the first part of a minute. This was a recent discovery so I haven’t contacted anyone at the Library of Congress. I am excited for what I might be able to find. The quality is amazing when I know the pictures were taken so long ago. They aren’t touched up and I didn’t expect to find them all at once. If the pictures would be perfectly lined up on the screen in perfect condition without hard work to find them, then that would take the realness out of them. Thank you so much for keeping these old pictures in the library.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.