Introducing the new EPUB reader for e-books at the Library of Congress

Today’s guest post is from Kristy Darby, a Digital Collections Specialist at the Library of Congress.


The Open Access Books Collection on loc.gov includes approximately 6,000 contemporary open access e-books covering a wide range of subjects, including history, music, poetry, technology, and works of fiction. All books in this collection were published under open access licenses, meaning the e-books are available to use and reuse according to the terms of the licenses. Users can access the e-books in the Open Access Books Collection by reading directly online in a browser or downloading the book as a PDF or EPUB file.

Green book cover for Bird Species: How They Arise, Modify and Vanish, edited by Dieter Tietze

Bird Species: How They Arise, Modify and Vanish is now available to view in the new EPUB reader.

When we first made open access e-books available on loc.gov, titles were available for download in either PDF or EPUB format, but PDF was the only one available for reading directly on the website; loc.gov did not support viewing EPUBs in the browser and they were only available for download. As many books were available in both formats or in PDF only, this ensured most titles were viewable directly on the website. However, we recognized an increase in titles available in EPUB only so we are happy to share the news that an EPUB viewer was launched on loc.gov. The viewer makes EPUBs available for reading on loc.gov and provides a richer interface for users.

So, why is an EPUB viewer important? First, it allows users to access the titles (nearly 900!) only available in EPUB format without requiring downloads. Among the first open access books available on the Library’s website were eleven new editions of classic works from Standard Ebooks, a project producing high quality open access editions of public domain classic books as EPUB files. These e-books were selected by the Library’s subject matter experts, cataloged by the U.S. Programs, Law, and Literature Section, and made available on loc.gov as EPUBs. Titles in this group include A Strange Manuscript Found in a Copper Cylinder and The Book of Tea. The EPUB viewer offers unique features that improve the reader experience, such as a bookmark feature that can be applied to multiple pages within the e-book and full-text searching.

The EPUB format has a number of characteristics that set it apart from other e-book formats. The reading experience of an EPUB more closely resembles that of the experience of reading a print book with turning pages rather than vertical scrolling. The EPUB supports links within the text, such as tables of contents and linked endnotes, as well as external links, which may point to resources such as informational websites, publisher websites, and license terms. For example, the colophon of The Book of Tea links to a Wikipedia article about Claude Monet, as his painting Bridge over a Pond of Water Lilies is used as the cover. A URL pointing to the Creative Commons license page (CC0) allows user to see the terms of use for this book.

Title cover of Okakura Kakuzo's book "The Book of Tea," which features Claude Monet's Water Lilies and the Japanese Bridge. The righthand image is the colophon for the same book.

The colophon (right) in The Book of Tea (left) includes a link to the Wikipedia article about Claude Monet.

The Library of Congress Recommended Formats Statement (RFS) “identifies hierarchies of the physical and technical characteristics of creative formats, both analog and digital, which will best meet the needs of creators, publishers, and cultural heritage institutions, maximizing the chances that creative content will survive and continue to be accessible well into the future.” The RFS is a tool that helps inform acquisition decisions at the Library and EPUB 3 is included as a preferred format for textual works in digital form. The Format Description Document for the EPUB File Format Family, part of the Sustainability of Digital Formats site, describes the EPUB format in detail as well as information about sustainability, accessibility, and the history and development of the format.

Accessibility is an essential benefit EPUBs offer to users: the format has the capacity to support many features that ensure all users will be able to enjoy the e-books. EPUBs are compatible with screen reader technology and they allow for alt text for images in the e-book. The table of contents in an EPUB can serve as a navigation aid and structured metadata helps with navigation as well as discoverability. The 2017 EPUB Accessibility specification from IPDF outlines requirements for EPUBs to conform to accessibility standards. While not all EPUBs conform to these specifications, many publishers and creators are designing their EPUB e-books with these provisions in mind.

We hope the EPUB viewer on loc.gov enhances the reading experience for everyone. The Open Access Books Collection is growing every month, and development of the EPUB viewer is ongoing. Check it out and let us know what you think below!

Even More Fun with File Formats!

Today’s guest post is from Kate Murray, Marcus Nappier, and Liz Holdzkom of the Digital Collections Management & Services Division at the Library of Congress. Fun with File Formats is back with another installment! Our first two blog posts from December 2021 and June 2022 were very popular with readers of The Signal. No surprise that there are […]

FADGI’s embARC Now Supports FFV1!

Today’s guest post is from Kate Murray, Digital Projects Coordinator in Digital Collections Management and Services at the Library of Congress and Bertram Lyons, Partner at AVP. FADGI (Federal Agencies Digital Guidelines Initiative) is pleased to announce a new release of its free open source application embARC with support for FFV1 encoding. embARC, short for […]

Performing Arts in the Coronavirus Web Archive: Part 1

This post was originally written by Melissa Wertheimer, a Music Reference Specialist at the Library of Congress, for In the Muse: Performing Arts Blog. In June 2020, I was appointed to the Library’s interdisciplinary Coronavirus Web Archive project team to select social and cultural content. The Coronavirus Web Archive’s official landing page and press release came out on #WebArchiveWednesday, […]

Fun with File Formats

Today’s guest post is from Kate Murray, Marcus Nappier, and Liz Holdzkom of the Digital Collections Management & Services Division at the Library of Congress. Are you a file format fan? If you’re curious how to pronounce the still image format HEIF (spoiler alert: it rhymes with “beef”) or the difference between PDF/A-3 and PDF/A-4, […]

Annotation as Aesthetic: A Closing Interview with Innovator in Residence Courtney McClellan

2021 Innovator in Residence Courtney McClellan created Speculative Annotation, an experimental browser-based application that encourages students and teachers to have conversations with historic Library of Congress items through annotation and mark-making. McClellan is a research-based artist who lives in Atlanta, Georgia. With a subject focus on speech and civic engagement, McClellan works in a range […]

It’s a bird, it’s a plane, it’s a…derivative dataset!

This post describes a collaboration between LC Labs member Eileen J. Manchester and Peter DeCraene, the Albert Einstein Distinguished Educator Fellow to answer the question: “what would it mean to treat a dataset as a primary source?”