More Open eBooks: Routinizing Open Access eBook Workflows

This is a guest post by Kristy Darby, a Digital Collections Specialist in the Digital Content Management Section in Library Services.

Figure 1. Youjeong Oh’s Pop City: Korean Popular Culture and the Selling of Place, one of the open access books available from the Library of Congress collections.

We are excited to share that anyone anywhere can now access a growing online collection of contemporary open access eBooks from the Library of Congress website. For example, you can now directly access books such as Cory Doctorow’s Little Brother, Yochai Benkler’s The Wealth of Networks, and Youjeong Oh’s Pop City: Korean Popular Culture and the Selling of Place from the Library of Congress website. All of these books have been made broadly available online in keeping with the intent of their creators and publishers, which chose to publish these works under open access licenses.

A key objective of the Library of Congress digital collecting plan is the development and implementation of an acquisitions program for openly available content. We have previously discussed a number of open access book projects, including open access Latin American books, and open access children’s books. Significantly, the Library of Congress has long been receiving print copies of open access books through multiple routine acquisition streams. These openly licensed works can be made much more broadly accessible in their digital form.

These books are the result of a pilot effort of the Digital Content Management Section (DCM). DCM staff, in collaboration with the Collection Development Office (CDO), identified books available through Directory of Open Access Books (DOAB) of which the Library already holds a copy in print. DOAB is a digital directory that provides access to academic peer-reviewed books available under open access licenses.

While all the books in DOAB could potentially be considered for addition to the Library of Congress collections, all books added to the collection go through a selection process whereby subject matter experts determine which works are in scope based on the collection policy statements. By identifying matches in DOAB to print holdings of the Library of Congress, we could identify a set of works for which a selection decision had already been made.

Analyzing DOAB Data

Identifying books to include in this pilot project required some data crunching. DOAB provides metadata about all eBooks available through their service, so staff compared ISBNs from DOAB books against ISBNs for books in the Library’s catalog. This provided a list of matches on books the Library holds in print and books included in DOAB, which gave DCM staff a list of books to work on as part of this pilot. DCM staff carefully inspected each book to ensure that the creator had licensed it under an open access license, such as Creative Commons.

Processing Open Access eBooks

Figure 2. The ILS record for Pop City : Korean Popular Culture and the Selling of Place which now includes a direct link to the copy of the work in the Library’s digital collections.

DCM staff established strategies for taking the book from the very beginning – identifying titles to process – to the end, which is full and open access on loc.gov. Because the Library already holds the print books as part of its collection, the Library’s catalog includes a MARC bibliographic record for the book. DCM staff, with the help of staff from the Integrated Library Systems Program Office, developed a method of cloning and then transforming the metadata to create new records for the eBook counterparts, making them discoverable in the catalog.

These new eBook records include information about the terms of the open license the work is provided under in the MARC 540 field. Staff made necessary changes to the records to ensure that the books and the accompanying metadata would display correctly on loc.gov. DCM staff downloaded the eBook files from DOAB and processed the files for presentation on loc.gov as well as for long-term preservation. The DOAB eBooks were made available via loc.gov after processing was complete and the content and metadata were live on the website.

Expanding Access and Enhancing Resilience in the Commons

Figure 3. A view of some of the book covers for open access titles now available through the Library of Congress digital collections.

The books added to the collection through the DOAB pilot are digital versions of print books already held by the Library. The print books are only available to researchers who visit one of the reading rooms at the Library of Congress in Washington, DC and sign up for a reader card. The eBooks are openly available on loc.gov without any restrictions. There is no travel, registration, or authentication necessary.

These eBooks are available to anyone in the world with an internet connection. Also, by collecting the eBooks in addition to the print books, the Library commits to preserving the digital content and providing lasting access to this content. While it would be possible to simply link to copies of these books hosted elsewhere, which many libraries do, the Library of Congress is invested in preserving content for the long term that is added to its collection. By acquiring the digital files for these works the Library is helping to support enduring access to these works for communities around the world.

To explore how these workflows and processes would scale, over a three month period, DCM staff processed and provided access to over three hundred OA eBooks recorded in DOAB. The workflows created, codified, and documented during the pilot project will now be used to support routine OA eBook processing not only for DOAB, but for any OA eBook projects in which DCM is involved.

We are excited to continue to refine and improve this process. You can find books like these alongside other open access books in the Library’s collection at this link: //www.loc.gov/search/?fa=partof:open+access+books.

5 Comments

  1. Jennifer Sauer
    March 26, 2020 at 10:48 am

    Fantastic! Thank you for sharing your process. This will be more and more meaningful as OA continues to expand and evolve. Congratulations!

  2. J. Ripp
    March 26, 2020 at 11:46 am

    Well, this is just pretty great, isn’t it?

    Especially pleased at your recognition of the importance of reliable infrastructure: hosting the books at LC is a significant step to consolidating resource location and digital preservation.

  3. JJ Harbster
    March 26, 2020 at 12:00 pm

    So happy to see this launch. For me, the preservation of open access content is extremely important. There are multiple access points to this content including, now, LC.But the question for me is” Who is preserving it?” Answer: Library of Congress!!! Yay!

  4. Emily
    March 26, 2020 at 12:17 pm

    Is it possible to publish a spreadsheet of the titles, OCNs, and URLs? Thank you.

  5. Trevor Owens
    March 26, 2020 at 4:06 pm

    Hi Emily,

    Books are being routinely added, so the list is changing overtime.

    The best way to get a list of titles from this is to query the Loc.gov API based on the “part of” that the books are associated with.

    For the API, you can take the search URL, like //www.loc.gov/search/?fa=partof:open+access+books and add &fo=json to it. In this case, that will look like this //www.loc.gov/search/?fa=partof:open+access+books&fo=json

    The JSON that comes back from that URL has data about each of the titles.

    For more information on how to use the API you can read up on documentation about it here https://libraryofcongress.github.io/data-exploration/

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.