LC Labs Letter: October 2020

October 2020

A Monthly Roundup of News and Thoughts from the Library of Congress Labs Team

And the Winner Is…

Our Director of Digital Strategy, Kate Zwaard, was awarded a 2020 Theodore Roosevelt (Teddy) Government Leadership Award for her leadership in expanding the Library’s use of technology and encouraging deeper exploration and discovery of its collections.

Kate was recognized as one of two winners in the Pathfinder category for her work leading LC Labs and the launch of the By the People program. After its launch in 2018, the Library’s crowdsourced transcription platform registered over 11,500 volunteers who completed more than 37,000 transcriptions over the course of 2019. From middle school classrooms to college campuses and homes across the country, By the People has given all Americans a pathway to connect with the Library of Congress.

Having matured the program to a permanent programmatic home, Kate continues to bring the Library’s Digital Strategy to life, while creating new bridges and strengthening connections with all users. In an interview with a Government Executive senior editor produced for the Teddy’s, Kate discusses the award and the impressive work being done throughout the Library and supported by LC Labs and the Digital Strategy Directorate (that’s us!).


Our Projects

Calling all researchers!

Apply for funds to do big data research with Library of Congress! We have issued a Broad Agency Announcement (LCCIO20D0112) to recruit and contract with researchers as part of Computing Cultural Heritage in the Cloud (CCHC), a project funded by the Andrew W. Mellon foundation. The Library is seeking to award contracts to four researchers to experiment working with Library data at scale.

  • Proposed budgets should not exceed $77,500.
  • 2-page concept papers are due at 12:00 pm EST on November 30, 2020
  • The research experiments will take place between May 1, 2021 and January 31, 2022

See the Frequently Asked Questions (FAQ) for Interested Researchers for more information. See the full posting on We will host two online sessions to share information about the BAA and answer questions. These sessions (known as Industry Days) will be held on 10/28/2020 at 1:00 pm Eastern Time and 10/29/2020 at 4:00 pm Eastern Time. Please reference the Broad Agency Announcement (BAA) for information on how to register before 10/26/2020.

The goal of the CCHC project is to test and analyze a cloud-based service model for data intensive research. The contracted researchers will join a small team of innovation specialists in LC Labs and a project documenter and reporter. The results of the CCHC project will be shared throughout the project on this project page.


Looking Forward: Two New LC Labs Experiments

LC Labs is thrilled to announce two new collaborations on experiments to enhance diverse user needs and explore possible models for combining automated methods and human-centric approaches.

We have engaged the UK firm Digirati on the Experimental Access: to Digital Collections and Data: Analyzing and Demonstrating Affordances at Scale project to help us analyze how existing LC Labs projects (like Newspaper Navigator, Citizen DJ, and Southern Mosaic) and similar applications and approaches serve user needs. Recommendations and demonstrations based on user research, user testing, and prototyping will help clarify next steps toward realizing key goals in the Digital Strategy. This project will specifically explore how to support emerging styles of research, maximize the use of collections, to connect to users where they are, and to build toward the horizon.

We are collaborating with data management solutions provider AVP on the Humans in the Loop: Accelerating access and discovery for digital collections project. AVP has over a decade of experience working with the Library of Congress on projects focused on software, digital preservation, and best practices. Humans in the Loop will build on the foundations of the Library’s success with crowdsourcing to explore deepened engagement with collections, while foregrounding the role human expertise plays in machine learning. The project will result in at least two experimental prototypes or proof of concepts for at least two human-in-the-loop workflows, training data, code, and recommendations for combining crowdsourcing and machine learning.

Track their project pages in the coming months for updates on these two projects!


AudiAnnotate Audiovisual Extensible Workflow (AWE) Project

LC Labs and the Library’s American Folklife Center are partnering with The University of Texas at Austin and cultural heritage organizations across the country on this recently announced Andrew W. Mellon Foundation grant.  The project will assemble a pipeline of open source tools to create enhanced sharing and annotation of audiovisual materials, while also exposing the human labor necessary to enhance collections and access.  The grant will include workshops and openly documented project outcomes including code and examples.  Read more about the project in a press release from UT Austin, and stay tuned!

New Languages for NLP: Building Linguistic Diversity in the Digital Humanities

The National Endowment for the Humanities Office of Digital Humanities recently announced the awards for Institutes for Advanced Topics in the Digital Humanities.

One award went to project directors Natalia Ermolaev (Princeton University) and Andrew Janco (Haverford College) are collaborating with the Digital Research Infrastructure for the Arts and Humanities (DARIAH) to support coordinated workshops that will teach scholars to create linguistic data and training models for Natural Language Processing for new languages. LC Labs will provide consultation to Institute participants seeking to create corpora from the Library of Congress digital collections for the workshops.


  • The Library of Congress National Book Festival continues! Author talks, Library of Congress booths, and live interviews are still live for your enjoyment.
  • 2020 Innovator in Residence Brian Foo was featured on the Ghana Music Project podcast. In this episode, Brian gives a breakdown and overview of Citizen DJ, the LC Labs experiment that allows visitors to sonically browse and sample free-to-use audio and moving image collections. You can read more about Brian’s design process, including how he incorporated feedback from user testing, in this blog post on the Signal.
  • Remote Workshops for Students! The Library of Congress recently unveiled a new series of virtual workshops for students in grades 3-8. Teachers who’d like to set up a workshop for their school or classroom can read more about them in this blog post and register by visiting this website.

To subscribe to the monthly LC Labs Letter, visit //

For more information about LC Labs, visit us at

Questions? Contact LC Labs at [email protected]


Analyzing the Born-Digital Archive

Kathleen O’Neill is a 2020 Staff Innovator with LC Labs and a Senior Archivist in the Manuscript Division at the Library of Congress. In this post, she discusses her analysis of the various file formats in the Manuscript Division’s born-digital holdings.

Newspaper Navigator Search Application Now Live!

On September 15, 2020, the Library of Congress announced the release of Newspaper Navigator, an experimental web application which makes 1.5 million photographs from the dataset from Chronicling America available to the public to explore for the first time. Read more about the design and features of the project below or jump straight to the newly launched application at // !

Metaphors for Understanding Born Digital Collection Access: Part III

Kathleen O’Neill is currently serving as one of two Staff Innovators at the Library of Congress. Their 2020 project, Born Digital Access Now!, explores existing pathways for accessing born digital materials in the Manuscript Division. In this series of blog posts, Kathleen describes the complexities of gaining access to born digital materials through the lens of three different metaphors. Up first was “Media Format, or, Have Fun Storming the Castle!” The second blog post discussed “Legacy File Formats and Operating Systems or Lost in Translation.” This is the third and final post in the series and Kathleen carefully explains the process of emulation and makes it feel less like “strange magic.”

Metaphors for Understanding Born Digital Collection Access: Part II

Kathleen O’Neill is currently serving as one of two Staff Innovators at the Library of Congress. Their 2020 project, Born Digital Access Now!, explores existing pathways for accessing born digital materials in the Manuscript Division. In this series of blog posts, Kathleen describes the complexities of gaining access to born digital materials even before they reach researchers. This is the second post in the series and focuses on legacy file formats through the metaphor of being “lost in translation.”

Metaphors for Understanding Born Digital Collection Access: Part I

The following is a guest post by Senior Archivist Kathleen O’Neill. Kathleen and her colleague Chad Conrady are currently working on a project called Born Digital Access Now! as the 2020 Staff Innovators in LC Labs. Their first blog post introduces the project, which aims to provide greater access to born digital materials held in the Manuscript Division, in greater detail. Today’s post is the first in a series of three blog posts in which Kathleen will discuss different challenges or barriers to born digital collection access through the lens of three different metaphors. Up first is: “Media Format, or, Have Fun Storming the Castle!”