LC Labs Letter: December 2022

December 2022

News from the Library of Congress Labs Team

Announcing the LC Labs Data Sandbox

As readers may remember from the editor’s issue on data and libraries, LC Labs has provided access to the Library’s collections in a machine-readable form since our inception.

With support from the Mellon Foundation, the Computing Cultural Heritage in the Cloud grant has allowed our team to advance these efforts, which began with the resources shared on the LC for Robots page, into a new experimental sandbox space for sharing data packages.

Specifically, the grant team designed the space to host three derivative data packages used in the CCHC Data Jam, an invitation-only event in October 2022 at which outside experts gave their input on what it was like to computationally access and engage with large Library of Congress collections datasets using cloud services. Read more about how we designed these data packages and made them publicly available in this detailed process post on the Signal Blog.


Hearing from Users: Computing Cultural Heritage in the Cloud Data Jam 

The Computing Cultural Heritage in the Cloud initiative pilots ways to combine cutting-edge technology and the collections of the largest library in the world, to support digital research at scale.

The CCHC team has continually taken a user-centered approach to meeting our grant goals of recommending service models, cost implications, and technical affordances of providing access to cultural heritage collections as data in cloud-based environments. First, we hosted a cohort of research fellows whose work required them to analyze LC collections at scale. The CCHC Data Jam was our second round of public user engagement, with heavier emphasis on understanding specific details about the technical set up of cloud-based storage environments and computational access pathways.

The Data Jam participants were experienced data wranglers from all over the world, all of whom were knowledgeable about the complexities of cultural heritage data. In a short, time-bound engagement, they recorded their feedback in real time and as authentically as possible. Now, anyone can watch these impressive feedback presentations via the event recording on For a written summary of event highlights, check out this post recapping the event on the Signal Blog.


How CCHC connects to Labs’ experiments with machine learning

In this end-of-year reflection, Sr. Innovation Specialist Meghan Ferriter shares how the Computing Cultural Heritage in the Cloud initiative is tightly coupled with the multifaceted explorations that are the hallmark of LC Labs work, and, specifically, our investigation of machine learning (ML) and artificial intelligence (AI).

Check out her post on the Signal Blog for a lucid explanation of how Labs experiments inform one another and how we build upon their outcomes in creative and iterative approaches.


  • ICYMI: new collections made their way online since our last issue in September! Check out the Thanksgiving and Fall editions of What’s New on

To subscribe to the monthly LC Labs Letter, visit //

For more information about LC Labs, visit us at

Questions? Contact LC Labs at [email protected]

Grounding iterative experimentation with LC Labs: CCHC and Machine Learning

Across the last five years, LC Labs experiments have integrated sundry perspectives and disciplines to connect people, practice, and history; from making collections more legible and discoverable through volunteer crowdsourcing efforts with Beyond Words and By the People, to developing frameworks for ethically engaging people when adopting machine learning with Humans in the Loop, to […]

Why Web Archiving?: A Conversation with Web Archivists and Researchers

On May 23, the Library of Congress hosted “#WhyWebArchiving: Preserving Internet Content for Research Use,” a virtual event that brought together Library subject experts actively involved in building web archives with researchers that have utilized the Library’s web archives in their work. The event kicked-off the 2022 Web Archiving Conference, which the Library co-hosted with […]

An Introduction to Born Digital Collections at the Manuscript Division, or How to Cross the Equator

The following guest post by Josh Levy, Historian of Science and Technology in the Library’s Manuscript Division, is part two of a series. You can find Part 1 of the series, “Doing History with Born Digital Files: the Rhoda Métraux and Edward Lorenz Papers,” posted on The Signal. Archives can’t just collect physical objects anymore. […]

Candidates, Campaigns, and CDX Files: A New United States Elections Web Archive Dataset

This blog post was co-authored by Chase Dooley (Senior Digital Collections Specialist) and Tracee Haupt (Digital Collections Specialist), members of the Library’s Web Archiving Team. The Library’s Web Archiving Team recently released a derivative dataset that describes the United States Elections Web Archive, a collection that preserves over twenty years of campaign websites for candidates […]