LC Labs Welcomes Computing Cultural Heritage in the Cloud (CCHC) Researchers!

June 2021

LC LABS LETTER
A Monthly Roundup of News and Thoughts from the Library of Congress Labs Team

Welcome, CCHC researchers!

Funded with a 2019 $1 million grant from the Andrew W. Mellon Foundation, the Computing Cultural Heritage in the Cloud (CCHC) initiative aims to better serve research and creative uses of Library of Congress resources. The initiative will document what is required to support this work–from levels of staff support and costs associated with serving and transforming digital materials– using the affordances of cloud-based technology.

This year, LC Labs has partnered with three scholars whose individual research projects will use cloud computing services to explore the Library’s digital collections at scale.

The projects of Lincoln Mullen, Lauren Tilton, and Andromeda Yelton are impressively varied in their aims–Mullen attempts to use machine learning to extract biblical quotations across the Library’s collections; Tilton seeks to examine approximately 250,000 early 20th century images by refining and designing computer vision; and Yelton plans to create an interactive data visualization that clusters conceptually similar documents, supporting users who only have a rough idea of the items they’re looking for. This recent article in the Wall Street Journal details how these approaches seek to reimagine what is possible through “search.” Furthermore, each project includes a public humanities focus and intends to engage audiences in transforming access to knowledge.

These projects will collectively inform the Library of Congress’ understanding of the benefits and challenges of using distributed computing environments in large-scale digital library settings. Outcomes from the individual projects will be documented and shared openly as a complement to the findings from the institution’s overarching investigation. Read more in a recent Washington Post piece about how this open approach to innovation supports the Library’s digital transformation.

 

More about the researchers…

Lincoln Mullen, “America’s Public Bible: Machine-Learning Detection of Biblical Quotations Across LOC Collections via Cloud Computing.”

Lincoln Mullen is associate professor at George Mason University in the Department of Art and Art History and director of computational history at the Roy Rosenzweig Center for History and New Media. Dr. Mullen is no stranger to Library collections, having won first place in the 2016 Chronicling America Data Challenge. His current project builds on past work that identified biblical quotations within a vast corpus of historic digitized newspapers. Now, he seeks to diversify the range of sources informing the creation of what he calls “America’s Public Bible.”

Lauren Tilton, “Access & Discovery of Documentary Images”

Lauren Tilton is an assistant professor of digital humanities at the University of Richmond in the Department of Rhetoric & Communication Studies and co-director of Photogrammar and the Distant Viewing Lab. Dr. Tilton’s project will look for ways computer vision methods could be improved to better consider context and enhance discovery of images across collections from the early 20th century.

Andromeda Yelton, “Situating Ourselves in Cultural Heritage: Using Neural Nets to Expand the Reach of Metadata and See Cultural Data on Our Own Terms”

Andromeda Yelton is a software engineer and professionally trained librarian. Her project will use a searching capability that utilizes machine learning and “fuzzy search” to help users discover and navigate Library collections outside the methods currently available to them. Ultimately, Yelton aspires to make searching the Library’s collections easier and more accessible for a host of potential users and researchers.

Stay informed

You can read more about the researchers in this recent Library of Congress press release. As the project progresses, updates and calls for public participation will be shared on the CCHC experiments page on labs.loc.gov and you can always ask us questions at [email protected]

 

To subscribe to the monthly LC Labs Letter, visit //updates.loc.gov/accounts/USLOC/subscriber/new?topic_id=USLOC_182

For more information about LC Labs, visit us at https://labs.loc.gov/

Questions? Contact LC Labs at [email protected]

Developing a New Digital Collections Strategy at the Nation’s Library

Today’s guest post is from Joe Puccio, Collection Development Officer at the Library of Congress. Tremendous progress has been made by the Library of Congress in acquiring born-digital content as part of a coordinated strategy presented in its 2017 Digital Collecting Plan and previously reported in the Signal. With that plan now in its fifth […]

An Archivist’s Perspective on Legacy Files

In this post, 2020 Staff Innovator Chad Conrady discusses his area of expertise, emulation, which imitates older operating systems in order to open outdated or legacy files that are no longer operable with contemporary operating systems or software.

 

Analyzing the Born-Digital Archive

Kathleen O’Neill is a 2020 Staff Innovator with LC Labs and a Senior Archivist in the Manuscript Division at the Library of Congress. In this post, she discusses her analysis of the various file formats in the Manuscript Division’s born-digital holdings.

Newspaper Navigator Search Application Now Live!

On September 15, 2020, the Library of Congress announced the release of Newspaper Navigator, an experimental web application which makes 1.5 million photographs from the dataset from Chronicling America available to the public to explore for the first time. Read more about the design and features of the project below or jump straight to the newly launched application at //news-navigator.labs.loc.gov/search !

Metaphors for Understanding Born Digital Collection Access: Part III

Kathleen O’Neill is currently serving as one of two Staff Innovators at the Library of Congress. Their 2020 project, Born Digital Access Now!, explores existing pathways for accessing born digital materials in the Manuscript Division. In this series of blog posts, Kathleen describes the complexities of gaining access to born digital materials through the lens of three different metaphors. Up first was “Media Format, or, Have Fun Storming the Castle!” The second blog post discussed “Legacy File Formats and Operating Systems or Lost in Translation.” This is the third and final post in the series and Kathleen carefully explains the process of emulation and makes it feel less like “strange magic.”