Looking Back and Forward with LC Labs

Last year, LC Labs worked with partners across the Library and outside its walls to advance the Digital Strategy. Here’s a look back at some of our work on the strategy’s goals of opening the treasure chest, connecting, and investing in our future, and a preview of this year’s plans.

In the coming year, we hope you’ll join us as we experiment and explore! Stay tuned in this space and our social media channels for job and contracting opportunities, collaboration opportunities, and more news of the digital transformation at America’s Library.

Opening the Treasure Chest

Collaboration is key for nearly all of our projects, but especially those whose activities intersect with the Digital Strategy’s call to exponentially grow our collections, to maximize the use of content, and support emerging styles of research.

Experimenting in the Cloud

For nearly three decades, the Library has designed collections and produced a massive treasure trove of digital content. How can we enable and empower researchers to use that data to its fullest potential? Thanks to a grant from the Andrew W. Mellon Foundation, we are now experimenting with sharing digital collections at scale. The Computing Cultural Heritage in the Cloud project enables us to test models for serving digital content to users in the cloud computing environment. We spent a bulk of 2020 developing a (model) for documenting our digital collections and other behind-the-scenes work to set the stage for incoming staff success. We entered 2021 poised to identify research interests that will help define possibilities for this project and to assess existing service models; more on this work in the coming months.

a diagram showing the proposed grant-funded model where accessing, subsetting, reformatting and analysis all happen in a zone of shared responsibility between the library and the researcher.

Experimenting with providing data and content at scale.

Born Digital Access Now!

An increasing amount of cultural heritage material is born digital, but as formats and obsolescence multiply, so do the challenges to accessing this content. Our first LC Labs Staff Innovators, Kathleen O’Neill and Chad Conrady, explored tools and modes of access for born digital materials held by the Library’s Manuscript Division. After recommending tools and potential modes of access tailored to individual collections, they created a prototype digital workstation for processing, preserving, and providing these materials.

Connecting

What does it take to connect with all Americans? The Digital Strategy identifies starting points: inspire a relationship with every visit, bring the Library to our users, welcome other voices, and drive momentum in our communities.

Engaging the Crowd

Connecting with and engaging users are key to the Library’s mission. At the start of 2020, the Librarian’s signature crowdsourcing initiative, By the People, continued to flourish and transitioned to a permanent home at the Library. In just two years, By the People has offered hundreds of thousands of pages for transcription across twenty campaigns, including Alan Lomax field materials, Civil War diaries and letters, Branch Rickey’s baseball scouting records, and selections from Rosa Parks’ papers. Become a volunteer today and dig into the papers of the papers of spiritualist Frederick Hockley from the Houdini Collection, letters to Theodore Roosevelt, or new campaigns launching in 2021. Sign up for updates from the By the People team and mark your calendar for Douglass Day programming beginning on February 12. The collaboration with the Colored Conventions Project will feature the vision, life, and experience of Mary Church Terrell through her papers and other materials.

A screenshot of the By the People crowdsourcing platform.

A screenshot from the By the People crowdsourcing platform.

Innovation in Residence

LC Labs established the Innovator in Residence program to support innovative uses of our collections to expand public interest and engagement with them. This year’s projects, Newspaper Navigator and Citizen DJ, followed through on that promise!

Innovator in Residence Ben Lee recognized that, while the Library’s millions of digitized newspaper pages are easily text-searchable thanks to  Optical Character Recognition software, there is no similar tool to find images. Lee built on a corpus of segmented images from newspapers made possible through previous Innovator Tong Wang’s Beyond Words project. With these and hand-annotated classification, Lee developed a workflow and used machine learning to harvest 100 million photographs, illustrations, cartoons, and maps, and created a user interface for the public to explore using machine-learning-assisted searches. Newspaper Navigator is available for you to explore. Interested in digging deeper? You can read Lee’s own critique of the project’s limitations in his data archaeology, as well as spin up your own investigations with code and derivative datasets.

A screenshot of a web application showing images of baseball players from historic newspapers. The top is a selection of photos that closely resemble baseball players. The bottom displays images of false negatives, images that are not of baseball players.

Newspaper Navigator allows users to refine searches based on visual content, as in this search for images of “baseball players.”

The Library holds major collections of sound and moving image recordings, and hip hop and other musicians work with audio samples to create music. Seeing that connection, Innovator in Residence Brian Foo created the Citizen DJ application, which provides free-to-use sounds culled from Library collections for creating hip hop and other sample-based music. The interface allows users to explore, remix and combine with beats, and download sounds for use with other software; find Foo’s code and documentation in this repo. Along the way, Foo and our colleagues in AFC connected with classrooms and groups including PATH to teach creative expression, music production, and information literacy through Library of Congress collections. Get your headphones on and listen to Brian’s free-to-use album Tracks from the Stacks, and read his guide on ethics and sampling.

Screenshot of Citizen DJ

The Citizen DJ application provides free-to-use sounds culled from Library collections for creating hip hop and other sample-based music.

Investing in our Future

The third part of the Digital Strategy is a call to invest in the future, building on past and work and creativity while looking forward to future needs.  Connecting a desired future with our capability and understanding current user needs and infrastructure presents opportunities to identify where we can invest near-term resources and attention.

The Ins and Outs of Machine Learning

The Digital Strategy pushes us to look to the future, considering the tools and technologies most likely to play a role in 21st century libraries. Last year’s “Season of Machine Learning” let us explore the opportunities and challenges of applying artificial intelligence to library collections. We collaborated with the University of Nebraska-Lincoln’s Project Aida for insights and recommendations, shared the outcomes of 2019’s Machine Learning + Libraries Summit, and released an expert state-of-the-field report by Professor Ryan Cordell. We also took next steps highlighted in the University of Nebraska-Lincoln’s Aida Intelligent Data Analytics report and Cordell’s Machine Learning + Libraries recommendations, beginning an experiment that integrates crowdsourcing and machine learning in September; we expect to share more about the Humans in the Loop project in spring 2021.

Sounding Out Audiovisual Preservation and Access

One of the next frontiers in digital transformation lies in the vast collections of audiovisual materials held by libraries and cultural heritage institutions, including our own. Last year, we worked with Library partners to test the possibilities for implementing speech-to-text transcription tools, using digital spoken-word collections from the American Folklife Center. We are collaborating on a generous Andrew W. Mellon Foundation grant to The University of Texas at Austin supporting preservation and promotion of audiovisual materials, as well as a partnership between Zooniverse and the Library’s American Folklife Center to improve audiovisual transcription workflows.  The Library also hosted the I\V/A\V/ Informal Virtual Audiovisual Summit to share and learn about improving access to A/V content. The I\V/A\V Summit brought together hundreds of people for a full day of discussion of the state of the field, accessibility, and future directions, as well as the impact of operational changes during the COVID-19 crisis.

Junior Fellows Join Remotely

Each year, the Library welcomes a host of Junior Fellows to its Summer Intern Program. In the Summer of 2020, LC Labs helped re-imagine the first-ever remote version of the program while hosting five fellows. Fellows Selena Qian, Emily Sienkiewicz, Hibba Khan, Tyler Youngman, and Nina Kostic explored Sanborn maps, Veteran History Project collections of WWI audio interviews, Political Islam Web Archive and the Puerto Rico at the Dawn of the Modern Age collections, LC for Robots, and Serbian-American history materials contained in Library collections. You can check out their projects, and what their creators had to say about them, in the 2020 Junior Fellows Display Day.

 

Screenshot of Sanborn Maps Navigator

2020 Library Junior Fellow Selena Qian’s Sanborn Maps Navigator allows users to peruse the past with Sanborn Fire Insurance Company maps and historic newspaper images from Chronicling America.

Looking Ahead

These efforts and more build on decades of work and collaboration at the Library to boost momentum and capacity for the agency’s Digital Strategy.

And there’s so much more ahead! Our fourth Innovator in Residence, Courtney McClellan, is at work designing a collaborative annotation tool for students of all ages. We’re continuing our work with cultural heritage collections at scale, and experimenting with machine learning, crowdsourcing, and alternative access models. Stay tuned for news and opportunities to be involved!

Learn more about LC Labs at //labs.loc.gov, subscribe to the monthly LC Labs Letter, and follow us on Twitter @LC_Labs. You can reach us at [email protected].

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.