Top of page

community managers carlyn osborn, lauren algee, and abby shelton, in the great hall at the library of congress
'By the People' community managers (L-R) Carlyn Grace Osborn, Lauren Algee, and Abby Shelton.

Celebrating 5 Years of By the People

Share this post:

It’s that time again: another By the People anniversary!

By the People (BtP), the Library of Congress crowdsourced transcription program, is taking a moment this winter to look back on how we’ve grown and celebrate our 5th year! As we’ve shared in earlier birthday celebrations, BtP was originally incubated in 2018 by the LC Labs team and was designed to engage volunteers by inviting them to explore and transcribe documents from the Library’s digital collections. Once volunteer-created transcriptions are completed, they are integrated back into the Library’s website, where full-text transcriptions make materials more discoverable and accessible to all users.

selected items from past transcription campaigns. Transcribe history with By the People.
Over the last 5 years, By the People, has published nearly one million pages of Library of Congress digital collections for transcription and engagement by volunteers. Visit crowd.loc.gov to learn more!
The last five years

In our five-year history, By the People has released nearly 1 million pages for transcription on crowd.loc.gov. Over 37,000 registered contributors (and even more anonymous transcribers!) have completed 725,000 transcriptions of Library materials and we’ve published nearly 300,000 of those transcriptions back into loc.gov alongside their original objects. We’ve also released bulk transcription datasets for 19 completed transcription campaigns.

All of our work focuses on finding new ways to further our commitment to volunteers by continually improving the transcription experience, introducing them to interesting collections, and putting their contributions to use. So, what have the last 5 years looked like for By the People?

We launched in 2018 with 5 campaigns from the collections of the Manuscript Division. Since then, we’ve hosted an additional 35 campaigns that have included materials from the American Folklife Center, Music Division, Rare Book and Special Collections Division, and the Law Library of Congress. These campaigns have included diaries, letters, sheet music, theatre programs, subject files, and field notebooks from Presidents, artists, writers, teachers, philosophers, activists, soldiers, and more.  We’ve also celebrated the anniversary of the 19th Amendment, the transcription of a unique 54-ft long South Carolina voting rights petition signed in 1865 (below), and marveled when our volunteers completed nearly 100,000 pages of early Copyright records.

Librarian of Congress Carla Hayden and Civil War historian Michelle Krowl stand over an unrolled 54 foot long scroll.
Librarian of Congress Carla Hayden and Civil War historian Michelle Krowl from the Manuscript Division discuss the 1865 petition. Photo by Shawn Miller, Library of Congress.

Our volunteers tell us that in addition to getting to know these people and eras of history firsthand, the reason they contribute is the real-world impact they are having on these materials and future researchers. To hear more from volunteers in their own words, check out these recent “Volunteer Vignettes” we’ve published over the last year or so: Student and Teacher team up to transcribe Federal Theatre Project playbills, Transcribing Spanish history, and The Youngstown State University Transcribing Club.

New website features in 2023 helped improved the transcription experience

With support from a new development contract in 2022, we’ve rolled out new user-friendly features for our site, including new volunteer hours verification tools, image filtering options, an OCR button, an Ask a Librarian contact form, campaign retirement (more on that below!), and iterative improvements to the transcription user interface. Many of these features come directly from volunteer feedback over the years and the team continues to regularly do user testing to make sure the site is serving volunteer needs.

One of those new 2023 features is a suite of image filtering tools. If a transcriber comes across a blurry scan or an old microfilm image, they now have the option of adjusting the image’s brightness, inverting the colors, or adjusting the contrast. These tools have really come in handy for some of our more challenging sets of materials!

Another recent feature we’re particularly excited about is the introduction of an OCR (optical character recognition) tool to help volunteers with typed and printed text. Volunteers can select a new “Transcribe with OCR” button under the transcriptions box to generate text for a page. This text provides a great starting point for things like newspapers, letterhead, book proofs, or printed ephemera. We hope this new feature will free up our volunteers’ time and energy so they can focus on manuscript and non-machine-readable collection items.

Additionally, because of our fabulous and intrepid developers, we now have the ability to “retire” completed campaigns this year. Our transcription platform, Concordia, was always intended to be a “passthrough application,” meaning that it wasn’t designed to host Library content and transcriptions forever! We pull materials into crowd.loc.gov where volunteers transcribe them and then we extract & publish those transcriptions back to loc.gov. Now, thanks to our ability to retire campaigns, we can remove data from completed campaigns out of our site to improve performance and enable sustainable growth, while still ensuring that volunteer contributions persist. Check out our post on the Signal blog to learn more about the retirement process: The crowdsourced transcription lifecycle – from conception to retirement.

Looking forward to 2024

We’re looking forward to introducing even more new tools in the coming year! In 2024, we’ll continue rolling out feature improvements for the website, including enhanced transcription instructions. We’ll be offering some new ways to interact with transcription data through a set of Jupyter notebooks, and other volunteers through a new series of virtual office hours starting in January 2024.

And what’s the latest from By the People? We had a very musical Fall with two new campaign launches in partnership with our friends in the Music Division. In September we released over 100,000 pages of Sheet Music of the Musical Theater. And just last week we published a campaign featuring writings to, from, and by Leonard Bernstein! Check out this great video from the Bernstein curator Mark Horowitz to learn more about the collection and it’s history here at the Library.

Somewhere... There's a place for us: Somewhere, some place for us. Peace and quiet and sun and air Wait for us Somewhere There's a time for us: We'll find the time for us Time to take what the world can give, Time to love, Time to live - Someday We'll have a city Truer than dreams. Someday, Maria,
This page from the Leonard Bernstein: Writings By, From, and To campaign has been transcribed by volunteers since launching in December 2023. Link to completed transcription.

Be sure to mark your calendars for Wednesday, February 14, 2024 as we celebrate Douglass Day with our collaborators from douglassday.org and the Manuscript Division. We’re teaming up for a transcribe-a-thon of Frederick Douglass correspondence to celebrate Douglass’ chosen birthday and we hope you can join us.

Stay in touch with us!

If you want to keep up to date with By the People and follow along as we go into year 6, you can subscribe to our newsletter. Let us know what you think or ask questions about By the People on our discussion forum in History Hub or via our Ask a Librarian form. Thank you to everybody who helped us go from Day 1 to Day 1825 – happy new year!


Previous versions of this blog post appeared in the Library of Congress Staff Gazette and the Library of Congress Magazine.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.


Required fields are indicated with an * asterisk.