Let’s go! Explore, transcribe, and tag at crowd.loc.gov

This is a guest post from Lauren Algee, LC Labs Senior Innovation Specialist. Connect with Lauren and her fellow crowd.loc.gov Community Managers Elaine Kamlley and Victoria Van Hyning via History Hub and on Twitter, as well as GitHub.

What yet-unwritten stories lie within the pages of Clara Barton’s diaries, writings of Civil Rights pioneer Mary Church Terrell, or letters written to Abraham Lincoln? With today’s launch of crowd.loc.gov, the Library of Congress is harnessing the power of the public to make these collection items more accessible to everyone.

You are invited to join the Library of Congress via crowd.loc.gov to volunteer to transcribe (type) and tag digitized images of text materials from the Library’s collections. People who join us will journey through history first-hand and help the Library while gaining new skills – like learning how to analyze primary sources and read cursive.

Finalized transcripts will be made available on the Library’s website, improving access to handwritten and typed documents that computers cannot accurately translate without human intervention. The enhanced access will occur through better readability and keyword searching of documents and through greater compatibility with accessibility technologies, such as screen readers used by people with low vision.

Screenshot of album page needing Review

William Oland Bourne, in his own hand, now transcribed and tagged in crowd.loc.gov

The pages awaiting transcription in Campaigns on crowd.loc.gov represent the diversity of the Library’s treasures. Today, volunteers can choose to work on selections from the papers of Mary Church Terrell, letters the public wrote to Abraham Lincoln, Clara Barton’s diaries, Branch Rickey’s baseball scouting reports, or memoirs of disabled Civil War veterans. We’ll continuously add new materials. Coming soon are documents related to women’s suffrage, American poetry, the history of psychiatry, and more.

So how does it work? We import digitized items into the platform from loc.gov using our JSON API. Volunteers type what they see in an image, check transcripts created by others and tag images. All of these tasks will help enhance existing collections metadata. We expect to release the first set of publicly-transcribed materials in early 2019.

Participatory projects like these are known as crowdsourcing, meaning that they invite the public – nonspecialists and specialists alike – to engage with collections and process information. This is not the Library’s first foray into these approaches.

We have long invested in building digitized collections and making them searchable. One of our first attempt to recruit volunteers to increase their findability began in 2008 when the Library’s Prints and Photographs Division published thousands of photographs on Flickr Commons. For over 10 years, this project has invited visitors to the photo-sharing social network site to help identify people and places in the photographs, generating additional rich information about them. Two additional crowdsourcing efforts within the Library – American Archive of Public Broadcasting’s FixIt+ and Library of Congress Labs’ Beyond Words –  invited people to transcribe historic public broadcasting programs and to identify cartoons and photographs in the Library’s historic newspaper collections. These projects have demonstrated the passion of volunteers for the Library’s mission and programs, as well as the knowledge and expertise the public has to share with the Library.

screenshot of GitHub issues page

The crowd.loc.gov team is actively developing and addressing issues in GitHub

Crowd.loc.gov runs on an open-source software, Concordia, developed utilizing the user-centered design principles of trust and approachability. It has been open source from the beginning and is available in the library’s Github repository. Because Concordia is open source, other libraries and organizations can use the code to create transcription projects focused on their own collections.

Have we got you curious? Good! Consider visiting crowd.loc.gov today to contribute to our Abraham Lincoln Papers Challenge. We hope to inspire volunteers to finish transcribing over 10,000 items from the papers by the end of 2018. Help us meet our goal by transcribing at least one page; then share your work with others in History Hub!

8 Comments

  1. Lynn Davis
    October 27, 2018 at 9:18 am

    What a wonderful opportunity to bring history to the people!

  2. Sharon von See
    October 29, 2018 at 3:36 pm

    I am a Library of Congress certified braille transcriber with time on my hands, so would love to help out in digitizing some of the materials!

  3. John R Barton
    October 29, 2018 at 5:02 pm

    Will transcribe where the need is greatest.

  4. Meghan Ferriter
    October 30, 2018 at 6:41 pm

    Thank you, Lynn!

  5. Meghan Ferriter
    October 30, 2018 at 6:44 pm

    Thank you, Sharon – please let us know if we can help you get started with the Lincoln, Terrell, Rickey, Barton, or Bourne campaigns. We appreciate your generous offer to help!

  6. Meghan Ferriter
    October 30, 2018 at 6:45 pm

    Thank you for this offer, John! You’ll find our that images in the Letters to Lincoln campaign are ready for transcription editing and review and we’d welcome your help.

  7. Rachel Vogus
    November 2, 2018 at 1:50 pm

    Hello! I am a Master of Information Student at Rutgers doing a small project on Open Source software. I was wondering why you switched to your own platform Concordia over using Scribe (from Beyond Words). Could you tell us more about that?

  8. David Brzostowicki
    November 14, 2018 at 8:18 am

    In agreement with John, I will transcribe where the need is greatest as well. The host of historical resources in this program must be revelatory and I would happily volunteer to transcribe.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.