Machine Learning + Libraries: A Report on the State of the Field

[Madge Lessing, full length, on bicycle, facing left; holding musical horn to lips]. Photograph copyrighted by E. Chickering, c1898. Library of Congress Print s& Photographs Division. //www.loc.gov/resource/cph.3b10346/

[Madge Lessing, full length, on bicycle, facing left; holding musical horn to lips]. Photograph copyrighted by E. Chickering, c1898. Library of Congress Print s& Photographs Division. //www.loc.gov/resource/cph.3b10346/

Digital collections in libraries are vast—and growing, as we continue to digitize cultural heritage materials and acquire new born digital collections.  At the same time, the use of machine learning and artificial intelligence has grown exponentially.  At LC Labs, we explore how technology can help fulfill the Library of Congress’s vision that “all Americans are connected to the Library of Congress,” sharing this knowledge and learning from others.  To this end, we are delighted to release a new report, “Machine Learning + Libraries: A Report on the State of the Field,” by Dr. Ryan Cordell.

Dr. Cordell is an Associate Professor of English at Northeastern University and an experienced digital humanities practitioner.  The report is part of the Library’s 2019 “Season of Machine Learning,” sponsored by LC Labs and the Digital Strategy Directorate.  The practical applications of machine learning have great potential to improve access, discovery, and engagement in library collections.  But, as Dr. Cordell points out, libraries are also in a position to lead broader conversations about responsible use of these technologies.

The report expertly frames machine learning and both its possibilities and its challenges.  After defining machine learning and distinguishing it from the broader category of artificial intelligence (AI), Dr. Cordell describes the history of the use of these technologies in libraries and explores various approaches.  Urging care and thoughtful implementation of machine learning, the report points out the risks and potential benefits for crowdsourcing, searchability of type and handwritten documents and audiovisual materials, collection management, conservation and preservation, and creative and artistic works.  The report concludes with recommendations for responsibly implementing machine learning, improving access to data, developing infrastructure, and cultivating expertise.

The report, and the other explorations in our “Season of Machine Learning,” have helped us to learn about applying machine learning at the Library of Congress collections.  We hope that these experiments and recommendations will help others looking to apply these technologies in libraries, archives, and other cultural heritage collections.

We invite you to read Dr. Cordell’s report.  You may also be interested in all of the results of our “Season of Machine Learning,” including an experiment by the University of Nebraska-Lincoln’s Project Aida team, the results of our September 2019 Machine Learning + Libraries Summit, and 2020 Library Innovator in Residence Benjamin Lee’s Newspaper Navigator project.  And, as always, you can get in touch and keep up with our work at labs.loc.gov, at [email protected], or by signing up for signing up for our monthly newsletter!

One Comment

  1. Mike H.
    July 23, 2020 at 9:20 pm

    Thank you for this report. A quick question: what is the preferred citation for this report?

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.