LC Labs is pleased to welcome Benjamin Lee and Brian Foo as the second cohort of the Innovators in Residence program, designed to attract artists, journalists, researchers, teachers, and others willing to imagine and prototype examples of creative, innovative, and novel uses of Library of Congress digital collections. Innovators’ projects may take many final forms such as an artwork, visualization, application, or other publicly available tool, service, or exhibit. The Library’s first Innovator in Residence was data artist Jer Thorp; you can find out more about the outcomes of his work here.
Both Brian and Ben are using their backgrounds in computer science to approach the Library’s digital collections in new ways—Brian by using hip hop as a discovery tool and Ben by applying machine learning to extract images contained in the digital collections.
We sat down with Brian and Ben to get to know them and talk more about their projects.
Can you begin by introducing yourself to readers of the Signal blog?
Ben: I’m a second year Ph.D. student in computer science at the University of Washington studying machine learning. Starting in college, I got interested in digital humanities questions related to Holocaust Studies through my grandmother, who is a survivor of Auschwitz-Birkenau concentration camp. This led me to pursue a year-long fellowship at the United States Holocaust Memorial Museum (USHMM), where I developed a machine learning algorithm to categorize and sort the identification cards in the archives of the International Tracing Service. My goal in this project was to provide new ways for users and researchers to search the collection other than by name. This experience led me to pursue research in the field of information access and computational cultural heritage.
Brian: My name is Brian Foo and I am currently a data visualization artist at the American Museum of Natural History in New York City. I’ve worked in museums and libraries for the past eight years; I specialize in visualizing large collections of data and library materials of multiple formats including images, audio, and moving image.
Please describe your Innovator in Residence project.
Ben: The idea behind my Innovator in Residence project is to use deep learning, a subset of machine learning, to automate the extraction and tagging of images from the over 15 million newspaper scans in Chronicling America. My next goal is to make these images available to users in an interactive visualization such as on a timeline or a map or searching by topic. The interest of this research, in my opinion, cuts three ways: first, it allows users to experience the Library’s digital collections in an engaging way; second, it enables cultural heritage practitioners to ask new research questions; and third, it allows computer scientists to better understand how people are using the systems they build.
Brian: The goal of my project is to use the library’s public domain audio as source material for hip hop music production. By embedding these materials in hip hop music, listeners can discover items in the library’s vast collections that they likely would never have known existed.
I will do this by collaborating with library staff to identify sonically interesting and culturally relevant audio and moving image collections that are free to use for sample-based hip hop production. I will then develop music-making tools that facilitate serendipitous moments and connections between users and library audiovisual materials. I use these tools myself to create hip hop music that I will share throughout the process. As these new sounds travel to listeners’ ears, the unique materials that are referenced travel with them. So that’s what I mean by “discovery through hip hop” which I believe aligns with the mission of the Library of Congress to make culture and history accessible to the public while at the same time facilitating the use of those materials in the creation of new cultural artifacts.
What will be the benefit of your project for users of the Library of Congress?
Ben: A primary motivation behind my project is to excite the American public by demonstrating the possibilities of applying machine learning to library collections. Given the widespread enthusiasm about machine learning, this project could draw new people to the Library of Congress’s digital collections, as well as excite the Library’s regular users about emerging technological advances.
My hope is that this project could also inspire members of the public to start their own coding projects involving the Library of Congress’s digital collections. I plan to build this project primarily through open source code bases so that it will be easy to reuse and access for those who want to work with it. I hope this methodology will serve to “promote a culture of continuous improvement” as outlined in the Library’s Strategic Plan.
Brian: In hip hop, there’s a term called “crate digging,” which essentially refers to the practice of DJ’s digging through crates of recorded music in search of a treasure trove of obscure sounds spanning many different genres and eras. By focusing on providing access to audiovisual materials housed in the Library of Congress that are in the public domain and free to use, I want to identify what I think of as “the American Citizen’s Crate.” What are the sounds we can all draw from?
All users of the Library of Congress will be able to draw from this for exploration, inspiration, and for their own music production. Because it is not always clear to the general public what is in the public domain and what is fair use, especially with recorded sound, an added benefit of the project will be documenting that process in a great amount of detail.
Finally, as an Innovator in Residence, what does the word “innovation” mean to you?
Ben: Innovation is something that inspires others. It means taking new approaches. I see the Library of Congress and its entire mission as innovative. This institution is an immense repository of information and is dedicated to being accessible to the American public. I am fortunate to be part of this wave of innovation at the Library, which is paving the way for a digital-forward approach. The main element of innovation in my project is demonstrating to seemingly disparate groups, whether it be historians, computer scientists, cultural heritage practitioners, educators, or members of the public, that interdisciplinary approaches can be really fruitful for all involved.
Brian: For me, innovation combines three things: understanding as many viewpoints as possible, cultivating a collaborative work environment, and then working together to remove barriers to achieving the organization’s mission. Innovation happens when you have a cross-pollination of different expertise. This is why I always take a humble approach to my work: I never have assumptions that things “should” be done in a certain way, I always maintain respect for others’ perspectives, and see it as my goal to enable new ways of working while collaborating closely with stakeholders and content experts.
For more information about Ben and Brian’s projects and the Library of Congress Innovator in Residence program, see the following press release.