October Innovator-in-Residence Update

Library of Congress Innovator-in-Residence, Jer Thorp, has started diving into the collections at the Library. We’ve rounded up some of his activities in October and how he is sharing his process in this post.

Jer has created a “text-based exploration of Library of Congress @librarycongress‘ MARC records, specifically of ~9M books & the names of their authors.” He started by asking what would happen if you and he were to wander the Library of Congress stacks and collect every book from a given year. After piling them up, what if you selected 40 titles at random to represent each year, then gathered the first names of the authors? What might you see in those names across time and space?

Now you can explore this thought experiment with Jer as he takes authors’ first names from MARC records and remixes them in glitch.

Screenshot of author first names from Library of Congress MARC records

Library of Names – an experiment extracting author first names from Library of Congress MARC records

You can find code for the front end and processing via this tweet from Jer.

Jer is sharing his work in several ways. First, he’s documenting his research and thoughts via Open Science Framework. You can dig into his wiki, activity, and tags on his Library of Congress Residency 2017/2018.

He has also created a Github repository of his code, data, and miscellanea related to his residency. You can comment and share your ideas with him there and via Twitter.

Earlier this month, we shared our experience touring Library of Congress divisions with Jer. Take a look at the collections we explored via this Twitter Moment. We visited with curators, reference librarians, and archives specialists from Manuscripts, Geography & Maps, Rare Books, Prints & Photographs, American Folklife Center, and Web Archiving.

Two men exploring subject files in the American Folklife Center Reading Room, Subject File drawer for "Gulf War - History of the World"

@LC_Labs Tweet from 06 October during American Folklife Center tour

 

Want to create something new with Library of Congress collections data? Download a cleaned MARC records data set for yourself from our LC for Robots page from the MARC Open-Access section. And if turning data into a thought experiment is your game, you might consider showcasing your skills in the Congressional Data Challenge (details here).

Introducing Beyond Words

As a part of Library of Congress Labs release last week, the National Digital Initiatives team launched Beyond Words. This pilot crowdsourcing application was created in collaboration with the Serial and Government Publications Division and the Office of the Chief Information Officer (OCIO) at the Library of Congress. In our first week and a half, […]

Automating Digital Archival Processing at Johns Hopkins University

This is a guest post from Elizabeth England, National Digital Stewardship Resident, and Eric Hanson, Digital Content Metadata Specialist, at Johns Hopkins University.  Elizabeth: In my National Digital Stewardship Residency at Johns Hopkins University’s Sheridan Libraries, I am responsible for a digital preservation project addressing a large backlog (about 50 terabytes) of photographs documenting the university’s […]

Centralized Digital Accessioning at Yale University

This is a guest post from Alice Prael, Digital Accessioning Archivist for Yale Special Collections at the Beinecke Rare Book & Manuscript Library at Yale University. As digital storage technology progresses, many archivists are left with boxes of obsolete storage media, such as floppy disks and ZIP disks.  These physical storage media plague archives that […]

Recommendations for Enabling Digital Scholarship

Mass digitization — coupled with new media, technology and distribution networks — has transformed what’s possible for libraries and their users. The Library of Congress makes millions of items freely available on loc.gov and other public sites like HathiTrust and DPLA. Incredible resources — like digitized historic newspapers from across the United States, the personal papers […]

Wisdom is Learned: An Interview with Applications Developer Ashley Blewer

  Ashley Blewer is an archivist, moving image specialist and developer who works at the New York Public Library. In her spare time she helps develop open source AV file conformance and QC software as well as standards such as Matroska and FFV1. She’s a three time Association of American Moving Image Archivists’ AV Hack […]

User Experience (UX) Design in Libraries: An Interview with Natalie Buda Smith

  Natalie Buda Smith is the User Experience (UX) Team supervisor at the Library of Congress, and most recently worked with NDI to design the beautiful graphic for our Collections as Data conference. Her team has been busy redesigning Loc.gov, and the new homepage is set to debut Tuesday, Nov.1st. We caught up over coffee […]

Digital Collections and Data Science

Researchers, of varying technical abilities, are increasingly applying data science tools and methods to digital collections. As a result, new ways are emerging for processing and analyzing the digital collections’ raw material — the data. For example, instead of pondering one single digital item at a time – such as a news story, photo or […]

Co-Hosting a Datathon at the Library of Congress

On June 14 and 15, the Library of Congress hosted Archives Unleashed 2.0, a web archive “datathon” (otherwise known as a “hackathon,” but apparently any term with the word “hack” in it might sound a bit menacing) in which teams of researchers used a variety of analytical tools to query web-archive data sets in the hopes of discovering some intriguing insights before their 48-hour deadline […]