Newly-Minted Librarians and Web Archives

The following is a guest post by Nora Ohnishi, a former intern with the Web Archiving Team at the Library of Congress.

Nora Ohnishi standing next to our Alan Rath "World Wide Web 1997" sculpture. Photo credit:

Nora Ohnishi standing next to our Alan Rath “World Wide Web 1997” sculpture. Photo credit:

My name is Nora Ohnishi, and I will graduate with my Masters in Library and Information Science from the University of North Texas in May. I began working for The Library of Congress via the HACU National Internship Program in January 2015. While here I have worked with the Web Archiving Team, part of the Office of Strategic Initiatives.

Within my first few weeks of work in February 2015, still learning the lay of the land, I attended the Federal Government Web Archiving Working Group meeting. The parties in attendance at this meeting included:

During this meeting, Christie Moffatt from NLM discussed her team’s use of the Archive-It tool and what collections are publicly available, such as the Global Health Events and Health and Medicine Blogs. Dory Bower and David Wallace from the Government Publishing Office distributed a new publication documenting the Federal Information Preservation Network’s plans for activities.

It was after this meeting that I began work on analyzing a spreadsheet of Federal government Web sites published by 18F, a newly-formed organization within the General Services Administration. I review the sites for owners, authors and completeness and determine the size of the site according to Google. After this, I create a nomination using DigiBoard (pdf), the Library’s curatorial tool, for inclusion of the site in the Public Policy Topics Web Archives collection. This work has helped build the Library of Congress collection of Federal Web sites.

It is now early May, and using this spreadsheet, I have successfully evaluated over 2,000 Federal Government sites and created nominations for review! Among my favorites are:

I like these sites for various reasons, but the main reason is that they helped me learn about different government groups or initiatives that I otherwise would not have known about. Additionally, some of the documents and reports on the sites date back to the 1980s, and the site owners may decide to take them down at some point. This reason makes the sites good candidates for inclusion in the Web Archives.

As a result of this process, the Library and the working group can now use this as a launching point for ensuring our work harvesting Web sites does not overlap, and in cases where it does, discuss the reasons, such as different scopes, harvesting levels or other issues. This was at times a formidable task, but it all ties into the reason for our work at the Library of Congress and in the Federal Web Archiving Working Group: to provide future researchers open access to government information. As a newly minted librarian, this is one of my most important philosophies behind library services.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.