This is a Guest Post by Abbie Grotke, the Library of Congress Web Archiving Team Lead and Co-Chair of the National Digital Stewardship Alliance Content Working Group.
We’re excited to finally announce something a team of Library staff has been involved with for over a year now – a big project to integrate the Library’s web archives into the rest of the loc.gov web presence. It’s the new “Archived Web Site” format type and is a part of the Library’s main search function.
Part of an ongoing effort to better enable patrons and researchers to find and use our online materials more easily, this update provides, for the first time, a way for the Library’s web archive content to be searched alongside other formats such as books, photos, maps, periodicals and more.
This “soft launch” (or “we’re putting it out there but still have improvements to make”) includes content from eight of our publicly accessible web archives. All eight collections and others remain available at the Library of Congress Web Archives home (we won’t take that site down until all content has been migrated over). We have plans for more content to be released later this year, including some archived sites not already on the LCWA site.
Some of the features of this new release:
- Searching web archives directly from the Library’s homepage: When searching the larger loc.gov site, you can now select “Archived Web Site” as a facet to search archived sites, rather than going to a separate interface.
- Searching archived websites along with other Library content: You will now find records for our web archives intermixed with related content if you do a search across all formats. For instance, if you search the Library’s site from the home page for “Ann Telnaes” you not only see content from our Prints and Photographs division, but also an archived site that was captured as a part of a 2006 P&P project.
- Faceted searching: Users can narrow search results by date, collection name, contributors, subjects, locations and languages.
- Combined records: Previously, we had no elegant way to handle when a URL had changed or if a URL belonged to multiple collections — we usually had more than one catalog record and these were not linked. Now, these records combine to make it easier to find all of the content collected for any given web presence over time.
- Thumbnail browsing and featured items: We’ve now got thumbnail images for all of our archived sites (taken from the first date captured), which are integrated into the item records and in the search screens. Our content curators have also selected featured items for each of the eight collections, which are available on the collection pages.
- New viewer: Staff working on the project wanted to push the boundaries a bit and try some new ways of accessing and viewing the archived content, beyond what we’d been doing with access via our catalog records to the Wayback calendar view that most users of web archives are familiar with; so you’ll see a new viewer with this release that displays content from our Wayback Machine but in a new way.
Converting our web archives to the new format has provided some new challenges. Web archives aren’t like digitized images – they don’t appear neatly in a record and simply require a click on a “view larger” button to zoom in so researchers can inspect the content more closely. Our web archives try to replicate the look and feel of the site we archived over time, and we’re working hard to ensure that the content works as expected within our new framework and new viewer, particularly.
In addition to working over the coming months on the additional LCWA content, we’ll continue to make iterative fixes and tweaks, and would welcome any feedback that you, as researchers using web archives, digital preservation experts, or just interested folks have.
Let us know what you think!