As librarians, we identify, evaluate, select, collect, describe, preserve and provide access to materials to facilitate use. As librarians of the 21st century, we have integrated digital collections such as ebooks, databases, datasets, and other digital objects into our traditional analog collections.
What about websites?
Do libraries collect websites?
Back in January, I presented on the topic of preserving websites at the 15th Annual Conference of Atmospheric Science Librarians International (ASLI), which was held in conjunction with the 92nd American Meteorological Society (AMS) Annual Meeting in New Orleans, Louisiana. My presentation focused on web archiving from the perspective of a reference librarian. In my presentation, If It Is Not Archived, It Maybe Lost in the Future: Collecting and Preserving Websites at the Library of Congress, I wanted to bring awareness of archiving websites to my ASLI colleagues and encourage them to get involved with web archiving projects of their own.
The Library of Congress has been preserving websites since 2000. Our first projects involved U.S. Elections and September 11. In 2003 the Library of Congress, the national libraries of Canada, Australia, Denmark, Finland, France, Iceland, Italy, Norway, Sweden, and UK, along with Internet Archive formed the International Internet Preservation Consortium (IIPC). In 2004 the Library established an official web archiving team.
My first introduction to web archiving was in 2005 when I participated in the Hurricane Katrina and Rita web archive collaboration. In this collaboration the Library of Congress, along with Internet Archive, California Digital Library, and other similar institutions nominated news, personal, relief and government websites to be captured and archived.
Typically, the Library of Congress collects and archives websites based on themes and events, which makes sense. However, in 2006 the Library experimented with a Single Sites project that archived websites without a unifying theme. I was excited to have another opportunity to preserve websites and work with our web archiving team. On working with the Single Sites project, I was able to select a variety of websites in my subject areas- mathematics and meteorology. I nominated a variety of websites that would supplement the Library’s analog collection. For example, I nominated websites devoted to the search for Mersenne prime numbers and the plight of the endangered polar bear.
As of January 2012, the Library has collected about 285 terabytes of web archive data, and it keeps growing! You can view our public web archive collections here.
If you want to read more about archiving websites at the Library, our web archiving team leader Abbie Grotke has written extensively on this topic for The Signal: Digital Preservation blog:
- Ask the Recommending Officer: The September 11, 2001 Web Archive
- Ask the Recommending Officer: Indian General Elections 2009 Web Archive
- Ask the Recommending Officer: The Civil War Sesquicentennial Web Archive
- First Decade of Web Archiving
- A Tale of a Disappearing Website
- It’s Beginning to Look A Lot Like…Election Archiving Season!
- It Takes a Village…to Archive the Internet