Top of page

Web Archiving Blog Roundup

Share this post:

The following is a guest post by Abbie Grotke, Library of Congress Web Archiving Team Lead

While organizations have been archiving the web since the mid-1990s, it’s only in the last few years that there’s been a surge in web archivists speaking out about issues they encounter, uses of archives, and innovations in tools and technologies.

Readers interested in web archiving may have noticed that my colleagues and I blog about it here whenever we get the chance: information about our Library of Congress Web Archives, collaborations we’re involved in, technical issues related to archiving, and more.  Not only are our fellow members of the International Internet Preservation Consortium blogging more these days about their work, but others in the community as well.

For those who might be interested in learning more about what other organizations are doing, I thought I’d do a roundup of blogs we’re reading. Most are in English, though as an indication of the international reach of this activity, some are not.

Australia’s Web Archives: National Library of Australia’s PANDORA Archive curators blog about the content of the archive and sharing experiences of web archivists at the NLA and its partner organizations.

California Digital Library: CDL has a blog that includes posts related to their web archiving program and the Web Archiving Services (WAS).

Common Crawl: While not web archivists per se, Common Crawl’s mission is to “build and maintain an open crawl of the web that can be accessed and analyzed by everyone,” members of our community are interested in the research use and potential of the common crawl dataset.

Internet Archive Blogs: Internet Archive’s blog covers a variety of topics about the IA. Posts related to web archiving can be found under Announcements, Archive-IT, and Wayback Machine categories.

Internet Memory Foundation Blogs: A non-profit institution actively supporting the preservation of the Internet, the Internet Memory Foundation offers up two blogs (Memoranda Blog and Synapse Blog), both of interest to web archivists, researchers and technologists working in this space.

National Archives (UK)  blog:  This blog hosts a variety of topics from the UK National Archives; posts specifically relating to Web Archiving are at this link.

National Library of New Zealand: The library’s blog has had a few posts about their web archive activities: Websites Come, Websites Go and This site is accessible using TELNET!

National Library of Spain (in Spanish): A mixed-content blog dealing a variety of topics related to the institution. They recently began to promote their web archive and have a few posts explaining the nature of their web archive, its founding and contents, and expounding on the hypothetical advent of a Digital Dark Age.

Portuguese Web Archive – News: In Portuguese and English, this news feed provides the latest news from the Portuguese Web Archive.

Smithsonian Archives Bigger Picture Blog: Has an occasional post about the Smithsonian’s web archiving activities. A search of the blog for “web archiving” will get you to them.

UK Web Archive blog: Announcements and information about the UK Web Archive. Topics include research use of their archives, collection development and crowdsourcing, technical updates, and so forth.

WebArchiv: The Czech web archiving project maintains this blog (in Czech). They discuss what, how and why they web archive, among other topics.

Web Science and Digital Libraries Research Group at Old Dominion University: This blog has a variety of research and teaching updates from Old Dominion University. They often include postings about Web Archiving.

I may have missed other blogs that you may be reading or contributing to. Please post in the comments to add to this list!

Comments (4)

  1. Check for the Archive-It blog at – our first post will be in late February!

  2. The WebART research project, aimed at building tools for Web archiving research, blogs at

  3. Thanks for alerting us to these other resources!

  4. Another new one just appeared: The Web Archiving Roundtable of the Society of American Archivists. Check it out at this address:

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.