The following is a guest post by Abbey Potter, Program Officer, NDIIPP. She is also Communications Officer for the IIPC.
The Future of the Past of the Web was a well-named event hosted by the Digital Preservation Coalition, JISC, and the British Library on October 7th. It followed in a sequence of DPC events on web archiving that examined current practice and trends. This year’s focus was on using web archives and the future of researcher access. Memento and the idea of delivering and using web archives as data were the hot topics.
Herbert van der Sompel of the Los Alamos National Labaratory, and co-leader of the Memento project, presented a compelling use case for Memento in which he took an older paper he published and tried to follow the links he cited back to the original resources. Most of the links were broken, though all but one are available in a web archive somewhere. Using the Memento plug-in or other Memento tool a user would be able to reach the archived version if the resource no longers exists on the live web.
There was a lot of excitement about the usefulness of the project and the possibilities for awareness. Users who don’t find what they are looking for on the web may not know they can redirect their search to a web archive. Giving seamless access to the archived resources will raise the profile of web archives. If access is restricted for policy or copyright reasons the user can at least know the archived version exists and how they could pursue access.
The rise of data use, delivery and preservation as core digital library services has appeared on this blog before (“D is for data” by Martha Anderson and “Data is the new black” by Leslie Johnston). Web archives are very big in terms of their size, the numbers of files, and the diversity of content. And because web data is structured and machine readable, there is tremendous potential for web archives as research datasets. Both the n-gram search offered by the British Library and the WayBack Machine’s new beta release allow users a better view of what’s in the archives.