The Library of Congress was thrilled to host the 2012 International Internet Preservation Consortium General Assembly April 30 – May 4th. Over 150 registrants packed meeting rooms to discuss all aspects of web archiving. From legal issues to technical challenges to research use, the entire lifecycle of web archiving was covered.
As a library and heritage practice web archiving is a little over 10 years old, the IIPC GA is a vital meeting of professionals that are developing the tools, standards and best practices for this new but growing field. Below is a quick overview of the week’s events. More detailed posts will follow about some of the presentations and workshops. Presentations from the week are posted and video of the proceedings will be posted in the coming weeks. Also, check out the collection of tweets from participants.
Day One: The Broad Value of Web Archiving: Demonstrated Use
During the open conference day presenters and participants explored in depth the current and possible uses of web archives in the research, business, legal, and public spheres.
The data contained in web archives conceivably covers virtually all contemporary (and many historical) subjects, in all languages, by an unknowable variety of authors. The detail and scale of information that web archives offer to researchers make them very unique and valuable resources. However, there are challenges. Participants discussed the technical and financial difficulties of providing access in the context of a public institution and the copyright and privacy laws that restrict access. Although researchers often build their own archives or use data services provided private companies, the archives collected by IIPC members have added authenticity that they are being preserved by a trusted party. This is an especially important issue in web archiving for legal purposes.
The collection of at-risk web sites is also an important role IIPC members serve for researchers. In the public sphere especially the web is often the only avenue of communication to constituents. After reorganizations, regime changes or even policy changes government publications on the web are often altered or disappear completely. Several IIPC members and associates are actively collecting, preserving and providing access to these resources.
Day Two: IIPC General Assembly
Membership also received an update from funded projects including JhoNAS, Twittervane, and the web archiving workshop at the National Library of France. A project of considerable interest that was demonstrated is the IIPC Memento Aggregator which holds the possibility of a unified access mechanism for IIPC web archives via a Memento time gate.
The number of IIPC member institutions is up to 42. The four new members this year, Columbia University, George Washington University, National Library of Estonia, and Los Alamos National Laboratory, were able to introduce themselves and their interests and activities in web archiving.
Our veteran members were also able to update the group on major developments at their own institutions, including presentations from a first-time GA attendee the Government Printing Office and first-time GA presenter the National Library of Spain, among others.
Day Three:Working Group Meetings
The Harvesting, Access and Preservation Working Groups met on day three in addition to a Heritrix User Group. Participation in a Working Group is the major benefit and responsibility of each member organization. It is also the structure through which most IIPC funded projects are incubated. Working group members should contact their co-chairs for information about these meetings. The agendas are posted here: http://netpreserve.org/events/2012ga.php
Day Four & Five: Workshops
Several topics in web archiving cross working group lines. Thursday and Friday were and opportunity for members and invited guests to discuss important and emerging issues. Depicting and managing the Web Archiving Lifecycle were discussed in detail. The automated workflow tool NetarchiveSuite was demonstrated, the Unified Digital Format Registry community met, metrics and quality in web archives were discussed, and ideas about using crowdsourcing methods to engage the public and improve the selection and access of web archives. In addition to discussions about improving current practices to capture the web as it is today there was a workshop and panel discussions about Harvesting the Future Web.
As the publishing platforms and technologies continue to shift the tools and processes of web archivists also need to evolve. It is these kinds of challenges and the talented people who keep trying to meet them that will make future IIPC General Assembly meetings just as productive, thought provoking, and enjoyable as this year’s was.