@IIPC12: A Week of Web Archiving

The Library of Congress was thrilled to host the 2012 International Internet Preservation Consortium General Assembly April 30 – May 4th. Over 150 registrants packed meeting rooms to discuss all aspects of web archiving. From legal issues to technical challenges to research use, the entire lifecycle of web archiving was covered.

As a library and heritage practice web archiving is a little over 10 years old, the IIPC GA is a vital meeting of professionals that are developing the tools, standards and best practices for this new but growing field. Below is a quick overview of the week’s events. More detailed posts will follow about some of the presentations and workshops. Presentations from the week are posted and video of the proceedings will be posted in the coming weeks. Also, check out the collection of tweets from participants.

Day One: The Broad Value of Web Archiving: Demonstrated Use

During the open conference day presenters and participants explored in depth the current and possible uses of web archives in the research, business, legal, and public spheres.

The data contained in web archives conceivably covers virtually all contemporary (and many historical) subjects, in all languages, by an unknowable variety of authors. The detail and scale of information that web archives offer to researchers make them very unique and valuable resources. However, there are challenges. Participants discussed the technical and financial difficulties of providing access in the context of a public institution and the copyright and privacy laws that restrict access. Although researchers often build their own archives or use data services provided private companies, the archives collected by IIPC members have added authenticity that they are being preserved by a trusted party. This is an especially important issue in web archiving for legal purposes.

The collection of at-risk web sites is also an important role IIPC members serve for researchers. In the public sphere especially the web is often the only avenue of communication to constituents. After reorganizations, regime changes or even policy changes government publications on the web are often altered or disappear completely. Several IIPC members and associates are actively collecting, preserving and providing access to these resources.

Day Two: IIPC General Assembly

IIPC logo

New IIPC logo.

The General Assembly is the annual gathering of all IIPC members to discuss IIPC business. The officers gave updates on the past year’s progress. Notable announcements were the selection of the IIPC, Internet Archive and University of North Texas sponsored PhD student Brenda Reyes. She is currently working at the National Library of Spain and will spend 3 years studying web archiving at UNT with a summer internship at the Internet Archive. Members should note that in the weeks following the 2012 GA the 2013 call for proposals will be released. Another officer project this year is the redesign of IIPC’s own web site netpreserve.org. The new IIPC logo and homepage layout were debuted. The new site is expected to be launched well before the end of the calendar year. It will be the definitive resource for web archiving practice, it will help IIPC members work together better, and it will clearly explain the value of the work of the IIPC. Broad participation from members will be needed to launch and maintain the new web site.

Membership also received an update from funded projects including JhoNAS, Twittervane, and the web archiving workshop at the National Library of France. A project of considerable interest that was demonstrated is the IIPC Memento Aggregator which holds the possibility of a unified access mechanism for IIPC web archives via a Memento time gate.

The number of IIPC member institutions is up to 42. The four new members this year, Columbia University, George Washington University, National Library of Estonia, and Los Alamos National Laboratory, were able to introduce themselves and their interests and activities in web archiving.

Our veteran members were also able to update the group on major developments at their own institutions, including presentations from a first-time GA attendee the Government Printing Office and first-time GA presenter the National Library of Spain, among others.

Day Three:Working Group Meetings

The Harvesting, Access and Preservation Working Groups met on day three in addition to a Heritrix User Group. Participation in a Working Group is the major benefit and responsibility of each member organization. It is also the structure through which most IIPC funded projects are incubated. Working group members should contact their co-chairs for information about these meetings. The agendas are posted here: http://netpreserve.org/events/2012ga.php

Day Four & Five: Workshops

Several topics in web archiving cross working group lines. Thursday and Friday were and opportunity for members and invited guests to discuss important and emerging issues. Depicting and managing the Web Archiving Lifecycle were discussed in detail. The automated workflow tool NetarchiveSuite was demonstrated, the Unified Digital Format Registry community met, metrics and quality in web archives were discussed, and ideas about using crowdsourcing methods to engage the public and improve the selection and access of web archives. In addition to discussions about improving current practices to capture the web as it is today there was a workshop and panel discussions about Harvesting the Future Web.

As the publishing platforms and technologies continue to shift the tools and processes of web archivists also need to evolve. It is these kinds of challenges and the talented people who keep trying to meet them that will make future IIPC General Assembly meetings just as productive, thought provoking, and enjoyable as this year’s was.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.