For someone who thinks about web archiving almost every day it’s sometimes hard to explain to people outside the digital library community why archiving web sites is worth doing. “They archive themselves,” some say. “Why would you want to save what’s on the Internet?” they wonder. Instead of launching into explanations about cultural heritage, dynamic publishing streams and comprehensive collection policies, I can now point to recent and fun examples of why we should be archiving the web and what it looks like to archive the web.
NPR’s Weekend Edition Sunday ran a story about a project called perma.cc which is a perfect example of why preserving websites is important. URLs are often given in citations and bibliographies to direct readers and researchers to source materials. The project address the problem of “link-rot,” or broken links that show up in the citations of legal articles and arguments.
We’ve all come across a 404 error or a URL that doesn’t exist anymore. The people behind perma.cc studied the problem and found over 70% of the links used in the citations of a sample of legal journals (published between 1999-2011) and 50% of the links cited in Supreme Court opinions are now dead or go to the wrong place. This link rot puts the basic information supporting our legal system at risk.
To address this problem perma.cc works with law libraries and law authors to build a system where authors can create links to archived versions of their journals. There are a number of other projects and services working on this problem as well. Perma.cc is a recent addition to the scene and it provides clear evidence that the web does not archive itself. Librarians, archivists and researchers need to take action to ensure these resources are fully available in the future.
Space Jams website
The NBA playoffs are a good excuse to bring in this next example, which originally went viral in late 2010. Space Jams is a 1996 Warner Brothers movie starring cartoon characters and 1990s basketball stars like Michael Jordan and Charles Barkley. In a conversation about web archiving a friend mentioned that this movie’s website is still in its “original format.” And indeed, the screen capture here is from the live website for the movie and is identical to the website captured by the Internet Archive in November of 2003, when the website was first saved.
I can’t say why this website continues to exist but it is unique. Other popular movies from 1996 such as Independence Day, Scream and Fargo have no trace of a website–one may never have existed. The live website for Space Jams is not an example of a web archive but it’s a good example of what web archives are filled with.
Other sites from this era are only available in web archives and this site’s surprising existence points to how digital content created in new forms and formats are often at-risk.
The following is a guest post by Margo Padilla, Program Manager for NDSR-NY. With a month remaining in the inaugural term of the National Digital Stewardship Residency program, the cohort is busy putting the finishing touches on projects, participating in closing program events and planning future endeavors. Since arriving in Washington DC last September, residents […]
Cinda May, a key organizer of the Personal Digital Archiving 2014 conference, is one of a growing number of information professionals helping to digitally preserve personal and community history. May, chair of Special Collections at Indiana State University Library, is a co-creator of the Wabash Valley Visions & Voices Digital Memory Project and, as such, she […]
The following is a guest post by Andrea Fox, Web Archiving Intern at the Library of Congress. When Abbie Grotke of Web Archiving took me on for an internship, I thought well of myself for a few minutes until realizing I had no clue what Web Archiving was or what it wanted from me. Abbie […]
What does the history of the MP3 format mean for those interested in ensuring long-term access to our digital cultural heritage? In this installment of the NDSA’s Insights interview series I talk with historian Jonathan Sterne about his book MP3: The Meaning of a Format. You can read the introduction to his book, titled “Format […]
The following is a guest post by Kris Nelson, Program Management Specialist at the Library of Congress and Program Coordinator of the National Digital Stewardship Residency. “If you want to do important work, you have to work on an important problem.” With these words, Betsy Humphreys, Deputy Director of the National Library of Medicine, effectively […]
In getting ready to make a transition from digital preservation and repository development at the Library of Congress to digital preservation at the U.S. National Archives and Records Administration (NARA), I was asked if I would write a post about what I’ve been doing and what I will be doing at NARA. Don’t mind if […]
The following is a guest post from Jane Mandelbaum, co-chair of the National Digital Stewardship Alliance Innovation Working group and IT Project Manager at the Library of Congress. The NDSA Levels of Digital Preservation are useful in providing a high-level, at-a-glance overview of tiered guidance for planning for digital preservation. One of the most common requests received […]
Gaining the knowledge, skills and experience required to manage digital assets and provide access to them over time can sometimes feel like trying to hit a moving target. Almost all heritage organizations now have a responsibility to steward some kind of digital content be it e-books or journals, digitized materials, electronic records, digital photographs, data […]
Continuing the insights interview series, I’m excited to share this conversation with Meg Phillips, External Affairs Liaison at the National Archives and Records Administration. A few years back we “un-chaired” CURATEcamp Processing: Processing Data/Processing Collections together. Meg wrote a guest post reflecting on that event for the Signal titled More Product, Less Process for Born-Digital […]