The September 11, 2001 Web Archive: Twenty Years Later

Today’s guest post is from Tracee Haupt, a Digital Collection Specialist in the Digital Content Management section at the Library of Congress.


On the twentieth anniversary of the September 11th terrorist attacks, I asked four individuals who were part of the creation of the September 11, 2001 Web Archive to reflect on their experience documenting the tragedy and the unique contents of the collection. In addition to the archive’s historical significance as a record of how a variety of individuals and organizations responded to September 11th, the collection is also important as an example of an early web archiving project, when both the internet and the Library of Congress’ (LC) efforts to preserve it were still relatively new. In this post, current and former Library employees describe how the collection came to be, what they learned while creating it, and why preserving this aspect of internet history was crucial to fully understanding the impact of September 11th.

What was the state of web archiving at the Library of Congress in 2001?

Diane Kresh, former Director of Collections: It was pretty rudimentary. September 11th gave us the opportunity to do much more robust, targeted collecting.

Abbie Grotke, Assistant Head, Digital Collection Management Section: We started collecting web content just one year prior, so activities at the LC were still very much in a pilot phase, and we were focused on collecting content related to U.S. elections. We had finished collecting for the 2000 election and planning was underway to document the election in 2002 with some outside partners, including Internet Archive, WebArchivist.org, and the Pew Internet & American Life Project. So we already had some external partners that could help the Library jump into action as the events of September 11th unfolded, and together we pivoted rather quickly to focusing on documenting the terrorist attacks and their aftermath.

What was your role in the creation of the September 11, 2001 Web Archive?

Diane: I was the Director of Collections at the time, and my role was greenlighting it and identifying people to do it. I started pulling together like-minded people who were flexible, easy to work with, and tech-savvy, and who also had an appreciation for how the Internet’s contribution to the historical record would be different from that of a newspaper, book, or an article. I knew I had good people, so I felt like it was my job to eliminate barriers so that they could do the work they needed to do.

Cheryl Adams, Reference Librarian: I was the Reference Specialist and Recommending Officer (RO) for Religion. I was one of many ROs who jumped in to find sites related to this event from as many angles as we could find. I believe that any RO could be involved in the project, and the Library definitely wanted ROs from as many divisions/reading rooms as possible to participate so that we captured the response – from as many countries and in as many languages as we could. The event felt unprecedented, and our response was something completely new. I’d never seen the Library ask all of its recommending staff to respond to one event before.

The website for the French newspaper Le Monde, shown here in a capture taken shortly after September 11th, is an example of the international scope of the September 11, 2001 Web Archive.

The website for the French newspaper Le Monde, shown here in a capture taken shortly after September 11th, is an example of the international scope of the September 11, 2001 Web Archive.

Kathy Woodrell, Reference Librarian: In 2001, I was the Decorative Arts Specialist in the Humanities and Social Sciences Division. My colleague and friend, Cassy Ammen, sat across from me – and I regularly marveled at the new “web archiving” technology. A call went out to Recommending Officers to identify topics or items within their subject area impacted by the September 11th terrorist attacks. With Fine Arts Specialist Tony Mullan (now retired), I focused on collecting material about art destroyed in the World Trade Center. Locating the lists of artwork that had been destroyed was difficult – I felt somewhat conflicted that I was compiling lists of lost artwork when there were so many people whose lives had been lost.

Abbie: Initially, my role in the web archive was limited. Staff involved in digital collections were asked to contribute nominations to the collection – I think I submitted a few but don’t recall what they were. One year later, I became more involved in web archiving and helped create a 9/11 Remembrance collection, which was later integrated into the September 11, 2001 Web Archive. We revisited some of the websites that we captured in the immediate aftermath of the attack and preserved new content that showed how the event was remembered one year later.

The Library was documenting September 11th and its aftermath while events were still unfolding. What was it like to archive a historical event while also living through it?

Cheryl & Kathy: It was helpful to have something concrete to do – to somehow capture what felt like chaos, knowing that it would help future researchers.

Diane: I was a leader at the Library, but I was also a wife and mom. My children were younger at the time, and we lived on Capitol Hill, which was under immediate lockdown. I didn’t know anyone in New York then, but I knew lots of people who were frantically trying to reach loved ones to make sure they were safe, and sadly, that wasn’t the case for everyone. I was managing my own anxiety, fears, and concern, while also recognizing that I had a responsibility as a professional to tell this story. In a moment like that, there’s not a lot of time for reflection, you just have to act. I think we were lucky because we had people who were ready to act at the library. On the professional end of it, we could deal with the emotions and all the details about how this collection would be made available later, but in the moment, we had to just get it done and worry about fine-tuning later. It may sound corny, but I think you live for those moments when you know you’re really making a difference. If we hadn’t jumped on it and did what we did, that material would be gone and what a loss that would have been for the historic record.

The Detroit Free Press created this guide to combat inaccurate or insensitive portrayals of Arab Americans in news coverage after September 11, 2001.

The Detroit Free Press created this guide to combat inaccurate or insensitive portrayals of Arab Americans in news coverage after September 11, 2001.

Did the team have any specific goals or considerations in mind when selecting sites? 

Cheryl & Kathy: Web archiving was so new that we weren’t sure what we what we were allowed to capture. We wanted to find multiple viewpoints and reactions – not just pro-US. We wanted to give an accurate sample of the world’s response to an event in the United States. We thought about what researchers might want or need to see twenty or fifty years from then– and we’re still trying to do that in web archiving twenty years later.

What is unique about the September 11, 2001 Web Archive? In other words, what can you find in there that might not be in archives of newspapers or television footage?

Diane: Twenty years ago, most of what was going up on the web was not designed or intended to be permanent. A study at the time said that the average lifespan of a website was seventy-five days. It would be tempting to view it as, ‘oh that’s just ephemeral,’ but I think the kind of content that we were capturing was really important – spontaneous memorials, people looking for loved ones, advice on how to cope with tragedy, and information about how to donate blood or make monetary donations. We recognized that it was important, and if we didn’t collect it, there would be holes or gaps in the story.

Abbie: As Cassy, an RO who worked on the project, said in an interview ten years later, “At LC, we focused on difficult to find sites that spontaneously appeared as immediate reactions to the events – sites that were not easily found by search engines (too new to be indexed) but that were being shared in emails, linked from other sites, or in news articles at that time.” Looking through the collections, an example of this kind of content is the International Association of Firefighters website, which had condolences and information about donating to the New York Firefighter’s disaster relief fund. Another is “Gramma Hugs,” which is a personal site that exemplifies the kind of tributes that came from ordinary people and often disappeared quickly from the internet. We also have reactions from religious organizations, from small churches to the Vatican, and a number of school sites that document the response in classrooms and on campuses. Adding to the variety, there are examples of how people all over the world expressed their emotions through artwork and poetry.

You Will Never Be Forgotten is an example of the spontaneous memorials where people found an outlet to express their grief.

You Will Never Be Forgotten is an example of the spontaneous memorials where people found an outlet to express their grief.

Psychology in Daily Life was quick to post topical content related to how to cope with trauma.

Psychology in Daily Life was quick to post topical content related to how to cope with trauma.

What challenges did the team face in creating the web archive?

Diane: There was no playbook, no manual that said ‘for web archiving turn to page 17.’ It was all new, and to me that was the exciting part, and the material that’s there as a result is irreplaceable.

Abbie: Diane, there was a quote in an article you wrote that I think is really insightful about some of the other types of challenges we faced. You wrote, “The September 11 Web Archive project raised many of the same issues librarians have been confronting since the profession’s beginning. What are the dangers in collecting unevaluated sources? How do librarians, as keepers of the culture, ensure accuracy, balance and objectivity when disseminating information that has not been vetted through the avenues of peer review common to print media? When there are issues of national security at stake, how do we protect intellectual freedom and guard against censorship? Is it appropriate for the Library to collect Web sites that primarily seek to inflame, offend and promote hate?” I think that is a good summary of some of the issues the Library was grappling with.

When you consider the collection now, twenty years later, are there any aspects of it that strike you as being particularly noteworthy or surprising?

Diane: Well, first of all, that it exists at all. In bureaucracies, it can be a challenge responding to unexpected events. Everyone rushes around trying to figure out what the right response is. However, I think what we captured stands the test of time. It’s important material, and I’m always moved by the collection – the personal stories, photos, spontaneous memorials, poetry, and memories of friends or coworkers that were lost. If we had started the Minerva pilot just two years later, we probably wouldn’t have done a September 11th web archive, and that material might not exist anymore. It was a leap of faith to recognize that we didn’t have web archiving all figured out, but we were going to do the best we could with the amazing team that we had. I’m extremely proud of the work Abbie and others did to get the web archiving program going. It serves an important research need that complements everything else the Library is known for.

Abbie: When I see examples from the September 11, 2001 Web Archive now, I’m reminded of how far the Library has advanced in terms of our web archiving capabilities. Websites were a lot simpler back in 2001, but we were still figuring out how to fully capture them. Our tools at the time tended to focus on text rather than images, which means we sometimes lost valuable visual content – like the images on this website selling peace flags, for example. I also noticed that a small portion of the sites in the archive don’t seem to have content related to September 11th at all, and that is because we had to come up with a list of sites very quickly and nominations were coming not just from library staff but also from a number of organizations, volunteers, and the general public. Sites were sometimes added because someone thought they might post relevant content, and we started collecting them just in case. Knowing how fleeting web content could be, we erred on the side of collecting so that we would be able to preserve as much as we could before it disappeared. Web archiving was not as technologically advanced then, and the systems we had in place to create and manage web collections were not as sophisticated as they are now. Nonetheless, I’m still impressed by the number of websites we were able to capture, and we have learned a lot over the past twenty years.

The website NYCStories.com collected the memories and reactions of ordinary people to the events of September 11, 2001.

The website NYCStories.com collected the memories and reactions of ordinary people to the events of September 11, 2001.

Was there anything that you didn’t collect that in hindsight you wish you had?

Diane: No, I wouldn’t have done anything differently. Collecting for the web archive and other collections was limited only by our willingness to reach out. So yes, there probably was something out there that we didn’t get, but we were so indefatigable about our collecting, saying basically, ‘we’ll figure it out, and we’ll make a home for it,’ that I can’t think of any area where we gave up and said ‘we just have to let that go.’ Of course, a researcher could analyze the collection and find gaps, but I think in terms of what we were intending to do, we did a pretty great job of being comprehensive, smart, and sensitive about what we went after.

Cheryl & Kathy: In 2001, we were trying to project what researchers twenty years in the future would want to see. We might wish for more personal accounts than we were able to collect, but this is because blogs, podcasts and social media were in their infancy.

How has the approach to event-based collecting changed for the Library since the September 11, 2001 Web Archive was created?

Abbie: So much has changed. I often describe the September 11th collection as the one that took the web archiving team firmly out of the “pilot” stage. Back then, we had a small number of LC staff involved and could start collecting immediately with our existing partnerships in place – a quick call to Internet Archive and our other partners got things going. Twenty years later, we have more formal policies and workflows, as well as more people involved in the web archiving process. We still do some event-based harvesting around elections in the United States and internationally, and we are documenting events like the COVID-19 pandemic,  but primarily we leave urgent event-based collecting to colleagues working on projects such as the International Internet Preservation Consortium‘s collaborative collections, or the Internet Archive’s Global Events collections. Depending on the event, Library staff sometimes contribute subject expertise to these collaborative efforts. Over the years, we’ve also built up a number of ongoing collections that continually capture current events, such as our General News on the Internet Web Archive, and our Public Policy Topics Web Archive, so if something does happen we are likely already collecting websites that will be publishing materials about it.

Some responses have been edited for clarity or excerpted when necessary.

Next Slide Please: 2021 Digital Strategy Summer Intern Design Sprint part I

This is an interview with Emily Zerrenner, Jodanna Domond, Luke Borland, and Darshni Patel, four of the seven students that joined our team during the summer of 2021. As a small group, they worked together to better understand the Library’s Web Archives with the needs of researchers and data visualization artists in mind.

Nominations sought for the U.S. Federal Government Domain End of Term 2020 Web Archive

This is a guest blog post by Abbie Grotke, Assistant Head, Digital Content Management Section You may have noticed that it is presidential election season in the United States, which means it’s also time for web archivists to gather once again to archive United States Federal Government websites during the end of the presidential term. […]

Gina Jones and 20 Years of Web Archiving at the Library of Congress

Today’s guest blog post is from Gina Jones and Abbie Grotke, both of the Web Archiving Team. As a part of our series looking back at some of the people and stories around our 20th Anniversary of Web Archiving, I wanted to share with you an interview with a person who has been working on […]

In a Web Archives Frame of Mind: Improving Access and Describing the Collections

This is a guest post by Lauren Baker, a Librarian-in-Residence on the Library of Congress Web Archiving Team (a part of the Digital Collections Management & Services Division). The Librarians-in-Residence Program offers early career librarians an opportunity to contribute to Library projects while learning from professionals in the field. In 2018, the Library of Congress […]

Introducing the Computing Cultural Heritage in the Cloud Project

With support from the Andrew W. Mellon Foundation, the LC Labs team will pilot ways to combine cutting edge technology and the collections of the largest library in the world, to support creative new uses of collections. This project will explore service models to support researchers accessing Library of Congress collections in the cloud, with findings shared throughout the 2 year project.

In the Library’s Web Archives: 1,000 U.S. Government PowerPoint Slide Decks

The Digital Content Management section has been working to extract and make available sets of files from the Library’s significant Web Archives holdings. The outcome of the project is a series of web archive file datasets, each containing 1,000 files of related media types selected from .gov domains. You can read more about this series […]

In the Library’s Web Archives: Dig If You Will the Pictures

The Digital Content Management section has been working on a project to extract and make available sets of files from the Library’s significant Web Archives holdings. This is another step to explore the Web Archives and make them more widely accessible and usable. Our aim in creating these sets is to identify reusable, “real world” […]

In the Library’s Web Archives: Totally Tabular Data

The Digital Content Management section has been working on a project to extract and make available sets of files from the Library’s significant Web Archives holdings. This is another step to explore the Web Archives and make them more widely accessible and usable. Our aim in creating these sets is to identify reusable, “real world” content in the Library’s […]