Results from the 2013 NDSA U.S. Web Archiving Survey

The following is a guest post from Abbie Grotke, Web Archiving Team Lead, Library of Congress and Co-Chair of the NDSA Content Working Group.

wa-survey2014-coverThe National Digital Stewardship Alliance is pleased to release a report of a 2013 survey of Web Archiving institutions (PDF) in the United States.

A bit of background: from October through November of 2013, a team of National Digital Stewardship Alliance members, led by the Content Working Group, conducted a survey of institutions in the United States that are actively involved in, or planning to start, programs to archive content from the web. This survey built upon a similar survey undertaken by the NDSA in late 2011 and published online in June of 2012. Results from the 2011-2012 NDSA Web Archiving Survey were first detailed in May 2, 2012 in “Web Archiving Arrives: Results from the NDSA Web Archiving Survey” on The Signal, and the full report (PDF) was released in July 2012.

The goal of the survey was to better understand the landscape of web archiving activities in the U.S. by investigating the organizations involved, the history and scope of their web archiving programs, the types of web content being preserved, the tools and services being used, access and discovery services being provided and overall policies related to web archiving programs. While this survey documents the current state of U.S. web archiving initiatives, comparison with the results of the 2011-2012 survey enables an analysis of emerging trends. The report therefore describes the current state of the field, tracks the evolution of the field over the last few years, and forecasts future activities and developments.

The survey consisted of twenty-seven questions (PDF) organized around five distinct topic areas: background information about the respondent’s organization; details regarding the current state of their web archiving program; tools and services used by their program; access and discovery systems and approaches; and program policies involving capture, availability and types of web content. The survey was started 109 times and completed 92 times for an 84% completion rate. The 92 completed responses represented an increase of 19% in the number of respondents compared with the 77 completed responses for the 2011 survey.

Overall, the survey results suggest that web archiving programs nationally are both maturing and converging on common sets of practices. The results highlight challenges and opportunities that are, or could be, important areas of focus for the web archiving community, such as opportunities for more collaborative web archiving projects. We learned that respondents are highly focused on the data volume associated with their web archiving activity and its implications on cost and the usage of their web archives.

Based on the results of the survey, cost modeling, more efficient data capture, storage de-duplication, and anything that promotes web archive usage and/or measurement would be worthwhile investments by the community. Unsurprisingly, respondents continue to be most concerned about their ability to archive social media, databases and video. The research, development and technical experimentation necessary to advance the archiving tools on these fronts will not come from the majority of web archiving organizations with their fractional staff time commitments; this seems like a key area of investment for external service providers.

We hope you find the full report interesting and useful, whether you are just starting out developing a web archiving program, have been active in this area for years, or are just interested in learning more about the state of web archiving in the United States.

Close Reading, Distant Reading: Should Archival Appraisal Adjust?

From time to time, co-chairs of the National Digital Stewardship Alliance Arts and Humanities Content Working Group will bring you guest posts addressing the future of research and development for digital cultural heritage as a follow-up to a dynamic forum held at the 2014 Digital Preservation Conference.   The following is a guest post from Meg […]

What Does it Take to Be a Well-rounded Digital Archivist?

The following is a guest post from Peter Chan, a Digital Archivist at the Stanford University Libraries. I am a digital archivist at Stanford University. A couple of years ago, Stanford was involved in the AIMS project, which jump-started Stanford’s thinking about the role of a “digital archivist.” The project ended in 2011 and I […]

We Want You Just the Way You Are: The What, Why and When of Fixity

Fixity, the property of a digital file or object being fixed or unchanged, is a cornerstone of digital preservation. Fixity information, from simple file counts or file size values to more precise checksums and cryptographic hashes, is data used to verify whether an object has been altered or degraded. Many in the preservation community know […]

The Library of Congress Wants Your File Format Ideas

In June of this year, the Library of Congress announced a list of formats it would prefer for digital collections. This list of recommended formats is an ongoing work; the Library will be reviewing the list and making revisions for an updated version in June 2015. Though the team behind this work continues to put […]

Announcing the Release of the 2015 National Agenda For Digital Stewardship

The National Digital Stewardship Alliance is pleased to announce the release today of the “2015 National Agenda for Digital Stewardship.”  The Agenda provides funders, decision‐makers and practitioners with insight into emerging technological trends, gaps in digital stewardship capacity and key areas for research and development to support the work needed to ensure that today’s valuable […]

QCTools: Open Source Toolset to Bring Quality Control for Video within Reach

In this interview, part of the Insights Interview series, FADGI talks with Dave Rice and Devon Landes about the QCTools project. In a previous blog post, I interviewed Hannah Frost and Jenny Brice about the AV Artifact Atlas, one of the components of Quality Control Tools for Video Preservation, an NEH-funded project which seeks to […]

Preliminary Results for the Ranking Stumbling Blocks for Video Preservation Survey

In a previous blog post, the NDSA Standards and Practices Working Group announced the opening of a survey to rank issues in preserving video collections. The survey closed on August 2, 2014 and while there’s work ahead to analyze the results and develop action plans, we can share some preliminary findings. We purposely cast a […]

Untangling the Knot of CAD Preservation

At the 2014 Society of American Archivists meeting, the CAD/BIM Taskforce held a session titled “Frameworks for the Discussion of Architectural Digital Data” to consider the daunting matter of archiving computer-aided design and Building Information Modelling files. This was the latest evidence that — despite some progress in standards and file exchange — archivists and the […]

Curating Extragalactic Distances: An interview with Karl Nilsen & Robin Dasler

While a fair amount of digital preservation focuses on objects that have clear corollaries to objects from our analog world (still and moving images and documents for example), there are a range of forms that are basically natively digital. Completely native digital forms, like database-driven web applications, introduce a variety of challenges for long-term preservation […]