Partly Cloudy: Trends in Distributed and Remote Preservation Storage–More Results from the NDSA Storage Survey

The following is a guest post by Jefferson Bailey, Fellow at the Library of Congress’s Office of Strategic Initiatives.

As discussed in a previous post, the NDSA Infrastructure Working Group recently conducted an extensive survey of NDSA member preservation storage systems. While the survey examined a wide range of preservation storage trends and activities, it originated from the working group’s interest in exploring the current use of cloud computing among NDSA members.

Defining “the cloud” in digital preservation storage

General conversation about the cloud focuses on third-party cloud storage providers. As the results below suggest, adoption of these cloud storage providers remains relatively small. However, when we consider cloud storage alongside several related ways of distributing and using storage as a service, some interesting trends emerge.

The results illuminate both the widespread acceptance of some digital preservation storage practices and the continuing uncertainty regarding others. For example, there is broad acceptance of the importance of geographic redundancy in maintaining preservation copies of content. A majority of members are currently keeping all or some of their preservation copies in multiple geographic locations.

Similarly, participation in distributed or collective preservation systems is gaining in popularity, with half the respondents participating in or planning on joining such a system. Lastly, usage of third-party and cloud-based storage systems is still an emerging idea. Many members are exploring this option but functionality challenges, issues of trustworthiness and uncertainty over sustainability are limiting widespread adoption.

Managing copies in different geographic locations

A significant majority of members are keeping their digital assets in different geographic locations, which signals a success in establishing baseline best practices for preservation storage. Among NDSA members:

  • 76% (44 of 58) report keeping data in other locations for all their content
  • 14% (8 of 58) reported keeping a complete copy in multiple geographic locations for some of their content
  • 10% (6 of 58) reported that they do not keep their data in multiple geographic locations.

Cooperatives, contracting out and cloud storage

Members were each asked if they were currently using, planning to use, exploring the possibility of using or not considering using a distributed storage cooperative, a contracted provider of storage and third party cloud storage providers. The chart below reports the frequency at which members responded to each of these questions.

The chart illustrates the following trends. Among the membership there is a clear trend toward participating in distributed storage cooperatives and substantive interest in cloud storage illustrated in the 20 members currently exploring or planning on incorporating cloud storage systems. The individual frequencies for each option are reported below the chart.

Participating in distributed storage cooperatives or systems

Many NDSA members are involved in distributed and cooperative systems:

  • 43% (25 of 58) are using distributed and cooperative systems
  • 26% (15 of 58) with planning on or actively exploring using these systems
  • 31% (18 of 58) of members are not currently considering this storage option.

Of those using distributed and cooperative systems, 56% (14 of the 25) are using some type of LOCKSS system. The trustworthiness of distributed digital preservation cooperatives, an issue outlined in this paper, appears to be gaining acceptance.

Contracting out storage services

  • 27% (15 of 56) are currently contracting out some of their preservation storage to be managed by third parties
  • 4% (2 of 56) are planning to contract out some of their preservation storage
  • 8% (10 of 56) are currently exploring this option
  • 52 % (29 of 56) of members are not considering contracting out storage services to be managed by a third party.

Using third-party cloud storage service providers

  • 16% (9 of 58) of members are using third-party cloud storage service providers for keeping a copy of their content
  • 7% (4 of 58) are planning on using third-party cloud storage service providers
  • 28% (16 of 58) are currently exploring this option
  • 49% (28 of 58) are not considering using cloud services for keeping a copy of their content

Feelings of control and the cloud

One interesting result revealed by the survey is the tension between the use of third-party systems and a stated preference to host, maintain and control preservation storage. As seen in the above numbers, nearly 50% of respondents are using, planning on using, or considering contractor services or third-party cloud storage. At the same time, 74% (43 of 58) of the members agreed or strongly agreed that they had a strong preference for maintaining and controlling their preservation storage systems. The most-cited reasons for this preference were costs, trustworthiness, legal mandate and security and risk management.

One survey question offers insight into this complexity. The question asked members to rank the significance of specific preservation storage system features (with 1 being least significant and 7 being most significant). The chart below shows the results. The highlighted cells indicate 10 or more responses and the sum for each function was calculated by multiplying the number of responses by the priority and then adding the totals.

That features like built-in fixity checking, automated tasks and migration services rank above block level access suggests that contractor and third-party cloud services have yet to satisfy a demand for these types of functions – numbers also evidenced by the higher participation in distributed cooperative systems. Vendor and cloud-based systems are playing a significant role in preservation but a dearth of functionality and the uncertainties inherent in relinquishing control are likely limiting their widespread use.

Taking this information about desired functionality into account, these results may suggest that feelings about control are being expressed in different ways. While to some, control may mean block level access to content, this level of control was far and away the least requested feature. In contrast, built-in functions like fixity checking and automated inventory, retrieval and management services express a different sense of control. Built-in functionality provides an organization with preservation information that gives them assurance over the integrity of their content; however, it does so by actually reducing the direct control individuals in the organization can exert on digital objects. In this sense, control may be conditionally defined according to specific preservation activity or whether that activity is occurring locally or in “the cloud.” Here the survey results open up more questions than they answer about exactly what kinds of control member organizations want to be able to exercise. Still, the combined desire for control over storage coupled with a desire for additional automated functionality suggests that desires for control are not manifesting themselves in strong desires for block level control.

Note on survey data

The NDSA Infrastructure Survey, conducted between August 2011 and November 2011, received responses from 58 members of the 74 NDSA member organizations who are preserving digital content. This represents a 78% response rate. The goal of this survey was to get a snapshot of current storage practices within the organizations of the National Digital Stewardship Alliance.

The original survey was sent out to the 98 members of the NDSA. We confirmed that 24 members do not have content they are actively involved in preserving. These organizations include consortia groups, professional organizations, university departments, funders and vendors. There were 16 organizations that neither responded to the survey nor indicated that they were not preserving digital collections. The 16 non-respondents are distributed across the different kinds of organizations in the NDSA, including state archives, service providers, federal agencies, universities and media producers.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.