One of our annual highlights is The Storage Meeting, which brings together digital preservation practitioners and data storage vendors to have an open discussion. We held this year’s meeting, Designing Storage Architectures for Preservation Collections, during September 26-27, in Washington DC.
Over 100 archivists and librarians, computer scientists, IT professionals and storage vendors participated. This included people in the trenches who manage digital collections and run IT infrastructures, researchers and engineers experimenting with new technologies and managers and executives.
I love this meeting.
It’s a rare opportunity for preservation professionals to have an open conversation with IT staff and vendors about what they are trying to accomplish, for vendors and researchers to present new trends and for everyone to see just how close or far apart we are in meeting our needs.
Presentation topics included issues faced by a number of large digital collections, outcomes of surveys in preservation storage use, economic models for analysis of preservation storage infrastructure expenditures, the current state of solid-state and flash storage, tools for the management of storage infrastructure, use of cloud storage and reducing power usage.
The two buzzwords for this meeting were: “faster” and “migration.”
Every collecting institution in the room had some sort of migration going on, be it networking, hardware, software, media file formats, or metadata. The assumption was that this would be an ongoing effort for all of us, rather like the story about painting the Golden Gate Bridge – that every time they finish, it’s time to start again. (Note that they don’t really continuously paint it end-to-end; I looked it up).
It’s expensive to constantly update our infrastructures, but the need to reduce power consumption, increase storage capacity in fixed-space data centers, and outpace the growth of our digital collections is vital.
Also vital is to improve the speed at which we move files between and within our institutions, and to improve the ability to provide access to increasingly large collections. It was not uncommon at this meeting to hear about multi-petabyte video and audio collections, 250 terabyte web archives, collections of millions of digitized books and billions of tweets. Cultural heritage institutions have entered the realm of “big data,” and service models that once supported near-instantaneous querying and transferring of files for research use are now being revised to include the possibility of hours for queries and days for transfers. Many of the presentations focused on denser storage for larger collections, faster write and seek times and faster processing. The catch is that many opined that network infrastructure may be the one area that has lagged in this era of large files and collections.
Notes and slides from the meeting will be put on digitalpreservation.gov by the end of the first week in October. In-depth reports will also be made available about some of the presentations on this blog in the very near future. Stay tuned!