The following interview is a guest post from Karen Cariani, Director of the WGBH Media Library and Archives at WGBH Educational Foundation and Co-Chair for the National Digital Stewardship Alliance Infrastructure Working Group.
Open source software is playing an important role in digital stewardship. In an effort to better understand the role open source software is playing, the NDSA infrastructure working group is reaching out to folks working on a range of open source projects. Our goal is to develop a better understanding of their work and how they are thinking about the role of open source software in digital preservation in general.
For background on discussions so far, review our interviews with Bram van der Werf on Open Source Software and Digital Preservation, Peter Van Garderen & Courtney Mumma on Archivematica and the Open Source Mindset for Digital Preservation Systems and Mark Leggott on Islandoras Open Source Ecosystem and Digital Preservation. In this interview, we talk with Tom Cramer, Chief Technology Strategist & Associate Director, Digital Library Systems & Services at Stanford University Libraries.
Karen: Could you give us some background on the Hydra project? How did this project come about and what are its goals and objectives?
Tom: Hydra’s goals are to combine the power of a repository for enterprise-scale digital asset management and preservation, with tailored interfaces, workflows and access systems specific to different content types and streams–e.g., articles vs. images vs. time-based media vs. books vs. data. The project started in 2009 when three universities (Hull, Stanford, Virginia) plus Fedora Commons came together to see if they could jointly develop a flexible application framework to complement Fedora. The project motto quickly became “if you want to go fast, go alone; if you want to go far, go together”, and we’ve spent as much energy on building a vibrant and sustainable community as we have on the code.
Karen: Could you tell us a bit about how you and your institution got involved in this project? Further, could you tell us a bit about how your thinking on digital repository platforms has changed and developed over time?
Tom: In 2008-09, Stanford was re-evaluating the architecture and platform for the Stanford Digital Repository. When we started building the first generation of the system in 2005, we decided to write our own repository from scratch, as we we didn’t think the existing platforms at the time were good starting points for us. By 2008, we had a much better sense of our needs–which included collaborative development on a shared platform. We also felt that the repository community, and Fedora in particular, had greatly matured, and would make a serviceable component in our environment for the second-generation SDR. In discussions with colleagues at UVa and Hull, it became clear that we were not the only ones with the same needs, and a joint approach could be highly leveraged.
Karen: The NDSA infrastructure working groups exploration of open software is focused on figuring out if there are any inherent benefits to using open source software for parts of an organizations digital preservation strategy. Do you see any such inherent benefits, and if so what are they?
Tom: Yes, I believe there is a substantial benefit around using open source software for digital preservation activities. Using open source software isn’t necessarily any cheaper than licensing commercial software, but it does give an institution a predictable ongoing cost for maintenance, and more control over direction than with most vendor products. So much of digital preservation is about the sustainability of supporting a system, these are important factors. The transparency that comes with OSS is also a benefit–with more eyes and more users on a given piece of software, the chances of uncovering latent issues is arguably greater than with proprietary software (though this would only apply in OSS projects with sufficient adopters, of course). Most importantly, though, I think using OSS puts the reins of digital preservation in your institution’s hands in a way that commercial software does not. In another ten years, when digital preservation is still better understood and there are a number of well-understood, commodity functions, I’d hope there is a robust and competitive marketplace of commercial solutions. Until then, though, the flexibility and customizability of OSS can provide a more direct path to meeting your preservation needs if they happen to fall outside the current market providers’ product lines.
Karen: I am impressed by the commitment of the community to share and work collaboratively to improve. What do you attribute this to? And how does this improve the project or make it a better model?
Tom: The pattern of collaboration is baked into Hydra’s DNA; from the very beginning, the software has been a shared development effort. I think this may be due to the fact that we didn’t start the code-base by open sourcing a single institution’s system, nor did we have all the heavy-lifting initially done by grant-funded programmers. We put a lot of time into ongoing communication, including weekly calls, constant IRC, and quarterly face-to-face meetings. That said, I think the biggest draw to aggressive collaboration is the quality of the technical work on Hydra. Developers like working with good developers, and by participating in Hydra collaboratively and sharing their work for re-use, many institutions feel like they’ve produced their best code.
Karen: How does the open community work? What are the guidelines or rules for participating?
Tom: We go to great lengths to make the community open and supportive. We value working code and welcome constructive engagement on any front: participants can add code or documentation, help in communications or outreach, or simply try to use the software, and ask question when they need help. We also put a lot of energy into training, to bring new community members up to speed. The Hydra Partners (of which there are now 17) are institutions that have each committed to the success of the project overall, not just their local use. We have a lot of coordination, a lot of consensus building, and very little central planning. In short, there aren’t “rules” so much as a community process, and people get out what they put in.
Karen: What is the best way to keep up on Hydra head development?
Tom: Joining one of the community email lists, email@example.com or firstname.lastname@example.org, is the best way to stay abreast of what’s up. We also do try to keep the website (http://projecthydra.org) up to date, but in a project that is as large and fast moving as Hydra is, there is always a little latency.
Karen: How do you think these projects can become sustainable? And how perhaps does being open help that sustainability vs a licensable vendor supported system?
Tom: I think any software project–open source or commercial–is sustainable when it provides more benefit than cost, and it’s clear to participants that they get more out of participating/using it than going another route. The great thing about vendors is they provide that focusing lens to translate a community’s interest (and revenue) into ongoing development and support that benefits the community. Hydra has succeeded so far for the same reasons; it has focused the community’s efforts around a common approach, and created a framework to enhance and expand it.