The following is a guest post by Marie Gallagher, a computer scientist in the Lister Hill National Center for Biomedical Communications at the U.S. National Library of Medicine (NLM).
(This post is based on “Improving Software Sustainability: Lessons Learned from Profiles in Science“, an interactive paper (pdf) at the Society for Imaging Science and Technology’s Archiving 2013 conference, April 2-5, 2013.)
This story begins in the early 1990s at the National Library of Medicine, when our group experimented with arranging, describing and digitizing historical manuscript collections to make the collections searchable and accessible to multiple users simultaneously. Our earliest digital library experiments involved a collection containing correspondence and reports from the 1960s and 1970s. At that time we used a proprietary document management system to collect the metadata, manage the digitized images, and allow for searching across the collection. The proprietary system met our basic needs. However, without access to the source code or support from the vendor, we could neither make changes nor add basic functionality. Over time we replaced components of the system so that we could modify them to meet our evolving needs. Fortunately we were no longer dependent on the proprietary system by the time the vendor was acquired and the product was abandoned.
Today, the metadata and digital images we created using that system survive. We have imported and exported the metadata into different systems over the years. We had scanned the papers to TIFF format files. We benefited from using a sustainable file format because these original TIFF files survive unchanged as our digital masters today. We carefully copy the TIFFs to new media and verify the copies. So the effort to keep the metadata and digital images alive through the years has been minimally burdensome.
Keeping alive the software required for our digital library to function has been a different experience. These software enable metadata creation, quality assurance, digital item management, and access to items and metadata through our current digital library’s Web site, Profiles in Science®. Our digital library’s basic software architecture design has remained fairly stable, as well as our metadata schema and digital items. But the effort to keep alive the various software components we depend upon has been ongoing and seems to have grown over time. We used public domain or open source software where possible, used proprietary software where the benefit outweighed the cost, and wrote our own software when necessary. A few of these software replacements are shown.
The effort needed for ongoing software upgrades and replacements will come as no surprise to software architects, developers, programmers, security experts and others who use or develop software. But the need may be less obvious to anyone more removed from these activities. The ongoing need for upgrades and replacements has sometimes prompted the question, “Why can’t you just build it and leave it alone?”
A look back at the history of some of our software upgrades and replacements provides some answers. Clearly some changes were necessary in order to add new features. But avoiding adding new features still would not have eliminated the need to make replacements and upgrades. Some software replacements and upgrades were necessary because of external threats to the stability of our software. Some of these threats included hardware or operating system incompatibilities, loss of backward compatibility, loss of needed functions, new policy requirements, product abandonment, product support/licensing costs, security flaws and software bugs. Not responding to these threats could have eventually resulted in inability to create or edit metadata and digital items as well as lack of access to our digital items and metadata–not to mention exploitation of security flaws to do harm to our systems or others.
The technological landscape will continue to change. And we will want to be able to make changes and add new features to better manage and provide access to digitized collections. We will want to keep software maintenance costs as low as possible.
Eliminating the threats and effects of technological obsolescence altogether seems unattainable. But we might be able to delay or diminish the threats of technological obsolescence. When we have a choice, we can try to make choices that might encourage the sustainability of our software. Choices might include software that has these characteristics: access to source code, widely-used, well tested, actively developed, uses standards, well documented, acceptable licensing terms, import/export capabilities, multi-platform, supports backward compatibility, and is not overly customized. More suggestions are welcome.