The following is a guest post by Zach Coble, Systems and Emerging Technologies Librarian at Gettysburg College.
When I began this job a year ago, one of the first things our director told me was that the library had just purchased two blogs on the Civil War and she wanted me to figure out how they could be used most effectively for teaching and learning. From the start I wasn’t sure exactly what I was supposed to do. “What do you mean we purchased two blogs? Why?”
A little bit of context first.
Many students are drawn to Gettysburg College for its curricular offerings and opportunities related to the Civil War, and the library aims to provide the resources necessary to support these initiatives. Musselman Library Special Collections has a strong collection of Civil War materials, but, as a small liberal arts college, our budget is not unlimited. Not to mention, most of the intriguing Civil War artifacts have already been purchased, or, if they’re on the market, are quite expensive.
In this environment, how can the library continue to acquire and provide access to resources that support students and faculty in creating original scholarship? One avenue, suggested by Peter Carmichael of Gettysburg College’s Civil War Institute, is to collect Civil War blogs.
It’s not as strange as it sounds.
The study of Civil War memory, or how people at various points in time after the War perceived and interpreted the War through reunions, monuments, and the like, is a growing field. Our Special Collections has a strong collection of such material, especially relating to the Battle of Gettysburg–veteran reunions, battlefield monuments and artwork.
Furthermore, since much of the current discussion relating to Civil War memory is taking place online, Carmichael envisions the College taking an active role in collecting digital materials and focusing the collection on the study of Civil War memory.
As a pilot project, Gettysburg College has purchased two blogs, Civil War Memory and Cosmic America. It’s exciting to explore new forms of scholarship, but we’re not exactly sure what to do with the blogs. Although the blogs are currently active they will not always be, so we must determine how we want to preserve them. Since none of us are experts in digital preservation, we are trying to understand at a conceptual level how best to approach this project.
This initiative has required us to think of larger issues concerning the library’s role in digital curation. Should libraries even try to preserve blogs and other digital content? Are we equipped, in terms of technology and staffing, to take on this kind of work? Can’t we rely on the big names in the field like the Library of Congress and the Internet Archive to take care of this?
As an employee of a cultural institution, I’m biased to believe that libraries (as well as archives, museums, and others) have a responsibility to preserve cultural content as it fits within the mission, goals, and collection development policy of the organization. I also believe that institutions need to take responsibility and work to inform themselves so they can properly care for the digital materials in their own collections.
As we began to research the different tools and methods of preservation, we gained a better understanding of our own goals for the project. Obviously, we are interested in preserving the blog posts and comments. However, we can’t predict how future users will want to use a site or even who they will be, so we’d like to try to preserve as much as possible, including the site layout and design. Our choices basically come down to:
1. Develop an in-house system. Possibly begin by migrating the blogs to campus servers, use a web crawler to harvest (e.g. Heritrix), and then figure out what to do with the files (How do we catalog them? Do we set up mirror sites or just make the crawls searchable?)
2. Use a hosted service. As with any hosted product the tradeoff is control vs. convenience. There are services designed specifically for educational/cultural institutions that offer many of the features we’re interested in.
While our research has been productive, we have not yet decided on a solution. We are just beginning to consult with our IT department to determine how the campus’s technology infrastructure will influence our approach.
For example, the two Civil War blogs we have use WordPress but our servers are not equipped to accommodate every content management system. Although moving the blogs to campus servers and then crawling periodically would offer a robust solution, perhaps a hosted solution would meet our needs equally well. Or maybe our best solution is something completely different. We’re still discovering and planning, and keeping an open mind to our end goals as well as which tools will help us get there.
Are you working on a similar project or have you encountered similar issues? Share your experience!