Five Questions for Will Elsbury, Project Leader for the Election 2014 Web Archive

The following is a guest post from Michael Neubert, a Supervisory Digital Projects Specialist at the Library of Congress.

The 2008 Barack Obama presidential campaign web site a week before the election.

The 2008 Barack Obama presidential campaign web site a week before the election.

Since the U.S. national elections of 2000, the Library of Congress has been harvesting the web sites of candidates for elections for Congress, state governorships and the presidency. These collections  require considerable manual effort to identify the sites correctly, then to populate our in-house tool that controls the web harvesting activity that continues on a weekly basis during about a six month period during the election year cycle.  (The length of the crawling depends on the timing of each jurisdiction’s primaries and availability of the information about the candidates.)

Many national libraries started their web archiving activities by harvesting the web sites of political campaigns – by their very nature and function, they typically have a short lifespan and following the election will disappear, and during the course of the election campaign the contents of such a web site may change dramatically.  A weekly “capture” of the web site made available through a web archive for the election provides a snapshot of the sites and how they evolved during the campaign.

With Election Day in the U.S. approaching, it’s a great opportunity to talk with project leader Will Elsbury on the identification and nomination of the 2014 campaign sites and his other work on this effort as part of our Content Matters interview series.

Michael: Will, please describe your position at the Library of Congress and how you spend most of your time.

Will: I came to the Library in 2002.  I am the military history specialist and a reference librarian for the Humanities and Social Sciences Division. I divide most of my time between Main Reading Room reference desk duty, answering researchers’ inquiries via Ask a Librarian and through email, doing collection development work in my subject area, participating in relevant Library committees, and in addition, managing a number of Web archiving projects.  Currently, a good part of my time is devoted to coordinating and conducting work on the United States Election 2014 Web Archive. Several other Web archiving collections are currently ongoing for a determined period of time to encompass important historical anniversaries.

Michael: Tell us about this project and your involvement with it over the time you have been working on it.

Will: I have been involved with Web archiving in the Library for the last ten years or so. The projects have been a variety of thematic collections ranging from historical anniversaries such as the 150th commemoration of the Civil War and the centennial of World War I, to public policy topics and political elections. The majority of the projects I have worked on have been collecting the political campaign Web sites of candidates for the regular and special elections of Congress, the Presidency and state governorships. In most of these projects, I have served as the project coordinator. This involves gathering a work team, creating training documents and conducting training, assigning tasks, reviewing work, troubleshooting, corresponding with candidates and election officials and liaising with the Office of Strategic Initiatives staff who handle the very important technical processing of these projects. Their cooperation in these projects has been vital. They have shaped the tools used to build each Web archive, evolving them from a Microsoft Access-created entry form to today’s Digiboard (PDF) and its Candidates Module, which is a tool that helps the team manage campaign data and website URLs.

Michael: What are the challenges?  Have they changed over time?

Will: One of the most prominent challenges with election Web archiving is keeping abreast of the many differences found among the election practices of 50 states and the various territories. This is even more pronounced in the first election after state redistricting or possible reapportionment of Congressional seats. Our Web archive projects only archive the Web sites of those candidates who win their party’s primary and those who are listed as official candidates on the election ballot, regardless of party affiliation. Because the laws and regulations vary in each state and territory, I have to be certain that I or an assigned team member have identified a given state’s official list of candidates.

Some states are great about putting this information out. Others are more challenging and a few don’t provide a list until Election Day. That usually causes a last minute sprint of intense work both on my team’s part and that of the OSI staff. Another issue is locating contact information for candidates. We need this so an archiving and display notification message can be sent to a candidate. Some candidates very prominently display their contact information, but others present more of a challenge and it can take a number of search strategies and sleuthing tricks developed over the years to locate the necessary data. Sometimes we have to directly contact a candidate by telephone, and I can recall more than once having to listen to some very unique and interesting political theories and opinions.

2002 web site for the campaign of then-Speaker Denny Hastert of Illinois.

2002 web site for the campaign of then-Speaker Denny Hastert of Illinois.

Michael: You must end up looking at many archived websites of political campaigns – what changes have you seen?  Do any stand out, or do they all run together?

Will: I have looked at thousands of political campaign web sites over the years. They run the gamut of slick and professional, to functional, to extremely basic and even clunky. There is still that variety out there, but I have noticed that many more candidates now use companies dedicated to the business of creating political candidacy web sites. Some are politically affiliated and others will build a site for any candidate. The biggest challenge here has to be identifying the campaign web site and contact information of minor party and independent candidates. Often times these candidates work on a shoestring budget if at all and cannot afford the cost of a campaign site. These candidates will usually run their online campaign using free or low-cost social media such as a blog or Facebook and Twitter.

Michael: How do you imagine users 10 or 20 years from now will make use of the results of this work?

Will: Researchers have already been accessing these Web archives for various purposes. I hope that future researchers will use these collections to enhance and expand their research into the historical aspects of U.S. elections, among other purposes. There are many incidents and events that have taken place which influence elections. Scandals, economic ups and downs, divisive social issues, military deployments, and natural disasters are prominent in how political campaigns are shaped and which may ultimately help win or lose an election for a candidate. Because so much of candidates’ campaigns is now found online, it is doubly important that these campaign Web sites are archived. Researchers will likely find many ways to use the Library of Congress Web archives we may not anticipate now. I look forward to helping continue the Library’s effort in this important preservation work.

The Library of Congress Wants Your File Format Ideas

In June of this year, the Library of Congress announced a list of formats it would prefer for digital collections. This list of recommended formats is an ongoing work; the Library will be reviewing the list and making revisions for an updated version in June 2015. Though the team behind this work continues to put […]

Beyond Us and Them: Designing Storage Architectures for Digital Collections 2014

The following post was authored by Erin Engle, Michelle Gallinger, Butch Lazorchak, Jane Mandelbaum and Trevor Owens from the Library of Congress. The Library of Congress held the 10th annual Designing Storage Architectures for Digital Collections meeting September 22-23, 2014. This meeting is an annual opportunity for invited technical industry experts, IT  professionals, digital collections […]

Library to Launch 2015 Class of NDSR

The Library of Congress Office of Strategic Initiatives, in partnership with the Institute of Museum and Library Services, has recently announced the 2015 National Digital Stewardship Residency program, which will be held in the Washington, DC area starting in June 2015. As you may know (NDSR was well represented on the blog last year), this […]

Hybrid Born-Digital and Analog Special Collecting: Megan Halsband on the SPX Comics Collection

Every year, The Small Press Expo in Bethesda, Md brings together a community of alternative comic creators and independent publishers. With a significant history of collecting comics, it made sense for the Library of Congress’ Serial and Government Publications Division and the Prints & Photographs Division to partner with SPX to build a collection documenting […]

Upgrading Image Thumbnails… Or How to Fill a Large Display Without Your Content Team Quitting

The following is a guest post by Chris Adams from the Repository Development Center at the Library of Congress, the technical lead for the World Digital Library. Preservation is usually about maintaining as much information as possible for the future but access requires us to balance factors like image quality against file size and design […]

Duke’s Legacy: Video Game Source Disc Preservation at the Library of Congress

The following is a guest post from David Gibson, a moving image technician in the Library of Congress. He was previously interviewed about the Library of Congress video games collection. The discovery of that which has been lost or previously unattainable is one of the driving forces behind the archival profession and one of the […]