The following is a guest post by David Brunton, a Supervisory Information Technology Specialist in the Library of Congress Office of Strategic Initiatives.
I have heard the National Digital Newspaper Program jokingly described as “putting breaking new online, within 200 years.” In some ways, its a fitting tag line: the most current newspaper pages released on Chronicling America are nearly ninety years past.
It’s a phenomenal project: a joint venture between the National Endowment for the Humanities and the Library of Congress to digitize historic American newspapers. Its fueled by partners in state institutions who select and digitize American papers from between 1836 and 1922. Try searching for a great-grandparent or a favorite news item, or a serial novel, and the once-breaking news may surprise again.
Even though the news were publishing is a century old, we still try to make pages available to the public quickly. To begin with, this is the proof of one of the projects core goals: enhancing access to historic American newspapers. But it is also true that once we get the news online, its only a beginning. Pages then get incorporated into all manner of photo sharing sites, online encyclopedias, search engines, special-purpose indexes, and big-data explorations. People are anxious to get at the information, so we’re trying to get it online in faster, more efficient ways.
For five years, we have been releasing content in large batches, with a several month break in-between releases. We have now ended that practice in favor streaming pages out the door as soon as we complete our internal review. This is closer, I imagine, to the way the publishers did it when the news was current.
Its a big change for us, and we havent worked out all the kinks; but, several hundred thousand pages have made it through the process, and the results are there to read. What find I like best about this change is that I now visit the site more frequently, trolling for updates.