Working at Scale: The Firehose of Data
Posted by: Aaron D. Chaletzky
The Library of Congress is the largest library in the world with over 173 million cataloged items. For the past decade, digitization of this enormous collection has increased exponentially. The Preservation Services Division (PSD) is responsible for a huge portion of this effort, managing contracts for the digitization of millions of pages of books, newspapers, and microfilm frames each year. All of this imaging results in a lot of data, hundreds of millions of files, and this is how we manage that data.
Posted in: Collections Management, Preservation, Reformatting, Tools of the Trade, Uncategorized