Collecting Digital Content at the Library of Congress

This is a guest post by Joe Puccio, the Collection Development Officer at the Library of Congress.

Joe Puccio

Joe Puccio. Photo by Beth Davis-Brown.

The Library of Congress has steadily increased its digital collecting capacity and capability over the past two decades. This has come as the product of numerous independent efforts pointed to the same goal – acquire as much selected digital content as technically possible and make that content as broadly accessible to users as possible. At present, over 12.5 petabytes of content – both acquired material and content produced by the Library itself through its digitization program – are under management.

In January, the Library adopted a set of strategic steps related to its future acquisition of digital content. Further expansion of the digital collecting program is seen as an essential part of the institution’s strategic goal to: Acquire, preserve, and provide access to a universal collection of knowledge and the record of America’s creativity.

The scope of the newly-adopted strategy is limited to actions directly involved with acquisitions and collecting. It does not cover digitization nor does it cover other actions that are critical to a successful digital collections program, including:

  • Further development of the Library’s technical infrastructure
  • Development of various access policies and procedures appropriate to different categories of digital content
  • Preservation of acquired digital content
  • Training and development of staff
  • Eventual realignment of resources to match an environment where a greater portion of the Library’s collection building program focuses on digital materials

It must also be emphasized that the strategy is aspirational since all of the resources required to accomplish it are not yet in place.

Current Status of Digital Collecting and Vision for the Future

In the past few years, much progress has been made in the Library’s digital collecting effort, and an impressive amount of content has been acquired.  As the eDeposit pilot began the complex process of obtaining digital content via the Copyright Office, additional efforts made great strides toward the goal of acquiring and making accessible other content.  Digital collecting has also been integrated into a range of special collections acquisitions.

The adopted strategy is based on a vision in which the Library’s universal collection will continue to be built by selectively acquiring materials in a wide range of formats – both tangible and digital.  Policies, workflows and an agile technical infrastructure will allow for the routine and efficient acquisition of desired digital materials. This type of collection building will be partially accomplished via collaborative relationships with other entities. The total collection will allow the Library to support the Congress in fulfilling its duties and to further the progress of knowledge and creativity for the benefit of the American people.

Assumptions and Principles

The strategy is based on a number of assumptions, most significantly that the amount of available digital content will continue to grow at a rapid rate and that the Library will be selective regarding the content it acquires. An additional primary assumption is that there will continue to be much duplication in the marketplace, with the same content being available both in tangible and digital formats.

Likewise, there are a number of principles that support the strategy, including the fact that the Library is developing one interdependent collection that contains both its traditional physical holdings and materials in digital formats. Other major principles are that the Library will ensure that the rights of those holding intellectual property will be respected and that appropriate methods will be put in place to ensure that rights-restricted digital content remains secure.

Plan for Digital Collecting

Over the next five years, the Library intends to follow a strategic framework categorized into six objectives:

Strategic Objective 1 – Maximize receipt and addition to the Library’s collections of selected digital content submitted for copyright purposes

Strategic Objective 2 – Expand digital collecting via routine modes of acquisitions (primarily purchase, exchange and gift)

Strategic Objective 3 – Focus on purchased and leased electronic resources

Strategic Objective 4 – Expand use of web archiving to acquire digital content

Strategic Objective 5 – Develop and implement an acquisitions program for openly available content

Strategic Objective 6 – Expand collecting of appropriate datasets and other large units of content

More Information

Much more detail is available in Collecting Digital Content at the Library of Congress.  Any questions or comments about this strategy or any aspect of the Library’s collection building program may be directed to me, [email protected].

A Library of Congress Lab: More Use and More Users of Digital Collections

Mass digitization — coupled with new media, technology and distribution networks — has transformed what’s possible for libraries and their users. The Library of Congress makes millions of items freely available on and other public sites like HathiTrust and DPLA. Incredible resources — like digitized historic newspapers from across the United States, the personal papers […]

Developing a Digital Preservation Infrastructure at Georgetown University Library

This is a guest post by Joe Carrano, a resident in the National Digital Stewardship Residency program. The Joseph Mark Lauinger Memorial Library is at home among the many Brutalist-style buildings in and around Washington, D.C. This granite-chip aggregate structure, the main library at Georgetown University, houses a moderate-sized staff that provides critical information needs […]

Women’s History Month Wikipedia Edit-a-thon

This is a guest post from Sarah Osborne Bender, Director of the Betty Boyd Dettre Library and Research Center at the National Museum of Women in the Arts. I graduated from library school in 2001, just months after Wikipedia was launched. So as a freshly minted information professional, it is no surprise that I fell […]

Open Science Framework: Meeting Researchers Where They Are

This is a guest post by Megan Potterbusch, National Digital Stewardship resident at the Association of Research Libraries. Openly sharing research data, code and methodology are integral parts of open science. Whether due to disciplinary culture shifts or funder and publisher mandates, the general trend towards open science has been increasing in many research fields. […]

Assembling the Whole: An Interview with Librarian|Artist Oliver Baez Bendorf

Oliver Baez Bendorf is a poet, cartoonist, librarian, teaching artist and activist. He holds an MFA in Poetry and MLIS from the University of Wisconsin-Madison, author of the book of poems The Spectral Wilderness (Kent State University Press 2015) and an essay on activism in the forthcoming Poet-Librarians in the Library of Babel (Library Juice […]

Read Collections as Data Report Summary

Our Collections as Data event in September 2016 on exploring the computational use of library collections was a success on several levels, including helping steer our team at National Digital Initiatives in our path of action. We are pleased to release the following summary report which includes an executive summary of the event, the outline […]

IEEE Big Data Conference 2016: Computational Archival Science

This is a guest post by Meredith Claire Broadway,a consultant for the World Bank. Computational Archival Science can be regarded as the intersection between the archival profession and “hard” technical fields, such as computer science and engineering. CAS applies computational methods and resources to large-scale records and archives processing, analysis, storage, long-term preservation and access. […]

The University of Richmond’s Digital Scholarship Lab

In November, 2016, staff from the Library of Congress’s National Digital Initiatives division visited the University of Richmond’s Digital Scholarship Lab as part of NDI’s efforts to explore data librarianship, computational research and digital scholarship at other libraries and cultural institutions. Like many university digital labs, the DSL is based in the library, which DSL […]

FADGI’s 10th Anniversary: Adapting to Meet the Community’s Needs

This is a guest post by Kate Murray, IT Specialist in the Library of Congress’s Digital Collections and Management Services. Started in 2007 as a collaborative effort by federal agencies, FADGI has many accomplishments under its belt, including the widely implemented Technical Guidelines for Digitizing Cultural Heritage Materials (newly updated in 2016); open source software, […]