The Heart of the Matter: An NDSR Project and Program Update

The following is a guest post by Maureen McCormick Harlow, a National Digital Stewardship Resident at the National Library of Medicine in Bethesda, Maryland.  She is working on a project to build a thematic web collection.

Maureen McCormick

Maureen McCormick Harlow

Greetings from the National Library of Medicine!  It’s hard to believe it, but I’m heading into the fourth quarter of my residency here.  I thought it was time to give an update on what I’ve been doing for my project, even though it’s not terribly Valentine’s Day-related!

The Project

My NDSR project is to build a thematic web collection at NLM that will be incorporated into the History of Medicine Division collection.  HMD has extensive digital and modern manuscript collections, and this little collection that I’m working on will be accessioned into it as a curated, intentional collection.

The Theory

Thematic collections can provide institutions with an opportunity to close known collection gaps.  If institutions can identify areas of weakness within their collections, they can intentionally collect on the topics as they exist today on the Internet.  This is an especially attractive option for topics that are in flux, or whose understanding is changing frequently.

Another benefit of thematic web collections is that they allow institutions to collect material that may be ephemeral.  Blogs come and go frequently, and once they are taken down, the information contained in them is gone as well.  Collecting websites can be akin to collecting gray literature.

The Model

My project is limited to creating one thematic collection to add to the HMD holdings, but I wanted to also establish a framework that could be used in the future for other thematic collections.  The framework that we eventually settled on is a thematic collection that represents two sides of the same coin so to speak.  In this case, Autism Spectrum Disorders are brain disorders generally diagnosed at the beginning of life, while the brain is developing, while Alzheimer’s Disease is a brain disorder diagnosed at the end of life, in old age.

Although the two diseases are not related, they are diagnosed during the organ’s development and decay.  Future thematic web collections could explore diagnoses in a particular body system or region made during the system/region’s development and at the end of life, or two extremes of the same issue.  Some examples include:

  • Teen pregnancy and infertility
  • Diabetes type 1 and type 2
  • Scoliosis and osteoarthritis
  • Eating disorders and obesity

Each of these issues is one of strategic importance to NIH and, in some cases, the nation (see: the Let’s Move project by Michelle Obama and the Teen Pregnancy Prevention Resource Center in HHS’s Office of Adolescent Health).  More importantly, many of these topics represent areas of great change and understandings that are in flux, making websites a viable way for future researchers to examine change over time.

The Details

Picking a Theme
Before you can create a thematic web collection, you’ve got to have a theme.  This process took awhile.  My first step was to look over the various collecting documents. In my section at NLM, there were three to consider: the NIH Research Priorities, the NLM Collection Development Policy and Manual, and an internal document that deals with known collection gaps (for example, the caregiver perspective).  Each of these helped to inform narrow my possibilities.  For instance, the Research Priorities at NIH report indicated several areas of interest to the larger NIH audience, alerting me to trends in research and some of the most prevalent problems in medicine.  It stood to reason that, since these were priorities for NIH, there would be scholarly work about the diseases produced, and that the understanding of the diseases was in a period of flux, making web collecting more important than ever.  Since this is a bit of a pioneer collection, I wanted it to fit squarely within each of these areas.


Screenshot of the National Library of Medicine Collection Development Manual relevant to the History of Medicine Division

After reviewing all of these documents and spending a significant amount of time looking at internet resources, I came up with three proposals:

  • Eating disorders
  • Sexual assault
  • Autism and Alzheimer’s

My last step in the process was to plug each potential topic into the NLM catalog and the HMD finding aid search to see what kind of resources we already had on each topic.  Since one of my personal goals was to help fill some of the collecting gaps, I wanted to see that the web collection would be contributing something original to the HMD collection.  In each case, I found that, while NLM collected extensively on each topic, the HMD holdings were limited.

We ended up going with the third option, and I’m calling the collection Disorders of the Developing and Aging Brain: Autism and Alzheimer’s on the Web.

The results when I searched the HMD finding aids for “autism.”

The results when I searched the HMD finding aids for “autism.”

Picking the Seeds
The scope of my collection was limited to approximately 40-60 seeds (individual websites/URLs that will be added to the collection).  I decided to split the seeds roughly in half (a total of 64 seeds) and divided the ~30 per topic into six or seven different areas:

  • Current understanding
  • Caretakers (first-person resources, primarily blogs of caretakers)
  • Patients/sufferers (also first-person, also primarily blogs)
  • Research
  • Causes
  • Treatment
  • Prevention (for Alzheimer’s only)

For the first-person categories, I tried to make sure to cover a wide variety of ages, diagnoses, and roles/perspectives to represent a range of experiences.

Collecting the Material
After picking the seeds, we went about collecting permissions for the blogs.  Although we have a strong argument for use under the ARL Best Practices for Fair Use guidelines, we’re proceeding with an abundance of caution and collecting as many permissions as possible for the blogs in the collection.

Almost two weeks ago, I started crawling the seeds using Archive-It.  NLM has used Archive-It for several years for its web collections, and has two other public web collections.

Describing the Collection
This is where I am now.  My preliminary plan is to use the following methods to describe the new collection:

  • Create a catalog record so that the collection is discoverable through the NLM catalog;
  • Fully arrange and describe the collection using a finding aid and adhering to DACS principles and local implementations.

There are very few examples that I’ve found of web collections described in this manner, so it’s going to be a lot of work creating standards and best practices that will be robust and durable enough to make the collection usable to researchers, while also being flexible enough for archivists at NLM to use into the future.

That’s where my project stands now!  I’m looking forward to finishing it, and I welcome the challenge of describing the collection and getting it incorporated into the HMD collection!

Other residents in the blogosphere: Heidi Dowding discusses digital asset management at cultural institutions in Baltimore, Emily Reynolds recaps her presentation at ALA Mid-Winter with Julia Blase and shares her slides, and Lauren Work shares her ALA Mid-Winter slides.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.