This is a guest post by Rachel Trent, Digital Collections and Automation Coordinator in the Geography and Map Division.
Interested in bulk downloading maps from the Library of Congress’s online collections?
Need a corpus of historical map images to build a training dataset for your machine learning model?
Looking to learn more about Python or APIs?
Curious about how to query maps on loc.gov?
Yes? Great! We have tutorials for you.
The Geography and Map Division is beginning a series of Jupyter Notebooks exploring how to computationally access, retrieve, and analyze cartographic materials in the Library of Congress’s online collections. These notebooks include instructions and demonstration Python code that lead you through the process of downloading and analyzing images and metadata in bulk from the Library’s website, specifically geared towards maps. These notebooks are designed to be downloaded to your computer and opened with Jupyter Notebook.
The first two of the notebooks are now available on the Library’s GitHub page:
- Querying and downloading cartographic material from loc.gov
- Analyzing and visualizing cartographic metadata from loc.gov
These notebooks use the Sanborn Fire Insurance Map collection to show you how to do things like download map image files in bulk . . .
. . . or visualize data about the collection.
If this is your first time using Jupyter Notebooks, there are many online tutorials to help you install the software and get started, including instructions for installing Jupyter via Anaconda. To download the two Jupyter notebooks, head to the Library’s GitHub page and download the entire repository of files as a ZIP file. Or, if you’re comfortable with git, you can clone the repository.
Inside that repository and at LC for Robots, you’ll find other notebook tutorials and example code. In particular, be sure to check out accessing images for image analysis (which the maps notebooks build on) and extracting location data from the loc.gov API for geovisualization!