Top of page

Digital map image placed over the top of a Sanborn scanned map

Building Digital Worlds: Where does GIS data come from?

Share this post:

This is a guest post by Meagan Snow, Geospatial Data Visualization Librarian in the Geography and Map Division.

Whether you’ve used an online map to check traffic conditions, a fitness app to track your jogging route, or found photos tagged by location on social media, many of us rely on geospatial data more and more each day. So what are the most common ways geospatial data is created and stored, and how does it differ from how we have stored geographic information in the past?

A primary method for creating geospatial data is to digitize directly from scanned analog maps. After maps are georeferenced, GIS software allows a data creator to manually digitize boundaries, place points, or define areas using the georeferenced map image as a reference layer. The goal of digitization is to capture information carefully stored in the original map and translate it into a digital format. As an example, let’s explore and then digitize a section of this 1914 Sanborn Fire Insurance Map from Eatonville, Washington.

Sanborn Fire Insurance Map from Eatonville, Pierce County, Washington. Sanborn Map Company, October 1914. Geography & Map Division, Library of Congress.

Sanborn Fire Insurance Maps were created to detail the built environment of American towns and cities through the late 19th and early 20th centuries. The creation of these information-dense maps allowed the Sanborn Fire Insurance Company to underwrite insurance agreements without needing to inspect each building in person. Sanborn maps have become incredibly valuable sources of historic information because of the rich geographic detail they store on each page.

When extracting information from analog maps, the digitizer must decide which features will be digitized and how information about those features will be stored. Behind the geometric features created through the digitization process, a table is utilized to store information about each feature on the map.  Using the table, we can store information gleaned from the analog map, such as the name of a road or the purpose of a building. We can also quickly calculate new data, such as the length of a road segment. The data in the table can then be put to work in the visual display of the new digital information that has been created. This often done through symbolization and map labels.

A section of the 1914 Eatonville, Washington Sanborn Fire Insurance Map is digitized into basic building polygons, road line segments, and a railroad line. Information stored in the data’s table is used to label and symbolize the features.

Manually digitizing from georeferenced map images can be tricky to do well and time-consuming to create at a professional level. Researchers are now refining automated methods for feature extraction from analog maps and even experimenting with machine-learning methods for extracting text.

Digitized data is displayed over current aerial imagery (left, ESRI) and the original Sanborn Map Image (right).

Digitizing from georeferenced images isn’t the only way spatial data is created. A process called geocoding allows for the creation of geographic points from tables that contain either latitude and longitude coordinates or a street address.  GIS software or a third-party geocoding platform are required for these data transformations. Remote sensing provides another avenue for geospatial data.  Satellites, drones, and other geospatial technologies can collect measurements and images of the earth’s surface from far away. These data sources help us understand complex geographic phenomena at large scales, such as the ability to study land cover change over time.

Finally, spatial analysis itself has the power to transform data from one format to another. One example of this method is interpolation – for example, ground measurements of air temperature at exact weather stations can be used to generate a continuous surface which estimates all temperature values in-between the original input points.

Now as we move about in the world, our smart phones can also act as generators of geospatial data, sending out location information that helps monitor traffic in real time or crowd-source weather reports. With so many types of geospatial data out in the world, map libraries are faced with the difficult challenge of how to curate and store collections of geospatial data for future generations. Despite the changing technology, what hasn’t altered is the impact that data collection, measurement, and representation plays in our decision-making. Maps from any time period are an insight into how people measure and value the world around us and an indication of the world we try to create.


  1. Great article! We have been working on machine learning methods to automatically understand historical map content from the scanned images. We are currently working on automatically extracting text labels and tagging their meanings (e.g., “drug store” is a type of pharmacy; “bank” is a type of financial business) from historical maps. If anyone is interested, take a look at and

Add a Comment

Your email address will not be published. Required fields are marked *