The following is a guest post by Kalev Hannes Leetaru, a data scientist and Senior Fellow at George Washington University Center for Cyber & Homeland Security. In a previous post, he introduced us to the GDELT Project, a platform that monitors the news media, and presented how mass translation of the world’s information offers libraries enormous possibilities for broadening access. In this post, he writes about re-imagining information geographically.
Why might geography matter to the future of libraries?
Information occurs against a rich backdrop of geography: every document is created in a location, intended for an audience in the same or other locations, and may discuss yet other locations. The importance of geography in how humans understand and organize the world (PDF) is underscored by its prevalence in the news media: a location is mentioned every 200-300 words in the typical newspaper article of the last 60 years. Social media embraced location a decade ago through transparent geotagging, with Twitter proclaiming in 2009 that the rise of spatial search would fundamentally alter how we discovered information online. Yet the news media has steadfastly resisted this cartographic revolution, continuing to organize itself primarily through coarse editorially-assigned topical sections and eschewing the live maps that have redefined our ability to understand global reaction to major events. Using journalism as a case study, what does the future of mass-scale mapping of information look like and what might we learn of the future potential for libraries?
What would it look like to literally map the world’s information as it happens? What if we could reach across the world’s news media each day in real time and put a dot on a map for every mention in every article, in every language of any location on earth, along with the people, organizations, topics, and emotions associated with each place? For the past two years this has been the focus of the GDELT Project and through a new collaboration with online mapping platform CartoDB, we are making it possible to create rich interactive real-time maps of the world’s journalistic output across 65 languages.
Leveraging more than a decade of work on mapping the geography of text, GDELT monitors local news media from throughout the globe, live translates it, and performs “full-text geocoding” in which it identifies, disambiguates, and converts textual descriptions of location into mappable geographic coordinates. The result is a real-time multilingual geographic index over the world’s news that reflects the actual locations being talked about in the news, not just the bylines of where articles were filed. Using this platform, this geographic index is transformed into interactive animated maps that support spatial interaction with the news.
What becomes possible when the world’s news is arranged geographically? At the most basic level, it allows organizing search results on a map. The GDELT Geographic News Search allows a user to search by person, organization, theme, news outlet, or language (or any combination therein) and instantly view a map of every location discussed in context with that query, updated every hour. An animation layer shows how coverage has changed over the last 24 hours and a clickable layer displays a list of all matching coverage mentioning each location over the past hour.
Selecting a specific news outlet like bbc.co.uk or indiatimes.com as the query yields an instant geographic search interface to that outlet’s coverage, which can be embedded on any website. Imagine if every news website included a map like this on its homepage that allowed readers to browse spatially and find its latest coverage of rural Brazil, for example. The ability to filter news at the sub-national level is especially important when triaging rapidly-developing international stories. A first responder assisting in Nepal is likely more interested in the first glimmers of information emerging from its remote rural areas than the latest on the Western tourists trapped on Mount Everest.
Coupling CartoDB with Google’s BigQuery database platform, it becomes possible to visualize large-scale geographic patterns in coverage. The map below visualizes all of the locations mentioned in news monitored by GDELT from February to May 2015 relating to wildlife crime. Using the metaphor of a map, this list of 30,000 articles in 65 languages becomes an intuitive clickable map.
Exploring how the news changes over time, it becomes possible to chart the cumulative geographic focus of a news outlet, or to compare two outlets. Alternatively, looking across global coverage holistically, it becomes possible to instantly identify the world’s happiest and saddest news, or to determine the primary language of news coverage focusing on a given location. By arraying emotion on a map it becomes possible to instantly spot sudden bursts of negativity that reflect breaking news of violence or unrest. Organizing by language, it becomes possible to identify the outlets and languages most relevant to a given location, helping a reader find relevant sources about events in that area. Even the connections among locations in terms of how they are mentioned together in the news yields insights into geographic contextualization. Finally, by breaking the world into a geographic grid and computing the topics trending in each location, it becomes possible to create new ways of visualizing the world’s narratives.
Turning from global news to domestic television news, these same approaches can be applied to television closed captioning, making it possible to click on a location and view the portion of each news broadcast mentioning events at that location.
Turning back to the question that opened this post – why might geography matter to the future of libraries? As news outlets increasingly cede control over the distribution of their content, they do so not only to reach a broader audience, but to leverage more advanced delivery platforms and interfaces. Libraries are increasingly facing identical pressures as patrons turn towards services (PDF) like Google Scholar, Google Books, and Google News instead of library search portals. If libraries embraced new forms of access to their content, such as the kinds of geographic search capabilities outlined in this post, users might find those interfaces more compelling than those of non-news platforms. The ability of ordinary citizens to create their own live-updating “geographic mashups” of library holdings opens the door to engaging with patrons in ways that demonstrate the value of libraries beyond as a museum of physical artifacts and connecting individuals across national or international lines. As more and more library holdings, from academic literature to the open web itself, are geographically indexed, libraries stand poised to lead the cartographic revolution, opening the geography of their vast collections to search and visualization, and making it possible for the first time to quite literally map our world’s libraries.