How can the nature and practice of humanities research change in the face of the scale of digital cultural heritage collections and the possibilities offered by computational analysis? This was the core question in the recent Joint Council on Digital Libraries round table discussion of the Digging into Data challenge. The session description does a good job at setting the boundaries on the nature of this challenge. “Now that we have massive databases of materials used by scholars in the humanities and social sciences — ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records — what new, computationally-based research methods might we apply?”
The panel was timed with the release of One Culture. Computationally Intensive Research in the Humanities and Social Sciences, an extensive report from the Council on Library and Information Resources on the initial eight Digging into Data projects.
In the panel representatives from the projects and the funding agencies, as well as the President of CLIR, all discussed the challenges and opportunities as evident in the results of the initial and ongoing digging into data grant projects.
The panel started off with presentations about three ongoing Digging into Data projects, An Epidemiology of Information: Data Mining the 1918 Influenza Pandemic, Integrating Data Mining and Data Management Technologies for Scholarly Inquiry, and Cascades, Islands, or Streams? Time, Topic, and Scholarly Activities in Humanities and Social Science Research. Each of the projects is fascinating in its own right. Collectively, they illustrate the diversity of work happening under this grant program. The eight original Digging into Data projects are similarly expansive in illustrating the possibilities for computational humanities research on a range of different types of data sources. In short, read the CLIR report, from there I think the following three issues from the panel and the report are particularly intriguing.
Requests for Significant Shake Up in the Organization of Humanities Research
On the Panel, Charles Henry from CLIR spoke about several of the primary recommendations from the report. The nine suggestions in the report hang together around a central notion of redefining the structure of humanities scholarship. At once expanding what counts as research; thinking more broadly about what constitutes research data, and becoming more collaborative and embracing an interdisciplinary approach. At the heart of these suggestions is a desire to rework things like tenure and promotion and simultaneously a desire to change academic culture, academic structure, the infrastructure of academic research and the nature of scholarly publishing. It is a rather tall order. These recommendations would seem quite radical if it weren’t for the fact that they echo many other suggestions and calls for reform in the academy.
One of the questions at hand throughout the discussion of the Digging into Data grants was if computational methods resulted in a changing paradigm for humanities research. With all the suggestions for changing the nature of academic research, we can rest assured that the Digging into Data work reaffirms continuity in the questions humanists ask. This is not a paradigm shift; it is about the how new tools and methods, along with larger and more comprehensive research data, can be used to address traditional humanities scholarship.
How to Mainstream Computational Research in the Humanities?
During Q&A an audience member asked if era of the individual scholar in the humanities was over. All of these projects are significantly interdisciplinary, each involves considerable resources and each has teams of researchers working together. The fact that this is how most of the initial Digging into Data projects were structured supports much of the recommendations in the CLIR report. To some extent, the panelists felt that the possibilities of the Digging into Data work point to a potential future for the humanities that is much more collaborative. Jennifer Serventi from NEH also noted that the idea of the lone humanities scholar was always a myth in someways. Those scholars and their monographs have always resulted from a network of other scholars working in their field with librarians, archivists, and administrators who guide, shape, enable and structure scholarship.
This questioner went further in asking if there might be a quicker road to mainstreaming computational research in the humanities. Instead of working to change the culture and structure of the academy, might we focus on the development of tools that are easy enough for individual scholars to use and incorporate in their work? Might we be able to get around some of the structural questions and fit computational humanities research into the existing culture of the academy? Partly due to time limits, this question did not get a more general discussion on the panel. I think it is an essential question to push further on.
I think we can already see the results of what this kind of approach can look like in the kinds of research that scholars can do with things like Google’s NGram viewer. For example, Jo Guldi’s recent piece, The History of Walking and the Digital Turn: Stride and Lounge in London, 1808-1851. published in The Journal of Modern History. Here we have an individual scholar making use of an easy-to-use tool to do innovative computational humanities research that results in an engaging piece of traditional scholarship, an essay in an academic journal. In short, I think there is something to be said about building straightforward tools for individual scholars to play with and explore cultural data sets.
Hunt and Peck: The Dual Path to the Computational Humanities
We can have it both ways. We can pursue making structural changes that will support computational work. The value of the kinds of work coming out in the Digging into Data projects is just one of a series of reasons to peruse these kinds of changes. With that said, I think there are examples in the Digging into Data projects that simultaneously suggest a route that puts well-designed digital tools into the hands of any and everyone interested in dipping their toes into this kind of work and allow scholars with specific research questions to dig deeply into the materials. For example, the Data Mining Criminal Intent project developed and used a series of relatively easy-to-use open source tools (Zotero and Voyeur/Voyant). This project is both an exploration in new forms of research and an attempt to add on to those tools to make it possible for the “ordinary working historian” to make use of these techniques in their work (PDF of their white paper).
Altogether it was an exciting session, and the report makes for a compelling read. While much of the work in the Digging into Data grants focuses on working with digitized materials these issues of scale are only going to intensify as humanities scholars turn to work with born digital collections, like the National Software Reference Library’s 25,000,000 unique files, or the ever growing web archive collections that members of the International Internet Preservation Consortium are working to maintain.