The Metadata Games Crowdsourcing Toolset for Libraries & Archives: An Interview with Mary Flanagan

Mary Flanagan, Sherman Fairchild Distinguished Professorship in Digital Humanities at Dartmouth College

I am excited to continue the NDSA innovation insights interview series to talk about the metadata games open source software project with Mary Flanagan. Mary is an artist, scholar and designer who holds the Sherman Fairchild Distinguished Professorship in Digital Humanities at Dartmouth College and serves as the director of Tiltfactor Lab. While she is broadly involved with ongoing discussions and conversations related to digital art conservation, I am particularly interested in talking to Mary about her National Endowment for the Humanities- funded Metadata Games project.

Trevor: How do you describe the idea of Metadata Games? I’m particularly interested in hearing a bit about who you see as the audience and what you see as the goals for the project.

Mary:There’s no shortage of archival material across the world, as you know. In universities, archives, libraries and museum collections, millions of photographs, audio recordings and films lie waiting to be digitized. The British Library has warned that by 2020 vast quantities of legacy content will be undigitized and is in danger of being forgotten. But digitization is only part of the problem.  Once digitized, someone has to tag the images properly. This takes significant staff time to input. There are many collections that are very well documented and just need to be brought into the digital age. There are, however, millions of artifacts in collections which have little or no informative descriptions aside from what may be written on the archival box or photo itself. Inspired by Luis von Ahn’s research on crowdsourcing, archivist Peter Carini and I thought we should make a free crowdsourcing game toolset for libraries and archives to get some help saving our artifacts from digital oblivion. We imagined a suite of games that can quickly gather valuable tags while offering fun for players. The games are an opportunity for the public to interact with cultural heritage institutions in ways they may not have otherwise.

We have three motivations in terms of audience/players, with overlaps among them. One motivation is to assist a particular institution, or simply contribute to a good cause. These players first and foremost like the idea of helping. A second motivation for players is to play because they love a subject area – they like tagging buildings, or playing games about parts of boats, or dog breeds for example. A third player motivation is simply to win – to be the best, the fastest and most accurate.

Cultural heritage institutions with digital collections that have little or no metadata will likely benefit the most from using Metadata Games. Our hope is that with Metadata Games, cultural heritage institutions will gain useful data for their collections, assist scholars to analyze their collections in novel and possibly unexpected ways and increase engagement with the community at large.

Metadata Games is importantly Free Open Source Software, so no expensive licensing fees or contracts are required and anyone can install, use and customize its functionality. The games themselves are also designed as plugins, so they are FOSS data gathering “portals” that could be adapted to other systems as well. Ultimately we are crafting the Metadata Games kit to be useful for a wide range of institutions.

Trevor: Could you walk us through a few of the specific games? It would be great if you could give readers a few concrete examples of how gameplay happens?

A screenshot of Zen Tag, a naming activity — where participants just name what they see.

Mary: Zen Tag is the simplest possible tagging model: it is a tagging window that offers points. A player is sees a single image, and uses the text box underneath to input as many tags as he or she would like – there is no time limit, and the player works at his or her own pace. Players receive points for each tag submitted, with higher points for tags that prior players have provided. Let me be clear, this one is barely a game at all. We can see how much time a particular player spends and what they input: some players love Zen Tag (or its re-skinned variations): they look at images and treat each as a world of “Where’s Waldo,” trying to describe the image in super-precise detail. Other players see this one and they want to throw their computers through a window. My mother, who I thought would enjoy simply the act of tagging images, really hates Zen Tag. She wants more competitive play. Different players like different activities! For archives, thought, this little point generator can really rake in the tags.  We check the entries with a variety of tools and “verification” game designs, but we have tended to get very accurate tags. There’s a multiplayer and single player of this one.

A screen shot of Guess What, a two-player game where players have to choose an image from an array of images based on clues sent to them by the networked partner.

Guess What! is a synchronous, collaborative two-player game where one player is given a particular image and must describe the image to another player across the network. The other player is given 12 images and must select the correct image based on the description. We also have speedtag, which is a theme-based beat the clock game. CattyGory is an Edward Gory-inspired theme-specific tagging game. We’ve paper-prototyped a whole new set of designs and are honing these for demonstration to our project advisors as we speak. We’re focusing on mobile games, for those are what we play in those in-between moments. It would be awesome if the in-between moments were also improving the digital commons.

Trevor: Based on those examples, could you tell us a bit about what you see players getting out of the experience and what the Library, Archive or Museum gets out of it?

Mary: Players get recognition for their knowledge, they get to have fun while exploring rarely seen artifacts, and they get satisfaction in contributing –and improving– the accessibility and value of an institution’s collection. The project provides a path toward a deeper experience with the collections and the institution.

Alum Tag, an example of using the Zen Tag game to have players identify alumni in photographs

The library/archive/museum receives useful tags and valuable context for their collections, which also improves their accessibility and connections to the public. Through using Metadata Games, libraries and museums can further engage patrons, which can likely improve fundraising and attendance—in particular if they offer real world rewards or events in connection with the online games. For example, we launched a Dartmouth College-related image set under a game reskinned to focus on “AlumTag” during Darmouth’s homecoming weekend when many alumni are back on campus. We had instant participation and have had solid participation since with that collection. Other institutions have been keen to use the software for school fundraisers while also improving their archives.

Trevor: Could you tell us about a few of the different organizations and kinds of collections you have experimented with using the platform with? I would be particularly interested in hearing about the different kinds of orgs and their different use cases, capabilities and needs.

Mary: We started at the Rauner Special Collections Library at Dartmouth College, and in our pilot we used images from the Stefansson Collection on Polar Exploration, one of the world’s most extensive bodies of research materials on the North and South Poles. We then created other installs for our own testing, development and data gathering with a variety of image sets—some general, and some are thematic in nature, like the alumni images. We are about to set up servers for Washington University, Boston Public Library, The University at Buffalo and UC-Santa Cruz right now. The system is also running in Hong Kong successfully! Some institutions want data. Others want more engagement with the public.

We’ve been overwhelmed with interest! Folks at some institutions though have had a difficult time getting the go ahead to try using the system because of institutional politics, conservative managers, or because the server folks are already too taxed. Once we walk folks through how simple the system is to install, that seemed to address the latter concern.

Trevor: What projects have informed and inspired the development of Metadata Games? I would be particularly interested in hearing about particular aspects of other initiatives and projects that have inspired specific features and components of your design?

Mary: I’ve already mentioned Luis von Ahn’s work… The Library of Congress 2008 experiment with using Flickr as a possible crowdsourcing model is also excellent. We were thrilled to learn of the New York Public Library’s “What’s on the Menu?” project. The menu project is proof of two out of our three player motivations: given the right context, people will be very engaged in seemingly niche and esoteric topics, like transcribing and verifying text from old restaurant menus. I hope people will read our 2012 American Archivist article (vol. 75, no. 2) which goes into depth with these examples.

Trevor: I spoke with Arfon Smith of the Zooniverse and Adler Planetarium about their work on Citizen Science projects. I would be curious to hear how you see Metadata Games in relation to projects like the Zooniverse?

Mary: The citizen scientist is also a citizen archivist! The idea is very appealing to us. It requires trusting the public to contribute real data–real knowledge—to our archives and libraries much as they do to science. Games can engage players who initially do not care about the cultural heritage institution whose collections they’re interacting with, but as they play something like Metadata Games, they become more interested in what else the institution has to offer.  Metadata Games is a way of bridging a player’s intrinsic motivations, making connections between what’s intrinsically appealing with civic engagement. Oh, and Zooniverse is awesome. My team is trying to connect this week in fact to see how we can collaborate and share.

Trevor: You guys have been working on this for a bit now, in a few different phases of funding and development. I would love to hear a bit about what you think are some of the big takeaways and lessons learned in terms of

Mary: I’ve talked a little about player motivation. One key lesson learned is in regard to expert tags vs “lowest common denominator” tags. The latter is much easier to design for… It is much more challenging to design games that increase not our base knowledge, but our more expert knowledge—how do we figure out who is an expert? Who do we trust? These are really interesting research questions we’ve encountered while working on the project. Obviously we’re learning from computational linguists, but we’re also learning from Humanists about these issues.  A second lesson learned was getting “too cutting edge” for institutional good. While cultural institutions have similar needs in terms of being able to quickly collect metadata for their collections, they vary very widely in terms of their organizational and technical infrastructure. Finding a balance where our system is flexible and fast, but is still able to run on current systems with current levels of support, has been a key goal. The current build of Metadata Games is built using software that’s available at most web hosting services. We wanted to write the system in a NoSQL database such as MongoDB, but cultural heritage institutions are typically late technological adopters. Almost every institution we spoke with said that they would be sticking with current technologies like PHP and MySQL for at least another 5 years. I was surprised by learning the high number of heritage institutions that don’t host their own servers. We went with a solution that is familiar for now, and can be upgraded later through a plug-in architecture.

Trevor: To what extent do you see Metadata Games as a crowdsourcing or gamification project? I realize that both terms come with a bit of baggage, but both seem to capture some parts of the essence of it. So, would you define Metadata Games as a platform for crowdsourcing metadata collection and remediation? Could one talk about it as “gamified” in the sense that you are bringing game mechanics into the tool? Or do you think there is a better vocabulary we should be using to talk about this kind of project?

Mary: One could refer to Metadata Games using both of those terms, though most game designers don’t like to go near “gamification” because it implies poor game design without meaningful choice applied to corporate interests first, player experience second. We’re not just adding games to archives mindlessly; we’re really trying to address player motivation and foster a connection between the player and the collections, so folks feel a sense of ownership with the archival materials. That’s the big vision for the project: there might be archives just down your street, or in the next town, or in Washington—but for whom? They are saved for us! And for our children, and their children’s children and so on. We have a right to see what’s in there and offer what we know. The public likely knows a thing or two: perhaps someone was married in a particular park, and finds a photo of that place, untagged? An architecture geek can name the architect on this anonymous photograph of a building that otherwise would remain lost and unidentified! A veteran may be able to identify friends in archived news footage! Perhaps your great grandfather could tag plants in a photograph that just looks like a field to someone else. Perhaps your sister can identify a poet’s voice in an audio recording, or can identify the dickens out of Nascar models. Once we know base facts, we can begin to learn from what might be essential lost archives. By sending in their tags, the playing citizen really can contribute new knowledge to the records.

Trevor: Could you tell us a bit about how your team is approaching the open source software development process? Along the same lines, how are you guys thinking about the sustainability of the software you are developing?

Mary: The project is entirely FOSS. To ensure open source compliance, we use openly available frameworks and programming libraries. For the first iteration of Metadata Games, we needed to convert our earlier game prototypes created in Flash to HTML5 and javascript. We also try to use libraries that have an active development community to encourage dialogue and upgrading. One of the great things about an open source project is that you get to see the code that makes it work. Features can be adapted to contexts and intuitions can completely customize the system as they see fit if there is the expertise to do so. Open source is about sharing and interacting.

Ideally, we would like to see a few institutions contribute a custom plug-in or two for the Metadata Games community of users. We are working to make the APIs and documentation as flexible and easy to use as we can. An important part of FOSS work is getting the word out about the project; it is a fantastic contribution to the not-for-profit space, and it raises interesting questions and puzzles. For example, how would you create a trust algorithm? What location based game app might you build off the system? If a project is interesting, people will contribute and build on it. That’s what is meant to happen.

Regarding sustainability, we are working with the Office of Digital Humanities at the NEH on promoting crowdsourced humanities projects. We are kicking off a Humanities-specific code-sharing initiative as well, so humanists don’t have to start from scratch on developing backend databases and the like.

Trevor: How have librarians, archivists, curators and scholars reacted to the idea of metadata games? Are there different camps or perspectives that have emerged from different parties based on their feelings about authority and openness?

Mary: Overall we have been met with extremely positive reception. The fact that the institution can “own” their own data is essential for most of our affiliates who aren’t legally allowed to share some of the collections on the internet due to copyright restrictions and such. Institutions can use the system in-house if desired, or restrict Metadata Games to a particular IP address.  Most of the questions have centered around implementation: how easy is it to install, setup and maintain? How accurate is the data? How can we use the data gathered by Metadata Games back to the collections? Do we WANT to incorporate tags back into the collection, or should we make a parallel identical collection, one for “original” data and one for crowdsourced material that is searchable and constantly updated by the public? Some groups want to integrate all of the data together and some prefer to try a “separatist” approach for at least a trial period. Either way, we’re excited to help.

Trevor: There are a lot of folks interested in inviting public participation through platforms like this into libraries, archives and museums. However, in my experience you are one of the very few working on this sort of thing who has experience as both a game designer and an artist. I would be curious to hear to what extent you think your game design and artistic perspectives come into play in the development of this platform.

We are very careful to attend to the player experience – what is it like as a player to engage with the games? This is as important as finding out how the games generating very good data. I hope that an additional phase of the project will be moving beyond the screen to engage with the space of the museum or library and sinking into some of this material deeply. I’m a closet historian at heart, and I think once people find their way into this content, they may not only contribute to the project but have a richer sense of their communities, their family histories or other cultures. I see the games as much as a means of inquiry and investigation – for us as well as for the players– as they are a play experience in and of themselves. This is one of the ways this project relates to thinking as an artist.  I also think people have the right to access their own cultural heritage, and it may be a right that a lot of us have forgotten all about. Hopefully, we’ll remember soon.

2 Comments

  1. Chris Chelberg
    April 7, 2013 at 8:33 am

    This looks like a wonderful project. I know I’ll be keeping this bookmarked in case I have need of it later. Thanks for sharing!

  2. Frances Hammond
    April 11, 2013 at 2:13 am

    It’s not gamified, but readers of this post may be interested to see how much crowdsourced correction is being done to the Digitized Newspapers Database on Trove in Australia – “78 million newspaper lines corrected’
    http://www.nla.gov.au/our-publications/staff-papers/trove-crowdsourcing-behaviour

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.