The 2011 Annual Summer Meeting of DataCite, brought data lovers from several nations to Berkeley, CA, recently. A celebration of access and preservation ensued, with communal sharing of case studies, best practices and ideas for future work.
DataCite is an organization with members from national libraries and other organizations from around the world that are working to develop a global citation framework for data.
In a previous post, I talked about the growing importance of making scientific research data easier to find and use. Researchers in different disciplines are pushing for enhanced data access and reuse to extend learning and make new discoveries. A key part of making this happen is ensuring that data sets are discoverable through a uniform method of citation a manner similar to that long in place for the published journal articles that summarize research findings.
The DataCite meeting featured sessions on community trends and practices, data and the scholarly output, the role of data archives in identifying and preserving data, and the role of publishers in managing research data. Presentations from the meeting are available here.
A highlight of the meeting was the opening keynote by John Wilbanks, Vice President for Science, Creative Commons. He stated that DataCite is tackling a hard problem: developing a citation method while people are also still trying to figure out how researchers use data and also how best to preserve it. The primary purpose of citation itself is also spread across different concepts that are difficult to disentangle, including providing credit, tracking influence, and advancing science.
Wilbanks described data citation as facilitating a series of codependent activities:
- Find data
- Access data
- Understand Data
- Be influenced by data
- Provide credit back for influence
In developing solutions, he urged adhering to some basic concepts.
Simple and weak. Initial efforts to develop new solutions often fail, so it is important to guard against wasted effort. A wiser course is to develop systems and services that are easy to implement and use. This enables quick development of practical experience and identification of what works and what doesn’t.
Scalable and open. Use of the least powerful solution to a problem typically enables the most rapid scaling. And as solutions grow, they draw critics. Openness allows critics to fix and change things as they see fit.
Wilbanks also stressed what he saw as the critical role of data archivists and librarians: helping researchers find important data sets and steering them away from “nonsense data.” He stated that the traditional role of archivists and librarians in keeping and locating information will grow as data accumulates in volume, diversity and importance.