EarthCube Data Capabilities: Collaborative Proposal: Reducing Time-To-Science in the Earth Sciences: Annotations to Foster Convergence, Inclusion, and Credit

Lead PI: Dr. Kerstin A. Lehnert

Unit Affiliation: Marine and Polar Geophysics, Lamont-Doherty Earth Observatory (LDEO)

September 2019 - August 2022
Project Type: Research

DESCRIPTION: The long term sustainability of federally funded research depends on the discovery, accessibility and reuse of data. However, data and research products are often stored in different locations. This makes it challenging to find and integrate related data. This project helps support the discovery of related but distributed research products for Earth science and natural history data. Researches will have a way to link data resources, add context, or provide additional information about data, software and publications. Researchers use this system to create annotations that link resources using unique identifiers. Over time, these links connect to create a network of data resources. This project will support the development of the underlying database, a user interface for access and discovery, and a number of documentation tools and workshops to help support the ongoing development and sustainability of the Throughput database.The broader impacts of this project include the engagement of early career researchers and the better sharing of data and other research products in the Earth sciences.

Improved discoverability of data, metadata and services is a need shared across the geosciences. A barrier that increases "time-to-science" in Earth Science research is the difficulty of integrating individual observations, concepts, data models, and statistical techniques across subfields. The Throughput Annotation Engine (TAE) offers a solution to the challenge of managing interdisciplinary workflows by providing multiple points of entry to access, annotate and interact with data, and to link code, data, publications, or other elements to one another. The TAE will support adoption as part of a system of systems, linking outside users to data repositories, and providing data stewards a degree of flexibility in deciding how to manage new information: whether to incorporate information in annotations into their data models, or to access and report annotations using the Throughput API. Throughput improves credit for data, software and documentation by providing a mechanism to track use and implementation across a range of publications and online resources, and will provide a new set of citation tools based on FORCE11 recommendations. Linked annotations in the TAE support the generation of metrics for research infrastructure beyond standard publication metrics. The Cookbook will allow individuals and communities to identify programmatic workflows, and link them to community data resources. The Cookbook will provide researchers with access to community best-practices, and, by leveraging ORCID user credentials, to individuals engaged in the development of documentation and workflows. Throughput offers a user-centered solution to deepening and densifying connections among the many nodes in the emerging linked data ecosystem of scientists, data, software services, journals, and funders.