EarthCube Building Blocks: Collaborative Proposal: A Geo-Semantic Framework for Integrating Long-Tail Data and Models

Lead PI: Dr. Colin P. Stark

Unit Affiliation: Marine and Polar Geophysics, Lamont-Doherty Earth Observatory (LDEO)

September 2014 - August 2017
Project Type: Research

DESCRIPTION: The project offers a unique and transformative approach to integrate existing and emerging long-tail model and data resources. Many challenges hinder the seamless integration of models with data. These challenges compel scientists to perform the integration process manually. The primary challenges are a consequence of the knowledge latency between model and data resources and others are derived from inadequate adoption and exploitation of information technologies. Knowledge latency challenges increase exponentially when a user aims to integrate long-tail data (data collected by individual researchers or small research groups) and long-tail models (models developed by individuals or small modeling communities).The goal of this research is to develop a framework rooted in semantic techniques and approaches to support ?long-tail? models and data integration. The vision is to develop a decentralized knowledge-based platform that can be easily adopted across geoscience communities comprising of individual and small group researchers. This project offers a unique and transformative approach to integrate existing and emerging long-tail model and data resources. The project will develop a knowledge framework to close the loop from models? queries back to data sources by first investigating the required concepts architecture for integrating two leading examples of long-tail resources in geoscience: Community Surface Dynamic Modeling System (CSDMS) and Sustainable Environment Actionable Data (SEAD). The project will also develop a context-based data model that provides an explicit interpretation of a metadata attribute. The researchers will capture the metadata concepts and semantic from various geo-informatics systems and provide tools for ensuring conceptual integration between the resources. Next, the project will develop a knowledge discovery tool that allows automated coupling of a model and data coming from different contributors. Finally, the project will provide a prototype physical implementation of the knowledge framework in CSDMS modeling framework to demonstrate how it can advance the seamless discovery, selection, and integration between models and data, and how to achieve dynamic reusability of resources across multiple Earth Science long-tail resources.