Collaborative Research: EarthChem & SESAR - Data Infrastructure for Geochemistry and Earth Science Samples Communities

Lead PI: Dr. Kerstin A. Lehnert

Unit Affiliation: Marine and Polar Geophysics, Lamont-Doherty Earth Observatory (LDEO)

May 2020 - April 2024
Project Type: Research

DESCRIPTION: Earth scientists gather a vast amount and diversity of data by collecting specimens of geological materials including rocks, minerals, fossils, sediments, ice, river water or seawater, and analyzing these samples in the laboratory to measure their chemical and physical properties. The data generated by these analyses help scientists understand natural processes of the past, present, and even future Earth systems, and form the basis of new ideas, hypotheses, and discoveries. Thus, it is important that data are preserved and made easily findable and accessible online in a useful format and accompanied by information relevant for other researchers, decision makers, teachers, and the general public. This project continues the operation of two data systems, SESAR (System for Earth Sample Registration) and EarthChem, that provide online open access to information about samples collected as part of Earth and environmental science research, and to data generated when these samples are chemically analyzed in the lab. EarthChem manages and curates hundreds of geochemical datasets contributed by researchers and maintains databases with over 30 million geochemical measurements for more than one million samples. EarthChem will ensure that these data are available to, and consumable by, anyone (educator, student, researcher, industry groups, general public) interested in using them for essentially any purpose, especially research and education in STEM fields. SESAR makes samples discoverable and accessible online, thus allowing more efficient and effective access to sample collections that play a central role in a wide diversity of Earth, environmental, and planetary sciences and may inform critical decisions, laws and policies, and future science. EarthChem and SESAR build important national and international connections among data providers and users that facilitate the availability of information and knowledge to society.

During the two years of this project, EarthChem and SESAR will provide ongoing data stewardship services for the geochemistry, petrology, and Earth samples communities. EarthChem also will undertake essential developments to modernize and optimize technical implementation, enhance user functions, expand accessible data holdings, and engage with a broad community of users and data facilities to establish best practices and interoperability standards for geochemical data. EarthChem will restructure and reengineer its systems to ensure that its data services scale to evolving community demands and that operations are optimized for efficiency, resilience, and sustainability. New developments will support next generation modes of data-driven research by improving both human and machine-readable interfaces for discovery, access, visualization, and analysis of content in EarthChem data systems. Applications of Machine Learning and other data science approaches will help extract new knowledge from the data collections. SESAR will continue to expand its user community beyond geochemistry and the Earth sciences. In addition, SESAR's services will move to a new, independent, multi-disciplinary infrastructure for sample registration, an emerging collaborative effort with cyberinfrastructure providers in biology, genomics, and archeology.