Discovering Physically Meaningful Structures from Climate Extreme Data

Lead PI: Dr. Marcus van Lier-Walqui

Unit Affiliation: Center for Climate Systems Research (CCSR)

September 2021 - August 2022
Project Type: Research

DESCRIPTION: The past two decades have witnessed natural disasters and extreme weather events that affect millions of people. At the same time, the data volume from high-resolution climate models, satellite, in-situ and ground-based measurements have substantially increased to petabyte scales. These new and readily accessible datasets create the previously missing pipeline required for scientific machine learning (ML) and therefore new opportunities for improved understanding and prediction capability of climate extreme events. This proposal aims to develop a deep latent variable model framework to discover physically meaningful hidden structures from high-dimensional, spatiotemporal climate extreme data. Climate extremes data are highly complex and they pose foundational challenges to current ML methods. Climate extremes are by definition rare, contributing to a highly skewed, long-tail distribution. Climate extremes also have high dimensions covering large spatial and temporal extents, and containing many variables. Furthermore, climate change is affecting the frequency, intensity and spatial organization of extreme events, leading to non-stationary distribution, this complicates extreme event prediction. To tackle these challenges, we propose to develop a probabilistic paradigm to extract physically meaningful hidden structures from climate extreme data. Our research is based on deep latent variable models (LVM), and centers around three research aims:

Aim 1: Discover hidden structures from climate extremes: design novel deep LVM models including sequential Variational AutoEncoder (VAE), tractable LVM and tensor LVMs to extract low dimensional hidden structures that are evolving in space and time from climate simulation and sparse observations. Then, use causal transportability to certify the learned hidden structures and transport the causality from simulation to observations.

Aim 2: Injecting physical principles into hidden structures: examine innovative methods including disentangled representations, weak supervision in the latent space, and equivariant neural networks to inject physical laws, principles, and constraints into the hidden structures discovered by the deep LVM models.

Aim 3: Quantifying uncertainty for discovered hidden structures: develop scalable Bayesian inference techniques for deep LVM models. Design pre-conditioned Monte Carlo sampling to handle long-tailed distribution. Exploit the hidden structures to accelerate variational inference (VI), obtain robust estimates, and improve statistical calibration. Investigate hybrid methods to interpolate Monte Carlo sampling and VI for accurate posterior inference.

This project brings together a unique team of scientists with expertise in machine learning, causal inference, Bayesian statistics and climate science. It would lead to efficient and robust scientific ML methods that are flexible, scalable, physically meaningful, and also produce uncertainty estimates for high-dimensional spatiotemporal data. For climate science, this project will uncover hidden structures from climate extremes, significantly deepen our understanding and increase our capability to predict climate extremes and similar types of data across many scientific fields.