Toward Unified Earth-Scale Knowledge Discovery: Multisensor, Multimodal Earth Observation with AI

Date: 2025/12/02 – 2025/12/02

Academic Seminar: Toward Unified Earth-Scale Knowledge Discovery: Multisensor, Multimodal Earth Observation with AI

Speaker: Dr. Gencer Sumbul, postdoctoral scientist at École Polytechnique Fédérale de Lausanne (EPFL)

Time: 3:00-4:00 p.m., December 2nd, 2025 (Beijing Time)

Online: Feishu

Abstract

Earth observation (EO) has entered a regime where heterogeneous satellite constellations produce exabyte-scale data streams across multiple sensors (multispectral, SAR, hyperspectral, and very high-resolution optical). Yet, most contemporary deep learning approaches—whether task-specific or foundational—remain tied to individual sensors or fixed sensor combinations. This creates a fundamental tension: while the data have truly Earth-scale coverage with complementary characteristics across sensors, our AI models are often sensor-specific, task-specific, and brittle to label noise and distribution shift. This constraint hinders the scalability, adaptability, and generalization of EO models across heterogeneous data sources, limiting the ability to mine the true potential of EO data for actionable solutions to environmental and societal challenges. In this talk, I will present my research aimed at moving from fragmented EO models toward unified, scalable representations for Earth-scale knowledge discovery. I will first discuss how my early work addressed core limitations of supervised deep learning in EO. In the second part, I will focus on foundation models for EO, while presenting SMARTIES, a spectrum-aware multi-sensor autoencoder that maps diverse sensors into a shared latent space and pretrains a transformer backbone via masked modeling. This yields sensor-agnostic, transferable representations and allows a single model to support multiple sensors, including zero-shot transfer to unseen ones. Finally, I will outline my ongoing and future work toward language-powered, physics-inspired multimodal EO foundation models. Here, EO imagery is integrated with auxiliary modalities (meteorological, ecological, and socio-economic data) and natural language, enabling semantic querying, better interpretability, and broader accessibility. I will illustrate how these models can support applications such as forest-loss mapping, wildlife monitoring, species-distribution modeling, and climate-extreme understanding.

Biography

Dr. Gencer Sumbul is a postdoctoral scientist at the Environmental Computational Science and Earth Observation Laboratory at the École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. His research focuses on advancing computer vision and machine learning methodologies for multi-modal Earth observation, with a current emphasis on sensor-agnostic foundation models and cross-sensor representation learning for environmental understanding. He received the Ph.D. degree from the Faculty of Electrical Engineering and Computer Science at Technische Universität Berlin with the distinction of summa cum laude, where he worked as a Research Associate and created BigEarthNet, one of the most widely used multi-modal benchmark datasets for Earth observation studies. He is also the creator of SMARTIES, a spectrum-aware foundation model for Earth observation. Dr. Sumbul has authored over forty peer-reviewed publications with more than 1,700 citations, and has contributed to several European Space Agency and Horizon Europe research projects aimed at scalable and generalizable AI for environmental and societal understanding.