Lecture for Topic 2

Lecture content

Multimodal Foundation Models for EO

Foundation Models (FMs) represent the latest leap forward in AI, following the era of Deep Learning. Trained on vast amounts of unlabeled data through self-supervised learning (SSL), these models capture rich patterns that can be applied to a wide array of downstream tasks—even with limited or no additional training data. This paradigm holds particular promise for Earth Observation (EO) and Earth Sciences by enabling breakthroughs in analytical, predictive, and even prescriptive capabilities.

In EO and Earth Sciences, FMs can significantly enhance applications such as weather prediction and geospatial semantic data mining. By analyzing large-scale climate and atmospheric datasets, they deliver more accurate forecasts across different time horizons and reveal complex patterns in environmental systems. Their latent space representations and embeddings also enable powerful insights while reducing the need for extensive labeled data—a critical advantage in remote sensing, where labeling is often expensive and time-consuming.

Despite these benefits, integrating FMs into EO workflows poses distinct challenges. EO data often spans multiple modalities, resolutions, and spectral bands, requiring specialized adaptation and careful model updating—especially for “digital twin” scenarios where AI must remain synchronized with real-world changes. Moreover, FMs demand significant computational resources and optimized training strategies, particularly when handling enormous, continuously growing geospatial datasets. Evaluating and benchmarking FMs for these specialized applications further complicates their deployment, as existing benchmarks may be limited in scope.

This session provides a comprehensive introduction to TerraMind, a large-scale generative multimodal foundation model for EO. We will begin by exploring the theoretical concepts behind TerraMind, including its dual-scale early fusion architecture that integrates both token-level and pixel-level representations across nine geospatial modalities. As a beginning, the session covers:

A lecture on High-Performance Computing (HPC) strategies for training and deploying large-scale EO models.
A dedicated lecture on TerraMind, focusing on its architecture, training data, and benchmark performance.

Participants will then engage in hands-on sessions including:

Generative capabilities of TerraMind across modalities.
Standard fine-tuning techniques for downstream EO tasks.
Thinking-in-Modalities (TiM): a novel approach introduced by TerraMind to generate artificial data during fine-tuning and inference, enhancing model performance.

Throughout the session, participants will gain practical experience in data preparation, model fine-tuning, and deployment workflows, equipping them with the skills to effectively utilize TerraMind in operational EO settings.

Agenda

Block 1: Introduction

An introduction to HPC for AI
Leveraging HPC for EO applications: opportunities and challenges
TerraMind: Introduction and Background Theory and Implementation

Blocks 2, 3 and 4:

Hands-on TerraMind

Meet

Instructor

Johannes Jakubik

Biography

Johannes is a Staff Research Scientist within the AI for Climate Impact team at IBM Research Europe. In this role, he leads research activities focused on pretraining and scaling multi-modal AI foundation models for earth observation, as well as developing AI foundation models for weather and climate assessments in collaboration with NASA, ESA, and the EU Horizon program. His work on large-scale deep learning for Earth observation has been recognized with the NASA Marshall Space Flight Center Honor Award, multiple IBM accomplishment awards, and has been featured in various international and national media. He also supervises and mentors Ph.D. students at MIT and ETH Zurich. Johannes graduated from KIT and ETH, where his research spanned across all relevant subfields of deep learning-based systems: data-centricity, model-centricity, and human-centricity. During his Ph.D., he received a best paper award and a best paper award nomination for theoretical contributions to human-centric AI. In fall 2024, he was recommended as a top candidate for a tenure-track professorship at a German university of excellence. Together with a range of esteemed co-authors, his work has been published in highly recognized journals and conferences.

Alexandre Strube

Biography

Alexandre has a PhD in High-Performance computing by the University Autònoma de Barcelona. He worked at the Performance Analysis team at the Jülich Supercomputing Centre from 2010 to 2015, on the Application Support team from 2015 to 2019, and since then he is a Consultant at Helmholtz AI. He is also one of the maintainers of the whole Scientific software stack on Juelich's supercomputers, and he is the official maintainer of LMOD, the module system, for Debian and Ubuntu operating systems. Alexandre develops and maintains Blablador, the LLM inference infrastructure of the Helmholtz Foundation.

Rocco Sedona

Biography

Rocco Sedona (Member, IEEE) received the B.Sc. and M.Sc. degrees in information engineering from the University of Trento, Trento, Italy, in 2016 and 2019, respectively, and the Ph.D. degree in computational engineering from the University of Iceland, Reykjavik, Iceland, in 2023. He is a member of the “AI and ML for Remote Sensing” Simulation and Data Lab, JSC, Germany. His research interests primarily lie in the field of deep learning and its application to remote sensing data. He has extensively utilized optical satellite data acquired by Landsat (NASA) and Sentinel (ESA) missions toward near real-time land-cover classification. In addition, he specializes in distributed deep learning on high-performance computing systems, an area of study that he has been actively engaged in since 2019.

Þorsteinn Elí Gíslason

Biography

Þorsteinn Elí Gíslason received a B.Sc. degree in Physics and an M.Sc. degree in Computational Engineering from the University of Iceland. His master's research focused on foundation models for Earth observation as part of the Prithvi-EO-2.0 project, where he contributed to both pretraining on high-performance computing (HPC) systems and downstream validation. He is currently a Researcher at the Jülich Supercomputing Centre, Forschungszentrum Jülich, where he works on foundation model research for remote sensing. His contributions span both pretraining large-scale models on HPC systems and fine-tuning them for Earth observation applications. Additionally, he is involved in improving hybrid workflows that integrate HPC and cloud resources.

‍