- 1Alfred Wegener Institute for Polar and Marine Research, Climate Sciences | Climate Dynamics, Bremerhaven, Germany
- 2Department of Environment, Land, and Infrastructure Engineering, Politecnico di Torino, Turin, Italy
- 3Institute of Environmental Physics, University of Bremen, Bremen, Germany
Large Language Models (LLMs) have emerged as powerful tools for text and data processing, with potential extending far beyond conversational interfaces. We demonstrate that integrating LLMs into agentic workflows enables automated climate and oceanographic data analysis while minimizing hallucinations through strict reliance on real data sources.
ClimSight combines LLMs with climate model data to deliver localized climate insights for decision-making. Specialized agents consult external databases, extract variables from climate models, generate Python scripts for post-processing, and validate outputs through visual analysis. The workflow iteratively corrects errors until reliable results are achieved.
PANGAEA GPT enhances accessibility to the PANGAEA data repository through a supervisor agent that interprets queries, delegates tasks to domain-specific subagents, and coordinates data extraction, statistical analysis, and visualization of oceanographic and atmospheric datasets.
Both systems leverage automatic Python execution and image analysis for quality control. By constraining outputs to verifiable data sources and implementing multi-agent verification, we demonstrate that LLMs can play a significant role in geoscientific data pipelines and automated research workflows.
How to cite: Kuznetsov, I., Pantiukhin, D., Grassi, J., Shapkin, B., Jung, T., and Koldunov, N.: Integrating Large Language Models into Climate and Geoscientific Data Workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13612, https://doi.org/10.5194/egusphere-egu26-13612, 2026.