EGU26-18770, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-18770
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Tuesday, 05 May, 08:30–10:15 (CEST), Display time Tuesday, 05 May, 08:30–12:30
 
Hall X4, X4.88
Energy Efficiency in Cloud-Based Earth Observation Data Processing: Gap Analysis and Research Directions
Adhitya Bhawiyuga, Serkan Girgin, Rolf A. de By, and Raul Zurita-Milla
Adhitya Bhawiyuga et al.
  • Faculty of Geo-Information Science and Earth Observation, University of Twente
The Earth observation (EO) community has increasingly adopted cloud platforms for processing large datasets and EO data archives grow by approximately 100 PB annually. However, energy costs and environmental footprint of this processing remain largely invisible. This oversight is particularly contradictory for a community focused on environmental monitoring and climate mitigation. In this study, we present a gap analysis of energy awareness and energy efficiency in cloud-based EO data processing, using Pangeo's Kubernetes-based architecture as a case study. Through literature review and architectural analysis, we identify five interconnected problems that prevent energy-efficient cloud operations in the EO domain.

According to our analysis, the most critical gap is the absence of granular energy observability. While Pangeo deployments on self-managed Kubernetes can access resource metrics, e.g. through Prometheus, they lack energy attribution at the task level. Tools like Kepler provide pod-level power estimates on bare-metal infrastructure but face limitations in virtualized cloud environments where hypervisors restrict hardware sensor access. On fully managed cloud platforms, provider transparency worsens the problem as they offer only monthly service-level carbon footprints. Without this visibility, researchers could optimize workflows solely based on execution time and cost, leaving energy efficiency as an invisible dimension. Furthermore, the EO community lacks standardized benchmarking frameworks for evaluating energy-performance trade-offs in realistic workflows. Researchers reporting energy improvements for specific algorithms cannot provide reproducible comparisons, as different studies use varying datasets, baseline systems, and measurement methodologies.

From system-level perspective, current Kubernetes orchestration policies optimize for resource availability and load balancing but ignore hardware-specific energy profiles. Pangeo deployments consequently distribute workloads across multiple underutilized nodes rather than consolidating them to enable node shutdown. Similarly, Dask schedulers prioritize data locality and workload balance but cannot incorporate energy awareness when assigning tasks. When processing continent-scale mosaicking operations, schedulers could mismatch task characteristics with hardware capabilities by assigning compute-intensive operations to high-power nodes when energy-efficient alternatives could handle the workload.

In order to address these interconnected gaps, we propose a multi-phase research roadmap. The first phase should focus on developing energy monitoring toolkits that synthesize hardware sensors with application profiling and modeling frameworks to account for hidden energy consumption in unmeasured components such as disk I/O and network peripherals. This phase should also establish standardized benchmarking frameworks comprising representative EOBD workflows to enable reproducible energy-performance evaluation across different platforms and algorithms. Building on this measurement infrastructure, subsequent phases should develop predictive models that estimate task-level energy consumption from workflow characteristics and hardware specifications before execution takes place. This model will enable proactive decisions about algorithm selection, hardware provisioning, and resource allocation. The final phase focuses on system-level optimization by designing energy-aware Kubernetes orchestration through workload consolidation and heterogeneous hardware selection. This phase also includes developing multi-objective task schedulers for distributed frameworks like Dask that co-optimize energy consumption, execution time, and cost when assigning tasks to worker nodes. These directions aim to make energy consumption a measurable, optimizable metric in cloud-based EO processing, aligning computational practices with environmental sustainability goals.

How to cite: Bhawiyuga, A., Girgin, S., de By, R. A., and Zurita-Milla, R.: Energy Efficiency in Cloud-Based Earth Observation Data Processing: Gap Analysis and Research Directions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18770, https://doi.org/10.5194/egusphere-egu26-18770, 2026.