EGU23-17046
https://doi.org/10.5194/egusphere-egu23-17046
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Wayang AgoraEO Plugin: The Framework for Scalable EO Workflows

Rodrigo Pardo Meza1, Jorge-Arnulfo Quiané-Ruiz3, Begüm Demir1,2, and Volker Markl1,2
Rodrigo Pardo Meza et al.
  • 1TU Berlin, Berlin, Germany (ro.pardo.meza@gmail.com, demir@tu-berlin.de, volker.markl@tu-berlin.de)
  • 2BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
  • 3IT University of Copenhagen, Denmark (joqu@itu.dk)

Wayang AgoraEO Plugin: The Framework for Scalable EO Workflows

Currently, Earth Observation (EO) platforms provide datasets, algorithms, and processing capabilities. Nevertheless, each platform proposes its own exclusive habitat to discover, process, and run EO elements. We recently proposed AgoraEO [2], a decentralized, open, and unified ecosystem, where users can find EO elements, compose cross-platform EO pipelines, and execute them efficiently. With this ambition of supporting cross-platform federated analytics, Agora EO relies on Apache Wayang [1] as its main analytical processing platform. Within AgoraEO, we are developing and enabling Apache Wayang with EO features, exposing the internals of BigEarthNet [2] to the Earth Observation community. Here we present our Wayang AgoraEO plugin that follows the BigEarthNet workflow to achieve all its benefits in a scalable and parameterizable (reusable) way. The Wayang AgoraEO plugin empowers users to create EO workflows, using any EO platform in a simple way: using operators and an intuitive API that follows the behaviors of the EO platforms it exploits. The execution of sub-tasks is controlled but isolated in any required data processing system in tandem with the rest of the platform. In addition, one can fetch datasets from several independent sources. By design, Apache Wayang works as a declarative framework for ML: Users specify ML tasks at a high level, using the most convenient API to write a workflow (Java-Scala, Python, and Postgres are supported). Wayang then models an ML task as a mathematical optimization problem and uses its gradient descent-based optimizer to invoke the appropriate physical algorithms and system configurations to execute a given ML task. Therefore, decoupling user specification of ML tasks from its execution. We believe the Wayang AgoraEO plugin can be a game changer in the tedious task of implementing and deploying EO workflows within EO platforms today: It makes it easy to reuse resources and share them. Likewise, it is an easily extensible solution to include new operators that can include new EO platforms and tasks. As a result, this solution can be a great leap in the democratization of EO technologies, contributing to their integration, scalability, and access to high-performance computing.

References

[1] S. Kruse, Z. Kaoudi, J. -A. Quiane-Ruiz, S. Chawla, F. Naumann and B. Contreras-Rojas, "Optimizing Cross-Platform Data Movement," IEEE 35th International Conference on Data Engineering, 2019, pp. 1642-1645.

[2] A. Wall, B. Deiseroth, E. Tzirita Zacharatou, J-A, Quiané-Ruiz, B. Demir, V. Markl, "AGORA-EO: A Unified Ecosystem for Earth Observation - A Vision For Boosting EO Data Literacy," Big Data from Space Conference, 2021.

How to cite: Pardo Meza, R., Quiané-Ruiz, J.-A., Demir, B., and Markl, V.: Wayang AgoraEO Plugin: The Framework for Scalable EO Workflows, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-17046, https://doi.org/10.5194/egusphere-egu23-17046, 2023.