EGU22-10006
https://doi.org/10.5194/egusphere-egu22-10006
EGU General Assembly 2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Scaling and performance assessment of TSMP under CPU-only and CPU-GPU configurations

Daniel Caviedes-Voullième1,2, Jörg Benke1, Ghazal Tashakor1, Stefan Poll1,2, and Ilya Zhukov1
Daniel Caviedes-Voullième et al.
  • 1Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
  • 2Institute of Bio- and Geosciences: Agrosphere, Forschunszentrum Jülich, Germany

Multiphysics Earth system models are potentially good candidates for progressive porting of modules to run on accelerator hardware. Typically, these models have an inherently modular design to cope with the variety of numerical formulations and computational implementations required for the range of physical processes they represent. Progressively porting modules or submodels to accelerators such as GPUs implies that models must run on heterogeneous hardware. Foreseeably, exascale systems will make use of heterogeneous hardware, and therefore, exploring early on such heterogeneous configurations is of importance and a challenge.

The Terrestrial Systems Modelling Platform (TSMP) is a scale-consistent, highly modular, massively parallel, fully integrated soil-vegetation-atmosphere modelling system. Currently, TSMP is based on the COSMO atmospheric model, the CLM land surface model, and the ParFlow hydrological model, linked together by means of the OASIS3-MCT library.

Recently, ParFlow was ported to GPU, therefore enabling the possibility of running TSMP under a heterogeneous configuration, that is COSMO and CLM running on CPUs, and ParFlow running on GPUs. The different computational demands of each submodel inherently result in non-trivial load balancing across the submodels. This has been addressed by studying the performance and scaling properties of the system for specific problems of interest. The new heterogeneous configurations prompts a re-assessment of load balancing, performance and scaling, in order to identify optimal computational resource configurations, and re-evaluate the bottlenecks and inefficiencies that the heterogeneous model system can have.

In this contribution, we present first results on performance and scaling assessment of the heterogeneous TSMP, compared to its performance under homogeneous (CPU-only) configurations. We study strong and weak scaling, for different problem sizes, and evaluate parallel efficiency and power consumption, for homogeneous and heterogeneous jobs on the JUWELS supercomputer, and on the experimental DEEP-Cluster, both at the Jülich Supercomputing Centre. Additionally, we explore profiles and traces of selected cases, both on homogeneous and heterogeneous runs, to identify MPI communication bottlenecks and root causes of the load balancing issue.  

How to cite: Caviedes-Voullième, D., Benke, J., Tashakor, G., Poll, S., and Zhukov, I.: Scaling and performance assessment of TSMP under CPU-only and CPU-GPU configurations, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10006, https://doi.org/10.5194/egusphere-egu22-10006, 2022.