EGU23-16817
https://doi.org/10.5194/egusphere-egu23-16817
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Comparison of CPU and GPU parallelization approaches between two programming languages in copepod model simulations

Varshani Brabaharan1, Sachithma Edirisinghe2, and Kanchana Bandara3
Varshani Brabaharan et al.
  • 1University of Ruhuna, Faculty of Fisheries and Marine Sciences and Technology, Department of Oceanography and Marine Geology, Matara, Sri Lanka (varshanibrabaharan@gmail.com)
  • 2University of Ruhuna, Faculty of Fisheries and Marine Sciences and Technology, Department of Oceanography and Marine Geology, Matara, Sri Lanka (sachithma99@gmail.com)
  • 3University of Tromsø, Department of Arctic and Marine Biology, Norway (info@kanchanabandara.com)

This study presents a comparative assessment to evaluate between two high performance computing languages, Java and FORTRAN for the computation vs. communication trade-off observed during a strategy-oriented copepod model simulation. Here we compared the computational time of (i) sequential processing, (ii) latency (CPU) and (iii) throughput (GPU) oriented designs. CPU based parallelization was accomplished on a 4-core Intel i7 processor with a clock speed of 1.99 GHz. On this CPU, we implemented a (i) fork/join framework design based on work-stealing algorithm in Java and (ii) Open Multi- Processing (OpenMP), a directive-based application programming interface (API) with shared memory architecture on FORTRAN 95. The GPU processing power was leveraged using the CUDA framework in Java and OpenACC API on FORTRAN on a NVIDIA GeForce MX230 with 256 unified pipelines. The simulation time for sequential CPU execution was ca. 41% lower in FORTRAN compared to Java (18 s vs. 25 s). Furthermore, the FORTRAN simulation was ca. 43% lower in execution time in latency-oriented CPU design compared to Java (10s vs. 13s). In the simulation regarding GPU-approach with unified memory space accessibility, Java computation consumed ca. 38% less time than FORTRAN (5s vs. 8s). Unlike FORTRAN, Java is purely an object-oriented language and therefore, object handling is not optimized in GNU compliers of FORTRAN. Nevertheless, memory consumption of FORTRAN can be fine-tuned thereby, decreasing latency unlike in Java. OpenMP API is based on self-consistency, shared memory architecture and its temporary view memory allows threads to cache variables and thereby reduce latency by avoid accessing the memory for each reference of variables unlike the fork/join framework in Java. Furthermore, OpenMP has a thread private memory, which allows efficient synchronization within the code. OpenACC is designed as a high-level platform, which is an independent abstract programming accelerator that offers a pragmatic alternative for accessing GPU programming without much programming effort. Nevertheless, some uses of unified memory space accessibility on NVIDIA GPU’s are better represented in CUDA despite OpenACC having a cache directive. Therefore, its best to investigate the performances of different accelerator models and different programming languages depending on the simulation needs and efficiency targets desired by the model.

Keywords: FORTRAN, Java, OpenMP, OpenACC, high-performance computing, copepods, modelling

How to cite: Brabaharan, V., Edirisinghe, S., and Bandara, K.: Comparison of CPU and GPU parallelization approaches between two programming languages in copepod model simulations, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-16817, https://doi.org/10.5194/egusphere-egu23-16817, 2023.

Supplementary materials

Supplementary material file