EGU22-3095, updated on 22 Sep 2023
https://doi.org/10.5194/egusphere-egu22-3095
EGU General Assembly 2022
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Fluid simulations accelerated with 16 bits: Approaching 4x speedup on A64FX by squeezing ShallowWaters.jl into Float16

Milan Klöwer1, Samuel Hatfield2, Matteo Croci3, Peter D. Düben2, and Tim Palmer1
Milan Klöwer et al.
  • 1Atmospheric, Oceanic and Planetary Physics, University of Oxford, UK (milan.kloewer@physics.ox.ac.uk)
  • 2European Centre for Medium-Range Weather Forecasts, Reading, UK
  • 3Mathematical Institute, University of Oxford, UK

Most Earth-system simulations run on conventional CPUs in 64-bit double precision floating-point numbers Float64, although the need for high-precision calculations in the presence of large uncertainties has been questioned. Fugaku, currently the world’s fastest supercomputer, is based on A64FX microprocessors, which also support the 16-bit low-precision format Float16. We investigate the Float16 performance on A64FX with ShallowWaters.jl, the first fluid circulation model that runs entirely with 16-bit arithmetic. The model implements techniques that address precision and dynamic range issues in 16 bits. The precision-critical time integration is augmented to include compensated summation to minimise rounding errors. Such a compensated time integration is as precise but faster than mixed-precision with 16 and 32-bit floats. As subnormals are inefficiently supported on A64FX the very limited range available in Float16 is 6·10-5 to 65,504. We develop the analysis-number format Sherlogs.jl to log the arithmetic results during the simulation. The equations in ShallowWaters.jl are then systematically rescaled to fit into Float16, using 97% of the available representable numbers. Consequently, we benchmark speedups of up to 3.8x on A64FX with Float16. Adding a compensated time integration, speedups reach up to 3.6x. Although ShallowWaters.jl is simplified compared to large Earth-system models, it shares essential algorithms and therefore shows that 16-bit calculations are indeed a competitive way to accelerate Earth-system simulations on available hardware.

How to cite: Klöwer, M., Hatfield, S., Croci, M., Düben, P. D., and Palmer, T.: Fluid simulations accelerated with 16 bits: Approaching 4x speedup on A64FX by squeezing ShallowWaters.jl into Float16, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3095, https://doi.org/10.5194/egusphere-egu22-3095, 2022.

Displays

Display file