EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Finding regions of similar sea level variability with the help of a Gaussian Mixture Model

Lea Poropat, Céline Heuzé, and Heather Reese
Lea Poropat et al.
  • University of Gothenburg, Department of Earth Sciences, Göteborg, Sweden (

In climate research we often want to focus on a specific region and the most prominent processes affecting it, but how exactly do we select the borders of that region? We also often need to use long-term in situ observations to represent a larger area, but which area exactly are they representative for? In ocean sciences we usually consider basins as separate regions or even simpler, just select a rectangle of the ocean, but that does not always correspond to the real, physically relevant borders. As alternative, we use an unsupervised classification model, Gaussian Mixture Model (GMM), to separate the northwestern European seas into regions based on the sea level variability observed by altimetry satellites.

After performing a principal component (PC) analysis on the 24 years of monthly sea level data, we use the stacked PC maps as input for the GMM. We used the Bayesian Information Criterion to determine into how many regions our area should be split because GMM requires the number of classes to be selected a priori. Depending on the number of PCs used, the optimal number of classes was between 12 and 18, more PCs typically allowing the separation into more regions. Due to the complexity of the data and the dependence of the results on the starting randomly chosen weights, the classification can differ to a degree with every new run of the model, even if we use the exact same data and parameters. To tackle that, instead of using one model, we use an ensemble of models and then determine which class does each grid point belong to by soft voting, i.e., each of the models provides a probability that the point belongs to a particular class and the class with the maximal sum of probabilities wins. As a result, we obtain both the classification and the likelihood of the model belonging to that class.

Despite not using the coordinates of the data points in the model at all, the obtained classes are clearly location dependent, with grid points belonging to the same class always being close to each other. While many classes are defined by bathymetry changes, e.g., the continental shelf break and slope, sometimes other factors come into play, such as for the split of the Norwegian coast into two classes or for the division in the Barents Sea, which is probably based on the circulation. The North Sea is also split into three distinct regions, possibly based on sea level changes caused by dominant wind patterns.

This method can be applied to almost any atmospheric or oceanic variable and used for larger or smaller areas. It is quick and practical, allowing us to delimit the area based on the information we cannot always clearly see from the data, which can facilitate better selection of the regions that need further research.

How to cite: Poropat, L., Heuzé, C., and Reese, H.: Finding regions of similar sea level variability with the help of a Gaussian Mixture Model, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-753,, 2023.