EGU General Assembly 2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Convective and stratiform precipitation: A PCA-based clustering algorithm for their identification

Antonio Francipane1, Gianluca Sottile2, Giada Adelfio2, and Leonardo V. Noto1
Antonio Francipane et al.
  • 1Università degli Studi di Palermo, Dipartimento di Ingegneria, Palermo, Italy (
  • 2Università degli Studi di Palermo, Dipartimento di Scienze Economiche, Aziendali e Statistiche, Palermo, Italy

The increasing occurrence of flood events in some areas of the Southern Mediterranean area (e.g., Sicily), over the last few years, has contributed to raising the importance of characterizing such events and identifying their causes. Since most of these events can be related to high-intensity rainfalls, which, in turn, are usually due to convective rainfall, it is very important to understand which factors could be recognized as drivers of such extreme events. Nevertheless, the way to distinguish between convective and stratiform rainfall is still an open issue and not easy to solve.
With this regard, starting from precipitation time series recorded at different rain gauge stations of Sicily, which is the greatest Mediterranean island, we propose an algorithm capable to classify precipitation distinguishing between their convective and stratiform components.
In order to do that, a dataset from the regional agency SIAS (Servizio Informativo Agrometeorologico Siciliano - Agro-meteorological Information Service of Sicily) has been used because of its high temporal resolution, quality, and availability of up-to-date data. Specifically, data from rain gauge stations spread over the entire island have been collected for the period 2003 - 2018 and with a temporal resolution of 10 minutes.
In order to classify the precipitation in convective and stratiform components, the functional PCA-based clustering approach (denoted by FPCAC) has been applied, which can be considered as a variant of a k-means algorithm based on the principal component rotation of data. In order to evaluate the validity of the proposed algorithm, finally, the results have been compared to some ERA5 reanalysis products.

How to cite: Francipane, A., Sottile, G., Adelfio, G., and Noto, L. V.: Convective and stratiform precipitation: A PCA-based clustering algorithm for their identification, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18518,, 2020

Comments on the presentation

AC: Author Comment | CC: Community Comment | Report abuse

Presentation version 1 – uploaded on 30 Apr 2020
  • CC1: Number of Classes, Swinda Falkena, 04 May 2020

    Hi Antonio,

    Following up my question during the chat session. You are considering only light and heavy rainfall events, but said you are thinking of introducing more classes. I assume that the number of classes has to be set a priori (please correct me if this is not true). Are you looking into ways to find the "best" number of classes? You mentioned 3 (or 4), but I'm wondering how you would know whether any of these best represents the data.

    I appreciate this is a quite technical question. I'll look a bit more into the method you use as it sounds interesting.

    Best wishes,


    • AC1: Reply to CC1, Antonio Francipane, 04 May 2020

      Hi Swinda,

      thank you for the question. The statistical tool that we are going to use can be set to use a fixed or variable number of classes. That means that we could set the model to use just 2 classes (e.g., convective and stratiform), 3 classes (e.g., convective, stratiform, and mixed/unresolved), 4 classes (e.g., convective, stratiform, mixed, and unresolved) or let the tool decide the more suitable number of classes to better separate different kinds of events on the base of some their statistical characteristics. 

      The application of the methodology with the second modality to the 6 rain gauge stations of the study (consider that we are going to apply the methodology to about 100 stations over the entire island) revealed that the most suitable number of classes to better discriminate events in the dataset is 4. That makes sense to us since we know that there are some rainfalls that, for their structure, can be considered as a mix between stratiform and convective (e.g., stratiform structures with more convective events within them). 

      by the way, we do not classify any event a priori as convective or stratiform... just extract the events within our dataset following some simple rules (e.g., two rainfall depths separate by less than 10 minutes belong to the same event; any event, to be considered as such, has to have a rainfall depth of at least 1 mm).

      Hope this answers your question.

      All the best,



      • CC2: Reply to AC1, Swinda Falkena, 04 May 2020

        Hi Antonio,

        Thanks! That explains, so the algorithm has a way to determine what's the best number of classes. Do you know what criterion it uses for that?

        Best wishes,


        • CC3: Reply to CC2, Giada Adelfio, 04 May 2020

          Hi Thanks. It uses a criterion that minimizes the distances among the curves. PLease see the paper:

          1. Sottile, G and Adelfio, G (2018). Clusters of effects curves in quantile regression models. Computational Statistics, 1-19. Doi 10.1007/s00180-018-0817-8