# Comparing methods for gap filling in historical snow depth time series

^{1}WSL Institute for Snow and Avalanche Research SLF, Davos, Switzerland^{2}Federal Office of Meteorology and Climatology MeteoSwiss, Zurich, Switzerland

Switzerland has a unique dataset of long-term manual daily snow depth time series ranging back more than 100 years for some stations. This makes the dataset predestined to be analyzed in a climatological sense. However, there are sometimes shorter (weeks, months) or longer (years) gaps in these manual snow depth series, which hinder a sound climatological analysis and reasonable conclusions. Therefore, we examine different methods for filling data gaps in daily snow depth series. We focus on longer gaps and use different methods of spatial interpolation, temperature index models and machine learning approaches to fill the data gaps. We assess the performance of the different methods by creating synthetic data gaps and set the applicability of the methods in relation to the density of the available neighboring stations, elevation and climatic setting of the target station.

**How to cite:**
1

## Comments on the display material

AC: Author Comment |CC: Community Comment | Report abuseDisplay material version 2– uploaded on 11 May 2020AC1: Comment on EGU2020-17211, Johannes Aschauer, 11 May 2020Dear Michael,

I am not able to comment on the version 1 any more, so I will answer your questions here on version 2. Hope you'll get notified anyway.

1. Indeed, the use of a vertical limit for spatial interpolations made a huge difference as you can see in the updated presentation. The selection of neighboring stations based on correlation as you proposed might even increase the skill. I can do another version where I also use the vertical limits for the regression methods and select stations for the spatial interpolation methods by correlation. I think this could add some more value.

In what times do you want to fill the gaps in your dataset? You have to consider that we use a dense station network in our here presented study (period 2000-2018). Therefore we probably have a bias towards the spatial interpolation/regression methods. If you go back in time, I guess things will look quite different.

2. Left the scoring the same since I wanted the different versions to be comparable.

3. Yes, probably somehow similar. But besides the stong elevation dependence there are more factors which make HN more complicated than precipitation. I think of things like wind drift effects or the exact timing when a measurement took place.

Best regards,

Johannes

CC1: Reply to AC1, Michael Matiu, 11 May 2020Dear Johannes,

thanks for the update! I'm glad that it helped. And it also feels reassuring for our own study ;)

I saw that you have a very dense network. Our time span goes longer into the past (up to 1960, and longer), and there the network is more sparse. Then the accuracy will depend on how many neighboring stations are required - and on their quality. It's a point we shall need to consider, thanks for the suggestion.

Also thanks for the other answers, was nice to have this discussion.

Best,

Michael

Display material version 1– uploaded on 04 May 2020CC1: Comment on EGU2020-17211, Michael Matiu, 05 May 2020Dear Johannes,

nice work, and quite useful when it comes to these gappy HS measurements...

I have some questions:

1. For the regression based approaches you selected best correlated neighbours, but for the distance weighting methods you did not preselect stations? In case of the neighbouring stations did you consider horizontal and vertical limits to select neighbours, and if yes what limits did you use?

2. I have not heard before of the MAAPE, but it seems to have some weird effects for very low HS for the regression based approaches. It seems to inflate low errors - I guess that interpolated values are not zero (but probably close?), while the true series is zero. Of course the temperature index models can better simulate 0 periods.

3. Any ideas on how results might look for HN?

Best, Michael

AC1: Reply to CC1, Johannes Aschauer, 06 May 2020Dear Michael,

thanks for your comment and your interest on our study. Please find below my answers to your questions.

1. Yes, we did some preselection of stations used for the distance weighting approaches. We chose to use the 20 closest stations that have to be in a horizontal radius of 100km (if there are less than 20 stations in that radius, the number of considered stations will decrease). However, we did not introduce a vertical limit. This is indeed a good point and should be implemented. I could post an update next week if you are interested.

For the regression methods, there is neither a horizontal nor a vertical limit for predictor stations. We only look at correlations while ignoring location and altitude.

2. As we wanted to compare stations with different amounts of snow, we looked for some sort of scaled error metric. The popular scaled errors (MAPE, MPE) have problems with zero in either series. That is why we used MAAPE. You are right, MAAPE inflates if measured HS is zero and the model still predicts some snow.

Do you have any suggestions on how we could better measure model accuracy on those time series that often contain zero? Definately, a first thing to do is to ignore all values that are zero in both time series.

As a side note: the regression models could potentially predict negative snow values and we set negative predictions to zero.

3. I guess HN is a different story but we have not yet looked at it. The spatial variability is thought to be much higher for HN, which potentially makes it hard to use the spatial interpolation or regression methods. New snow density can also change within short timescales during a storm and it will be difficult to estimate new snow density based on daily mean temperature and precipitation only. That is why we decided to use a constant “average” new snow density in our SWE2HS density model. However, I think a constant new snow density does not make sense if you want to model HN accurately.

Best,

Johannes

CC2: Reply to AC1, Michael Matiu, 06 May 2020Dear Johannes,

thanks for your replies.

Yes, we are actually quite interested in this topic. We have collected many HS (and HN) series for the whole Alpine arc (you can ask Christoph for more info, since he is also involved). And we also wanted to do some gapfilling. We planned to do some mixture of your approaches (selecting best correlated nearby stations, and then using some weighted mean). Your results are very promising. Also since you filled complete seasons with good results, it might even make sense to fill these series that only cover the ski season, and not the actual snow season (and thus have a lot missing at the start and end).

1. Yes I think this would be very valuable. I guess that using less stations, but with a more comparable elevation, would give better results. Something like Gernot Resch did with their +-300m elevation threshold. If you want to test that, I would be very interested in the results and an update.

2. I do not have a good suggestion to that one. I guess you wanted to have also some relative measure in addition to the absolute one (RMSE). And with relative measures, you always get problems with low denominators. The easiest thing to do would be as you suggested to remove the 0 values. Or maybe better all low values below some threshold like 5 or 10cm, because you get problems not only with 0 but with values close to 0. Anyway, from a practical point of view, I would not care if I have large relative errors if the errors are in the range of a few centimeters.

3. Makes sense. What is your experience, would you say HN could be thought of similar to precip? With an even stronger elevation dependence, of course...

Best,

Michael