# An efficient Bayesian method for inverting physical models on massive planetary data

^{1}UGA - Université Grenoble Alpes, LJK - Laboratoire Jean Kuntzman - INRIA, Grenoble, France (benoit.kugler@inria.fr)^{2}Univ. Grenoble Alpes, CNRS, IPAG, Grenoble, France (sylvain.doute@univ-grenoble-alpes.fr)

### Introduction

When studying the Martian surface, the composition of the materials is established on the basis of spectral detection, unmixing, and physical modelling using images produced by hyperspectral cameras. Information on the microtexture of surface materials such as grain size, shape, roughness and internal structure can also be used as tracers of geological processes (Fernando et al. PSS 2016). This information is accessible under certain conditions thanks to hyperspectral image sequences acquired over a site of interest from eleven different angles by the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM@MRO, Murchie et al. JGR-Planets 2009) . Similar multi-angular observations can also be realized in the laboratory with spectro-photo-goniometers, on planetary analog materials or real extraterrestrial matter (Potin et al. Icarus 2019) . In both cases, the interpretation of the surface Bidirectional Reflectance Distribution Factor extracted from these observations, in terms of composition and microtexture, is based on the inversion of the Hapke model, a semi-empirical photometric model that relates physically meaningful parameters **x** to the reflectivity **y **of a granular material for a given geometry of illumination and viewing (Schmidt and Fernando Icarus 2015).

This work presents an efficient method, based on a learning approach in a Bayesian framework, able to invert the Hapke model on large experimental or remote sensing datasets, as illustrated on a challenging CRISM multi-angular sequence of images.

### 1 Inversion characteristics

In the CRISM case, the dataset provides both spatial and spectral dimensions. More specifically, each of the N_{spatial} spatial point of the scene is observed at N_{spectral} wavelengths, yielding N_{spatial}× N_{spectral} vectors of reflectance values. This high number of observations to be inverted usually makes approaches like Markov Chain Monte-Carlo simulations unacceptably slow. The approach we propose starts with a computer intensive learning phase that builds a statistical model for the Hapke model; it is then applied for all observables **y**_{obs}.

Each observable is a vector of D reflectances, one for each geometry of measure. In the case of CRISM data, D = 11 is moderate, but in laboratory, D may be one order of magnitude higher. The statistical model we propose has the advantage to be able to handle high dimensional settings while remaining computationally tractable.

The Bayesian framework offers a natural solution to propagate uncertainties on measures, which are formalized as variance on the prior distribution, while the variance of the posterior distribution assesses the uncertainty induced by the inversion.

The complexity of the forward radiative transfer model often makes our inverse problem ill-posed. In practice, multiples solutions might be acceptable. More than a point estimator, our statistical model provides a full posterior density, whose multi-modality can be assessed and interpreted.

### 2 Bayesian inversion via a learning approach

From a statistical point of view, we formalize our model as couple of random variables (X,Y), where Y = F(X) + ε. X has a prior distribution on the physical parameters space, F is the forward model and ε is a centered Gaussian noise accounting for measure uncertainties. The main idea is to use a two steps approach: first, approximate this model by a parametric surrogate model, second use the surrogate model to invert the observations.

We chose the family of the so-called Gaussian Locally-Linear Mapping (GLLiM) (Deleforge et al StatComp 2015) . Such models are expressive enough to approximate highly non-linear, complex, forward models, while remaining tractable.

The learning step is performed by generating a training dictionary composed of samples from (X,Y), on which the likelihood is maximized by a standard EM algorithm. Regarding the dimensionality of the problem, the number of parameters is proportional to D, making this model suitable even for high dimensional observables. Moreover, the computational cost of the learning phase does not scale with the number of observations to be inverted.

The second step is performed by computing the posterior distribution of the random variable X given **y**_{obs}. The inverse of the surrogate GLLiM has an explicit formulation in the form of a Gaussian Mixture. From this posterior density, the mean and variance are straightforward to compute. To handle a multi-solution scenario, the mixture can be further exploited.

### 3 Massive inversion of spectro-images

The dataset comes from a multi-angular observation of the South Pole of Mars by CRISM. The targeted scene presents spatially segregated C02 ice, H2O frost, and mineral dust (Douté and Pilorget EPSC 2017). After fusion and atmospheric correction of eleven hyperspectral images (Ceamanos et al. JGR-Planets 2013), the dataset provides both spatial and spectral dimensions, totaling N_{obs} = 154650 = 3093 × 50 measurements vectors y_{obs,} which makes MCMC approach unacceptably slow. Maps of Hapke’s parameter values are generated from the results of our inversion and superposed with transparency onto the full resolution CRISM nadir image, which serves as a geological control background image.

The results are satisfying from the application point of view. The colour composition of Figure 1 reflects the variation of ω at three wavelengths and corresponds well with the spatial distribution of the three previous materials and their known spectral optical properties. The map of the roughness parameter θ averaged over the spectral dimension (Figure 2) is color coded by intervals of values whose spatial variations are correlated with the composition and the structures of the terrains. In general, we find that all predicted parameters preserve some spatial regularity and show meaningful correlations with the composition and the geology.

### Conclusion

We propose an efficient statistical and learning method for inverting physical models on massive remote sensing or experimental data. In the case of the Hapke model, the method has proved to produce satisfactory results for CRISM multi-angular acquisitions as illustrated in this paper and for laboratory spectrophotometric measurements as illustrated in Potin et al., this conference.

**How to cite:**
Kugler, B., Forbes, F., and Douté, S.: An efficient Bayesian method for inverting physical models on massive planetary data, Europlanet Science Congress 2020, online, 21 September–9 Oct 2020, EPSC2020-335, https://doi.org/10.5194/epsc2020-335, 2020