ITS1.5/GI1.5 | Intelligent systems for Earth, Environmental and Planetary Sciences (Methods, Models and Applications)

ITS1.5/GI1.5

EDI
Intelligent systems for Earth, Environmental and Planetary Sciences (Methods, Models and Applications)
Co-organized by ESSI2/SM2
Convener: Silvio GumiereECSECS | Co-conveners: Hossein BonakdariECSECS, Paul CelicourtECSECS
Orals
| Mon, 24 Apr, 10:45–12:20 (CEST)
 
Room 0.94/95
Posters on site
| Attendance Mon, 24 Apr, 14:00–15:45 (CEST)
 
Hall X4
Posters virtual
| Attendance Mon, 24 Apr, 14:00–15:45 (CEST)
 
vHall ESSI/GI/NP
Orals |
Mon, 10:45
Mon, 14:00
Mon, 14:00
Need for Smart Solutions in earth, environmental and planetary sciences: Tackling data challenges and incorporating applied earth and planetary sciences into artificial intelligence (AI) models opened a new avenue for creating comprehensive methodologies and strategies to answer a wide variety of theoretical and practical questions from detecting, modelling, interpreting and predicting changes in the earth and environment’s ecosystems in response to climate change to understanding interactions among the ocean, atmosphere, and land in the climate system. Therefore, AI and Data Science (DS) in earth, environmental and planetary sciences are one of the fastest growing areas. The performance of the AI/DS models improves as it gains experience over time. Various mathematical and statistical models need to be investigated to determine the performance of AI models. Once the learning process is completed, then the model can then be used to make an assumption, classify and test data. This is achieved after gaining experience in the training process. This session aims to make available to the world community of earth, environment and planetary sciences-related professionals a collection of scientific papers on the current state of the art and recent developments of AI and DS applications in the field. This session will shed light on many recent research activities on applying AI/DS techniques into a single comprehensive document to address engineering, social, political, economic, safety, health, and technological issues of earth, environment and planetary sciences challenges and opportunities. The purpose of this session is to improve and facilitate the application of intelligent systems for the earth, environmental and planetary sciences to highlight new insight for creating comprehensive methodologies for analyzing/processing/predicting/management strategies in the fields of fundamental and applied sciences problems through the decision-making abilities of artificial intelligence and machine learning techniques.

Orals: Mon, 24 Apr | Room 0.94/95

Chairpersons: Silvio Gumiere, Hossein Bonakdari, Paul Celicourt
10:45–10:50
10:50–11:00
|
EGU23-1183
|
ITS1.5/GI1.5
|
ECS
|
Highlight
|
On-site presentation
Solomiia Kurchaba, Jasper van Vliet, Fons J. Verbeek, and Cor J. Veenman

Starting from 2021 International Maritime Organization (IMO) introduced more demanding NOx emission restrictions for ships operating in waters of the North and Baltic Seas. All methods currently used for ship compliance monitoring are financially and time-demanding. Thus, it is important to prioritize the inspection of ships that have a high chance of being non-compliant. 

 

TROPOMI/S5P instrument for the first time allows a distinction of NO2 plumes from individual ships. Here, we present a method for the selection of potentially non-compliant ships using automated machine learning (AutoML) on TROPOMI/S5P satellite data. The study is based on the analysis of 20 months of data in the Mediterranean Sea region. To each ship, we assign a Region of Interest (RoI), where we expect the ship plume to be located. We then train a regression model to predict the amount of NO2 that is expected to be produced by a ship with specific properties operating in the given atmospheric conditions. We use a genetic algorithm-based AutoML for the automatic selection and configuration of a machine-learning pipeline that maximizes prediction accuracy. The difference between the predicted and actual amount of produced NO2 is a measure of inspection worthiness. We rank the analyzed ships accordingly. 

 

We cross-check the obtained ranks using a previously developed method for supervised ship plume segmentation.  We quantify the amount of NO2 produced by a given ship by summing up concentrations within the pixels identified as a “plume”. We rank the ships based on the difference between the obtained concentrations and the ship emission proxy.

 

Ships that are also ranked as highly deviating by the segmentation method need further attention. For example, by checking their data for other explanations. If no other explanations are found, these ships are advised to be the candidates for fuel inspection.

How to cite: Kurchaba, S., van Vliet, J., Verbeek, F. J., and Veenman, C. J.: Detection of anomalous NO2 emitting ships using AutoML on TROPOMI satellite data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1183, https://doi.org/10.5194/egusphere-egu23-1183, 2023.

11:00–11:10
|
EGU23-1583
|
ITS1.5/GI1.5
|
ECS
|
On-site presentation
Yasmin Mbarki and Silvio José Gumiere

Compaction of agricultural soil negatively affects its hydraulic proprieties, leading to water erosion and other negative effects on the quality of the environment. This study focused on the effect of compaction on soil hydrodynamic properties under unsaturated and saturated conditions using the Hydraulic Property Analyzer (HYPROP) system. We studied the impact of five levels of compaction among loam sand soils collected in a potato crop field in northern Québec, Canada. Soil samples were collected, and the soil bulk densities of the artificially compacted samples were developed by increasing the bulk density by 0% (C0), 30% (C30), 40% (C40), 50% (C50), and 70% (C70). First, the saturated hydraulic conductivity of each column was measured using the constant-head method. Soil water retention curve (SWRC) dry-end data and unsaturated hydraulic conductivities were obtained via the implementation and evaluation of the HYPROP evaporation measurement system and WP4-T Dew Point PotentioMeter equipment (METER group, Munich, Germany). Second, the soil microporosity was imaged and quantified using the micro-CT-measured pore-size distribution to visualize and quantify soil pore structures. The imaged soil microporosity was related to the saturated hydraulic conductivity, air permeability, porosity and tortuosity measured of the same samples.  Our results supported the application of the Peters–Durner–Iden (PDI) variant of the bimodal unconstrained van Genuchten model (VGm-b-PDI) for complete SWRC estimation based on the root mean square error (RMSE). The unsaturated hydraulic conductivity matched the PDI variant of the unconstrained van-Genuchten model (VGm-PDI) well. Finally, the preliminary results indicated that soil compaction could strongly influence the hydraulic properties of soil in different ways. The saturated conductivity decreased with increasing soil compaction, and the unsaturated hydraulic conductivity changed very rapidly with the ratio of water to soil. Overall, the HYPROP methodology performed extremely well in terms of the hydraulic behavior of compacted soils.

How to cite: Mbarki, Y. and Gumiere, S. J.: Study of the effect of compaction on the hydrodynamic properties of a loamy sand soil for precision agriculture, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1583, https://doi.org/10.5194/egusphere-egu23-1583, 2023.

11:10–11:20
|
EGU23-2388
|
ITS1.5/GI1.5
|
ECS
|
On-site presentation
Harriet Dawson and Cédric John

Identification of constituent grains in carbonate rocks is primarily a qualitative skill requiring specialist experience. A carbonate sedimentologist must be able to distinguish between various grains of different ages, preserved in differing alteration stages, and cut in random orientations across core sections. Recent studies have demonstrated the effectiveness of machine learning in classifying lithofacies from thin section, core and seismic images, with faster analysis times and reduction of natural biases.  In this study, we explore the application and limitations of convolutional neural network (CNN) based object detection frameworks to identify and quantify multiple types of carbonate grains within close-up core images. Nearly 400 images of carbonate cores we compiled of high-resolution core images from three ODP and IODP expeditions. Over 9,000 individual carbonate components of 11 different classes were manually labelled from this dataset. Using transfer learning, we evaluate one-stage (YOLO v3) and two-stage (Faster R-CNN) detectors under different feature extractors (Darknet and Inception-ResNet-v2). Despite the current popularity of one-stage detectors, our results show Faster R-CNN with Inception-ResNet-v2 backbone provides the most robust performance, achieving nearly 0.8 mean average precision (mAP). Furthermore, we extend the approach by deploying the trained model to ODP Leg 194 Sites 1196 and 1190, developing a performance comparison with human interpretation. 

How to cite: Dawson, H. and John, C.: Deep learning based identification of carbonate rock components in core images, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2388, https://doi.org/10.5194/egusphere-egu23-2388, 2023.

11:20–11:30
|
EGU23-3997
|
ITS1.5/GI1.5
|
ECS
|
On-site presentation
Elham Koohi, Silvio Jose Gumiere, and Hossein Bonakdari

Water used in agricultural crops can be managed by irrigation scheduling based on plant water stress thresholds. Automated irrigation scheduling limits crop physiological damage and yield reduction. Knowledge of crop water stress monitoring approaches can be effective in optimizing the use of agricultural water. Understanding the physiological mechanisms of crop responding and adapting to water deficit ensures sustainable agricultural management and food supply. This aim could be achieved by analyzing stomatal conductance, growth rate, leaf water potential, and stem water potential. Calculating thresholds of soil matric potential, and available water content improves the precision of irrigation management by preventing water limitations between irrigations. Crop monitoring and irrigation management make informed decisions using geospatial technologies, the internet of things, big data analysis, and artificial intelligence. Remote sensing (RS) could be applied whenever in situ data are not available. High-resolution crop mapping extracts information through index-based methods fed by the multitemporal and multi-sensor data used in detection and classification. Precision Agriculture (PA) means applying farm inputs at the right amount, at the right time, and in the right place. RS in PA captures images in different spatial, and spectral resolutions through in-field, satellites, aerial, and handheld or tractor-mounted such as unmanned aerial vehicles (UAVs) sensors. RS sensors receive the electromagnetic signals of plant responses in different spectral domains. Optical satellite data, including narrow-band multispectral remote sensing techniques and thermal imagery, is used for water stress detection. To process and analysis RS data, cloud storage and computing platforms simplify the complex mathematical of incorporating various datasets for irrigation scheduling. Machine learning (ML) algorithms construct models for the regression and classification of multivariate and non-linear crop mapping. The web-based software gathered from all different datasets makes a reliable product to reinforce farmers’ ability to make appropriate decisions in irrigating agricultural crops.

Keywords: Agricultural crops; Crop water stress detection; Irrigation scheduling; Precision agriculture; Remote Sensing.

How to cite: Koohi, E., Gumiere, S. J., and Bonakdari, H.: Artificial Intelligence Models for Detecting Spatiotemporal Crop Water Stress in schedule Irrigation: A review, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3997, https://doi.org/10.5194/egusphere-egu23-3997, 2023.

11:30–11:40
|
EGU23-6696
|
ITS1.5/GI1.5
|
ECS
|
Highlight
|
On-site presentation
Gyula Mate Kovács, Stefan Oehmcke, Stéphanie Horion, Dimitri Gominski, Xiaoye Tong, and Rasmus Fensholt

Wetlands provide invaluable services for ecosystems and society and are a crucial instrument in our fight against climate change. Although Earth Observation satellites offer cost-effective and accurate information about wetland status at the continental scale; to date, there is no universally accepted, standardized, and regularly updated inventory of European wetlands <100m resolution. Moreover, previous satellite-based global land cover products seldom account for wetland diversity, which often impairs their mapping performances. Here, we mapped major wetland types (i.e., peatland, marshland, and coastal wetlands) across Europe for 2018, based on high resolution (10m) optical and radar time series satellite data as well as field-collected land cover information (LUCAS) using an ensemble model combining traditional machine learning and deep learning approaches. Our results show with high accuracy (>85%) that a substantial extent of European peatlands was previously classified as grassland and other land cover types. In addition, our map highlights cultivated areas (e.g., river floodplains) that can be potentially rewetted. Such accurate and consistent mapping of different wetland types at a continental scale offers a baseline for future wetland monitoring and trend assessment, supports the detailed reporting of European carbon budgets, and lays down the foundation towards a global wetland inventory.

How to cite: Kovács, G. M., Oehmcke, S., Horion, S., Gominski, D., Tong, X., and Fensholt, R.: Satellite-based continental-scale inventory of European wetland types at 10m spatial resolution, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6696, https://doi.org/10.5194/egusphere-egu23-6696, 2023.

11:40–11:50
|
EGU23-8409
|
ITS1.5/GI1.5
|
ECS
|
On-site presentation
Federica Zennaro, Elisa Furlan, Donata Melaku Canu, Leslie Aveytua Alcazar, Ginevra Rosati, Sinem Aslan, Cosimo Solidoro, and Andrea Critto

Lagoons are highly valued coastal environments providing unique ecosystem services. However, they are fragile and vulnerable to natural processes and anthropogenic activities. Concurrently, climate change pressures, are likely to lead to severe ecological impacts on lagoon ecosystems. Among these, direct effects are mainly through changes in temperature and associated physico-chemical alterations, whereas indirect ones, mediated through processes such as extreme weather events in the catchment, include the alteration of nutrient loading patterns among others that can, in turn, modify the trophic states leading to depletion or to eutrophication. This phenomenon can lead, under certain circumstances, to harmful algal blooms events, anoxia, and mortality of aquatic flora and fauna, or to the reduction of primary production, with cascading effects on the whole trophic web with dramatic consequences for aquaculture, fishery, and recreational activities. The complexity of eutrophication processes, characterized by compounding and interconnected pressures, highlights the importance of adequate sophisticated methods to estimate future ecological impacts on fragile lagoon environments. In this context, a novel framework combining Machine Learning (ML) and biogeochemical models is proposed, leveraging the potential offered by both approaches to unravel and modelling environmental systems featured by compounding pressures. Multi-Layer Perceptron (MLP) and Random Forest (RF) models are used (trained, validated, and tested) within the Venice Lagoon case study to assimilate historical heterogenous WQ data (i.e., water temperature, salinity, and dissolved oxygen) and spatio-temporal information (i.e., monitoring station location and month), and to predict changes in chlorophyll-a (Chl-a) conditions. Then, projections from the biogeochemical model SHYFEM-BFM for 2049, and 2099 timeframes under RCP 8.5 are integrated to evaluate Chl-a variations under future bio-geochemical conditions forced by climate change projections. Annual and seasonal Chl-a predictions were performed out by classes based on two classification modes established on the descriptive statistics computed on baseline data: i) binary classification of Chl-a values under and over the median value, ii) multi-class classification defined by Chl-a quartiles. Results from the case study showed as the RF successfully classifies Chl-a under the baseline scenario with an overall model accuracy of about 80% for the median classification mode, and 61% for the quartile classification mode. Overall, a decreasing trend for the lowest Chl-a values (below the first quartile, i.e. 0.85 µg/l) can be observed, with an opposite rising fashion for the highest Chl-a values (above the fourth quartile, i.e. 2.78 µg/l). On the seasonal level, summer remains the season with the highest Chl-a values in all scenarios, although in 2099 a strong increase in Chl-a is also expected during the spring one. The proposed novel framework represents a valuable approach to strengthen both eutrophication modelling and scenarios analysis, by placing artificial intelligence-based models alongside biogeochemical models.

How to cite: Zennaro, F., Furlan, E., Melaku Canu, D., Aveytua Alcazar, L., Rosati, G., Aslan, S., Solidoro, C., and Critto, A.: Evaluation of lagoon eutrophication potential under climate change conditions: A novel water quality machine learning and biogeochemical-based framework., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8409, https://doi.org/10.5194/egusphere-egu23-8409, 2023.

11:50–12:00
|
EGU23-8702
|
ITS1.5/GI1.5
|
ECS
|
Highlight
|
On-site presentation
Angelica Bianconi, Elisa Furlan, Christian Simeoni, Vuong Pham, Sebastiano Vascon, Andrea Critto, and Antonio Marcomini

Marine coastal ecosystems (MCEs) are of vital importance for human health and well-being. However, their ecological condition is increasingly threatened by multiple risks induced by the complex interplay between endogenic (e.g. coastal development, shipping traffic) and exogenic (e.g. changes in sea surface temperature, waves, sea level, etc.) pressures. Assessing cumulative impacts resulting from this dynamic interplay is a major challenge to achieve Sustainable Development Goals and biodiversity targets, as well as to drive ecosystem-based management in marine coastal areas. To this aim, a Machine Learning model (i.e. Random Forest - RF), integrating heterogenous data on multiple pressures and ecosystems’ health and biodiversity, was developed to support the evaluation of risk scenarios affecting seagrasses condition and their services capacity within the Mediterranean Sea. The RF model was trained, validated and tested by exploiting data collected from different open-source data platforms (e.g. Copernicus Services) for the baseline 2017. Moreover, based on the designed RF model, future scenario analysis was performed by integrating projections from climate numerical models for sea surface temperature and salinity under the 2050 and 2100 timeframes. Particularly, under the baseline scenario, the model performance achieved an overall accuracy of about 82%. Overall, the results of the analysis showed that the ecological condition and services capacity of seagrass meadows (i.e. spatial distribution, Shannon index, carbon sequestration) are mainly threatened by human-related pressures linked to coastal development (e.g. distance from main urban centres), as well as to changes in nutrient concentration and sea surface temperature. This result also emerged from the scenario analysis, highlighting a decrease in seagrass coverage and related services capacity, in both 2050 and 2100 timeframes. The developed model provides useful predictive insight on possible future ecosystem conditions in response to multiple pressures, supporting marine managers and planners towards more effective ecosystem-based adaptation and management measures in MCEs.

How to cite: Bianconi, A., Furlan, E., Simeoni, C., Pham, V., Vascon, S., Critto, A., and Marcomini, A.: Evaluating the risk of cumulative impacts in the Mediterranean Sea using a Random Forest model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8702, https://doi.org/10.5194/egusphere-egu23-8702, 2023.

12:00–12:10
|
EGU23-10681
|
ITS1.5/GI1.5
|
Virtual presentation
Dharmen Punjani, Eleni Tsalapati, and Manolis Koubarakis

The standard way for earth observation experts or users to retrieve images from image archives (e.g., ESA's Copernicus Open Access Hub) is to use a graphical user interface, where they can select the geographical area of the image they are interested in and additionally they can specify some other metadata, such as sensing period, satellite platform and cloud cover.

In this work, we are developing the question-answering engine EarthQA that takes as input a question expressed in natural language (English) that asks for satellite images satisfying certain criteria and returns links to such datasets, which can be then downloaded from the CREODIAS cloud platform. To answer user questions, EarthQA queries two interlinked knowledge graphs: a knowledge graph encoding metadata of satellite images from the CREODIAS cloud platform (the SPARQL endpoint of CREODIAS) and the well-known knowledge graph DBpedia. Hence, the questions can refer to image metadata (e.g., satellite platform, sensing period, cloud cover), but also to more generic entities appearing in DBpedia knowledge graph (e.g., lake, Greece). In this way, the users can ask questions like “Find all Sentinel-1 GRD images taken during October 2021 that show large lakes in Greece having an area greater than 100 square kilometers”.

EarthQA follows a template-based approach to translate natural language questions into formal queries (SPARQL). Initially, it decomposes the user question by generating its dependency parse tree and then automatically disambiguates the components appearing in the question to elements of the two knowledge graphs.  In particular, it automatically identifies the spatial or temporal entities (e.g., “Greece”, “October 2021”), concepts (e.g., “lake”), spatial or temporal relations (e.g., “in”, “during”), properties (e.g., “area”) and product types (e.g., “Sentinel-1 GRD”) and other metadata (e.g., “cloud cover below 10%”) mentioned in the question and maps them to the respective elements appearing in the two knowledge graphs (dbr:Greece, dbo:Lake, dbp:area, etc). After this, the SPARQL query is automatically generated.

How to cite: Punjani, D., Tsalapati, E., and Koubarakis, M.: EarthQA: A Question Answering Engine for Earth Observation Data Archives, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10681, https://doi.org/10.5194/egusphere-egu23-10681, 2023.

12:10–12:20
|
EGU23-14519
|
ITS1.5/GI1.5
|
Highlight
|
Virtual presentation
Maël Plantec and Fabien Castel

As human activities continue to expand and evolve, their impact on the planet is becoming more evident. These past years Murmuration has been studying one of the most recent and destructive trends that has taken off: mass tourism. In Malta, tourism has been on the rise since before the Covid-19 pandemic. Now that travel restrictions are beginning to lift, it's likely that this trend will go back to increasing in the coming years. While Malta’s economy is mostly based on tourism, it's essential that this activity does not alter the areas in which it takes place. To address these issues and ensure sustainable development, governments and organizations have developed a set of guidelines called Sustainable Development Goals (SDG). SDGs are a set of 17 goals adopted by the United Nations in 2015 to provide a framework to help countries pursue sustainable economic, social and environmental development. They include objectives for mitigating climate change, preventing water pollution and degradation of biodiversity, as well as providing economic benefits to local communities.

In order to help territories like the islands of Malta to cope with these environmental issues, Murmuration carries out studies on various ecological, human and economic indicators. Using the Sentinel satellites of the European Copernicus program for earth imagery data makes possible the collection of geolocated, hourly values on air quality indicators such as NO2, CO and other pollutants but also water quality and vegetation through the analysis of the vegetation health. Other data sources give access to land cover values at meter resolution, tourism infrastructures locations and many more human activity variables. This information is processed into understandable indicators, aggregated indexes which take international standards and SDGs in their design and usage. An example of these standards are the WHO air quality guidelines providing thresholds quantifying the impact on health of the air pollution in the area of interest. The last step is to gather all the data, maps and correlations computed and design understandable visualizations to make it usable by territory management instances, enabling efficient decision making and risk management. The goal here is to achieve a link between satellite imagery, internationally agreed political commitment  and ground level decision-making.

This meaningful aggregation comes in the shape of operational dashboards. A dashboard is an up-to-date, interactive, evolving online tool hosting temporal and geographical linked visualizations on various indicators. This kind of tool allows for a better understanding of the dynamic of a territory in terms of environmental state, human impact and ecological potential.

How to cite: Plantec, M. and Castel, F.: From satellite data and Sustainable Development Goals to interactive tools and better territorial decision making, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14519, https://doi.org/10.5194/egusphere-egu23-14519, 2023.

Posters on site: Mon, 24 Apr, 14:00–15:45 | Hall X4

Chairpersons: Silvio Gumiere, Hossein Bonakdari, Paul Celicourt
X4.157
|
EGU23-244
|
ITS1.5/GI1.5
|
ECS
Dongyu Zheng, Zhisong Cao, Li Hou, Chao Ma, and Mingcai Hou

As deep learning (DL) is gathering remarkable attention for its capacity to achieve accurate predictions in various fields, enormous applications of DL in geosciences also emerged. Most studies focus on the high accuracy of DL models by model selections and hyperparameter tuning. However, the interpretability of DL models, which can be loosely defined as comprehending what a model did, is also important but comparatively less discussed. To this end, we select thin section photomicrographs of five types of sedimentary rocks, including quartz arenite, feldspathic arenite, lithic arenite, dolomite, and oolitic packstone. The distinguishing features of these rocks are their characteristic framework grains. For example, the oolitic packstone contains rounded or oval ooids. A regular classification model using ResNet-50 is trained by these photomicrographs, which is assumed as accurate because its accuracy reaches 0.97. However, this regular DL model makes their classifications based on the cracks, cements, or even scale bars in the photomicrographs, and these features are incapable of distinguishing sedimentary rocks in real works. To rectify the models’ focus, we propose an attention-based dual network incorporating the microphotographs' global (the whole photomicrographs) and local features (the distinguishing framework grains). The proposed model has not only high accuracy (0.99) but also presents interpretable feature extractions. Our study indicates that high accuracy should not be the only metric of DL models, interpretability and models incorporating geological information require more attention.

How to cite: Zheng, D., Cao, Z., Hou, L., Ma, C., and Hou, M.: High accuracy doesn’t prove that a deep learning model is accurate: a case study from automatic rock classification of thin section photomicrographs, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-244, https://doi.org/10.5194/egusphere-egu23-244, 2023.

X4.158
|
EGU23-1902
|
ITS1.5/GI1.5
Omiros Giannakis, Iason Demiros, Konstantinos Koutroumbas, Athanasios Rontogiannis, Vassilis Antonopoulos, Guido De Marchi, Christophe Arviset, George Balasis, Athanasios Daglis, George Vasalos, Zoe Boutsi, Jan Tauber, Marcos Lopez-Caniego, Mark Kidger, Arnaud Masson, and Philippe Escoubet

Scientific publications in space science contain valuable and extensive information regarding the links and relationships between the data interpreted by the authors and the associated observational elements (e.g., instruments or experiments names, observing times, etc.). In this reality of scientific information overload, researchers are often overwhelmed by an enormous and continuously growing number of articles to access in their daily activities. The exploration of recent advances concerning specific topics, methods and techniques, the review and evaluation of research proposals and in general any action that requires a cautious and comprehensive assessment of scientific literature has turned into an extremely complex and time-consuming task.

The availability of Natural Language Processing (NLP) tools able to extract information from scientific unstructured textual contents and to turn it into extremely organized and interconnected knowledge, is fundamental in the framework of the use of scientific information. Exploitation of the knowledge that exists in the scientific publications, necessitates state-of-the-art NLP. The semantic interpretation of the scientific texts can support the development of a varied set of applications such as information retrieval from the texts, linking to existing knowledge repositories, topic classification, semi-automatic assessment of publications and research proposals, tracking of scientific and technological advances, scientific intelligence-assisted reporting, review writing, and question answering.

The main objectives of TACTICIAN are to introduce Artificial Intelligence (AI) techniques to the textual analysis of the publications of all ESA Space Science missions, to monitor and evaluate the scientific productivity of the science missions, and to integrate the scientific publications’ metadata into the ESA Space Science Archive. Through TACTICIAN, we extract lexical, syntactic, and semantic information from the scientific publications by applying NLP and Machine Learning (ML) algorithms and techniques. Utilizing the wealth of publications, we have created valuable scientific language resources, such as labeled datasets and word embeddings, which were used to train Deep Learning models that assist us in most of the language understanding tasks. In the context of TACTICIAN, we have devised methodologies and developed algorithms that can assign scientific publications to the Mars Express, Herschel, and Cluster ESA science missions and identify selected named entities and observations in these scientific publications. We also introduced a new unsupervised ML technique, based on Nonnegative Matrix Factorization (NMF), for classifying the Planck mission scientific publications to categories according to the use of the Planck data products.

These methodologies can be applied to any other mission. The combination of NLP and ML constitutes a general basis, which has proved that it can assist in establishing links between the missions’ observations and the scientific publications and to classify them in categories, with high accuracy.

This work has received funding from the European Space Agency under the "ArTificiAl intelligenCe To lInk publiCations wIth observAtioNs (TACTICIAN)" activity under ESA Contract No 4000128429/19/ES/JD.

How to cite: Giannakis, O., Demiros, I., Koutroumbas, K., Rontogiannis, A., Antonopoulos, V., De Marchi, G., Arviset, C., Balasis, G., Daglis, A., Vasalos, G., Boutsi, Z., Tauber, J., Lopez-Caniego, M., Kidger, M., Masson, A., and Escoubet, P.: TACTICIAN: AI-based applications knowledge extraction from ESA’s mission scientific publications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1902, https://doi.org/10.5194/egusphere-egu23-1902, 2023.

X4.159
|
EGU23-11527
|
ITS1.5/GI1.5
|
ECS
Xingchen Yang, Yang Song, Zhenhan Wu, and Chaowei Wu

In the current stage of scientific research, it is necessary to break the barriers between traditional disciplines and promote the cross integration of various related disciplines. As one of the important carriers of research achievements of various disciplines, maps can be superimposed and integrated to more intuitively display the results of multidisciplinary integration, promote the integration of disciplines and discover new scientific problems. Traditional geological mapping is often based on different scales for single scale mapping, aiming at the mapping mode of paper printing results. It is difficult to read maps between different scales at the same time. To solve this problem,an integrated platform named Global Layer is being built under the support of Deep-time Digital Earth (DDE). Global Layer is embedded with several core databases such as Geological Map of the World at a scale 1/5M, Global Geothermal Database etc. These databases presented in form of electronic map which enables the results of different scales to be displayed and browsed through one-stop hierarchical promotion. In addition, Users can also upload data in four ways: local file, database connection, cloud file and arcgis data service, and data or maping results can be shared to Facebook, Twitter and other platforms in the form of links, widgets, etc. Construction of Global Layer could provide experience and foundation for integrating global databases related to geological map and constructing data platforms.

How to cite: Yang, X., Song, Y., Wu, Z., and Wu, C.: Global Layer——An integrated, fully online, cloud based platform, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11527, https://doi.org/10.5194/egusphere-egu23-11527, 2023.

X4.160
|
EGU23-12373
|
ITS1.5/GI1.5
|
ECS
|
Mariana Dos Santos Toledo Busarello, Anneli Ågren, and William Lidberg

Streams and ditches are seldom identified on current maps due to their small dimensions and sometimes intermittent nature. Estimates point out that only 9% of all ditches are currently mapped, and the underestimation of natural streams is a global issue. Ditches have been dug in European boreal forests and some parts of North America to drain wetlands and increase forest production, consequently boosting the availability of cultivable land and a national-scale landscape modification. Target 6.6 of the Agenda 2030 highlights the importance of protecting and restoring water-related ecosystems. Wetlands are a substantial part of this, having a high carbon storage capability, the property of mitigating floods, and purifying water. All things accounted for, the withdrawal of anthropogenic environment alterations can be on the horizon, even more because ditches are also strong emitters of methane and other greenhouse gases due to their anoxic water and sediment accumulation. However, streams and ditches that are missing from maps and databases are difficult to manage.

The main focus of this study was to develop a method to map channels combining deep learning and national Aerial Laser Scans (ALS). The performance of different topographical indices derived from the ALS data was evaluated, and two different Digital Elevation Model (DEM) resolutions were compared. Ditch channels and natural streams were manually digitized from ten regions across Sweden, summing up to 1923km of ditch channels and 248km of natural streams. The topographical indices used were: high-passing median filter, slope, sky-view factor and hillshade (with azimuths of 0°, 45°, 90° and 135°); while 0.5m and 1m were the DEM resolutions analysed. A U-net model was trained to segment images between ditches and stream channels: all pixels from each image were labelled in a way that those with the same class display similar attributes.

Results showed that ditches can be successfully mapped with this method and it can generally be applied anywhere since only local terrain indices are required. Additionally, when the natural streams are present in the dataset the model underperformed in predicting the location of ditches, while a higher resolution had the opposite effect. Streams were more challenging to map, and the model only indicated the channels, not whether or not they contained water. Further research will be required to combine hydrological modelling and deep learning.

How to cite: Dos Santos Toledo Busarello, M., Ågren, A., and Lidberg, W.: Mapping streams and ditches using Aerial Laser Scanning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12373, https://doi.org/10.5194/egusphere-egu23-12373, 2023.

X4.161
|
EGU23-16252
|
ITS1.5/GI1.5
|
ECS
Muhammad Asif Suryani, Christian Beth, Klaus Wallmann, and Matthias Renz

In Marine Geology, scientists persistently perform extensive experiments to measure diverse features across the globe, hence to estimate environmental changes. For example, Mass Accumulation Rate (MAR) and Sedimentation Rate (SR) are measured by marine geologists at various oceanographic locations and are largely reported in research publications but have not been compiled in any central database. Furthermore, every MAR and SR observation normally carries i) exact locational information (Longitude and Latitude), ii) the method of measurement (stratigraphy, 210Pb), iii) a numerical value and units (2.4 g/m2/yr), iv) temporal feature (e.g. hundred years ago). The contextual information attached to MAR and SR observations is heterogeneous and manual approaches for information extraction from text are infeasible. It is also worth mentioning that MAR and SR are not denoted in standard international (SI) units.

We propose the comprehensive end-to-end framework GEOTEK (Geological Text to Knowledge) to extract targeted information from marine geology publications. The proposed framework comprises three modules. The first module carries a document relevance model alongside a PDF extractor, capable of filtering relevant sources using metadata, and the extraction module extracts text, tables, and metadata respectively. The second module mainly comprises of two information extractors, namely Geo-Quantities and Geo-Spacy, particularly trained on text from the Marine Geology domain. Geo-Quantities is capable of extracting relevant numerical information from the text and covers more than 100 unit variants for MAR and SR, while Geo-Spacy extracts a set of relevant named entities as well as locational entities, which are further processed to obtain respective geocode boundaries. The third module, the Heterogeneous Information Linking module (HIL), processes exact spatial information from tables and captions and forms links to the previously extracted measurements. Finally, the all-linked information is populated in an interactive map view.

How to cite: Suryani, M. A., Beth, C., Wallmann, K., and Renz, M.: GEOTEK: Extracting Marine Geological Data from Publications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16252, https://doi.org/10.5194/egusphere-egu23-16252, 2023.

X4.162
|
EGU23-16813
|
ITS1.5/GI1.5
|
ECS
|
Highlight
Anna Jungbluth, Ed Pechorro, Clement Albergel, and Susanne Mecklenburg

Climate change is arguably the greatest environmental challenge facing humankind in the twenty-first century. The United Nations Framework Convention on Climate Change (UNFCCC) facilitates multilateral action to combat climate change and its impacts on humanity and ecosystems. To make decisions on climate change mitigation and adaptation, the UNFCCC requires systematic observations of the global climate system.

The objective of the ESA’s climate programme, currently delivered via the Climate Change Initiative (CCI), is to realise the full potential of the long-term, global-scale, satellite earth observation archive that ESA and its Member States have established over the last 35 years, as a significant and timely contribution to the climate data record required by the UNFCCC.

Since 2010, the programme has contributed to a rapidly expanding body of scientific knowledge on >22 Essential Climate Variables (ECVs), through the production of Climate Data Records (CDRs). Although varying across geophysical parameters, ESA CDRs follow community-driven data standards, facilitating inter- and cross-ECV research of the climate system.

In this work, we highlight the use of artificial intelligence (AI) in the context of the ESA CCI. AI has played a pivotal role in the production and analysis of these Climate Data Records. Eleven CCI projects - Greenhouse Gases (GHG), Aerosols, Clouds, Fire, Ocean Colour, Sea Level, Soil Moisture, High Resolution Landcover, Biomass, Permafrost, and Sea Surface Salinity - have applied AI in their data record production and research or have identified specific AI usage for their research roadmaps.

The use of AI in these CCI projects is varied, for example - GHG CCI algorithms using random forest machine learning techniques; Aerosol CCI algorithms to retrieve dust aerosol optical depth from thermal infrared spectra; Fire CCI algorithms to detect burned areas. Moreover, the ESA climate community has identified climate science gaps in context to ECVs with the potential for meaningful advancement through AI.

We specifically focus on showcasing the use of AI for data homogenization and super-resolution of ESA CCI datasets. For instance, both the land cover and fire CCI dataset were generated globally in low resolution, while high resolution data only exists for specific geographical regions. By adapting super-resolution algorithms to the specific science use cases, we can accelerate the generation of global, high-resolution datasets with the required temporal coverage to support long-term climate studies. 

How to cite: Jungbluth, A., Pechorro, E., Albergel, C., and Mecklenburg, S.: The Use of Artificial Intelligence in ESA’s Climate Change Initiative, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16813, https://doi.org/10.5194/egusphere-egu23-16813, 2023.

Posters virtual: Mon, 24 Apr, 14:00–15:45 | vHall ESSI/GI/NP

Chairpersons: Paul Celicourt, Silvio Gumiere, Hossein Bonakdari
vEGN.16
|
EGU23-13099
|
ITS1.5/GI1.5
|
ECS
Yiqi Lin, William Lidberg, Cecilia Karlsson, and Anneli Ågren

There is a soaring demand for up-to-date and spatially-explicit soil information to address various environmental challenges. One of the most basic pieces of information, essential for research and decision-making in multiple disciplines is soil classification. Conventional soil maps are often low in spatial resolution and lack the complexity to be practical for hands-on use. Digital Soil Mapping (DSM) has emerged as an efficient alternative for its reproducibility, updatablity, accuracy, and cost-effectiveness, as well as the ability to quantify uncertainties.

Despite DSM’s growing popularity and increasingly wider areas of application, soil information is still rare in forested areas and remote regions, and the integration with high-resolution data on a country scale remains limited. In Sweden, quaternary deposit maps created by the Geological Survey of Sweden (SGU) have been the main reference input for soil-related research and operation, though most parts of the country still warrant higher quality representation. This study utilizes machine learning to produce a high-resolution surficial deposits map with nationwide coverage, capable of supporting research and decision-making. More specifically, it: i) compares the performance of two tree-based ensemble machine learning models, Extreme Gradient Boosting and Random Forest, in predictive mapping of soils across the entire country of Sweden; ii) determines the best model for spatial prediction of soil classes and estimates the associated uncertainty of the inferred map; iii) discusses the advantages and limitations of this approach, and iv) outputs a map product of soil classes at 2-m resolution. Similar attempts around the globe have shown promising results, though at coarser resolutions and/or of smaller geographical extent. The main assumptions behind this study are: i) terrain indices derived from digital elevation model (DEM) are useful predictors of soil type, though different classification algorithms differ in their effectiveness; ii) machine learning can capture major soil classes that cover most of Sweden, but expert geological and pedological knowledge is required when identifying rare soil types.

To achieve this, approximately 850,000 labeled soil points extracted from the most accurate SGU maps will be combined with a stack of 12 LiDAR DEM-derived topographic and hydrological indices and 4 environmental datasets. Uncertainty estimates of the overall model and for each soil class will be presented. An independent dataset obtained from the Swedish National Forest Soil Inventory will be used to assess the accuracy of the machine learning model. The presentation will cover the method, data handling, and some promising preliminary results.

How to cite: Lin, Y., Lidberg, W., Karlsson, C., and Ågren, A.: Mapping Swedish Soils with High-resolution DEM-derived Indices and Machine Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13099, https://doi.org/10.5194/egusphere-egu23-13099, 2023.

vEGN.17
|
EGU23-15656
|
ITS1.5/GI1.5
Wenhua Wan and Petra Döll

Karst aquifers cover a significant portion of the global water supply. However, a proper representation of groundwater recharge in karst areas is completely absent in the state-of-art global hydrological models. This study, based on the new version of the global hydrological model WaterGAP, (1) presented the first modeling of diffuse groundwater recharge (GWR) in all karst regions using the global map of karstifiable rocks; and (2) adjusted the current GWR algorithm with the up-to-date databases of slope and soil. A large number of ground-based recharge estimates on 818 half degree cells including 75 in karst areas were compared to model results. GWR in karst landscapes assuming equal to the runoff from soil leads to unbiased estimation. The majority of simulated mean annual recharge ranges from 0.6 mm/yr (10th percentile) to 326.9 mm/yr (90th) in nonkarst regions, and 7.5 mm/yr (10th) to 740.2 mm/yr (90th) in karst regions. The recharge rate ranges from 2% to 66% of precipitation according to ground-based estimates in karst regions, while the simulated GWR produces global recharge fractions between 4% (10th) to 68% (90th) in karst areas while that in nonkarst areas rarely exceeds 25%. Unlike the previous studies that claimed global hydrological models consistently underestimate recharge, we observed underestimation only in the very humid regions where recharge exceeds 300 mm/yr. These very high recharge estimates are likely to include preferential flow and adopt a finer spatial and temporal scale than the global model. In karst landscapes and arid regions, we demonstrate that WaterGAP incorporating karst algorithm gives a worthy performance.

 

How to cite: Wan, W. and Döll, P.: Karst integration into groundwater recharge simulation in WaterGAP, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15656, https://doi.org/10.5194/egusphere-egu23-15656, 2023.