EGU23-16662, updated on 26 Feb 2023
https://doi.org/10.5194/egusphere-egu23-16662
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

NLP-based Cognitive Search Engine for the GEOSS Platform data

Yannis Kopsinis1, Zisis Flokas2, Pantelis Mitropoulos3, Christos Petrou4, Thodoris Siozos5, and Giorgos Siokas6
Yannis Kopsinis et al.
  • 1LIBRA AI Technologies, Athens, Greece (yannis.kopsinis@libramli.ai)
  • 2LIBRA AI Technologies, Athens, Greece (zisis.flokas@libramli.ai)
  • 3LIBRA AI Technologies, Athens, Greece (pantelis.mitropoulos@libramli.ai)
  • 4LIBRA AI Technologies, Athens, Greece (christos.petrou@libramli.ai)
  • 5LIBRA AI Technologies, Athens, Greece (thodoris.siozos@libramli.ai)
  • 6LIBRA AI Technologies, Athens, Greece (George.Siokas@libramli.ai)

Effectively querying unstructured text information in large databases is a highly demanding task. Conventional approaches, such as an exact match or fuzzy search, return valid and thorough results only when the user query adequately matches the wording within the text or the query is included in keyword-tag lists. The GEOSS portal relies on conventional search tools for data and services exploration and retrieval, limiting its capacity. This challenge, recent advances in Artificial Intelligence (AI)-based Natural Language Processing (NLP) try to surpass with enhanced information retrieval and cognitive search. Rather than relying on exact or fuzzy text matching, it detects documents that semantically and conceptually are close enough to the search query. 

The EIFFEL EU-funded project aims to reveal the role of GEOSS as the default Digital Portal for building Climate Change (CC) adaption and mitigation applications and offer the Earth Observation community the ground-breaking capacity of exploiting existing GEOSS datasets. To this end, as a lead technological partner of the EIFFEL consortium, LIBRA AI Technologies, designs and develops an end-to-end advanced cognitive search system dedicated to the GEOSS Portal and exceeds current challenges.

The proposed system comprises an AI language model optimized for CC-related text and queries, a framework for collecting a sizeable CC-specific corpus used for the language model specialization, a back-end that adopts modern database technologies with advanced capabilities for embedding-based cognitive search matching, and an open Application Programming Interface (API). The cognitive search component is the backbone of the EIFFEL visualisation engine, which will allow any GEOSS user, as well as the EIFFEL Climate Change application developing teams, to detect GEOSS data objects and services that are of interest for their research and application but could not effectively get accessed with the available GEOSS Portal search engine.

The work described in this abstract is part of the EIFFEL European project. The EIFFEL project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 101003518. We thank all partners for their valuable contributions.

How to cite: Kopsinis, Y., Flokas, Z., Mitropoulos, P., Petrou, C., Siozos, T., and Siokas, G.: NLP-based Cognitive Search Engine for the GEOSS Platform data, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-16662, https://doi.org/10.5194/egusphere-egu23-16662, 2023.

Supplementary materials

Supplementary material file