- 1Cyprus University of Technology, Limassol, Cyprus
- 2AGH University, Krakow, Poland
- 3CASTORC, Cyprus Institute, Nicosia, Cyprus
Artificial intelligence is often expected to revolutionise geological modelling, but in practice its performance is strongly controlled by how geological information is collected, encoded, constrained, and by how well the AI workflow is tailored to the task. In this contribution we analyse what helps and what hurts AI-based geological modelling under data-scarce conditions, using shallow geothermal modelling in Cyprus as a testbed.
Within the WAGEs project on shallow geothermal energy, we compiled borehole profiles from across Cyprus, harmonising heterogeneous lithological descriptions into a simplified but consistent scheme and linking them to tectonic units and basic spatial information. Classical, off-the-shelf neural-network approaches performed poorly on this limited and noisy dataset, highlighting the vulnerability of generic architectures to inconsistent lithological classifications and incomplete metadata.
We therefore developed a tailored, sequence-based machine-learning workflow in which each borehole is encoded as a one-dimensional string combining depth-ordered lithologies, tectonic context, and location. A supervised learning algorithm was trained on existing boreholes and tested on independent control sites. In phase-one experiments, the model reached about 85% accuracy when the two top-ranked predicted lithological profiles were considered for the full borehole depth. This metrics was selected due to existing rock types that may be easily misclassified (marl-chalk) or interpreted (decayed rock at the surface – rock, soil or surface deposit). Algorithm’s skill was highest where lithological contrasts were strong, while more gradational successions remained difficult to distinguish. The model showed partial ability to infer the presence of faults from lithological patterns, while it was not designed to localise them nor supplied with relevant information.
From this case study we distil key factors that help tailored AI-based geological modelling (standardised, information-rich lithological logs; task-specific encoding that reflects geological settings; explicit tectonic context) and those that hurt it (lack of identification protocol; inconsistent rock descriptions; loss of detail during digitization). Our results indicate that robust AI-based geological modelling does not necessarily require massive datasets, as long as the available information is consistent and well structured. However, in data-scarce settings the main ceiling for AI performance is informational rather than algorithmic: more complex models add little once the underlying geological description is noisy or underspecified. In practice, tailored workflows are most powerful as tools for scenario ranking and for identifying where additional boreholes or geophysical surveys would most effectively reduce subsurface uncertainty, rather than as engines for fully automatic geological models. We conclude that the community should treat AI primarily as a tool for rapid, big-picture or illustrative geological modelling and for stress-testing geological knowledge. Its main value lies in exposing gaps in our subsurface descriptions (including quantitative uncertainty estimates), rather than providing a shortcut that can replace careful geological thinking.
How to cite: Ciapala, B., Papaefthymiou, E., Aresti, L., Pasias, D., Graikos, D., Florides, G. A., and Christodoulides, P.: What helps and what hurts tailored AI in geological modelling: beyond the hype, evidence from data-scarce shallow geothermal modelling in Cyprus, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17722, https://doi.org/10.5194/egusphere-egu26-17722, 2026.