- 1Department of Hydraulic Engineering, Tsinghua University, Beijing, China
- 2Department of Computer Science and Technology, Tsinghua University, Beijing, China
- 3Zhipu AI, Beijing, China
- 4CHN Energy Dadu River Big Data Services Co., Ltd., Sichuan, China
Large Language Models (LLMs) have demonstrated outstanding performance across natural language processing tasks. However, when deployed in specialized domains such as hydro-science and engineering (HydroSE), these models face challenges such as insufficient domain knowledge and catastrophic forgetting during domain adaption. In this work, we constructed a multi-dimensional corpus for the HydroSE and trained a domain-specific LLM named Hammer. We propose a comprehensive training paradigm that integrates multi-dimensional knowledge injection with a multi-model merging method, effectively balancing domain expertise with general intelligence. First, to overcome knowledge scarcity, multi-disciplinary knowledge involved in HdyroSE is collected from various sources (such as textbooks, papers, laws and industry standards, etc.). Second, to mitigate catastrophic forgetting, we implemented a progressive training pipeline combining continued pre-training, supervised fine-tuning, and model merging. This approach allows the model to master professional knowledge while retaining its general capabilities. Experimental results show that Hammer significantly improved domain-specific performance from 68.8% (baseline) to 84.9%, surpassing mainstream general LLMs. Crucially, the model merging technique restores general capabilities to near-original levels. The proposed data processing and training approach demonstrates robust transferability even when the base model is substituted.
How to cite: Yu, X., Shan, W., Li, Y., Hu, S., Liu, D., Zheng, Z., Liu, J., Luo, W., Wang, L., Xu, B., and Zhao, J.: Hammer: An Expert-Level Large Language Model for Hydro-Science and Engineering Balancing Domain Expertise and General Intelligence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2906, https://doi.org/10.5194/egusphere-egu26-2906, 2026.