LLM-Assisted Neuro-Symbolic AI Models for Advanced Mineral Exploration

Weilin Chen

The increasing complexity, heterogeneity, and scale of geoscientific data present both unprecedented opportunities and significant challenges for mineral exploration. While machine learning (ML) and deep learning (DL) have advanced predictive modeling by uncovering hidden patterns in large datasets, their purely data-driven nature often leads to “black box” outputs that lack interpretability and may overlook critical geological knowledge. Such limitations constrain their applicability in real-world exploration, where predictions must align with established geological principles and be transparent enough to guide high-stakes decision-making. Neuro-Symbolic AI (NSAI), which integrates the statistical learning strengths of ML with the explicit reasoning capabilities of symbolic logic, offers a promising framework to address these challenges. By embedding domain-specific rules into predictive workflows, NSAI can produce results that are not only accurate but also geologically coherent. Recent advances in Large Language Models (LLMs) open new possibilities for automating the extraction and structuring of such domain knowledge from vast bodies of geological literature, enabling more scalable and consistent integration of expert reasoning into computational models. This dissertation builds on these developments by proposing an NSAI framework that uses LLMs to extract symbolic geological knowledge, formalize it into structured representations, and integrate it with ML-based mineral prediction models. The completed research demonstrates this approach in the context of mineral deposits, using a comprehensive geochemical dataset from the U.S. Geological Survey’s National Geochemical Database: Ore Deposits. The process involved (1) systematic data cleaning, normalization, and geological label generation, (2) automated rule extraction from mineral deposit model literature via iterative LLM prompting, (3) construction of expert-validated knowledge graphs (KGs) to represent relationships among deposit types, geochemical indicators, and alteration processes, and (4) incorporation of these symbolic rules into ML workflows as features and constraints. Experimental results using a random forest classifier show that the knowledge-guided models outperform baseline ML approaches in accuracy, precision, and robustness, while interpretability analyses using SHAP (SHapley Additive exPlanations) values confirm alignment between model reasoning and established geological understanding. While these results validate the potential of LLM-assisted NSAI in mineral prediction, the future research in this dissertation extends the framework toward broader adaptability and field deployment. First, the symbolic integration process will be advanced to ensure that general geological knowledge is effectively tailored to specific mineral deposit types. This will involve developing matching strategies that align each deposit type with its most relevant knowledge rules, thereby increasing both the precision and contextual relevance of model predictions. Second, the methodology will be applied to spatial prediction tasks, integrating geochemical, geological, and geophysical data to generate high-resolution maps of mineral prospectivity. This spatial modeling will be validated in data-rich regions to assess predictive performance across deposit types. Third, the framework will be tested in underexplored and data-sparse areas, where the embedded knowledge base can compensate for limited geochemical coverage, guiding early-stage exploration targeting and reducing operational risk. By uniting automated knowledge extraction, structured geological representation, and predictive modeling, this research contributes both methodologically and practically to the integration of AI in geoscience. The completed work demonstrates that embedding symbolic geological knowledge into ML workflows can improve accuracy and interpretability in mineral prediction, while the planned research aims to extend these benefits to spatial mapping and early-stage exploration in challenging data environments. Collectively, these efforts bridge the gap between computational modeling and geological reasoning, laying the groundwork for AI systems that are not only powerful but also trusted by the geoscience community.

LLM-Assisted Neuro-Symbolic AI Models for Advanced Mineral Exploration

Abstract

Files and links (1)

Metrics

Details