Logo image
Failure Event Mining With Fine-Tuned Large Language Model: Case Study of Analyzing United States Nuclear Power Plant Failure Event Reports
Journal article   Open access   Peer reviewed

Failure Event Mining With Fine-Tuned Large Language Model: Case Study of Analyzing United States Nuclear Power Plant Failure Event Reports

Sai Zhang, Shahidur Rahoman Sohag, Min Xian, Shoukun Sun and Zhegang Ma
Risk analysis, Vol.46(3), e70191
03/2026
PMID: 41715937

Abstract

failure event narrative deep learning nuclear power plant text mining large language model causality extraction
Failure event narratives contain detailed and valuable information describing how failures initiate and propagate. Event causality analysis can help improve the understanding of failure physics and facilitate the use of non-failure data (e.g., near-misses and degradations) to complement the limited data pool of failures, which is common in high-reliability industries such as the nuclear power industry. Automatically extracting event causality from text data, however, is challenging given complex and diverse language structures and causal patterns, and the lack of access to large, annotated datasets for use as training data. Existing automated mining approaches are mainly knowledge-based and extract causality using a set of predefined keywords and rules, which have difficulty achieving good performance. In this paper, we propose a novel large language model (LLM)-based approach for automated causality extraction. It leveraged the strong capability of LLM to understand intricate language patterns in long-range contexts and accurately extract cause-and-effect pairs from texts. The proposed approach has a twofold framework: causality detection and causality extraction. The causality detection step trained a deep learning model to identify texts with causality. The causality extraction step developed a T5-CE LLM to identify and extract cause-and-effect pairs in each text sample. A large, annotated dataset of the U.S. nuclear power plant failure event reports was used to train and evaluate the models. The model evaluation was performed using three performance metrics, including precision, recall, and F1 score. The proposed approach can effectively detect implicit and embedded causalities across multiple sentences.
url
Article Landing PageView
Published (Version of record) Open

Metrics

1 Record Views

Details

Logo image