Abstract
Industry-wide operating experience is a critical source of raw data for reliability and risk model parameter estimations for nuclear power plants. A large portion of operating experience data are failure events stored as reports that contain unstructured data, such as narratives. In current practice, a failure report is usually reviewed and manually coded by analysts. The coding is based on extracting several event characteristics such as system name, component type, sub-part type, failure mode, and failure cause. Event narratives are mostly used to help understand events and extract their characteristics. In this line of research, we aim to maximize the usage of event narratives by leveraging natural language processing (NLP) methods to automatically convert an event narrative to a causal graph. This research has promise to improve physical understanding of failure initiation and propagation and to facilitate use of non-failure data (e.g., near-misses and degradations) to complement the limited data pool of failures. In our previous work, we developed an NLP tool and applied it to analyze a number of licensee event reports submitted by U.S. nuclear power plants to the Nuclear Regulatory Commission. In this paper, we will report our recent research progress in aggregating the results of multiple reports, developing network model(s), and drawing statistical insights.