
Engineering and Technology
Development of a civil infrastructure resilience assessment framework and its application to a nuclear power plant
T. K. Singhal, O. Kwon, et al.
Explore the groundbreaking Civil Infrastructure Resilience Assessment Framework (CIRAF) developed by Tarun K. Singhal, Oh-Sung Kwon, Evan C. Bentz, and Constantin Christopoulos from the University of Toronto. This innovative framework uses a Bayesian Network approach to assess the seismic fragility and resilience of civil infrastructure systems, evaluating factors such as functionality loss and recovery time. Delve into its application through a case study of a nuclear power generation system and uncover effective upgrade strategies.
~3 min • Beginner • English
Introduction
The paper addresses the vulnerability of communities and civil infrastructure to extreme events such as hurricanes, earthquakes, and floods, noting that prevailing design codes emphasize life safety but not post-event recovery (resilience). The research question centers on how to quantify seismic fragility and resilience at both component and system (and system-of-systems) levels and to support resilience-based design and decision-making. The purpose is to develop an open-ended framework (CIRAF) that can capture interdependencies among structural and non-structural components, quantify loss of functionality, repair time, repair cost, and resilience, and compare upgrade strategies using probabilistic methods. The importance lies in enabling stakeholders to plan preventive measures and rapid recovery, extending traditional code-based design to resilience-based approaches.
Literature Review
Prior frameworks employ various techniques for vulnerability and performance analysis. Simulation-based approaches (e.g., Li et al., 2017; Kilanitis and Sextos, 2018; Tokgöz and Gheorghe, 2013) use sampling (e.g., Monte Carlo), which can be inefficient for rare, long-tail events. Some frameworks are geographically or hazard limited, or system-specific (e.g., Cimellaro, Reinhorn, and Bruneau, 2010b; Kilanitis and Sextos, 2018). HAZUS-MH enables geospatial risk assessments but with limited interdependent hazard models. Cimellaro et al. (2010b) formalized functionality, losses, and resilience post-seismic events but lacked a generic methodology to correlate components. Heitzler et al. (2017) proposed a spatiotemporal risk framework mainly for networked infrastructure. Bayesian approaches (Gehl and D’Ayala, 2016; Bensi et al., 2015) effectively capture dependencies within systems and across systems-of-systems and scale better than matrix-based methods when components exceed ~25, but often focus on risk rather than resilience and economic loss. Economic resilience at industry level (Panet, Barker, and Zobel, 2014) is too macroscopic for component-level insights. IN-CORE (Gardoni et al., 2018) expands capabilities for interdependent infrastructure and multiple hazards but was still under development. Loss quantification varies by system: traffic flow in transportation (Kilanitis and Sextos, 2018), health impacts in hospital networks (Cimellaro et al., 2010b), debris-related losses (Cimellaro, Archea, and Reinhorn, 2018). An ideal framework would accommodate diverse loss types, but practical constraints necessitate focusing on tractable, quantifiable parameters.
Methodology
CIRAF proceeds through seven steps: (1) system and hazard definition; (2) infrastructure component modeling; (3) development of a fragility database; (4) definition of recovery and upgrade models; (5) definition of functional relationships via Conditional Probability Tables (CPT) and Bayesian Networks (BN); (6) loss estimation; and (7) analysis of performance indicators.
System and hazard definition: The infrastructure system can be modeled either as a system-of-systems (spatially distributed) or as a single system with structural and non-structural components. Analysts select critical components essential to functionality. Hazards are characterized per context (e.g., PSHA for seismic hazards, hydrologic/topographic analyses for floods).
Component modeling: Each component is described by (1) fragility functions, (2) recovery functions, and (3) upgrade models. Functional relationships at the system level are captured via BNs.
Demand sources: Hazard demand values can be sourced from finite element analysis (e.g., ABAQUS, SAP2000, S-Frame), IoT sensor data, or analytical procedures (e.g., Capacity Spectrum Method, PSHA). A cloud tool exports FEA results in a standard format and can ingest IoT data by mapping sensors to component demand.
Fragility analysis: Fragility functions relate hazard intensity to probability of exceeding limit states, drawing from sources such as Cimellaro et al. (2010b), Gehl and D’Ayala (2016), Cover et al. (1983), and HAZUS-MH (2003). Components have one or more sequential damage states, each assigned a repair time (days) and damage ratio (0–1).
Recovery models: Recovery at component level can follow linear, exponential, or trigonometric (cosine) forms (Cimellaro, Reinhorn, and Bruneau, 2006), chosen to reflect societal response patterns. A multi-step recovery function accommodates systems that remain non-functional until fully recovered (e.g., roads). A wait time Tw allows delayed recovery initiation. Parameters include repair time Tr, event time toe, wait time Tw, and number of steps N for multi-step.
Upgrade models: Retrofitting is characterized by Upgrade Cost Ratio (UCR) and Upgrade Factor (UF). UCR represents additional investment relative to standard cost. UF increases the median capacity Am of the fragility (Am + UF) at constant dispersion, shifting fragility to the right, reducing failure probability. For components without defined retrofitting, set UCR and UF to 1.
Components fragility correlation: Dependencies are modeled with Bayesian Networks. Components (blue nodes) have damage states from fragilities; sub-systems and systems (red and green nodes) have user-defined failure modes; transfer nodes can connect subnetworks to manage complexity. CPTs map combinations of component states to system/sub-system failure modes. Example: a pumping system with pump, pipe, and voltage guard has failure modes for intact, reduced, or no flow; CPT entries reflect domain logic (e.g., pipe damage yields reduced capacity; pump/guard damage yields no flow). Probabilities can be derived analytically or via Monte Carlo.
Loss estimation immediately after a disaster: Component-level loss factor Li = Σj Pij Dij, where Pij is probability of component i in damage state j and Dij its damage ratio. System-level losses are computed by aggregating over component state combinations using CPT weights. For a generic sub-system with n components and total damage-state combinations D, construct: (a) PM vector of joint probabilities formed by products of component damage state probabilities; (b) WRLM (weighted loss ratio matrix) combining damage ratios and component weights; (c) apply CPT to obtain loss by failure mode and combine with failure mode probabilities from BN to yield total system loss. This is applied recursively from components to sub-systems to systems.
Loss estimation during recovery: As components recover per their recovery functions, derive an equivalent demand corresponding to recovered functionality at time Ti. Back-calculate updated failure probabilities from fragility curves using the functionality at Ti, and re-evaluate system functionality via the same BN/CPT formulation.
Performance indicators: System functionality Q(t) is numerically evaluated over time from system-level loss during recovery and integrated to obtain a resilience index. Additional indicators include:
- Repair time: For a component i, Trn = Σj Pij Trnu; for systems/sub-systems, take the maximum of constituent repair times.
- Repair cost: Component repair cost is loss times replacement cost; system repair cost RCsys = Lsys Σi Ii (where Ii are replacement costs of components in the system).
- Upgrade Benefit Index (UBI): UBIi = (Qsy | i upgraded − Qsy) / (1 − Qsy), measuring improvement in system functionality due to upgrading component i (0–1 range).
- Damage Consequence Index (DCI): DCIi = (Qsy − Qsy | i failed) / Qsy, measuring loss of system functionality if component i fails completely (0–1 range).
Advantages and scalability: CIRAF accepts multiple demand sources without reliance on extensive sampling; handles interdependencies via CPTs and BNs; allows decomposition of large networks with transfer nodes. A scalability study showed a single system with 2501 nodes (assuming 2 failure modes) processed in under 4 seconds on a 16 GB, 2.3 GHz Dual-Core Intel Core i5; splitting into three connected systems analyzed in under 12 seconds. A web application implements the framework for definition and visualization of resilience and indicators.
Key Findings
- The CIRAF framework quantifies seismic fragility, functionality loss, repair time, repair cost, and resilience for single and interconnected systems using Bayesian Networks and CPTs, with demands sourced from FE analysis, IoT, or analytical methods.
- Case study: A simplified nuclear power plant (NPP) on the east coast of North America subjected to 10,000-year return period earthquakes (0.5% in 50 years). FE analysis (ABAQUS) provided demand values for the reactor building (containment wall and internal structure); demands for other components were assumed. Recovery started immediately (Tw = 0). All components were assigned illustrative upgrade parameters UCR = 10 and UF = 1.5.
- Four upgrade scenarios were evaluated: (1) As-built; (2) Upgrade turbine building; (3) Upgrade turbine building and switchyard; (4) Upgrade turbine building, switchyard, generators, and high-pressure heaters. Upgrades were applied until system functionality reached 95%.
- Results indicate that increasing the number of upgraded components reduces repair costs and downtime, increasing system functionality and resilience. However, upgrade cost increases with scope, creating a trade-off for decision-makers.
- The framework demonstrated computational efficiency: a system with 2501 nodes (2 failure modes) processed in under 4 seconds on a modest laptop; splitting a large system into three connected subnetworks yielded analysis in under 12 seconds.
- The framework and tool allow computation of performance indicators (resilience, functionality over time, repair time, repair cost, UBI, DCI) to identify critical components and compare retrofit strategies.
Discussion
The findings show that CIRAF effectively addresses the need to quantify resilience beyond life-safety-based code provisions by modeling interdependencies and propagating component fragility to system-level performance via Bayesian Networks. In the NPP case, targeted upgrades systematically improved functionality and reduced repair costs, illustrating the utility of UBI/DCI metrics for prioritizing retrofits. The observed trade-off between reduced repair costs and increased upgrade expenditures underscores the framework’s role in supporting cost-benefit analyses for stakeholders. Computational scalability suggests practicality for large systems or systems-of-systems. By accommodating multiple demand sources (FEA, IoT, analytical), CIRAF can be integrated into both design and operational monitoring contexts. Overall, the framework translates component-level fragility and recovery into system-level resilience trajectories, enabling quantitative comparison of upgrade strategies and recovery planning.
Conclusion
The paper presents CIRAF, a generic framework and web-based tool to assess seismic fragility and resilience of civil infrastructure systems and systems-of-systems. Using CPTs and Bayesian Networks, it models probabilistic dependencies among structural and non-structural components to compute functionality loss, repair time, repair cost, and resilience. Demands can be obtained from FE analyses, IoT sensors, or analytical methods. A simplified nuclear power plant case study demonstrated the framework’s ability to compare upgrade strategies and quantify trade-offs between upgrade cost and reduced repair cost/downtime, achieving up to 95% system functionality with progressive upgrades. The framework is computationally efficient and supports decision-making via performance indicators (UBI, DCI, etc.). The authors note that the relevance of assessments depends on expert definition of failure modes, fragility, and recovery models, and that the framework provides a generic structure requiring careful, domain-informed inputs. Future applications can extend to diverse infrastructure types and multiple hazards once appropriate input models are defined.
Limitations
- The resilience assessment emphasizes physical losses from structural and non-structural component damage; indirect socio-economic, functional interdependencies beyond those modeled, and other non-physical losses were not comprehensively included.
- Recovery modeling uses generic functional forms (linear, exponential, cosine, multi-step) and assumes parameter values that may not capture complex, time-varying restoration dynamics (e.g., resource constraints, policy delays beyond Tw).
- Some fragility functions and parameters (median capacities, dispersions, repair times) were assumed due to limited published data; several fragilities were fixed or simplified.
- In the case study, many demand values for non-FE-modeled components were assumed; recovery initiation wait time was set to zero for simplicity; upgrade parameters (UCR, UF) were illustrative and uniform across components.
- System/sub-system failure modes and CPTs were defined at a high level based on expert judgment, which may introduce bias and limit granularity.
- System repair time at higher levels was taken as the maximum of component repair times, not accounting for potential additive or sequencing effects.
- The case study NPP model was simplified with a limited set of components and assumed dependencies, which may limit generalizability of numerical results (though the framework is general).
Related Publications
Explore these studies to deepen your understanding of the subject.