Resilient Fault Detection and Recovery Mechanisms for Safety-Critical Industrial Control Systems under Cyber-Physical Threats
Abstract
Cyber-physical systems (CPS) that control critical industrial infrastructure face increasingly sophisticated threats that can compromise both security and safety functions. These systems require robust fault detection and recovery mechanisms capable of maintaining operational integrity under various attack vectors and environmental disturbances. This paper presents a novel framework for resilient fault detection and recovery in safety-critical industrial control systems that integrates model-based anomaly detection with adaptive reconfiguration strategies. We demonstrate that by combining formal verification methods with stochastic process modeling, detection accuracy improves by 27\% while reducing false positives by 42\% compared to conventional approaches. Our proposed recovery mechanism implements a hierarchical decision-making architecture that prioritizes safety-critical functions while gracefully degrading non-essential operations, achieving a mean time to recovery of 3.8 seconds in experimental evaluations. We validate the approach using both hardware-in-the-loop simulation and testing on an operational testbed representing a chemical processing facility under various attack scenarios. Results indicate that the proposed methodology maintains critical safety margins even when 68\% of sensing infrastructure is compromised, significantly outperforming existing redundancy-based approaches while requiring minimal additional computational resources.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 authors

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.