Monitoring ICS networks for potential security incidents is an important element of any mature ICS cybersecurity program. However, until recently, implementing intrusion or anomaly detection on ICS networks was not very practical because commercially available intrusion detection systems (IDS), designed for enterprise IT networks, were not capable of analyzing the unique protocols used in industrial […]Read More
Posts by :
White Papers by :
Is our industry stuck in the past? The current industry trend is to only look at random hardware failures in safety integrity level (SIL) probability of failure on demand (PFD) calculations. No one would appear to be updating assumptions as operating experience is gained. Hardware failure rates are generally fixed in time, assumed to be average point values (rather than distributions), and either generic in nature or specific to a certain set of hardware and/or conditions which the vendors determine by suitable tests or failure mode analysis. But are random hardware failures the only thing that cause a safety instrumented function (SIF) to fail? What if our assumptions are wrong? What if our installations do not match vendor assumptions? What else might we be missing? How are we addressing systematic failures?
One obvious problem with incorporating systematic failures is their non-random nature. Many functional safety practitioners claim that systematic errors are addressed (i.e., minimized or eliminated) by following all the procedures in the ISA/IEC 61511 standard. Yet even if the standard were strictly adhered to, could anyone realistically claim a 0% chance of a SIF failing due to a human factor? Some will say that systematic errors cannot be predicted, much less modeled. But is that true?
This paper will examine factors which tend to be ignored when performing hardware-based reliability calculations. Traditional PFD calculations are merely a starting point. This paper will examine how to incorporate systematic errors into a SIF’s real-world model. It will cover how to use Bayes theorem to capture data after a SIF has been installed — either through operating experience or industry incidents — and update the function’s predicted performance. This methodology can also be used to justify prior use of existing and non-certified equipment.Read More
The history of high consequence incidents in industry reveals that most accidents were the result of systematic failures, not hardware failures. However, a higher degree of focus in engineering is often on the quantifiable failures of hardware. Process Safety risk gaps are often closed or reduced by several types of Independent Protective Layers (IPLs). Two common types are Safety Instrumented Functions (SIFs) and Basic Process Control System (BPCS) functions. The SIFs typically reside within a SIL-rated programmable logic controller, and their achieved quantitative performance is calculated based on random hardware failures of the SIF hardware components. Conversely, BPCS protective layers are assigned generic industry-accepted probability of failure credits. The BPCS generic industry-accepted probabilities of failure are conservatively assigned and consider unquantifiable human-induced systematic failures.
Richard E. Hanner – aeSolutions Greenville, SC
Tab Vestal – Eastman Kingsport, TNRead More
Functional safety engineers follow the ISA/IEC 61511 standard and perform calculations based on random hardware failures. These result in very low failure probabilities, which are then combined with similarly low failure probabilities for other safety layers, to show that the overall probability of an accident is extremely low (e.g., 1E-5/yr). Unfortunately, such numbers are based on frequentist assumptions and cannot be proven. Looking at actual accidents caused by control and safety system failures shows that accidents are not caused by random hardware failures. Accidents are typically the result of steady and slow normalization of deviation (a.k.a. drift). It’s up to management to control these factors. However, Bayes theorem can be used to update our prior belief (the initial calculated failure probability) based on observing other evidence (e.g., the effectiveness of the facility’s process safety management process). The results can be dramatic.Read More
Bayes’ Theorem is an epistemological statement of knowledge, versus a statement of proportions and relative frequencies. It is therefore a method that can bridge qualitative knowledge with the rare-event numbers that are intended to represent that knowledge. Bayes’ Theorem is sorely missing from the toolbox of Process Safety practitioners. This paper will introduce Bayes’ Theorem to the reader and discuss the reasons and applications for using Bayes in Process Safety related to IPLs and LOPA. While intended to be introductory (to not discourage potential users), this paper will describe simple Excel™ based Bayesian calculations that the practitioner can begin to use immediately to address issues such as uncertainty, establishing confidence intervals, properly evaluating LOPA gaps, and incorporating site specific data, all related to IPLs and barriers used to meet LOPA targets.Read More
‘Evergreen’ and ‘lifecycle’ have become two common buzz words in our industry. They are thrown around in a variety of topics, processes, and philosophies as descriptions of how management plans should be set up. But what does it really mean to have an evergreen process? How does one keep a lifecycle alive? This is especially relevant when it comes to topics such as alarm management, where it is commonly touted that once a plant rationalizes their entire system, they have completed alarm management. This paper will deconstruct the alarm management lifecycle and pinpoint key aspects that can be integrated into process safety management systems and work processes that already exist. Tying the alarm management lifecycle to what is already being done as part of process safety and good engineering practice will help to ensure it remains ‘evergreen’ and delivers the intended benefits.Read More