Understanding safety life cycles
IEC/EN 61508 is the basis for the specification, design, and operation of safety instrumented systems (SIS)
By Kristen Barbour
The international standard IEC/EN 61508 has been widely accepted as the basis for the specification, design, and operation of safety instrumented systems (SIS). In general, IEC/EN 61508 uses a formulation based on risk assessment: An assessment of the risk is undertaken and, on the basis of this assessment, the necessary safety integrity level (SIL) is determined for components and systems with safety functions. SIL-evaluated components and systems are intended to reduce the risk associated with a device to a justifiable level or “tolerable risk.”
When considering safety in the process industry, there are several relevant national, industry, and company safety standards used when determining and applying safety within a process plant.
- IEC/EN 61508 (product manufacturer)
- IEC/EN 61511 (user)
- ISA-84.01 (USA) (user)
These standards need to be implemented by the process owners and operators with the relevant health, energy, waste, machinery, and other directives. These standards, which include terms and concepts that are well-known to specialists in the safety industry, may be unfamiliar to the general user in the process industries.
Essentially, the standards give the framework and direction for the application of the overall safety life cycle (SLC), covering all aspects of safety, including conception, design, implementation, installation, commissioning, validation, maintenance, and decommissioning.
The standard IEC/EN 61508 deals specifically with functional safety relating to electrical, electronic, and programmable electronic safety-related systems (E/E/PES). Manufacturers of process instrumentation interface equipment develop and validate devices following the demands of IEC/EN 61508 and provide the relevant information to enable the use of these devices by others within their SIS.
To implement their strategies within these overall safety requirements, plant operators and designers of safety systems follow the directives of IEC/EN 61511, utilizing equipment developed and validated according to IEC/EN 61508 to achieve their defined SIS.
Tips to consider when analyzing a safety loop:
Within the SLC, the various phases or steps may involve different personnel, groups, or even companies to carry out the specific tasks. For example, the steps can be grouped together and the various responsibilities understood as identified below.
The first five steps can be considered as an analytical group of activities and would be carried out by the plant owner/end user, probably working together with expert consultants:
- Overall scope definition
- Hazard and risk analysis
- Overall safety requirements
- Safety requirements allocation
The outputs of these definitions and requirements are considered the inputs to the next stages of activity.
The implementation group comprises the next eight steps and would be conducted by the end user together with chosen contractors and equipment suppliers.
- Operation and maintenance planning
- Validation planning
- Installation and commissioning planning
- Safety-related systems: E/E/PES implementation
- Safety-related systems: other technology implementation
- External risk reduction facilities implementation
- Overall installation and commissioning
- Overall safety validation
It must be noted that while each of these steps has a simple title, the work involved in carrying out the tasks can be complex and time-consuming.
The third group is essentially one of operating the process with its safeguards and involves the final three steps. These steps are normally carried out by the plant end user and contractors:
- Overall operation and maintenance
- Overall modification and retrofit
Within the overall SLC, we are particularly interested in considering Step 4 of the implementation phase in greater detail. This step deals with the aspects of any electrical/electronic/programmable electronic systems.
Two groups, or types, of subsystems are considered within the functional safety standards:
- The equipment under control (EUC) carries out the required manufacturing or process activity
- The control and protection systems implement the safety functions necessary to ensure that the EUC is suitably safe
Fundamentally, the goal here is the achievement or maintenance of a safe state for the EUC. You can think of the “control system” causing a desired EUC operation and the “protection system” responding to an undesired EUC operation.
Do not assume that all safety functions are to be performed by a separate protection system. When any possible hazard is analyzed, and the risks arising from the EUC and its control system cannot be tolerated, it is imperative to find a method of reducing the risks to tolerable levels. Perhaps, in some cases, the EUC or control system can be modified to achieve the required risk-reduction, but in other cases protection systems must be added. These protection systems are designated safety-related systems, whose specific purpose is to mitigate the effects of a hazardous event or to prevent that event from occurring.
One phase of the SLC is the analysis of the hazards and risks arising from the EUC and its control system. In the standards, the concept of risk is defined as the probable rate of
- occurrence of a hazard (accident) causing harm
- the degree of severity of harm
So, risk can be seen as the product of “incident frequency and “incident severity.” Often, the consequences of an accident are implicit within the description of an accident, but, if not, they should be made explicit. There are a wide range of methods applied to the analysis of hazards and risk around the world, and an overview is provided in both IEC/EN 61511 and IEC/EN 61508. The step of clearly identifying hazards and analyzing risk is one of the most difficult to carry out, particularly if the process being studied is new or innovative. With historical plant data or industry-specific methods or guidelines, the analysis may be readily structured, but it is still complex and time-consuming.
Methods used to analyze hazards and risk include techniques such as:
- HAZOP - HAZard and OPerability study
- FME(C)A - Failure Mode Effect (and Criticality) Analysis
- FMEDA - Failure Mode Effect and Diagnostics Analysis
- ETA - Event Tree Analysis
- FTA - Fault Tree Analysis
These methods, and the relevant standards used in the analysis, balance the risks associated with the EUC (i.e., the consequences and probability of hazardous events) using relevant, dependable safety functions. This balance includes the tolerability of the risk. For example, the probable occurrence of a hazard with a negligible consequence could be considered tolerable, whereas a catastrophic event would be an intolerable risk. If, in order to achieve the required level of safety, the risks of the EUC cannot be tolerated according to the criteria established, then safety functions must be implemented to reduce the risk.
The goal is to ensure that the residual risk – the probability of a hazardous event occurring even with the safety functions in place – is less than or equal to the tolerable risk. In Figure 1, the risk posed by the EUC is reduced to a tolerable level using a “necessary risk reduction” strategy. The reduction of risk can be achieved by a combination of items rather than depending upon only one safety system and can comprise organizational measures as well. These risk reduction measures and systems must achieve an “actual risk reduction” that is greater than or equal to the necessary risk reduction.
While there may be some overall methods and mechanisms described in the safety requirements of the standards, these requirements are further broken down into specific safety functions to achieve a defined task. In parallel with this allocation to specific safety functions, a measure of the dependability or integrity of those safety functions is required.
SIL measures how reliably the safety function is performing according to a defined specification. More precisely, the IEC standards state the safety integrity of a system can be defined as “the probability (likelihood) of a safety-related system performing the required safety functions under all the stated conditions within a stated period of time.”
The specification of the safety function includes both the actions to be taken in response to the existence of particular conditions and the time for that response to take place.
Probability of failure
To categorize the safety integrity of a safety function, the probability of failure is considered – in effect, the inverse of the SIL definition, looking at failure-to-perform rather than success. It is generally easier to identify and quantify possible conditions and causes leading to the failure of a safety function than it is to guarantee the desired action of a safety function when called upon.
Two classes of SIL are identified, depending on the service provided by the safety function:
- Safety functions that are activated when required (on demand mode) – the probability of failure-to-perform correctly
- Safety functions that are in place continuously (continuous mode) – the probability of a dangerous failure is expressed in terms of a given period of time (per hour)
IEC/EN 61508 requires that when safety functions are to be performed by the E/E/PES, the safety integrity is specified in terms of a SIL. The probabilities of failure are related to one of four SILs (Table 1).
The protection functions, whether performed within the control system or a separate protection system, are referred to as safety-related systems. If, after analysis of possible hazards arising from the EUC and its control system, it is decided that there is no need to designate any safety functions, IEC/EN 61508 requires that the dangerous failure rate of the EUC control system shall be below the levels given as SIL1. So, even when a process may be considered as benign, with no intolerable risks, the control system must be shown to have a rate not lower than 10-5 dangerous failures per hour.
An important consideration for any safety-related system is the level of certainty that the required safe response or action will take place when it is needed. This is normally determined as the likelihood that the safety loop will fail to act when required. This rate is normally expressed as a probability.
The standards apply both to safety systems operating on demand, such as an emergency shut-down (ESD) system, and to systems operating “continuously” or in high demand, such as the process control system. For a safety loop operating in the demand mode of operation, the relevant factor is the PFDavg, which is the average probability of failure on demand (PFD). For a continuous or high-demand mode of operation, the probability of a dangerous failure per hour (PFH) is considered, rather than PFDavg.
With all the factors taken into consideration, the PFDavg can be calculated. Once the PFDavg for each component part of the system has been calculated, the PFDavg of the whole system is simply the sum of the component PFDavg.
An FMEA is a way to document the system being considered using a systematic approach to identify and evaluate the effects of component failures and to determine what could reduce or eliminate the chance of failure. An FMEDA extends the FMEA techniques to include on-line diagnostic techniques and identify failure modes relevant to SIS design. Once the possible failures and their consequences have been evaluated, the various operational states of the subsystem can be associated using the Markov models. One other factor needed to apply to the calculation is the interval between tests, which is known as the “proof time” or the “proof test interval.” This variable may depend on not only the practical implementation of testing and maintenance in the relevant system, subsystem, or component, but also the desired end result. By varying the proof time within the model, the subsystem or safety loop may be suitable for use with a different SIL. Practical and operational considerations are often the guide when establishing proof times.
The IEC/EN 61508 standard states that a SIL can be properly associated only with a specific safety function – as implemented by the related safety loop – and not with a stand-alone instrument or single piece of equipment. In this context, it is possible to state the compliance with the requirements of a specific SIL only after analyzing the entire safety loop. However, it is possible to analyze a single building block of a typical safety loop and provide evidence that this item can be used to obtain a required SIL-rated safety loop. Since all the elements of a safety loop are interdependent in achieving the goal, it is relevant to check that each piece is suitable for the purpose.
For our example, consider a single electronic, intrinsic safety isolator as our component. Within the context of this example, the safety loop is a control system intended to implement a safety function.
Figure 2 shows a typical safety loop, including the intrinsic safety isolators (signal input and signal output) for hazardous area protection. Further, we will assume the required SIL has been determined to be SIL2. Note, this example is for reference only and does not imply that a full safety loop assessment has been performed.
Considering that the typical safety loop as shown is made of many serially connected blocks, all of which are required to implement the safety function, the available PFD budget (< 10-2 as for SIL2, see Table 1) must be shared among all the relevant blocks. For example, a reasonable, rather conservative, goal is to assign to the isolator no more than 10 percent of the available PFD budget. This results in a PFD limit at the isolator level of ~10-3 or 0.1 percent. However, it should be understood that the selected value is only a reasonable guess and does not imply that there is no need to evaluate the PFD at the safety loop level or that the safety contribution of the intrinsic safety isolator can be neglected.
The PFD value for the complete safety device is calculated from the values of the individual components. Since sensors and actuators are installed in the field, these are exposed to chemical and physical loading (process medium, pressure, temperature, vibration, etc.). Accordingly, the risk of faults is relatively high for these components. For this reason, 25 percent of the overall PFD is assigned to the sensors, and 40 percent to the actuators. Thus, 15 percent remains for the fault tolerant control system, and 10 percent each for the interface modules. The interface modules and control system typically have no contact with the process medium since they are normally housed in the protected control room.
In this example, and to demonstrate that the relevant isolators are suitable to be used within a SIL2 safety loop, a comprehensive FMEA analysis was carried out. The FMEA covered 100 percent of the components and took into account the different, applicable failure modes, including intermittent and “de-rating” failures. This is the recommended procedure, according to IEC/EN 61508, with respect to other non-quantitative or semi-quantitative approaches. As a result of the FMEA, the PFDavg can be calculated for each of the relevant isolators and is shown to be less than 10-3, thus enabling their possible use within this specific application.
Overall, the concept of the SLC introduces a structured approach for risk analysis and for the operation of a safe process. If safety systems are used to reduce risks to a tolerable level, then these safety systems must exhibit a specified SIL. This summarizes the approach one needs to take regarding the proper establishment of a safety instrumented system and provide a basis for additional research and learning.
ABOUT THE AUTHOR
Kristen Barbour (firstname.lastname@example.org), product marketing manager for Pepperl+Fuchs of Twinsburg, Ohio, has worked in the technology field for 14 years, specializing in industrial automation. She holds a B.S. degree in education from the University of Toledo.
Reference and Bibliography