Using alarm suppression
Effective alarm systems improve operations
By Charlie Fialkowki
One of the signs of an effective alarm system is that it presents alarms to the operator only when they are relevant and require their response (attention). This means the alarm system is able to track the state of the process in order to know when to present the alarm and when to suppress it. Transient plant conditions, use of different feedstocks, and unplanned process upsets make this a challenge for many process applications. Modern systems provide a powerful and easy-to-configure capability for suppressing alarms dynamically based on the state of the process and/or equipment (called automatic alarm hiding). It can be used, for example, to suppress alarms when equipment is out of service or in response to a compressor trip (which would otherwise lead to an alarm flood). Overloading the operator with stale (irrelevant) alarms or alarm floods can lead to increased operator stress, missed alarms, operator error, production losses, or worse. This article discusses how to implement designed alarm suppression (a form of advanced alarming) following the best practices and recommendations of the ISA standard ANSI/ISA-18.2-2009, Management of Alarm Systems for the Process Industries, as well as EEMUA 191, Alarm Systems: A Guide to Design, Management, and Procurement, and NAMUR NA 102, Alarm Management for Process Control Industries.
Characteristics of an effective alarm system
Alarm: “An audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a response.” (Ref ANSI/ISA-18.2-2009)
An effective alarm system delivers the right alarm to the operator at the right time with the right importance and the right information. The first step in designing one is to review some important principles of alarm management:
- Every alarm should have a defined response.
- Each alarm should alert, inform, and guide.
- Each alarm presented to the operator should be useful and relevant.
From this we gather alarms require an operator response and must be useful and relevant. Designed suppression helps address alarms that do not follow these principles during all operating states of the plant. For example, a low-flow alarm is not useful or relevant when it is generated as a result of an outlet pump shutting off unexpectedly. In this case, the operator’s response (and attention) is focused on addressing the pump trip, not on the low flow, which is a consequence.
Alarm rationalization is the process of reviewing potential or existing alarms to determine which ones are necessary and valid. It entails finding the minimum set of alarms needed to keep the process safe and in the normal operating range. Rationalization also involves defining the attributes of each alarm (such as limit, priority, classification, and type), documenting the cause, consequence, response time, and operator action, as well as identifying alarms that require special handling (like advanced alarming). The product of rationalization is a list of alarm configuration requirements recorded in a Master Alarm Database (MADB). It is recommended the alarm system undergo rationalization before implementing advanced alarming techniques, such as designed suppression.
Advanced alarming is a collection of techniques that help manage alarm rates and make sure alarms are relevant by modifying alarm behavior dynamically. Implementation involves additional layers of logic, programming, or modeling used to modify alarm attributes.
Alarm suppression is valuable tool
Alarm suppression, defined as preventing indication of the alarm to the operator when the base alarm condition is present, is a useful function for helping to ensure that operators are not presented with alarms unless they are relevant. There are three different types of suppression defined in the alarm management standard, ANSI/ISA-18.2-2009:
- Shelving (manual suppression): A mechanism, typically initiated by the operator, to temporarily suppress an alarm. (For example, this is manual alarm hiding in SIMATIC PCS 7.)
- Designed suppression (automatic suppression): Suppresses alarms based on operating conditions or plant states. Under control of logic that determines the relevance of the alarm. (For example, this is automatic alarm hiding in SIMATIC PCS 7.)
- Out of Service: The state of an alarm during which the alarm indication is suppressed, typically manually, for reasons such as maintenance. (For example, this is locking of alarm messages in SIMATIC PCS 7.)
This article focuses on best practices for designed suppression. Two different applications will be addressed.
Implementing designed suppression
An effective designed suppression scheme includes a state (event) detection algorithm and a suppression rule set. If the conditions defining the state detection are satisfied, then the state suppression rules will be applied to determine which alarms should be suppressed from the operator. The state detection algorithm can use a mixture of permissive conditions (AND conditions) and voters (OR conditions). A set of suppression rules is often called an alarm suppression group.
State detection algorithm (state detection logic)
The key to creating a robust and effective suppression scheme is to be able to define what conditions can be used to reliably detect the state or event that triggers the suppression. The last thing you want to happen is alarms are suppressed when they should not have been. This is the job of the state detection algorithm (SDA), which identifies operating states, state changes, and events. Inputs to SDAs typically come from analog process values (e.g., flow, temperature, pressure, amps,), digital readings (e.g., equipment commands and run status), valve positions, selectors, and outputs from other logic or computational mechanisms. The output of the SDA is indication that a particular state is active (example: reactor taken offline) or an event has been detected (example: compressor trip).
Recommendations for design and implementation of the SDA:
- Use input from multiple sensors with at least two positive indications of state (2oo2 [2 out of 2] or 2oo3 voting).
- Avoid related measurements with a high probability of common-cause failure.
- Use deadband for analog limit alarms to prevent mode cycling (enabling/removing the suppression scheme).
- Logic should incorporate methods for handling sensor failures (example: Bad PV value) so a degraded sensor is excluded from the triggering logic.
- Operator manual confirmation may be included as one of the confirmation steps (until the operator is used to how the system works)—static suppression only.
- Design should ensure only one-state can be active at a time for a given alarm.
- Test SDA on the live system before enabling the logic to modify alarm attribute.
State-based (static) suppression
The presence of stale alarms, alarms that remain standing (active) for greater than 24 hours, can lead to reduced operator effectiveness. Stale alarms can clog the HMI display, making it more difficult to detect new alarms, leading to operators becoming numb to the alarm system and ignoring important alarms.
State-based suppression (otherwise known as static alarm suppression) is a technique for suppressing alarms that are always active and not meaningful when a process area, unit, or piece of equipment is in a particular operating state (mode). It can be effective at preventing stale alarms.
Recommendations for design and implementation of state-based suppression:
- Suppression groups (the alarms in a rule set) should not span disparate process areas, units, or pieces of equipment. This makes the implementation easier and helps the operator to understand the behavior.
- It is important to examine each individual alarm condition to determine whether it can be suppressed during a given state. For example, it may be important for safety critical alarms to remain enabled even when the equipment is offline. There are numerous examples of industrial accidents that have occurred when equipment was offline, and alarms were suppressed that should not have been.
Consider what options should be avail-able for deactivating suppression:
- a. Automatically turned off whenthe trigger conditions for the state have returned to normal
- b. Automatically after a specific elapsed maximum suppression time
- c. Manually discontinued from the HMI with appropriate security privileges
Reactor offline example: During normal operation, the reactor is capable of generating various temperature, pressure, level, and flow alarms to indicate abnormal reactor conditions when making product. When the reactor is taken out of production, offline alarms may be triggered that are not relevant. It is beneficial to suppress these stale alarms. Some alarms, however, may still be required to protect the safety of personnel and equipment. For example, a high-pressure alarm may still be required to detect leaks through an isolation valve to protect workers from the hazards of working in confined spaces when they are performing maintenance.
Alarm flood (dynamic) suppression
Plant upsets can generate tens to hundreds of alarms representing one of the most challenging times for operators. The investigation of the Milford Haven refinery explosion found operators were inundated with 275 alarms in the 11 minutes leading up to the explosion. During alarm floods, when the operator receives more than 10 alarms in a 10-minute period, the alarm system can become a nuisance, a hindrance, or a distraction, rather than a useful tool. Alarm floods diminish the operator’s effectiveness and can lead to missing critical alarms.
Alarm flood suppression (also known as dynamic alarm suppression) is the dynamic management of pre-defined groups of alarms based on detection of equipment state and triggering events. It is a technique for suppressing alarms following an event (such as the crash of a distillation column) when they are not relevant or meaningful to the operator.
Recommendations for design and im-plementation of alarm/flood suppression:
- Provide the ability to manually enable/disable suppression through the HMI.
- Consider implementing a common (group) alarm (e.g., “Compressor Trip”) to more effectively communicate the problem to the operator.
Consider what options should be available for deactivating suppression:
- a. Automatically turned off when the trigger conditions for the state have returned to normal
- b. Automatically after a specific maximum suppression elapsed time
- c. Manually discontinued from the HMI with appropriate security privileges
- Define the desired behavior when suppression is deactivated. Should alarms currently active and suppressed remain suppressed?
- Consider whether first-out event detection logic is required for capturing and indicating through HMI which conditions first became TRUE to satisfy the event-detection logic.
- Consider whether it is beneficial to check for alarms that are not in their expected state. Provide ability to detect/indicate/generate an alarm when an alarm condition, which is being suppressed as part of the algorithm (and which is expected to be active), is not active.
Distillation column crash example: When a distillation column “crashes,” it is best to present only those few alarms that affect the operator’s diagnosis and response, rather than all of the temperature and pressure alarms that may obscure important alarms from upstream and downstream parts of the system.
Compressor trip example: A common upset condition for many plants is when a compressor trips. Compressors typically have many individual alarm conditions that are used to indicate abnormal conditions. These include: low oil pressure/suction pressure/speed and high-discharge temperature/bearing temperature/current. Compressor trips can be caused by many conditions, for example an electrical surge trip causing it to spin down, triggering numerous alarms (alarm flood). Many of these alarms are not meaningful and can obscure alarms that might be meaningful (like those that are downstream/upstream in the process). Therefore suppressing them can help the operator respond more quickly, get to the root cause, and address the situation before it impacts other areas of the process. In some cases, it is beneficial to raise a common alarm (indicating that the compressor has tripped) in addition to suppressing individual alarms.
How designed suppression (automatic hiding) works
When an abnormal condition is detected in the automation system (AS), an alarm is generated by a function block within a control module. The alarm message is communicated from the AS to the operator station (OS), at which point it is sent in parallel to the archive and passed on for display to the operator. All alarms are logged in the archive, whether they are suppressed or not. This facilitates post-incident investigation and sequence-of-events review.
Before displaying an alarm to the operator, the OS checks whether the alarm should be hidden (suppressed) based on the current state. The suppression rule set is defined in the Process Object view of the engineering station (ES) in the form of matrix of alarms vs. states. To configure an alarm for hiding is as simple as checking/unchecking the suppress box within the ES for each alarm/state combination.
Implementation of alarm flood suppression
In this example, the condition of the motor (running or not running) and the status of a process interlock are used to detect the compressor trip condition. The compressor is assumed to have two states: normal and tripped. If the compressor trips, all expected alarms (e.g., high-bearing temperature, low-discharge flow, low-discharge pressure, etc.) are to be suppressed. A common alarm is to be presented to the operator and recorded in the archive for post-trip investigation.
Operator situation awareness
For operators to trust the alarm system, they must be able to tell when suppression is active and what alarms are affected. The ANSI/ISA-18.2 standard requires that if alarm suppression is implemented, then a list of suppressed alarms should be available to the operator. Typically, this is accomplished with a hidden alarm list, which indicates alarms that are currently suppressed either automatically or manually (operator shelving). Alarms affected by designed suppression are typically displayed in a list by the DCS.
Best practices would dictate that reviewing the suppressed alarm list should become a regular task during operator shift change. In some cases, it is useful to have an HMI graphic depicting the status of the inputs to an SDA and the resulting output.
Stale alarms and alarm floods can negatively impact operator performance. Implementing designed suppression techniques can help alleviate these conditions and ensure that when alarms are presented to the operator that they are relevant and actionable. A modern DCS provides robust and easy-to-configure capabilities for implementing designed suppression, allowing users to effectively implement advanced alarming techniques within their system.
ABOUT THE AUTHOR
Charlie Fialkowki is a National Process Safety Manager for Siemens Industry Inc. and a Certified Functional Safety Expert with 20 years of safety system design and engineering experience. He is a voting member for the ISA safety standards committee (ISA84), an active ISA and NFPA member, and an ISA instructor for safety instrumented systems.