It's in the pipes
Managing alarms for pipeline operations
By Kelly Doran
The alarm problem for pipeline operators, as in other process control industries, is steadily evolving. Field information you can economically add to SCADA systems is more and more available. Yet with this extra information comes more alarms and alerts, that if left unchecked can interfere with a controller’s ability to respond appropriately. In pipeline control centers around the world, it is not uncommon to see alarm summary screens completely filled with multiple pages of acknowledged and unacknowledged alarms, meaning pipeline operators are working in a constant alarm-flood condition. This means alarm management is better suited as a standard practice and is never really finished; it becomes an ongoing natural process as part of maintaining a good control room management program.
As a result of new regulations, operators are now focusing on the good engineering practice of sound alarm management. The days of pipeline controller consoles filled with an overwhelming amount of alarms will soon be gone. In addition to reduced controller stress, improved alarm management will deliver quieter control rooms, improved situational awareness, better alarm flood control/avoidance, better use of assets, regulatory compliance, and maintenance savings and increased pipeline uptime. Increased uptime can help avoid unplanned outages; one pipeline shutdown can delete the planned gains from a year of process improvements.
Pipeline SCADA systems have additional challenges for alarm handling due to the inherent latency of the data and the alarms associated with data reliability caused by intermittent communications outages.
In addition to the rise in alarm activity because of supervising an increased volume of information, the pipeline industry faces these more common systemic issues:
Running pipelines/units harder increases the need for alarms
Years of lower profits mean reduced levels of maintenance
Industry incidents increase safety requirements
Reliance on monitoring technology increases; including security monitoring
Downsizing causes work overload with not enough time to complete all tasks
Not enough correct people to determine and configure alarms and settings
Most of these factors are outside the domain of pipeline operations, but all play a part in the overall alarm load for pipeline controllers.
Possible leak alarms
Of the most critical alarms for pipeline operators are leak/possible leak alarms. When a leak alarm is received, the controller immediately begins an investigation using all the tools available to them to verify presented information is valid. Depending on the sophistication of the leak detection system employed and how you handle communication problems, metering, and telemetry uncertainty combined with pipeline transients, you could generate false alarms. But although they are false, they are alarms nonetheless and play a vital part in the safe operation of the pipeline.
The frequency of false alarms and the appropriate response should be part of any control room management program. Any alarm management program designed to reduce alarms should include a specific philosophy for leak alarms that could differ from other operational alarms.
The demand for improved alarm management on SCADA systems used on pipelines stems from the Pipes Act of 2006. Section 19 refers to implementing the recommendations contained in the National Transportation Safety Board’s (NTSB) report, “Supervisory Control and Data Acquisition in Liquid Pipelines.”
The Pipes Act states the Secretary of Transportation shall issue standards to implement the following NTSB recommendations:
Implementation of the American Petroleum Institute’s Recommended Practice 1165 for the use of graphics on the supervisory control and data acquisition screens.
Implementation of a standard for pipeline companies to review and audit alarms on monitoring equipment.
Implementation of standards for pipeline controller training that include simulator or non-computerized simulations for controller recognition of abnormal pipeline operating conditions, in particular, leak events.
On 12 September 2008, the Pipeline and Hazardous Materials Safety Administration (PHMSA) released its Notice of Public Rule, which covers alarm management and other control room management issues. The proposed rule covers operators of gas pipelines, hazardous liquids pipelines, and liquefied natural gas facilities and focuses on the control room management (CRM) issues identified in the Pipes Act and the requirements for a CRM plan. Keep in mind these issues when developing your CRM plan.
Several detailed provisions relating to alarm management ensuring controllers will respond appropriately to alarms and notifications. You should review SCADA operations at least once per week; review SCADA configuration and alarm management operations at least once each calendar year at intervals not to exceed 15 months. Include identification of abnormal or emergency operating conditions and a review of controller response actions.
You are required to provide adequate and accurate information to the controller, including a point-to-point baseline verification between field equipment and all SCADA system displays to verify 100% of the displays. You should complete baseline verification within one year after final rule issuance for operators with less than 500 miles of pipeline. Operators with 500 miles or more are allowed three years to complete this baseline verification after final rule issuance.
Record critical information during each controller shift and establish sufficient overlap of controller shifts to permit the exchange of necessary information.
For change management, establish communications between controller, management, and field personnel when planning and implementing physical changes to pipeline equipment and configurations. Coordinate SCADA system modifications in advance to allow time for controller training and familiarization.
Establish a definition or threshold for close-call events to evaluate event significance. Conduct a review for significant events. Operators must share information with all controllers.
Proposed rule requires pipeline operators with SCADA systems follow SCADA display standard API RP–1165 in its entirety or be able to demonstrate the recommended practice is inapplicable or impracticable.
A senior executive officer should validate and sign each operator’s CRM plan.
Comments received during the official comment period have resulted in some anticipated changes to the final rule regarding weekly SCADA operations review as well as the proposed point-to-point baseline verification between field equipment and all SCADA system displays. The current publication date for the final rule is 27 November.
Symptoms of alarm system problems
If an alarm system has one or more of these issues, it could see improvement it with a comprehensive alarm management program:
Incidents or near-incidents where controllers missed key data provided (or not) by the alarm system
Large number of high-priority alarms
Alarms that persist for long periods of time
Alarms going off and on regularly or intermittently (chattering or transient)
Lost track of alarm set points and the reason they were set there in the first place
Not knowing which alarms are safety, operational, environmental, informational
Controllers do not know what to do when a particular alarm occurs
No record of when/if the alarms were tested
No procedure or policy on alarm creation; i.e., anyone can create an alarm or change the limits based on his or her own authority
Alarm documentation and policies are out of date or nonexistent
Alarm handling vs. alarm management
SCADA systems have varying degrees of sophistication in their native alarm handling capabilities. With the ultimate goal of reducing the alarm load on controllers, along with improving the situational awareness that results from less alarm noise, a comprehensive alarm management lifecycle will look for opportunities for improvements in the handling and the management of the alarms generated. By addressing alarm handling and improved alarm management, pipeline operators can expect significant improvements in their alarm system performance.
Management standards, best practices
Alarm management is recognized as good engineering practice, and as such, there are reference standards and guidelines available to assist in the implementation of a comprehensive alarm management program.
The Engineering Equipment and Materials Users Association Publication 191 offers direction on designing, managing, and procuring an effective alarm system and is based on what leading companies are doing and promotes continuous improvement in alarm management practices.
ISA alarm management standard
ISA18 on instrument signals and alarms addresses the development, design, installation, and management of alarm systems in the process industries. It defines the terminology and models to develop an alarm system as well as the work processes recommended to maintain the alarm system throughout the lifecycle. To conform to this standard, you must show you have satisfied each of the requirements in the normative clauses.
American Petroleum Institute
An American Petroleum Institute (API) workgroup is responsible for developing the alarm management recommended practice API RP 1167. The expected outcome is a consensus document that addresses definitions, effective alarm system design, good alarm audit, and review practices and strategies to minimize overload and nuisance alarms. Members of the American Gas Association participate in the API 1167 workgroup and have written an alarm management whitepaper to address specific alarm management practices for the gas industry.
Implementation of the recommendations in these documents should satisfy the Pipes Act requirements and the alarm management requirements of the PHMSA Control Room Management/Human Factors Rule. Publication of API RP 1167 is expected in November.
Alarm management lifecycle
Alarm management lifecycle activities include alarm system specification, design, implementation, operation, monitoring, and maintenance. They can apply to new alarm systems or to better manage existing alarm systems.
Design and implement
The design and implementation phase of an alarm management program consists of identifying current alarm activity benchmarks. Follow this with a rationalization exercise to begin the process of alarm improvement. Rationalization is the process of reviewing a candidate alarm against the principles of the alarm philosophy and documenting the rationale for the alarm. Rationalization alone only provides some alarm reduction along with improved alarm documentation, and these results have proven to be temporary at best, as it does not resolve the issues that allowed the alarm system to get to its current state. Alarm design documentation does not necessarily lead to good alarm system performance. Rationalization will determine the design parameters you should implement in the first iteration of the program.
We typically design alarm performance benchmarks to measure the effect of alarm activity on controller performance. Human factors research indicates too much information is just as harmful as too little. In the context of SCADA alarms, this means we need to find a way to determine the sweet spot between too many alarms (too much information) and too few.
In addition to alarm activity metrics used to find the appropriate amount of alarm activity other classes of metrics exist that accurately expose shortcomings in alarm design and implementation, including identifying those alarms with little or no operational value. Using a mix of the various types of alarm performance metrics builds a better foundation from which to evaluate the performance, health, and value of an alarm system.
Controller alarm loading
The most widely accepted mechanism for determining the appropriate amount of alarm activity for a given operation is through an evaluation of controller loading. Monitoring the impact of alarms on an operating team helps determine the maximum thresholds for alarm activity in the context of all other operator responsibilities. Operators can do their job more effectively when alarm activity complies with established benchmarks.
The common key performance indicators used to evaluate the performance of an alarm management program include:
Manageable steady state: The maximum rate at which a single controller can effectively address alarms.
Flood state: The rate at which a single controller is overwhelmed by alarm activations.
Average process alarm rate: The average rate at which a single controller can be expected to perform as required.
Percent of time alarms exceed target average rate
Peak alarm hourly rate: Rate for the most active hour within the evaluated time period.
Peak alarm minute rate: Target rate for the most active minute within the evaluated time period.
Alarm activity priority distribution: The suggested approximate distribution of alarm activity by priority. (i.e. 5% high priority, 15% medium priority and 80% low priority)
Alarms within 10 minutes of a major upset: The maximum rate in the 10-minute period following a major upset.
Chattering Alarms: Determine chattering alarms by specifying a minimum number of times an alarm must activate within a defined period.
Stale alarms: Alarms active for an excessive period of time.
Average alarms/controller: A configuration target encouraging design discipline.
Unauthorized changes to alarm settings: Identifies unauthorized changes and supports enforcement of a strong management of change policy.
Thoroughly understanding the types of alarm key performance indicators facilitates a positive influence on good alarm system design and good alarm management practices. The richness of alarm activity data, joined and contrasted with other data resources, yields compelling insight into alarm system health and exposes opportunities for productivity gains through changes to the alarm design.
Maintaining control of controller loading is the foundation of alarm management. Complementing alarm activity metrics with additional event and configuration based analysis strengthens the alarm management process and improves the efficiency with which you can locate and address alarm-related problems.
An important part of any alarm management lifecycle is auditing and assessing the program. This becomes mandatory under the PHMSA Control Room Management/Human Factors; Proposed Rule. The proposed rule requires operators to undertake a detailed review of alarm management, including the monitoring of the number of alarms, potential systemic issues related to field equipment or the SCADA system, issues resulting in excessive or unusual alarms, unnecessary alarms, changes in controller performance in response to alarms, and a review of alarm set-point values.
Pipeline operators must use this information to evaluate and mitigate controller workload with respect to the number and nature of alarms received. Alarms indicating ongoing maintenance issues or communication problems should be resolved. It is important to not ignore known problems that continually cause alarms.
For regulatory compliance, an automated alarm assessment report strategically illustrates how an alarm management program has met the requirements of the rule by providing easy access to the key alarm performance indicators. As the reporting cycle progresses, you will be able to demonstrate improvement over time and identify additional improvements based on historical comparisons
Management of change
As part of an alarm management program, you will identify alarms that will require revised limits, priorities changed to match the new alarm philosophies, and the configuration of alarm suppression for related alarms. Whenever possible, configuration changes should be tested in an offline system. Ultimately, you will make changes on the live system that will directly affect the behavior of the alarm system. It is crucial to have a sound management of change process for the SCADA system that provides an auditable record of these configuration changes, and more importantly includes a process for informing controllers of changes made before they are put into effect on the live system, especially controllers who are not on shift at the time the changes are made.
ABOUT THE AUTHOR
Kelly Doran (Kelly.firstname.lastname@example.org) is a product manager at Telvent, a business information services provider in Calgary, Alberta, Canada, that specializes in the energy, transportation, agricultural, and environmental sectors. He is an associate staff member of the Transportation Safety Institute and has been involved in the development and delivery of SCADA Fundamentals training to federal and state pipeline inspectors for the Office of Pipeline Safety.
Return to Previous Page