Next frontier: Operator-automation relationship
Preclude operator error, improve safety and profitability
By Eddie Habibi
Industrial control systems are complex designs, but they do not give much consideration to the needs of the human operator. This oversight has caused serious and costly accidents. By empowering operators with the right human-automation relationship tools, companies will achieve and exceed their goals.
The world population is more prosperous today than ever before. The average global per capita gross domestic product has grown by nearly tenfold during the past 140 years. By comparison, this measure was practically flat from 1600 to 1875, the period before the Industrial Revolution. The fastest growth in productivity emerged after World War II and accelerated during the past three decades. The U.S. economy exploded, growing from $2.8 trillion in 1980 to more than $15.5 trillion in 2012. Human ingenuity and innovation made possible by connectivity, information sharing, and collaboration form the basis of this relative prosperity.
World per capita GDP
1600–2003 (1990 International dollars)
Source: J. Bradford DeLong, “Estimating World GDP, One Million B.C. - Present” http://www.j-bradford-delong.net/TCEH/1998_Draft/World_Estimating_World_GDP.html. Accessed Mar 5, 2008; Angus maddison. “Contours of the World Economy, 1 - 2030 AD: Essays in Macro-Economic History.” New York: Oxford University Press, 2007.382. © 2008, Matthew W. Kruse
Specific to industrial production, much of the productivity growth in recent decades is attributed to advances in process automation technologies such as the distributed control system (DCS), the historian, model-based controls, and production optimization. Without much incremental investment in plant equipment, automation technologies have substantially reduced variability and cost, improved throughput and quality, extended asset reliability, and delivered incredible financial returns on investment.
Industrial control systems are complex. They process and generate significant information in real time, and were designed without much thought for the needs of the human operator. Serious and costly accidents have been the consequences of this oversight. Chernobyl, Bhopal, Three Mile Island, Piper Alpha, and Texas City were some of the worst industrial accidents in recent history. Human error is cited as the root cause or a major contributing factor in every one.
The historical marker in front of the Three Mile Island nuclear power plant reminds us of the role of human error in that accident.
Do not blame the operator
Unfortunately, there is a tendency to automatically blame the human operator. Often, the operator is unjustly blamed for such accidents. An operator inherits a production facility as it was designed and constructed. Traditionally, the automation system configuration and the resulting alarms and interfaces are designed and implemented by engineers unfamiliar with human-factors design. It is not unusual for an operator to receive a barrage of alarms during a process upset, or to search through five displays before finding the relevant information related to the situation. Operators, like most everyone else, want to succeed at their jobs.
Understanding human error
Human error may simply be described as the failure to carry out a given task (or the performance of an undesired action) that could result in disruption of scheduled operations or damage to equipment and property. In process operations, human error occurs when an operator fails to take proper action (the right action at the right time) or takes no action when action is required.
Human error can be divided into two general categories:
Intentional errors occur when an operator deliberately performs, or chooses not to perform, a task. In almost every case, intentional errors are not malicious. Typically, when committing intentional errors, operators believe their actions are correct or more appropriate than what the standard operating procedure calls for. For example, an intentional error occurs when an operator deviates from a written procedure, falsely believing that the procedure is incorrect.
Unintentional errors occur when a worker unwittingly performs, or chooses not to perform, a task. For example, an operator might unintentionally enter 47 percent instead of 4.7 percent when moving a control valve. Another example of an unintentional error is when an operator does not detect the emergence of an abnormal situation and fails to take action entirely. These human errors are generally referred to as accidents.
Two factors influence an operator’s decision-making process:
Internal factors that reside within the operator, such as tacit knowledge, experience, cognitive abilities, fatigue, and work ethic, affect his or her ability to process information and act promptly. Management can influence the internal factors through rigorous hiring practices, training, policies and procedures, and overall organizational culture.
External factors are outside influences that act upon the operator and impact his or her decision process. They include the physical environment and the information flow. The control room ergonomics such as lighting, temperature, and noise are components of the physical environment. Sources of information flow encompass the telephone, the radio, conversations, and the automation and information systems.
As an industry, we have to believe that all human error is preventable. Otherwise, the concept of zero accidents, which is the vision of every industrial company, is an unattainable goal.
The knowledge-worker operator
The console operator is the final human element to interact with the production process in real time. The decisions and actions of the operator directly affect process safety and company profitability. Thus, the operator is the most critical element in the success of any industrial company. When the operator has a bad day, everyone in the company has a bad day, from the chief executive officer to the last shareholder. More importantly, if the bad day involves casualties or a major environmental excursion, the surrounding community suffers too. The performance of no other individual has such a direct impact on the profitability and the reputation of a company.
It is time for the industry to recognize the role of the operator as mission critical and deserving of proper respect and investment. Peter F. Drucker, the father of modern management theory, described a knowledge worker as one who uses or creates information in the process of performing a job. Operators have to monitor, analyze, and take action based on thousands of data points in complex, fast-moving, real-time situations. Operators are the ultimate automation industry knowledge workers.
Companies such as BASF place significant value on the role and the contribution of their operators. “We recognize the important role of our knowledge worker operators in the success of our company,” says Chris Witte, senior vice president and site manager at BASF Freeport. “We are investing in technologies such as alarm management, high performance HMI, and automation asset management to help improve safety and asset reliability.”
Situation awareness: the operator’s blink
Situation awareness refers to the operator’s comprehension of the condition of a process unit at any moment based on his or her personal knowledge, cognitive abilities, and the information presented to him or her. Proper situation awareness is critical to successful decision making in complex and dynamic professions such as aviation, firefighting, and industrial operations. Operators with proper situation awareness are more successful at analyzing information, identifying critical conditions, and taking proper actions to mitigate undesired consequences.
In his groundbreaking book, Blink: The Power of Thinking Without Thinking, Malcolm Gladwell tells the story of a firefighter in Cleveland, Ohio, and how he and his team escape a deadly fire trap. After numerous failed attempts to extinguish the fire in the kitchen of a one-story apartment in Cleveland, the lieutenant ordered his men out of the building. As soon as they exited, the kitchen floor where the men had been standing collapsed. Had the men not left the building when they did, they would have been swallowed by the fire. At first, the lieutenant refers to his timely judgment as “gut feel” or “ESP,” but after an extensive interview it is revealed that the experience and the cognitive abilities of the firefighter helped him analyze the situation and arrive at the correct decision in a split second. The lieutenant had realized that “something was wrong” based on three distinct observations: the fire was very hot, it was stubborn and would not go out, and there was an uncharacteristic silence in the kitchen for the intensity of the fire. That split-second decision is what Gladwell calls “blink.” Blink is that moment of absolute clarity, when the firefighter thin sliced the information presented to him and, based on his experience, made the correct call.
Experienced operators, similarly to Gladwell’s firefighter, have the ability to absorb large amounts of information, filter it, and connect the dots, and then make split-second decisions and respond to seemingly impossible situations. Proper situation awareness in the control room leads to improved operator comprehension of complex situations and fast and accurate decision making.
During abnormal situations, a typical control system may present hundreds or thousands of data points to the operator. This is indicative of a bad relationship between the operator and the automation system. Management must take steps to optimize the operator-automation relationship.
The human-automation relationship
The most differentiating competitive advantage in the consumer electronics market is the user experience. Apple significantly raised the bar with its design of the iPhone. In the past seven years, the Apple iPhone has gone from being the new kid on the block to selling more units in one weekend than BlackBerry did in the prior three months. The user experience, more than any other factor, made the iPhone the darling of the mobile device market.
In contrast, we can confidently claim that, at least until recently, user experience and interface design have not been viewed by automation vendors as a competitive advantage. The primary focus, instead, has been on making faster and better process controllers. Consequently, during a process disturbance when human intervention is required to rescue the plant, most control systems generate an inordinate number of alarms, creating sensory overload and making the situation worse for the operator.
The power and process industries today face a serious challenge that is the direct consequence of failures in the operator-automation relationship design.
Challenges in the operator-automation relationship arise from incongruity between the automation system’s user interface and the operator’s ability to process information and take action toward a desired outcome.
Improving the human-automation relationship for the console operator begins with the user interface. The operator interface is also the proper portal for a comprehensive and fully integrated decision-support system that enables knowledge retention and collaboration in real time.
A number of important elements must come together to create an effective human-automation relationship environment. These include:
The console graphical interface is the operator’s window to the process. One of the major user-interface challenges today is the absence of an effective “big picture” overview display. Unlike the wall-mounted instrument panel of the past that provided at-a-glance situation awareness of plant conditions, control system consoles today provide a “key-hole” view of the process through 60-to-100 individual displays. Most operator displays are cluttered, lack hierarchy, provide no pattern recognition, and use colors indiscriminately.
An effective resolution to the deficiencies of today’s operator displays is high-performance human-machine interface (HP HMI). HP HMI displays follow essential human-factor design principles that include, among other related display design best practices:
- Three levels of display hierarchy
- Grey-scale colors
- Chunking and grouping of information
- Pattern recognition objects
- Simple and intuitive navigation, including pan and zoom
HP-HMI-based displays are minimalist in detail and rich in useful information. True HP HMI displays also include the capability to easily integrate critically useful information like alarm response documentation, control logic interlocks, checklists, and operating procedures.
The purpose of an alarm is to inform the operator that action must be taken to mitigate an undesired situation. An effective alarm must be unambiguous, unique, timely, actionable, and with proper priority to convey the correct level of urgency to the operator. Results of alarm system performance studies conducted by PAS indicate that most alarm systems perform poorly under normal conditions and, even worse, become a hindrance to the operator during process upsets. Alarm floods, disabled alarms, and long-standing stale alarms are three of the top culprits that create confusion in the control room and cause operator error.
The issue of ineffective alarm systems has been around since the advent of the DCS. However, in recent years, alarm management optimization has become an industry best practice and an opportunity to improve plant safety and profitability.
A robust alarm management strategy must include software to automatically capture, archive, analyze, and report the performance of the alarm system. An integrated documentation and rationalization engine is needed to facilitate proper engineering of priorities and trip settings and to capture causes, consequences, and corrective actions for each alarm. A master alarm database with audit and enforcement capability ensures the integrity of the reengineered alarm system. Dynamic state-based alarming is an essential part of an integrated alarm system that automatically changes the alarm settings to properly match the state of the process.
Capturing, monitoring, and performing strict change control on limits of operability are crucial to safe operations. Typically, process and equipment boundaries such as alarms, the safety system trip point, pressure relief specifications, and other limits are maintained by different organizations within a plant. Operators usually have visibility to alarm limits only. There are two issues with the way boundary information is managed in today’s operations: it is nearly impossible to ensure the integrity of the information for all the boundaries associated with a given piece of equipment; and operators rarely have access to all the related boundary information in real time.
A consolidated approach is required to manage the integrity of operational and safety boundaries; this includes strict management of change and periodic audits. Furthermore, integrated operator interfaces are needed to give the operator an at-a-glance view of all the boundary limits associated with a piece of equipment.
Standard operating procedures (SOPs) are step-by-step written instructions that guide operators in the uniform operation of a process. Many industrial incidents happen when plant personnel either deviate from or ignore the written SOP. Other times, incidents occur when personnel follow a procedure that is out-of-date.
Procedural automation is an effective method for mitigating most incidents that are caused by circumventing established SOPs. In 2010, ISA launched a new standards committee on procedural automation for continuous processes. ISA-106 reflects the combined best practices of several global manufacturing companies such as Dow Chemical Company, Aramco, Chevron, and ConocoPhillips.
“For over thirty years, Dow Chemical has used a proprietary control system [MOD] to implement state-based controls methodology to improve operator performance, enforce our operating discipline, and significantly enhance process safety throughout our company,” said Yahya Nazer, Ph.D., manufacturing and engineering consultant at Dow Chemical Company. “We see ISA-106 standards for procedural automation as an avenue for sharing our experience with other companies.”
Many automation system suppliers are integrating automated procedures into their new platform designs.
Integrated information portal
An integrated information portal provides simple access to supplemental information such as operating procedures, checklists, and instructional videos that improve the operator’s situation awareness. It is also a platform for knowledge retention and collaboration.
Below is a summary of seven key elements of an effective knowledge retention and collaboration platform that enables the integrated information portal within the operator’s work environment.
- Aggregate: capture existing information or explicit knowledge that is available in a digital form (e.g., design drawings, procedures, training videos, and control system logic)
- Author: allow operators and engineers to enter contextual information and expand the knowledge base
- Contextualize: automatically recognize relationships and give context to information from disparate sources
- Tag: allow users to categorize or classify related points of information by adding user-defined tags to multiple information objects
- Search: allow Google-like searches on the process control and plant information networks. It is imperative that search results be concise, accurate, and previously vetted by designated subject matter experts.
- Alert: allow operators to subscribe to and receive alerts that would notify them of an event, a process condition, or a task
- Recommend: similar to the Amazon books’ recommendation “people who bought this book also looked at this other,” the integrated information portal must provide useful recommendations to help the operator improve his or her decision process. For example, the system should recognize when an operator selects a pump to start. It should then automatically present to the operator a checklist to follow.
Death of the traditional logbook
The traditional stand-alone operator logbook, whether in paper or electronic form, is obsolete. The information communicated through the operator logbook is highly valuable and interdependent with other production systems. The operator logbook can no longer be viewed as a stand-alone system. In fact, the functionality of an operator logbook must be included within the integrated information portal.
Operators are true knowledge workers whose performance directly affects the company’s profitability and reputation. It is the responsibility of the power and process industries to recognize and empower the operator in the same way that the airline industry and the medical community recognize and empower pilots and emergency room surgeons, respectively. By empowering the operator with the right human-automation relationship tools, the company will have a solid foundation for safe production.
Fortunately, the new generation of industry executives—those who grew through the ranks during the digital revolution—understands the critical role of the operator in the success of their organizations. They keenly recognize that the role of the operator is less about turning valves and making set point adjustments and more about assessing situations and making economic decisions in real time, often under adverse conditions. They are investing in their operators and in new enabling technologies to optimize the performance of their organizations.
ABOUT THE AUTHOR
Eddie Habibi (email@example.com) is the founder and CEO of Houston-based PAS, a supplier of human reliability software and services to the power and processing industries worldwide. He is a thought leader in the areas of operator effectiveness, automation and information integrity, and web-enabled knowledge retention and collaboration technologies. Habibi has an engineering degree from the University of Houston and an MBA from the University of St. Thomas. He is the coauthor of two popular books on industrial operator effectiveness: The Alarm Management Handbook and The High Performance HMI Handbook.
Human Reliability, Error, and Human Factors in Engineering Maintenance: with Reference to Aviation and Power Generation
The Alarm Management Handbook
The High Performance HMI Handbook
Normal Accidents: Living with High-Risk Technologies
Reduce Human Error: How to analyze near misses and sentinel events, determine root causes and implement corrective actions
“Enterprise 2.0: The Dawn of Emergent Collaboration”
“Apple Sells More Phones Over the Weekend Than BlackBerry Did Last Quarter”
Automation Can Prevent the Next Fukushima