Peril in the pipeline
Cyber security could have precluded gasoline rupture at Washington pipeline
- Real-life pipeline rupture leads to questions about security.
- Investigations reveal unresponsive SCADA system could have caused disaster.
- Cyber security issues existed before and immediately after blast.
By Marshall Abrams and Joe Weiss
On 10 June 1999, a 16-inch diameter pipeline owned by Olympic Pipeline Company ruptured, and gasoline leaked into the Hanna and Whatcom Creeks in Whatcom Falls Park within Bellingham, Wash.
When the gasoline ignited, the resulting fireball traveled nearly 1 1/2 miles downstream from the pipeline failure location, killing two 10-year-old boys and an 18 year old man. There were eight documented additional injuries, along with significant property damage to a residence and to Bellingham's water treatment plant. The release of nearly 1/4 million gallons of gasoline caused substantial environmental damage to the waterways.
Cyber security often focuses on the vulnerabilities of commercial off-the-shelf software and Internet access, with malicious activity as the primary concern. But more discussion is needed about control system cyber security and how its policies and countermeasures can potentially preclude or minimize the impacts of a control system cyber security event.
The pipe rupture at Bellingham involved a complicated scenario of physical damage to the pipeline with an eventual pressure buildup not mitigated by the pipeline SCADA or leak detection systems in a timely manner.
The pipeline system remotely operates from a central control center where pipeline controllers can monitor key variables, such as pressures and flow rates, and can also monitor and operate mechanical components, such as pumps and motor-operated valves. Olympic used a SCADA system to monitor, operate, and control its pipeline.
The Olympic Pipeline SCADA system consisted of SCADA vector (object-based editing) software running on two virtual asset extension (VAX) computers with virtual memory system (VMS) operation system Version 7.1. [VAX is a 32-bit computing architecture that supports an orthogonal instruction set (machine language) and virtual addressing (i.e., demand paged virtual memory)].
In addition to the two main SCADA computers, a similarly configured computer served as a host for the separate pipeline leak detection system software package. The Olympic system configuration included RTUs and PLCs for collecting field data. At the time of the accident, most of Olympic's field units also had local controllers embedded in their hardware designed to protect the station equipment and downstream piping if they lost contact with the main SCADA computer.
The VAX computers hosting the SCADA system, the control room terminals, and workstations and the leak detection computer were connected via an Ethernet backbone network. Each device was connected to one common connection, and there was only one path from any one device to another. A bridge connected the Ethernet in the SCADA control room with the company's administrative computer network, which was connected to the Internet. The bridge device offered some protection and isolation of the pipeline control from the administrative segments of the networks.
The VAX-VMS was designed to be a multi-user system and was capable of keeping track of hundreds of simultaneous users. Each user was allocated his/her share of system resources, and each user was only permitted to run or view files associated with that person's user identification (login). Extensive operating system accountability and permission logs documented the resources used by any user. Only one login was employed by all Olympic operators, which allowed them to have undifferentiated system administrator privileges, including manipulation or deletion of any file on the system.
At 12 a.m. each day, the system created a complete set of historical records for that day. As the system continued to operate, these files were appended with the new data until midnight, when the appended records became the historical record for that day. The Olympic system was set up with the default VMS system file attributes. The file system allowed read, write, execute, and delete file access for system, owner, group, and world user categories. It was also possible to add special access control entries to files. VMS had extensive configurable auditing capabilities.
Possible accident causes
Several events and conditions could have set the stage for the pipeline rupture. The first was external damage, gouges, and dents to the pipeline in the vicinity of the eventual rupture. Second was the construction and startup of the Bayview products terminal. During construction of the terminal, pressure relief valves were installed that were later found to be improperly configured or adjusted, and the actions taken by the company to test and correct the valve settings now seem to have been ineffective. Finally, on the day of the accident, the SCADA system that controllers used to operate the pipeline became unresponsive, making it difficult for controllers to analyze pipeline conditions and make timely responses to operational problems.
After the Bayview products terminal became operational in December 1998, controllers began to experience difficulties that often involved pressure increases within Bayview, causing the inlet block valve upstream of Bayview to close, thus shutting down the pipeline. Between December 1998 and June 1999, when the accident occurred, the inlet block valve closed 41 times because of high pressure at Bayview.
The National Transportation Safety Board (NTSB) concluded if the SCADA system computers had remained responsive to the commands of the Olympic controllers, the controller operating the pipeline probably would have been able to initiate actions that would have prevented the pressure increase that ruptured the pipeline. In other words, the unresponsiveness of the SCADA system was determined to be the proximate cause of the rupture.
Cyber security issues
Following a thorough review of the final NTSB report and their interim documentation, they found the following cyber security issues present immediately before, during, and shortly after the Bellingham pipe rupture. These issues (together or separately) could have led to the abnormal SCADA operation or precluded an ability to determine the cause of the event.
Unsecured remote access: The terminals and workstations were connected to the SCADA system either through network connections or via modems using one of the several serial communications ports located on the two SCADA computer units. Most of the day-to-day system support and development occurred using one of these remote terminals. Direct dial-in access to the VAX computer was available from the outside, provided the user knew the phone number and had an authorized dial-up account and system password.
Network separation: The SCADA host computers, the control room computers, and the leak detection computer were interconnected via a basic Ethernet backbone network. This means each device connected to one common connection point, and there was only one path from any one device to another. A bridge connected the Ethernet in the SCADA control room with the company's administrative computer network. The administrative computer network had some Internet connectivity. Several other departments used data obtained from the SCADA system.
Security technologies: The system did not create a keystroke record of the commands entered via one of the remote terminals or workstations. No virus protection or access monitoring was incorporated into the system. The VMS system was designed to log all system operations, errors, and hardware failures. VMS also contained a security log that kept a record of who was logged into the system. The security log would contain an entry if someone had attempted to break into the computer operating system. Each time a user typed an incorrect user name or password, a break-in entry would be made in the security log. The security log contained no evidence of an unauthorized attempt to access the system.
Security policies: There was no indication of an in-place cyber security program, including control system policies and procedures. The VAX-VMS system used as the platform for Olympic's SCADA system was a multi-user system, but all authorized computer operators used the same login. Thus, even though the operating system could track individual users, the system had no means of distinguishing one user from another. This single-login policy severely limited the ability of the company to audit the system or to assign individual accountability for actions performed on the VAX or SCADA system.
Furthermore, all authorized users had system administrator privileges, allowing them to manipulate or delete any and all of the files contained on the system. Because they all used the same login name, no record of exactly who performed what action was available. The pressure trend displays during the period the Olympic SCADA system was experiencing a computer system slowdown and stoppage were presenting potentially misleading data to the operator. There was no special highlighting feature programmed to alert the controllers they were looking at a graph that may contain gaps in the displayed data.
Training: The computer system support person was formerly a full-time pipeline controller. Although appearing well versed in SCADA technology, he had not received any training in the operating system or SCADA applications. There was no mention of any computer security training provided or taken by any of the staff.
Forensics: The software support group performed a review of the pipeline control software after the pipe ruptured. In an attempt to replicate the SCADA computer performance anomaly, they installed an image copy of the disk on one of the development computers. The development system and software were different from the Olympic SCADA system. They could create no performance anomalies on the Olympic system running on the computer. Key logs were inexplicably missing, including during the period when the SCADA was unresponsive. The SCADA operator refused to testify.
The SCADA system was integral to maintaining safe operation of the pipeline. However, various factors contributed to the inoperability of the SCADA system and the lack of adequate response to the inoperable SCADA system at critical times. A comprehensive control system cyber security program was not in place nor was appropriate SCADA operator training. The SCADA system appeared to have diagnostics capabilities, but those capabilities were not configured to address internal cyber issues. In addition, system logs that should have been automatically generated were inexplicably missing. The single backbone Ethernet network did not provide adequate separation from the real-time systems and non-critical business networks. Finally, the interconnections between the SCADA system and the plant leak detection system did not provide for adequate resources or separation.
Reviewers repeatedly input records with known errors into the historical database, but they could not replicate the computer slowdown of 10 June 1999.
The attempts to determine the cause of the SCADA system slowdown were hampered in that the system used for testing was not a clone of the operational system. The source code for the historic process in the software used by the Olympic computer system was significantly different than anticipated. Computer hardware was not available that matched the performance of the existing Olympic system, so the testing occurred on a system that was slightly smaller and slower.
Since the accident, Olympic took a number of steps to improve its SCADA system's performance, reliability, and security, including increasing computer processing speed and capacity and addressing physical security of the control center and electronic access security of the SCADA computers. The Olympic Pipeline Company has subsequently gone out of business.
ABOUT THE AUTHORS
Marshall Abrams, Ph.D., is a principal scientist at MITRE Corporation, a national resource center with expertise in systems engineering and information technology in McLean, Va. Contact him at Abrams@mitre.org. Joe Weiss, LLC PE, CISM, is an executive consultant at Applied Control Solutions in Cupertino, Calif. Contact him at email@example.com.
Accident timeline speaks volumes
By Marshall Abrams and Joe Weiss
At about 3:00 p.m. on 10 June 1999, the controller, using the SCADA system at the Olympic Pipeline Company in Bellingham, Wash., prepared to discontinue product delivery. The system administrator was working on a terminal in the control center computer room.
At about 3:10 p.m., the SCADA computer began to generate error messages related to the historical database. The system administrator checked the format of the new records and found no errors and left the control room for 15 minutes.
At the time of the accident, the SCADA system was unresponsive to the commands of the controllers. Had the controller been able to start the pump at the Woodinville station, the pressure backup probably would have been alleviated and the pipeline would have operated routinely for the balance of the fuel delivery. The controller attempted to systematically slow or shut down the line. Even if the controller had been unable to prevent the pressure buildup and the subsequent closure of the inlet block valve at Bayview, had he had full SCADA system control, he may have been able to reduce the flow through the pipeline sufficiently to minimize the severity of the pressure increase when the block valve did close.
When the system administrator returned, the primary computer was unresponsive.
The controllers reported the SCADA system was not updating the control screens like it normally does.
The SCADA problems grew more pronounced over the next 20 minutes, during which, the system became completely unresponsive. This period of non-responsiveness coincided with the rupture of the pipeline at about 3:28 p.m. The SCADA problems encountered by the controllers occurred shortly after the system administrator inserted new records into the system computer and were resolved after the control center supervisor deleted the new records. The system administrator noticed a typographical error in the records that had not been there when the records were checked earlier. The NTSB concluded the degraded SCADA performance on the day of the accident likely resulted from the database development work that was done on the SCADA system.