Bookmark and Share
01 March 2003

Test your system five ways

Internet technology has much to offer on the plant floor. The trick is to adopt the technology but not the culture.

By Eric Byres, Joel Carter, Amr Elramly, and Dan Hoffman

During the past 10 years, industrial control systems have seen a significant increase in the use of computer networks and related Internet technologies to transfer information from the plant floor to supervisory and business computer systems.

Most industrial plants now use networked process historian servers to allow business users to access real-time data from distributed control systems (DCSs) and programmable logic controllers (PLCs).

There are also many other possible business and process interfaces, such as using remote Windows sessions to the DCS or direct file transfer from PLCs to spreadsheets.

Regardless of the method, each involves a network connection between the process and the business systems.

At the same time, there has been an explosion in the use of Ethernet and TCP/IP in industry for process control networks. For many years, the control systems used proprietary industrial networks, such as Data Highway Plus (Allen-Bradley of Rockwell) or Genius I/O (General Electric), giving them a considerable degree of protection from the outside world.

Today, many DCS and PLC systems use protocols such as Ethernet, transmission control protocol/Internet procotol (TCP/IP), and hypertext transfer protocol (HTTP, a World Wide Web protocol) as a critical component of their architecture, resulting in easier interfacing at the cost of less isolation and security.

Internet and factory floor expectations and practices

BLIND ADOPTION OF INTERNET

While technologies such as Ethernet and TCP/IP allow for significant cost savings and improved interfacing for industry, it is important to understand that their origins are rooted in a culture very different from the factory floor.

Even the neophyte Internet user can spot these differences in terms of reliability: Occasional failures are common and tolerated on the Internet, while most control systems are expected to operate for months, if not years, without interruption.

Similarly, the tradition of beta testing many new Internet products in the field and recovering from problems by simply rebooting servers or switches contrasts sharply with standard plant floor practices.

This is not surprising because the risk impact of outages on the Internet are typically loss of data, while outages in the process environment will certainly result in loss of production and may even cause loss of equipment or life.

Very simply, the Internet culture and the technologies it has created rise from the idea that performance is paramount and outages, while undesirable, are acceptable. This is clearly not true for the industrial system.

Nowhere is this cultural difference more pronounced than in the area of cybersecurity. Considerable media and research attention has focused on the topics of Internet viruses and hacking, but the reality is that most Internet hosts exist with only light security.

For example, the Cooperative Association for Internet Data Analysis reported that plain text passwords are still very common on the Internet—a clear violation of the most rudimentary security standards.

Similarly, until recently, most IP networks openly connected to the outside world, while the factory engineer has always demanded the control system networks remain isolated from the other company information systems.

Even where security is well defined, the primary goal in the Internet is to protect the central server and not the edge client. In process control, the edge device—the PLC or smart drive controller—is far more important than a central host, such as a data historian server.

Looking at these differences in needs and cultures, it is clear the industrial control world must not blindly accept the Internet world's solutions. The technologies may be extremely useful, but they require careful consideration before implementation on the plant floor.

SP 99 is under way—but don't wait

ISA—The Instrumentation, Systems, and Auto mation Society has launched the SP99 standards committee to address the needs of cybersecurity in the manufacturing and control systems environment.

Although the committee has a sense of urgency, the consensus standards-setting process is time-consuming. Manufacturers need to be evaluating their cybersecurity now.

Here are some resources to aid that evaluation:

  • ISO/IEC 17799: Information technology, code of practice for information security management at www.iso.ch
  • NIST-automated security self-evaluation tool at csrc.nist.gov/asset
  • CERT Coordination Center operationally critical threat, asset, and vulnerability evaluation at www.cert.org/octave
  • Critical Infrastructure Assurance Office at www.ciao.gov
  • National strategy to secure cyberspace at www.securecyberspace.gov
  • Site security guidelines for the U.S. chemical industry at www.socma.com
  • Partnership for Critical Infrastructure Security at www.pcis.org
  • U.S. chemicals sector cybersecurity strategy at www.pcis.org
  • Securing oil and natural gas infrastructures in the new economy at www.pcis.org
  • Electricity sector response to the critical infrastructure protection challenge at www.pcis.org
  • Wireless network security at csrc.nist.gov

Source: ARC Advisory Group

STANDARD HACKING ATTACKS

The British Columbia Institute of Technology's Internet Engineering Lab (BCIT/IEL) maintains an industrial cybersecurity incident database that tracks incidents involving process control systems in all sectors of manufacturing.

While most companies are reluctant to report cyberattacks or even internal accidents, there are now enough events to allow some statistical analysis of the data.

Since the initiation of the tracking project, 40 incidents have logged into the database. The first conclusion we can draw is that there is a problem, and it may be more widespread than most process engineers believe.

Employees caused more than 50% of these incidents. This correlates with data from FBI studies on the sources of cyberattacks. Indeed, the FBI and the Computer Security Institute on Cybercrime concluded, after study, that persons with high technical skill and process knowledge pose the greatest threat to an organization.

In other words, most security risks to a control network may not be an Internet teenager on a joy ride but a disgruntled employee.

From these conclusions, we have come to believe it is naive to assume that control devices will never confront some sort of internal or external cyberincident. PLCs and DCSs need to be hardened so that any intrusions that do occur will have little direct impact on the industrial process.

Are today's controllers tough enough to withstand some level of network attack? To answer this question, the BCIT/IEL developed a series of test procedures to test industrial controllers for their susceptibility to various network problems, including standard hacking at tacks.

There are five tests for security:

1) Open ports: Unnecessarily open TCP and user datagram protocol (UDP) ports are a common security breach. While at least one port must be open for normal PLC communications, it is important to ensure that all open port(s) are well protected and unnecessary ports are closed.

2) Simple network management protocol (SNMP) robustness: SNMP provides access to network devices for monitoring and configuration control. SNMP security is weak, making it a significant security concern.

3) Malformed packets: Some TCP/IP implementations are vulnerable to attacks based on packets with purposely illegal field values.

4) Broadcast traffic storms: A broadcast message is a message that conveys to all hosts on a network or subnet. Unusually large numbers of broadcast messages can cause host failures.

5) Resource starvation: Many TCP/IP hosts are vulnerable to attacks based on consuming system resources to the point that normal operation ceases.

The equipment required for these tests included a widely used PLC, a workstation for programming the PLC, a SmartBits600 Ethernet load generator, and a Linux workstation with commonly available hackers' software. All of the equipment connected using a standard Ethernet switch.

NO PROGRAMMING APTITUDE

Internet services receive their identity via 16-bit integers called ports. Each TCP and UDP packet contains a source and destination port number. In packets from a client, the source port identifies the client, and the destination port identifies the service. Each well-known Internet service has a unique port.

The protocol stack notifies a server when a packet arrives with its port in the destination port field. For example, the HTTP server receives no tice when a TCP packet arrives with destination port 80.

A port is open if a server is waiting to respond to TCP/UDP packets with that destination port. Hackers often scan for open ports and then use an attack known to be effective on the service waiting on that port.

A port scanner is an application that takes a list of IP addresses and ports and sends packets to each address/port pair, checking whether the port is open. Port scanners are readily available on the Internet and require no programming ability.

BCIT performed this test by connecting the PLC Ethernet interface to a Linux workstation. We scanned all ports (1-65,355) for UDP and TCP services using the open source utility nmap. For TCP, nmap attempts to open a TCP connection. The port is open if the connection is successfully established.

For UDP, nmap sends a UDP packet and waits for an error message from the Internet control message protocol (ICMP). The port is open if an ICMP port unreachable response is not forthcoming.

The test results showed that a single TCP port was open on this particular model of PLC. This port must be open for normal communications from programming terminal to PLC. A single UDP port was also open. This was port 161, a port reserved for SNMP. An open SNMP port is potentially very dangerous, as discussed below.

HACKER COMMUNITY SEES DEFAULTS

An SNMP-enabled device maintains a management information base (MIB) containing many fields. Each field is read only, such as TCP connection statistics, or read write, as in an IP address. SNMP allows a network administrator to monitor and control many network devices from a single location.

SNMP provides password protection, with one password for read-only fields and another for read-write fields. Unfortunately, the password scheme suffers from two security weaknesses.

First, the passwords often remain unchanged from the factory defaults—typically, public for the read-only fields and private for the read-write fields. The hacker community knows these defaults well. It is easy to overlook the need to change them, especially if the installation is not using SNMP.

Second, changing the default passwords is still problematic because the most common version of SNMP (Version 1.0) uses no encryption. When the new passwords transmit over the LAN, they do so in plain text, making them available to anyone running a packet sniffer on the network.

In this test, the PLC Ethernet interface connected to a Linux workstation containing the open source utilities snmpwalk and snmpset. Snmpwalk traverses the entire SNMP MIB, returning all fields found. Snmpset allows the user to change the value of any read-write field. Both utilities require knowledge of the password.

The test strategy uses snmpwalk to discover the MIB fields supported by the PLC. After hand examination to determine which fields (usually read write) have the potential to compromise a PLC operation, snmpset provides the means to negatively impact the PLC.

The test results discovered MIB objects that could render the PLC inoperable. For example, interface status (up or down) could be changed. Further, changing TCP/IP configuration information was possible. This fault can effectively disconnect a PLC from its network. In this particular PLC, the lab found no method to protect the PLC—not even disabling the SNMP services.

MALFORMED PACKETS POSE RISK

The purpose of this third test is to check the stability of the TCP/IP stack when presented with deliberately malformed packets. It is important to note that all of these packets will have correct checksums and so will appear to be free of transmission errors.

Many of the fields in IP and TCP headers are restricted in the values permitted. Some restrictions are absolute; for instance, legal values for the 4-bit IP version field are 4 and 6. In other cases, the restrictions are relative, as in the value of the 16-bit total length field must be the same as the actual datagram length.

Malformed packets pose two risks to a PLC. First, the response to such a packet is often specific to a particular TCP/IP implementation. Thus, a hacker can send malformed packets to identify the TCP/IP implementation in use—a first step toward compromising the implementation.

Second, the implementation may fail when given a malformed packet, causing the PLC to cease functioning. For example, if the value of the total length field is 100 but the datagram length is only 80, the IP implementation may fail by attempting to read past the 80th byte.

In this test, the PLC Ethernet interface connected to a Linux workstation with the utility isic installed. Isic is an open source utility that generates large numbers of IP packets with randomly seeded errors in IP fragmentation, version number, and header size. The lab pushed a series of test runs. In each, we generated 6,000 packets and then called in the ping utility. Ping checks that the PLC TCP/IP stack is minimally operational.

No errors cropped up from packets with fragmentation or version number errors. However, errors in header length caused serious problems. When these packets interfaced the PLC, it exited run mode and ceased to respond to TCP/IP and serial communications.

To resume operation, the PLC had to power down and back up, and its control program had to reload through the serial port.

BROADCAST STORM ACCIDENT

Broadcast packets typically go to all computers on a network rather than to a specific host or device. They may emanate from network servers advertising their services or from a host trying to locate a service.

Broadcast messages normally use a small portion of the available bandwidth and are an important part of a properly functioning network. In large quantities, broadcast packets can overload a network or host.

The key point is that all network devices must spend some resources interpreting each broadcast packet, whether or not that packet is germane. Many devices behave abnormally if they receive too many broadcast packets in a short time.

Broadcast storms can be a considerable risk to PLCs. For example, several years ago, an Ethernet-based PLC network in a pulp and paper mill lost communication to the operator consoles due to a broadcast storm.

The cause was a controller with a faulty erasable programmable read-only memory, which resulted in the controller generating broadcast packets very rapidly. While this broadcast storm was accidental, a hacker could easily generate similar traffic.

This fourth test relies on address resolution protocol (ARP) packets. Host A sends an ARP request to get the Ethernet address of another host, say B. The request contains B's IP address and is broadcast to all hosts. Every host examines the request packet. Only the host owning the IP address (host B in this case) replies.

In each test run, ARP requests transmit at a fixed rate, and technicians monitor the PLC's behavior. The initial rate is 500 packets/second and increases by 500 for each subsequent run.

A SmartBits 6000 load generator was used to generate the packets. This equipment is invaluable when precisely metered traffic is called for. If the PLC connection breaks off, then the ARP transmissions cease, and reconnection attempts begin.

When presented with 1,500 ARP packets per second, the PLC ceased normal communications. While this is a lot of packets, on the 10-megabit-per-second links currently in vogue, it represents only 10% utilization. Indeed, this rate is easily attainable with commonplace tools and a generic PC.

STARVATION NORMAL LONGER

Resource starvation attacks are similar to normal requests for service except that they arrive in such large numbers or so quickly that the host is unable to continue normal operation. The attacks can target any communication layer: Ethernet, IP, TCP, or application.

Typical TCP/IP stacks are vulnerable to a wide variety of resource starvation attacks. Usually there is a fixed limit on the number of simultaneous TCP connections. Thus, one common approach is to open so many TCP connections that normal communication is impossible.

In the BCIT test, a Linux workstation connected to the PLC Ethernet card, and normal operation initiated. The hacker utility jolt was used to force closure of this connection by sending large numbers of illegal packets.

Then the netcat utility was used to create the maximum number of TCP connections the PLC could handle, making it impossible to reopen the connection required for normal operation.

It was straightforward enough to force closure of the active connection. Using jolt for approximately 10 seconds caused this connection to time out. Then, calls to netcat established 255 TCP connections—the maximum supported by the PLC.

At that point, it was no longer possible to establish the original connection and resume normal PLC communications.

STANDARD IT SECURITY WON'T DO

The outcomes from these tests, along with our security incident database results, showed that hackers have both the means and the will to disrupt DCS and PLC operations.

Ten years ago, that might have been unlikely because process networks were proprietary systems that were isolated from most corporate systems. Today, that has changed because we are building sensor-to-boardroom integrated systems that use open standards such as Ethernet, TCP/IP, and Web technologies.

Depending on the corporate firewall to protect the process isn't the answer because it ignores the fact that probably half or more of all corporate hacking is from inside the firewall.

To make matters worse, there are many reasons why standard information technology (IT) security standards can't be directly applied to the plant floor. First, the nature of process control systems, with their reliance on unusual operating systems and applications, means many of the software-based security solutions will not run, or if they do run, they will interfere with the process systems.

Secondly, traditional IT security techniques focus on threats from outside the organization. As we noted earlier, this is not the primary risk for process control security.

So know this: The process control world must face the reality that it has to create its own security standards. IT

Behind the byline

Eric Byres (ebyres@bcit.ca) is research manager at the British Columbia Institute of Technology, where Joel Carter (jcarter@bcit.ca) and Amr Elramly (aelramly@bcit.ca) work as an assistant and an associate, respectively. Dan Hoffman (dhoffman@csr.csc.uvic.ca) is an associate professor in the computer science department at the University of Victoria. Read the entire technical paper, complete with Byres' recommendations, at www.isa.org/intech/cybersecurity.


Read questions answered by our experts or join the email list.