1 September 2005
Industrial Ethernet takes the test
There are questions about reliability and performance in manufacturing plants.
By James Gilsinn
One buzz phrase in the controls business is industrial Ethernet.
For the last few years, industrial equipment vendors have been developing products with Ethernet interfaces. The ubiquity, speed, high-data throughput, and economy of the technology are terribly attractive to industrial users and manufacturers.
This is in stark contrast to information technology (IT) equipment, which has had Ethernet built-in for decades.
Industrial equipment vendors and users were primarily concerned with Ethernet's inherent non-deterministic performance characteristics. Ethernet, and many of the upper layer protocols based on it, is unreliable, connectionless, and best-effort delivery. In general, there is no guarantee the data sent over the wire by the source will reach the destination. In the IT world, this is unacceptable, so protocols and methods have developed to overcome these limitations.
In order to determine just how reliable and fast these protocols and methods were, a workgroup of the Internet Engineering Task Force (IETF) developed multiple standard metrics and methodologies for testing IT equipment's performance. These standard metrics and tests are now in use in describing the performance of Ethernet devices. Now that industrial equipment has started using Ethernet and its various protocols, it is useful to look at these standards in order to develop industrial Ethernet performance metrics and tests.
The National Institute of Standards and Technology (NIST) is working with ODVA (formerly the Open DeviceNet Vendors Association) to develop standard performance metrics and tests for Ethernet/Industrial Protocol (EtherNet/IP) devices.
These standard metrics and tests will measure the performance of the real-time data Input/Output (I/O) communications of the EtherNet/IP devices. The metrics and tests are being designed to help a user select between multiple devices of the same type based on their application's requirements.
First, let's look at the metrics used to describe the performance of the real-time I/O communications of the device. Next, we'll examine a basic description of EtherNet/IP and the relevant parts of the standard that relate to the performance metrics and tests. Then, we'll discuss multiple methodologies to measure the performance of a device. Some of these are mature while others are still in their infancy. Finally, we'll look at how these performance metrics and tests worked at the EtherNet/IP Interoperability Plug-Fest.
"Standard" performance metrics
The IETF maintains many of the "standards" for networking technology. These are Request for Comment (RFC) documents. They are public for a time, giving people around the world a chance to make comments on the technology as draft documents, eventually published as full RFCs.
RFCs can be on any topic related to the Internet, but some of them follow the "standards" track that requires a more stringent review process. Many of the common Internet technologies like the Internet Protocol (IP), Transmission Control Protocol (TCP), and User Datagram Protocol (UDP) are RFCs maintained by the IETF.
There are multiple RFCs for testing networking equipment performance. Two in particular are RFC 1242—"Benchmarking Terminology for Network Interconnection Devices"—and RFC 2544—"Benchmarking Methodology for Network Interconnect Devices."
Most, if not all, networking infrastructure equipment (switches, routers, hubs, and bridges) test against these two RFCs as well as others. In order to test industrial equipment, NIST and ODVA are using these RFCs and developing them into metrics and tests for industrial equipment.
These RFCs cannot directly apply to industrial networked equipment since they are primarily for pass-through devices. Pass-through devices take data in one port and pass them out again through another port on the same device. Hubs and switches are good examples of pass-through devices.
Most industrial equipment in the scope of these metrics and tests are end-point devices. End-point devices, like computers, sensors, and actuators, produce or consume network data packets. The work has primarily focused on creating metrics and tests that can use the common terms and methodologies as defined by the RFCs but are still relevant to industrial equipment.
NIST and ODVA have borrowed four definitions in particular from RFC 1244 that will provide a common set of terminology to describe both the tests and results.
Throughput: The maximum continuous traffic rate a device can send/receive without dropping a single packet (Frames/sec at a given frame size).
Latency: The time interval between a message being sent to a device and a corresponding event occurring.
Jitter/Variability: The difference between the minimum and maximum time for a particular series of events. Standard deviation has been included in this measurement.
Overload Behavior: A description of the behavior of a device in an overload state. (This is qualitative.) Overload states exist when the device's internal resources receive either too much information to process or bad information and the device goes into a state other than its normal run mode. Data recorded for overload states should 1) describe what the device does when its resources are exhausted, 2) describe what the response is to the system management in an overload state, 3) describe how well a device recovers from an overload state.
NIST and ODVA added two modifications to the Latency term.
Response Latency: The closed-loop latency of a device to process a command and respond to it.
Action Latency: The closed-loop latency of a device to process a command and return a desired physical output (e.g. analog/digital output signal), or the closed loop latency of a device to process a physical action (e.g. analog/digital input signal) and return a response.
The response latency and action latency terms were added on since the devices under test (DUTs) will be end-point devices and cannot simply test on their ability to forward packets.
In addition, there are these terms to consider:
Unloaded: The test occurs with no background traffic on the network.
Loaded: The test occurs with background traffic applied to the network.
Within Spec: The test occurs within the manufacturer's specifications.
Outside-Spec: The test occurs outside the manufacturer's specifications.
About EtherNet/IP standard
ODVA released the EtherNet/IP standard in June 2001. This standard took the Common Industrial Protocol (CIP) developed for DeviceNet and ControlNet and layered that over Ethernet, Transmission Control Protocol over Internet Protocol (TCP/IP), and User Datagram Protocol over Internet Protocol (UDP/IP). It allows simple I/O devices like sensors/actuators or complex control devices like robots, Programmable Logic Controllers (PLCs), welders, and process controllers to exchange time-critical application information.
EtherNet/IP uses a peer-to-peer and producer/consumer architecture for data exchange versus a master/slave or command/response architecture. This allows for greater flexibility in the network and system designs, which fits better into the Ethernet networking model. In addition, EtherNet/IP splits its communications into configuration and management traffic (explicit messaging) and real time, data I/O traffic (implicit messaging). Configuration and management traffic uses TCP/IP, and real-time I/O traffic uses UDP/IP. Since real-time I/O traffic uses UDP/IP, it must maintain information about the packet sequencing and connection at the CIP application layer. These exist as the EtherNet/IP Sequence Number and EtherNet/IP Connection ID. The performance metrics and tests described are primarily concerned with the real-time I/O traffic.
The producer/consumer model for EtherNet/IP allows multiple modes of communication to work for real-time data exchange. The most common mode for producing data is cyclic production. During cyclic production, the producer will send data at a particular rate called the Requested Packet Interval (RPI). The RPI and corresponding Accepted Packet Interval (API) dictates the speed of the data produced over the network regardless of the rate at which the actual data values change.
EtherNet/IP also uses an object-oriented model. Some objects, such as the Identity object, TCP/IP object, and the Ethernet link object, are necessary for EtherNet/IP devices. These map basic information about the device into the object model. Other objects are device specific, and while basic definitions of them may exist in the specification, the exact information recorded in the object is specific to the device and application.
Performance testing tactics
These tests are currently developing for EtherNet/IP. However, they are generic enough to be applicable to all industrial Ethernet.
Cyclic/API jitter testing—Unloaded cyclic/API jitter testing: Since the RPI/API is the basis for most of the real-time I/O communications over EtherNet/IP; it is a natural place to start when looking at device performance. The ability for a device to maintain that API value under different conditions may be very important to the control system. No device will perform perfectly, so tests will show how closely a particular device is able to maintain the API value. They will not determine a pass or fail value. The user will have to determine that, and it will depend on whether the device's performance characteristics meet the application's needs.
The basic premise of the unloaded cyclic/API jitter tests will be to determine the maximum throughput of the device at a particular API value. Both producer and consumer devices will be tested. Devices receive no penalty for tests performed outside their published capabilities. The tests will result in a 2D matrix of maximum throughput versus API value.
Cyclic/API jitter testing—Loaded cyclic/API jitter testing: Since many EtherNet/IP devices are on platforms with limited resources, it is necessary to determine the affect of background traffic or other out-of-bounds conditions on the device. The procedure for the loaded cyclic/API jitter test is very similar to the unloaded test, however the DUT will be asked to produce or consume data on a noisy network out outside its published specifications. The tests will be setup in a similar way to the one described in the unloaded test, except there may be additional hardware for background traffic generation during the test.
The device will first endure extraneous background traffic. Broadcast and multicast traffic at different network layers will transmit to the device at varying rates to determine how the device and its network stack handle the extra load. Next, the device will have to maintain its I/O connections while also being asked to support other Ethernet-based protocols, like FTP and HTTP. Since Ethernet-based devices must support other capabilities, these protocols simulate real-world conditions like program upload-download-compare and a Web page download.
Latency testing—response: NIST and ODVA have developed two modifications to the latency term for testing industrial devices. These tests relate to actual inputs and outputs to and from the device and are the most relevant real-world tests.
Response latency tests the ability of a device to respond to a request for information. The test helps to determine the efficiency of the communications stack. Since the command and response will only be reading information from the device's memory, there should be a minimum of processing overhead associated with the test results.
For EtherNet/IP, the simplest response latency test would be to have the device return its identity object information. Since the identity object is a specific EtherNet/IP and CIP concept, the command would have to process through the entire communications stack to the application layer. Other standard objects like the TCP/IP and Ethernet link object could also work, but the potential exists for devices to trick the test by responding at a lower level.
Time analysis for response latency test.
TNetwork = TN1 + TSW + TN2
TDUT_RL = 2 × TStack + TProc
TRL = 2 × TNetwork + TDUT_RL + TTE
TNetwork = Latency time due to network overhead.
TN1 = Latency time due to the first network physical interfaces.
TSW = Latency time due to the network switch or other infrastructure equipment.
TN2 = Latency time due to the second network physical interfaces.
TDUT_RL = Latency time due to the Device under Test for the Response Latency Test.
TStack = Latency time due to the DUT's network protocol stack.
TProc = Latency time due to the DUT's processor overhead.
TRL = Latency time for the Response Latency Test.
TTE = Latency time due to the test equipment.
Action latency tests the ability for a device to either cause or measure a physical action and determine the time between the action and the associated network packet. If the device is being commanded to act, it is the time between the device receiving the network packet and the action happening. If the device is producing data, it is the time between the physical action and the device sending the network packet. These tests will be highly device specific, and require application level programming on the part of the tester. Multiple error sources will affect this test, since the test equipment may consist of more than one device.
In order to eliminate the need for multiple devices to execute the test, it may be possible to construct a loop-back test. It would connect an output on the device to an input, command the device to send an output, and wait for the device to measure the input. While not all devices will have both inputs and outputs, many of the devices subject to this category of testing will have both. The loop-back test would also limit the test equipment to one device, since the test equipment would only have to measure the time delay between network packets.
The loop-back test would be subject to many different types of errors and latencies. The test will be much more valuable to users than to developers, since it will not show the affects of the individual errors of latencies. The major sources of error and latency will probably be from the physical energy conversion creating an output signal and reading the input. These numbers are usually familiar to the vendor, and the performance analysis can reconcile them. Another source of error and latency would be due to the processing overhead and network protocol stack.
Time analysis for action loop-back test
TDUT_ALLB = 2 × TStack + 2 × TProc + (4)
2 x TBus + TConOut + TWire
TALLB = 2 × TNetwork +TDUT_ALLB+TTE (5)
TDUT_ALLB = Latency time for the device under test for the action latency loop-back test.
TBus = Latency time for the internal device bus (may be zero if device does not use a bus).
TConOut = Latency time to perform the output energy conversion.
TWire = Latency time for the signal to travel along the wire.
TConIn = Latency time to perform the input energy conversion.
TALLB = Latency time for the Action Latency Loop-Back Test.
There have been three EtherNet/IP interoperability plug-fests where EtherNet/IP vendors developing products have come to determine how interoperable their devices are with other vendors' products. The most recent was February.
The interoperability testing split into two separate phases. During the first phase, every device tested against every other device individually to determine what features worked and what features did not. The second phase incorporated all the devices into one large system, which had the devices attempt to communicate with one another in groups.
The individual device-to-device testing was to determine how well each device communicated with all the other devices. A series of communications tests took place for each device pair, resulting in a set of matrices that, after analysis, helped determine how well all the devices interoperated on an individual basis. The system-testing phase determined if there were additional interoperability issues communicating among multiple devices at the same time. Another series of communications tests ran for the groups of devices, resulting in a dataset for each group. We chose the groups and tests based on the results of the device-to-device testing, so devices didn't receive downgrade for features already registered as not working.
Performance testing during plug-fests: While the interoperability testing was taking place, data archived on the performance of the different devices. Since the performance tests were not the focus of the event, only the cyclic/API producer testing happened online during the interoperability testing. This allowed the performance test equipment to remain a passive component in the system while not disturbing the interoperability testing. The latency testing demonstration was on a stand-alone test system not connected to the interoperability system network.
The performance test equipment hooked to the Plug-Fest infrastructure equipment and allowed to listen to all the traffic in the network. The performance test equipment recorded traffic from the network and sorted that traffic based on the Source IP address, Destination IP address, and EtherNet/IP Connection ID. This allowed the individual streams of real-time I/O traffic to undergo individual analysis.
Plug-fest performance data analysis: For the first attempt at analyzing the Plug-Fest performance data, we calculated some basic statistical values for each of the individual real-time I/O traffic streams. The statistical values calculated were the average, minimum value, maximum value, and standard deviation. These statistical values relate somewhat to the metrics discussed earlier, but they will have to be refined before a concrete connection is demonstrable. The average relates to the throughput and clock skew, while the minimum, maximum, and standard deviation relate to the jitter or variability in the throughput.
While none of the devices performed the same, it was possible to see different trends in the data and determine possible causes for their occurrence. These trends usually related to different spike patterns in the data. The following describes some of the identifiable trends:
Hardware versus software clock: One of the first trends in the data is whether a device uses a hardware or software clock. The spike pattern trends in the data sets usually related to some overhead in the processor that caused a packet to arrive late. For devices using a hardware clock for their RPI/API timing, the next packet showed a subsequent early packet after each late packet, resulting in a downward spike at almost the same distance. The downward spike was because the device returned to the original RPI/API sequence time. Devices that use a software clock for their RPI/API timing did not show this downward spike. Their timing was from the time the last packet transmitted.
However, the metrics and tests for industrial Ethernet devices are going to be substantially different when implemented, due to the limitations of industrial devices. In addition, the methodologies developed for industrial Ethernet performance will require different implementations from device to device. Given the benefits to users, who at this point have very limited ways to compare similar products from multiple vendors, it is necessary to try to standardize as much of these metrics and tests as possible.
The tests we have reviewed are for EtherNet/IP devices. It is NIST's goal to introduce these metrics and tests to other industrial Ethernet standards groups once they successfully work out for EtherNet/IP. By working through all of the development issues with one network first, the migration to other industrial Ethernet networks should prove smooth.
Behind the byline
James Gilsinn (email@example.com) is an electrical engineer with the Manufacturing Engineering Lab at the National Institute of Standards and Technology (NIST). He is the lead of the EtherNet/IP Performance Workgroup and the technical editor for the ISA SP99 (Manufacturing and Control Systems Security) Part 2 standard.
Ethernet units to increase seven fold in five years
By Ralph Rio
Ethernet provides users an industrial network that has a lower total cost of ownership with improved adaptability for their changing business needs.
This, combined with widespread availability and market familiarity, continues to drive Ethernet's use in industrial automation applications for a broad range of industries.
The worldwide market for industrial Ethernet will grow at a Compounded Annual Growth Rate (CAGR) of 51.4% over the next five years. The market totaled 840,000 units in 2004 and forecasts to total just over 6.7 million units in 2009.
Revenue for Industrial Ethernet Switches had worldwide sales of $124.4 million in 2004 and probably will grow to $939.8 million in 2009 for a CAGR of 49.9%
Ethernet technology is penetrating the device level of the automation hierarchy. This provides one of those rare opportunities where a new technology can upset the supplier landscape.
Complex mix of participants
The aggregate market for industrial Ethernet devices has grown substantially during the last few years in spite of a difficult market for automation equipment in most areas.
The use of commercial-off-the-shelf (COTS) technologies makes the application of Ethernet to industrial networking significantly easier for pilot projects, system extensions, application development, and new automation systems.
Devices and systems that support Ethernet and the Internet Protocol suite leverage the vast IT infrastructure. This includes switches, firewalls, network management tools, development tools, and messaging standards. While some may start with a cheap office grade switch purchased at a local retail computer
outlet, they soon learn the importance of reliability in an industrial environment. One trip to the plant manager's office to explain why a $30 home office switch brought down production becomes a hard lesson. Protection for a factory environment includes heat, vibration, power line noise, dust, and other factors.
Currently, nearly all industrial Ethernet switches have IP20 to IP40 ratings. We are forecasting an increase in the need for IP67 to IP68 as industrial networking expands deeper into process control applications for those industries where liquids are present. Industrial Ethernet brings together a complex mix of participants. Plant Engineering, IT, and independent Systems Integrators become involved. Surprisingly, user preferences for Industrial Ethernet are not primarily dependent on price.
Behind the byline
Ralph Rio (firstname.lastname@example.org) has degrees in mechanical engineering and management science. He is the director of research at ARC Advisory Group.