Smart instruments in safety instrumented systems
The U.K.'s largest nuclear site operator implements IEC61508 and finds the quality of instrument firmware to be variable, but improving
By Tom Nobes
Instruments containing firmware (smarts) have advantages over traditional analogue instruments in terms of self-diagnosis, ease of calibration, improved accuracy, and cost.
However, the safety justification of smarts can present difficulties. International standards and the British Nuclear Installation Inspectorate’s (NII’s) licensing process expect, even when there are only modest safety claims to be justified, we (Sellafield) assemble detailed information about the smarts development process and the final product.
This expectation works on the premise that without full visibility of the internal structure of the smart, Sellafield cannot demonstrate it is safe for the proposed safety function.
We define a smart instrument as one that measures or directly controls a single process variable, uses a microprocessor and is a commercial ‘off the shelf’ instrument. It includes flexibility in its use due to parameters set by the user. Its life cycle includes generic fixed firmware by the manufacturer and particular configuration by the user. Smarts are not restricted to measurements but include actuators, valves, motor variable speed drives, and other control equipment.
So smarts include pressure, temperature, flow, level, density, pH, and conductivity transmitters, three-term (PID) controllers, recorders, motor starters, speed controllers, ‘soft’ starters, and control valve actuators (and probably a lot more things too).
Smart instruments can also communicate by using the Highway Address Remote Transmitter (HART) language, a simple, cheap, and robust protocol that has proven reliable.
What are excluded from the definition are PLC, SCADA and DCS systems. Also excluded are “unique” programmable systems. In essence, anything that contains “full variability language.”
Firmware is defined as “fixed programming language,” where the user has no ability to change the program itself, but can change the configuration by selecting parameters (zero, range, damping, etc.) and the selection of various sub-routines (square-root extraction, batch totalizing, and others).
Non-smart, “dumb” instruments must therefore not feature a microprocessor and firmware. They include simple pressure switches and older transmitters based on analogue op-amps, etc.
Advantages of smartness
Historically, it has been preferable to use discreet (on/off) type signals in safety instrumented systems (SIS). This was because these signals came from dumb discreet instruments like pressure switches, thermostats, limit switches, and the like. In the early 1980s, dumb analogue transmitters started usage in SIS because:
Some measurements can only be derived in analogue form. (They are not discreet on/off events in themselves, like conductivity, pH, velocity, or speed, etc.)
Analogue transmitters output a “live” reading and so one knows at the very least, they are operational.
Simple out-of-range diagnostics can be used (<3.6mA or >21mA [NAMUR NE43]).
Deviation diagnostics can work even when the hazardous condition is not present, when redundant analogue measurements are in service (i.e. automatically compare one 4-to-20mA signal with another).
A single analogue transmitter and a single trip-amplifier might replace multiple switches (replace high-high, high, low, and low-low pressure switches).
Unfortunately, dumb analogue transmitters are not inherently fail-safe because:
The health of dumb instruments cannot be determined without testing. This testing must be frequent, requires the instrument to be inhibited during the test (a potential hazard in itself), and is only valid for an unknown time after the test.
Only limited diagnostics of field wiring are possible (current flow <3.6mA >21mA).
Dumb transmitters fail in three modes—drift high (reading too high), drift low (reading too low), and stuck (jammed at one reading). One common problem occurs when extra resistance adds in the field cabling (e.g. a bad joint). Here the transmitter will not be able to drive more than 16mA round a loop. This problem will also occur if the 24Vdc power-supply dips to say 15V. Thus, “high” alarms become particularly problematic.
Dumb transmitters measuring parameters that are complex (pH, dissolved O2, etc.) can also exhibit “strange” fault responses (readings of pH that vary with temperature, etc.). These are very difficult to diagnose in operation and not easy to diagnose in maintenance either.
In the 1990s, the arrival of smart instruments offered cost savings in the whole life cycle. Examples from breweries, steel-mills, and chemical plants show lifetime cost reductions of 20 and 40% (www.hartcomm.org). Smarts are flexible in their configuration. Configurations can easily change. We can order smarts without having a known measurement specification.
Because smarts are configurable in so many parameters, there is less need for spares of identical configuration, since we can just select the configuration we want. Experience in the Sellafield THORP plant showed reductions in numbers of spare pressure transmitters from 260 dumbs to just 13 smarts, saving about $5 million. Because the smart instrument can be accessed over the HART network, this reduces the need to send technicians into a hazardous environment. This includes reductions in radiation dose, irksome clothing, poor physical access, working at heights, or in hot or humid areas, and the like.
Concerns over nuclear SIS
Traditional means of demonstrating that dumb instruments are adequate for use in nuclear safety applications typically involved analysis of the hardware reliability of the components using techniques such as The Failure Modes, Effects, and Diagnostic Analysis (FMEDA). With the introduction of firmware in smarts, a new class of failure mechanism is possible—systematic firmware faults.
This diagram illustrates how additional failure mechanisms come into play with smart as opposed to dumb instruments. The concern is the firmware introduces new, unrevealed failure mechanisms, which prevent the instrument performing its safety function.
Arguments exist that diagnostics within firmware increase the reliability of smarts. The paradox of this argument is diagnostics also increase the complexity of the smart (smarts often contain tens of thousands of lines of code), and therefore makes it more difficult to analyze and demonstrate it is free from dangerous systematic failures.
Let us also keep a sense of perspective on this: Dumb instruments were not perfect either and our experiences with them; and their manufacturers were often variable.
How do we substantiate smarts?
There is a framework for smart substantiation between the NII and the U.K. nuclear industry. Some 30 different types of instruments are to undergo assessment over the next few years.
The framework is the two-legged approach with those legs being Production Excellence and Independent Confidence Building Measures.
Production Excellence: This leg seeks to show an instrument has been conceived, specified, designed, tested, offered as a solution, and is supported by the manufacturer to a standard suitable for use in safety applications. The EMPHASIS tool serves to ask a series of questions to assess production excellence. EMPHASIS is a three-part Microsoft Excel based interactive spreadsheet. It is specifically for the British nuclear industry with input from the NII during its production. EMPHASIS asks a series of questions, the answers to which determine what the next question will be. For most questions, answers supported by appended evidence are required.
Independent Confidence Building Measures is an independent and thorough “reasonably practicable” assessment of the instruments fitness for purpose comprising the following:
Complete and preferably diverse, checking of the finally validated instrument by a team—this is independent of the manufacturer.
Independently assessed testing, covering the full scope of test activities (e.g. verification, commissioning, and dynamic testing) including traceability of tests to specification and confirmation that the specification is met.
Examples are Statistical Tests, Static, and Dynamic Analysis of firmware, and the like.
Balance of Risk Arguments: So what do I do if an existing dumb instrument on my plant fails or is becoming obsolete, and I can only replace with a smart instrument? If used in an SIS and the replacement instrument is not “like for like” (i.e., same manufacturer, type, and version [both hardware & firmware]), then the instrument needs to be substantiated.
The Framework includes guidance on “balance of risk and reasonable confidence” arguments, which may allow the use of smart instruments for a temporary period prior to substantiation.
Experience smart instruments
Our experience proved surprising.
It was often difficult to tell if an instrument was dumb or smart. Manufacturers’ data sheets often never specified if microprocessors were in use, operations and maintenance manuals often never went to this level of detail on component parts, and visual examination often proved inconclusive.
Existing models became smart overnight. An instrument purchased just two years earlier, we knew was dumb; purchasing an instrument with exactly the same part number now delivered a physically identical smart instrument. Manufacturers thought this was a good thing.
It became clear the introduction of smarts was proceeding faster than we could gather reliability data on them in non-SIS.
The vacuum transmitter was in a non-safety application to monitor ventilation depressions in Gloveboxes. Commissioning tests revealed a certain vacuum transmitter worked correctly between atmospheric pressure and -6mB. If the transmitter were pressurized, it again worked correctly (remained at 4mA) to about +6mB overpressure. However, at +6mB pressurization, the output would increase, and in effect, the transmitter would work in reverse (+7mB would read -1mB, +8mB would read -2mB, +9= -3, etc.). This was clearly an unrevealed hazardous condition, since a pressurized Glovebox would appear as a ventilated Glovebox. Discussions with the manufacturer revealed a poor approach to firmware quality. The fault was “a well-known bug,” but the manufacturer had not contacted any of his customers informing them of the bug. The manufacturer was not clear which serial numbered transmitters had this bug. These transmitters have been removed, and a search was completed for any others at Sellafield and steps taken to forbid all future purchases.
The paperless chart recorder: Several installations of these recorders started to exhibit faults, mostly “going to sleep” or cyclic rebooting. We swapped recorders with spares, returned them to the manufacturer, and reconfigured them over a period of about 18 months, with no significant improvement in reliability. It certainly was a surprise to discover the chart recorders contained a game based on the movie “Hunt for Red October,” and the game could not be deleted from the firmware, only locked-out from the Operators. After “lock-out,” reliability seemed to improve, but faults were never completely eliminated, and Sellafield decided to change the recorders for another make.
Sellafield also has several hundred other smarts in non-SIS and SIS. These include approximately 600 pressure transmitters, over 100 paperless (game free) chart recorders, dozens of temperature transmitters, and about a dozen pH/conductivity transmitters. Our reliability data show they have all worked without significant faults for several years.
Move to IEC61508 compliance
Both potential users and manufacturers of IEC61508 equipment had no previous experience with it, and early offerings reflected this lack of experience. A number of routes to IEC61508 compliance seemed available to us.
Conformity Assessment of Safety Systems—the 61508 Association in the U.K.: The association enables an integrated end-to-end assessment of the total SIS, taking into account the contributions made by all organizations in the supply chain. Organizations that conform carry a service mark.
Third party certification: In later years, third parties have participated in the design of some instruments to assure there are no faults possible by design. Typically, they witness tested some firmware and all hardware for performance and diagnostic capability. They injected faults and did an approval test of the final design. Instruments that pass (for example a TuV or Exida) certification process test carry some kind of marking. Sellafield had mixed experiences with third-party certification. Certain instruments did indeed possess a certificate indicating the instruments suitability for use at mostly SIL1 or SIL2. However, the coverage of the certificate was often limited.
For example, an instrument would be in an advertisement showing a “SIL Approved” logo. When the certificate arrived, it would simply say the instrument was “suitable for use in SIL2 systems in line with the Report Number #.” It was then often difficult and time consuming to obtain the report. Examination of reports usually revealed a whole series of caveats like:
“This report details the results from a hardware FMEDA on the instrument. No analysis has been made of firmware.”
“The report presumes that all revealed failures of the instrument (where the output falls below 4mA or rises above 20mA) will be detected by the user.”
“The report presumes that all instrument faults will be fixed within one hour.”
“The FMEDA figures are calculated at Ground Benign conditions.”
At risk of being a little tough on the certifiers, the author believes many of the early certificates proved to be little more than entirely theoretical analyses of hardware reliability (from optimistic data sources), assumed to be in perfect ambient conditions (like an air-conditioned laboratory), with little or no account of the integrity of the on-board firmware (too difficult) or any kind of on-plant proven in use data (did not exist). Recently, there has been an improvement in certification quality. A few manufacturers have recently produced some commendable documentation showing good compliance with IEC61508. However, “buyer beware” is appropriate when assessing third-party approvals.
It is essential as a competent engineer you satisfy yourself that the certification and evidence offered by the manufacturer is through, accurate, and appropriate for your application.
The Nuclear Industry Smart Instrument Working Group (NISIWG): Realizing there was a lack of joined-up thinking; Sellafield formed a nuclear industry smart instruments working group. Members include Sellafield Ltd, British-Energy, Magnox Generation, United Kingdom Atomic Energy Authority, Urenco, BAE Systems, Atomic Weapons Establishment, Devonport dockyard Management Ltd (Babcock), Rolls-Royce Naval Marine, and GE Healthcare.
NISIWG objectives included a common understanding of smart technology, identification of potential problem areas, sharing of substantiations, spreading of costs, production of standards and procedures based on a common substantiation methodology, and of course, some market advantage with manufacturers.
The manufacturers: At first, most (but not all) manufacturers cooperated enthusiastically, but as the scope of the data NISIWG required became clear, only a few were able to deliver. It is worth reviewing why many manufacturers were unable to help NISIWG, as it is a clear indication at the way the market economy is influencing the whole IEC61508/61511 issue. Some manufacturers support evaporated because:
A lack of understanding of IEC61508 by the manufacturer.
The U.K. nuclear SIS market is only around 3% of the turnover of a large U.K.-based manufacturer. Typical sales of say, pressure transmitters, were 1.8 million annually, of which just 540 sold to the whole U.K. nuclear industry.
Average purchase price of a smart instrument is around $1,500 with profit less than 20%.
Manufacturers had already paid considerable funds to third parties to issue certificates, which NISIWG was now questioning. Typical examples included Certificates stating SIL3 compliance, with NISIWG erring toward SIL1. Manufacturers were concerned NISIWG would undermine the existing certificates.
It became obvious other industries were simply not as concerned about this issue as the nuclear industry. Indeed, NISIWG was often the only source of enquiries the manufacturer ever received for copies of the reports and further details on the certification process.
Manufacturers are concerned about the apparent plethora of compliance standards. These include ATEX, M-CERTs, CE, NAMUR, CE-M, UL, CSA, and IEC61508.
Smart instruments: A done deal
Overall, Sellafield’s experience with smarts has been good. It is the nature of articles like this to dwell on problems and underplay benefits. Smarts are a done deal, and we have no option but to use them along with the rest of industry.
The capabilities of manufacturers to produce smarts suitable for SIS are mixed. Sellafield recommends users engage with ‘blue-chip’ manufacturers to provide suitable instruments.
We attempted substantiations on many instruments, with an overall success rate of only about 30%, mostly for the lack of data and reasons we have chatted about here. Safety-driven users need to work together, to motivate market driven manufacturers to meet our needs.
Smarts offer advantages in SIS. They also pose challenges in their substantiation. The application of straightforward assessment techniques may satisfy substantiations.
Where substantiations require complex techniques the tools, methods and facilities are under development.
Above all, KISS and read the small print!
ABOUT THE AUTHOR
Tom S. Nobes (email@example.com) is an ISA member and a 36-year C&I veteran. He is a chartered engineer with the Institute of Measurement & Control and is Process Instruments Capability Leader at Sellafield Ltd. Sellafield Ltd. operates two nuclear sites in the U.K. and specializes in the recycling of spent nuclear fuel and in nuclear plant decommissioning.
IEC 61508 is the international standard “Functional safety of electrical/electronic/programmable electronic safety-related systems (E/E/PES).” It is a functional safety standard applicable to all kinds of industry. It defines functional safety as “part of the overall safety relating to the EUC (Equipment under Control) and the EUC control system, which depends on the correct functioning of the E/E/PE safety-related systems, other technology safety-related systems, and external risk reduction facilities.”
SIS is a safety instrumented system, and it is a form of process control usually implemented in industrial processes, such as those of a factory or an oil refinery. The SIS performs specified functions to achieve or maintain a safe state of the process when unacceptable or dangerous process conditions arise. SISs are separate and independent from regular control systems but maintain similar elements like sensors, logic solvers, actuators, and support systems.