01 October 2003

More is always better when it's critical

TMR is an old good idea that is only more valuable in the microprocessed era.

By Nicholas Sheble

When a child reports back to her parents and describes the appearance of the anteater she just saw in the next cage at the zoo as "really, really, really weird looking. Come see!" does the use of more than one really have significance?

Yes is the answer.

Likewise, suppose an operator is preparing for a critical discharge of product-for instance, one that has to leave his reactor at between 35 and 40 degrees Centigrade. He checks his product temperature readings and they say 36.0° from thermowell one, 36.0° from thermowell two, and 36.1° from thermowell three.

Does the operator know the temperature of the product inside that vessel with more reliability and certitude with three readings than he would with a single reading? Is he more confident that he's about to do the right thing?

The answer is yes, again. It's intuitive and, indeed, it's mathematically provable.

This concept of repetition and backup-redundancy-in the name of emphasis on certainty, and therefore reliability, is not new. Computer jockeys were working on reliability forty years ago as computer systems grew into military and space control applications.


The IBM Journal said in 1962 in a piece addressing severe reliability requirements: It is interesting to specify numerically the desired reliability improvement-reliability is quantitatively defined as the probability that a system will not fail under specified conditions.

A typical application may require 95% reliability for a period of time roughly equal to the mean time to failure of present systems-say one hundred hours. A rough calculation shows that without the use of redundancy, this requirement implies a twentyfold improvement in the mean time to failure of all components.

Even if such large improvements in component reliability are achievable in the years ahead, complex digital systems would still not be reliable enough for those applications where maintenance during operation is impractical.

The application of redundancy, together with the improvement of component reliability and the reduction of system complexity, will be necessary to solve the problem.

Flash ahead to year 2003 and one sees that, yes, component reliability and system complexity are such that the near certainty of operations is at hand. Ninety-five percent reliability? Perhaps for a tire swing.

Stratus Technologies has for some time offered 99.999% availability with its redundant processor arrangement. Customers who want 99.9999% availability can pay more for a triple-modular-redundancy system.

But wait, recently Stratus announced its perfect performance program, which guarantees 100% uptime. The company guarantees that if a customer has any unplanned downtime from its operating system or hardware in the first year of the contract, Stratus pays $100,000 in cash or gives product credit to the customer.


To explain triple modular redundancy, it is first necessary to explain the concept of triple redundancy as Johann (John) von Neumann-mathematician and computer pioneer originally laid it out.

The IBM Journal said the concept is something like a figure where there are three boxes labeled M. They are identical modules or black boxes that have a single output and contain digital equipment.

A black box may be a complete computer, or it may be a much less complex unit, such as an adder or a gate.

The three Ms' outputs feed to a single circle V, which von Neumann called a majority organ, but which eventually was identified as a voting circuit because it accepts the input from the three sources and delivers the majority opinion as an output.

Triple redundancy as originally envisaged by von Neumann.

Triple redundancy as originally envisaged

Because the outputs of the Ms are binary and the number of inputs is odd, there is bound to be an unambiguous majority opinion.

The reliability of the redundant system calculates as a function of the reliability of one module, RM, assuming the voting circuit does not fail. The redundant system will not fail if none of the three modules fails, or if exactly one of the three modules fails.

The failures of the three modules are independent. Because the two events are mutually exclusive, the reliability R of the redundant system is equal to the sum of the probabilities of these two events.

Rsystem = RM3 + 3 RM2 (1 - RM) = 3 RM2 - 2 RM3 

The triple-modular-redundant (TMR) configuration is considerably different from the triple redundancy outline because it employs three identical voting circuits instead of one voting circuit.

Triple-modular-redundant configuration.

Triple-modular-redundant configuration

Assuming that the voting circuits (V) do not fail, the two configurations have identical reliability. When accounting for the unreliability of the voting circuits, we see that the voting circuits themselves are redundant in the TMR configuration.

Hence, single voting-circuit (V) failure will not necessarily cause system failure either.

TMR for process industry usage in the year 2003 incorporates von Neumann's and others' 1950s and 1960s number machinations. A TMR vendor's primary competitive advantage is its engineering services, secondarily are its system hardware and software offerings, which may offer differences in speed and presentation.

The capable dealers of turnkey TMR installations are necessarily big name companies because the projects themselves are often grand and prominent. ABB, ICS Triplex, Invensys-Triconex, and Toshiba show up often on the radar screen.

A typical application is this ICS Triplex install: Bord Gais supplies and distributes natural gas in Ireland. The company decided to construct a second pipeline linking Beattock in southwest Scotland to Ballough, north of Dublin in Ireland.

The project includes three aboveground infrastructure developments at Beattock and Brighouse Bay in Scotland and at Gormanston in Dublin. It also includes a subsea, high-pressure pipeline.

The main component, the subsea pipeline, is 195 kilometers in length and 30 inches in diameter. It operates at a pressure of approximately 150 bar.

Gormanston AGI is the receiving terminal for the pipeline, and it ensures that the pressure of the gas, which transports at 150 bar under and across the Irish Sea, reduces to 85 bar, which is the maximum pressure permitted on land.

On arrival at the station, the gas is heated to counteract the cooling down that will take place as a result of this drop in pressure. The gas meters, then pressures down, and moves through the distribution grid.

In addition to the dangerously high pressures and volatility, the large range of flow and high energy have the potential for immense damage should process control fail.

It is imperative that the process never shut down.

For the station to function safely, the process control system needed to control the three flow streams used for letting the pressure down as well as provide an emergency shutdown system (ESD).

The ICS Triplex answer was an integrated station control and ESD within a single TMR safety system. This would include a TMR processor and a programmable logic controller (PLC) capable of swapping modules without interrupting the process.

Integrated station control

Integrated station control

There also was a system that had a valve test module for controlling the pulse-regulated, electric-actuated control valves.

There was a dual communication link with Bord Gais in Cork providing the ability to monitor and control the station remotely.

It's a massive project with massive needs and rewards. The potential for catastrophic failure is easy to see, and a single source of weakness is intolerable. This is the venue of triple modular redundancy in oil, gas, energy, and chemicals. In most cases, it may well be the law.

The company concludes that while distributed control systems will continue to dominate process control and PLC-supervisory control and data acquisition solutions offer an alternative approach, they both require a compromise in certain applications.

Where the control demands speed, flexibility, and high availability, using a safety system solution is the answer. Many safety systems aren't up to the task. Choose carefully. IT


Johann Louis von Neumann was a child prodigy in Budapest, Hungary. When only six years old he could divide eight-digit numbers in his head.

Entering the University of Budapest in 1921, he studied chemistry, moving his base of studies to both Berlin and Zurich before receiving his diploma in 1925 in chemical engineering. He received a doctoral degree in mathematics in 1928.

At a time of political unrest in central Europe, he visited Princeton University in 1930, and when scholars founded the Institute for Advanced Studies there in 1933, he became one of the original six professors of mathematics, a position that he retained for the remainder of his life.

At the instigation and sponsorship of Oskar Morganstern, von Neumann became a U.S. citizen in time for his clearance for wartime work.

Von Neumann's interest in computers differed from that of his peers by his quick perception of the application of computers to applied mathematics for specific problems, rather than their mere application to the development of tables.

During the war, von Neumann's expertise in hydrodynamics, ballistics, meteorology, game theory, and statistics contributed to several projects.

By the latter years of World War II, von Neumann was playing the part of an executive management consultant, serving on several national committees, applying his amazing ability to rapidly see through problems to their solutions.

Through this role, he was also a conduit between groups of scientists who were otherwise unknown to each other by the requirements of secrecy. He brought together the needs of the Los Alamos National Laboratory-the Manhattan Project-with the capabilities of the engineers building supercomputers.

Postwar, von Neumann concentrated on the development of the Institute for Advanced Studies computer and its copies around the world. His work with the Los Alamos group continued, and he continued to develop the synergism between computers' capabilities and the need for computational solutions to nuclear problems related to the hydrogen bomb.

His insights into the organization of machines led to the infrastructure called the von Neumann architecture. However, von Neumann's ideas were not along those lines originally; he recognized the need for parallelism in computers but equally recognized the problems of construction, and hence settled for a sequential system of implementation.

In the 1950s von Neumann worked as a consultant for IBM to review proposed and ongoing advanced technology projects.

The Institute of Electrical and Electronics Engineers (IEEE) continues to honor Johann (John) von Neumann through the presentation of an annual award in his name. The institute presents the IEEE John von Neumann Medal annually "for outstanding achievements in computer-related science and technology."

Source: Virginia Tech, ei.cs.vt.edu

Johann (John) von Neumann