September/October 2010

Web Exclusive

Reducing process variability

A major benefit of automation

By Joseph S. Alford

reduce1Annual increases in corporate IT budgets and the placement of a PC on every employee's desk are commonly accepted practices in today's company operations. A computer need, sometimes harder to economically justify, regards similar increases in plant automation. As a result, some plant automation groups sometimes struggle to obtain approval of major expenditures in process control; some plants are even experiencing reductions in the size of their automation support groups.

Working toward "optimal" process and plant control is often mentioned as a justification for increased automation. Progress toward an optimal plant is often realized as the result of increased level of automation, although the vision itself is hard to quantify and its targeted "utopic" endpoint challenging to achieve. Another justification of automation is reduction in operations staffs. Some automation projects achieve this. Dangers can exist, however, particularly if trying to automate a poorly understood process and/or in presenting large increases in automatically obtained real-time information to remaining operators (e.g., from smart sensors and valves), which can potentially increase the burden and stress on operators.

This article discusses the value of plant automation from a different perspective-that of reducing plant operation and product variability. Variability is the source of significant inefficiency in many plant operations.

Variability in society

To help understand the consequences of variability, consider some of the variability that occurs in today's society and the level of frustration and inefficiency that results.

The stock market has risen at an annual average rate of about 10 % in recent decades. However, the high variability in stock market indices, including the "tech wreck" in 2000 and the most recent recession, has created havoc to many stock portfolios, caused much lost sleep, increased blood pressure for individuals, driven many investors from the stock market, created equity problems for many companies, prevented some pension funds from delivering promised benefits, and driven some companies out of business. Other than day traders, wouldn't most stock holders prefer to see a consistent annual increase in their portfolios (including stock portions of 401Ks)? 

How about the hardships to the national economy from major growth swings? High growth usually causes high inflation, and low growth can lead to recessions and high unemployment. Is it clear how unwelcome the current recession "disturbance" is to government, with reduced revenues, need to cut staffs, and high welfare and jobless benefit expenditures?

How about the increasing "queue" of major national economic "disturbances" evolving, due to inaction on issues as they come up? Certainly the severity of consequences to future generations will be greatly magnified as a result of continuing inadequate political action regarding Social Security, Medicare, energy, international trade, and the federal budget deficit. It is clear to most citizens that ignoring warning signs and waiting for actual crises to occur is not the best way to manage a society.

Variability in corporate practices

Consider the variability that occurs in today's corporate operations, given the short-term "net profit" focus of many companies. While a short-term focus would seem to favorably influence some variability (e.g., profitability), it can also result in significant internal turmoil, fluctuations, and inefficiencies. Examples include surges in hiring or layoffs, gaps in operator experience levels, slowdowns in R&D programs (affecting long-term revenue potential), project cancellations (and associated contract cancellation fees), and changes to project timelines. Clearly, additional resources, time, and energy are needed to deal with the variability in today's work place vs. what would be needed for a more consistent steady growth paradigm.

Consider also the upheaval that occurs in some companies when a patent expires on a key product and a new product is not available to replace it. Clearly, a company with an R&D component values having a new product pipeline that turns out new products on a consistent basis. In fact, the current downsizing in some pharmaceutical companies is due, in part, to gaps in their new product pipelines.

Variability in a manufacturing plant

Variability exists in manufacturing plants for many different reasons, including: variability of incoming sales orders, variability in the time to complete individual tasks/activities, poor plant design (e.g., sequencing of unit operations that are too tightly coupled), inadequate process control, equipment failures, and operator errors.

Regarding manual operations, variability can occur due to:

  • Insufficient staffing during nights and weekends to deal with abnormal situations.
  • Operators not performing a task the same as another operator, despite the existence of a documented SOP. This can be the result of variability in training, differences in experience, fatigue, or the current workload.
  • Not performing time sensitive tasks on time.

Some operator errors can lead to the need to recycle or rework some of the product, or even the loss of a batch. Some errors, especially for batch processes, can result in queues or idle equipment in other parts of the plant, and, therefore, an increase in the average batch cycle time.

For example, an error in manually cleaning, sterilizing, or operating a bioreactor (i.e., fermentor) can cause the associated fermentation to become contaminated, requiring it to be dumped and discarded. This may idle downstream product recovery and purification operations. A new fermentation usually cannot be immediately started, due to personnel and equipment scheduling constraints and also because the fermentation step in a bioprocess comes after previous cell growth steps requiring many days or weeks to execute. That is, one just does not immediately start another fermentation, but, rather, must start with a sequence of various cell growth steps that produce the inoculum for the fermentor. This example is one reason why bioprocesses have been evolving to more automated clean in place and sterilization operations (i.e., to reduce the probability of a contamination incident).

Level of automation

Variability in manufacturing operations can also occur by virtue of the level and quality of process control that exists.

For example, the rate of some manufacturing operations is highly influenced by temperature (e.g., chemical reactions, growth of microorganisms). Variability in temperature can, therefore, significantly affect the time required to complete a process step. This can then cause other plant problems-especially if the step is part of a tightly coupled sequence of unit operations or if scheduling of operators is dependent on unit operations adhering to a fixed cycle time. As one consequence of process step time duration variability, queues in the flow of materials between steps in a manufacturing process can develop.

In some cases, temperature variability can cause product quality problems. When several competing parallel chemical reactions are occurring, temperature drifts may result in changes in the concentration of reaction products, including an increase in undesired compounds. If temperature drifts too high, some products (e.g., some pharmaceuticals) may degrade or become unstable, such that the product cannot meet advertized stability and shelf life requirements.

Factory physics relationships

Factory Physics is defined as a systematic description of the underlying behavior of manufacturing systems. In helping to quantify the effects of process variability, two Factory Physics equations are often useful:

Kingman's Equation

Queue Time = [(COVinc work2  + COVprocess work time2)/2] × [(U/(1-U)] Mean Task Time

where U = fractional resource effective utilization and COV is the coefficient of variation = std. dev./mean

In viewing Kingman's Equation, variability in the time to complete a unit operation (i.e., CVprocess work time), as well as variability in the timing of incoming work orders, have a direct impact on queue time, with the impact increasing dramatically as the fractional utilization of resources (personnel and/or equipment) approaches 1.0. Queue time, in turn, influences overall process cycle time. Note: Cycle time is the summation of all unit operation task times in a sequence + turnaround time + all queues.

Cycle time, in turn, influences plant throughput, as shown by Little's Law: Throughput = WIP / Cycle Time. (WIP stands for work-in-process/progress.)

Therefore, one way of maximizing plant throughput, given a fixed number of resources, is to minimize the variability in the timing of incoming work (e.g., work orders) and also minimize the variability in the time it takes to perform individual manufacturing operations. Again, the importance of this conclusion increases dramatically as resources become more fully utilized, as shown in the Kingman's Equation figure.


Example Calculation: If a plant has 30% variability (i.e., CV = 0.3) with respect to incoming work orders and manufacturing operation task times and has 90% effective utilization (i.e. U = 0.9) of its resources, then average Queue Time/Manufacturing Task time = 0.81. Reducing plant variability to 15% will reduce this ratio to 0.2, which represents a 75% reduction in queue time and a 33% reduction in manufacturing cycle time (queue time + manufacturing task time). Using Little's Law, this can translate into a 50% increase in plant throughput.

Paradigms involving variability reduction

A number of paradigms have evolved in recent decades that focus on improving quality and reducing variability in manufacturing plant operations. They include:

  • Just in Time (JIT), which deals with the negative aspects of high and/or variable material inventories
  • Total Quality Management (TQM), by W. Edwards Deming, which focuses, in part, on the reduction of process variability via, e.g., reducing errors during manufacturing and streamlining supply chain management
  • Six Sigma / Lean Manufacturing, which is a set of principles focused on reducing process variability via, e.g., reducing waste in manufacturing, eliminating defects, and reducing inventory

How can automation help?

The use of automation is a common strategic tool in pursuing the JIT, TQM, and Six Sigma objectives. Automation can play a key role in performing manufacturing operations consistently, thereby minimizing the variability in individual task times-whether those tasks are direct manufacturing operations, support operations, product inspections, or the management of abnormal situations. With automation:

  • Manufacturing operation task times are less dependent on the skill level, experience, and availability of operators performing manual operations.
  • Fewer process deviations will occur, since process control systems offer tighter control than is typically possible with manual control.
  • Fewer process deviations will also result from an automation system's ability to continuously monitor a plant and generate alarms to alert operators to impending abnormal situations that require their attention.
  • Some historical process stop and hold times (of variable duration) can be eliminated for which, historically, manual samples were collected and sent, off-line, to a laboratory for analytical assays. Such laboratory assays were needed to determine if a batch lot could be, e.g., forward processed to the next manufacturing step, approved for marketing, or the lot rejected. With automation, as more information is available in real time, more decisions can be made automatically and on-line. This is a primary objective of the United States Food and Drug Administration's initiative known as "Process Analytical Technology." Reduction in process stop and hold points will not only reduce overall process variability, but will also reduce average manufacturing process cycle times and, thus, enable increases in plant throughput without increasing plant capital costs.
  • Fast and accurate inspection of individual vials, cartridges, bottles, widgets being manufactured, items being packaged, and labeling can be performed, with automatic rejection of defective items. Such systems can also alert appropriate personnel when the rejection rate exceeds as predetermined threshold. These capabilities help improve consistency and reduce variability of final products.
  • Alerts, information, and response recommendations can be made available in real time to operators regarding existing and pending abnormal situations. As is noted in leading alarm guidance documents, an effective alarm 1) alerts, 2) informs, and 3) guides. The effectiveness of an automation system in performing these three tasks can go along way in reducing the likelihood and/or severity of abnormal situations and, therefore, variability in the process and/or quality of product.

A case study

In the mid 1970s, a well-known pharmaceutical company began automating their bioreactors with a basic process control system and historian, realizing several yield improvement and variability reduction benefits. However, significant process variability remained due to the combination of natural variability of biological processes, plus the fact that the expertise to deal with many bioprocess abnormal situations lay with scientists and technical service personnel who were not normally stationed in the control room or plant floor. Further, the plants were plagued with a phenomenon common to many automated industrial plants using basic control systems, a high frequency of nuisance alarms.

Therefore, about 1990, the company began investigating the potential value of interfacing 3rd party real-time expert systems to their bioprocess automation systems.

The objectives in considering expert systems were to:

  • Capture the key manufacturing process expertise of their experienced operators, engineers, and technical service personnel-and then making that expertise directly available to the automation system, 24/7, for process diagnostics, alarming, and data analysis purposes.
  • Generate "smarter alarms," utilizing any and all appropriate information available within the process control and historian systems, usually configured into heuristic "if-then-else" rules. The objective was to create more informative alarm messages (often including suggested root causes) and reduce the incidence of nuisance alarms. Note: One of the values of an "expert system," which is sometimes nicknamed "rule-based system," is the high efficiency at managing large numbers of "if-then-else" rules, including rules interlinked.
  • Perform on-line diagnostics, thus providing operators with possible root cause and other relevant information regarding abnormal situations.
  • Provide recommended response actions to operators regarding abnormal situations.
  • Generate pages regarding abnormal situations to appropriate personnel, including scientists, technicians, technical support, and utility support personnel. Thus, a person did not have to be in a control room or plant floor to receive an alarm. They could even be at home or on the golf course.

The real-time expert system functionality was replicated and installed in various bioprocess manufacturing plants during the 1990s. Individual systems contained some application differences due to the different products being manufactured. Data was collected on the system's benefits to the plants.

In one plant (making penicillin), the system was credited as a major contributor to a 10% reduction in overall process variability and a 4% yield increase. These improvements resulted from the system's ability to identify poor performing bioreactors much earlier during the batch process than was previously achieved, and then to immediately page appropriate personnel. This enabled operators and technical service to make appropriate on-line adjustments to improve the ongoing fermentation and, e.g., save those fermentations that, otherwise, would have been dumped. That is, the system had no effect on good fermentations, but helped improve the performance of poor fermentations.

One of the many benefits from the 10% variability reduction was the reduced number of plant trial fermentation runs required to evaluate suggested plant improvements. That is, plants often conduct trials to evaluate ideas for improvement. If plant COV is high, then a large number of batch runs are required, consuming perhaps months of time, in order to statistically determine (with 95% confidence) whether a change being tested is a real improvement. As plant COV decreases, the number of batch lots needed to test an idea also decreases.

Another benefit achieved was a more than 50% reduction in total alarms generated, primarily through the elimination of most nuisance alarms and by generating fewer more informative alarms (vs. many less descriptive individual process parameter alarms).

Another interesting benefit in using these systems became apparent when the company pursued a major early retirement program in the 1990s. Many manufacturing plants lost key operators, supervisors, and technical service support personnel. The only plant that did not incur a significant productivity loss as a result of the early retirement program, among 10 plants reporting results, was the one that had implemented an expert system that had captured much of the process expertise of their key employees and had made that expertise available, 24/7.

Other plant variability reduction perspectives

Statistical control, CUSUM charts: In the same way that reduced COV reduces the necessary number of batch runs in a plant trial, reduced COV also reduces the number of batch runs necessary to deduce that a fundamental change has taken place in a production plant-as might be observed in a Statistical Control or CUSUM chart. For example, unexpected minor to moderate changes to a plant's performance are hard to quickly detect when the plant's output is highly variable.

Use of feedforward: Automation systems have available a profound capability known as feedforward control. This form of control measures "disturbances" to a process and then adds (or subtracts) some corrective action to the standard feedback control loop so the disturbance has minimal impact (possibly none) on the variable being controlled. Utilizing such a feature can significantly reduce process variability.

Keeping the automation system as simple as practical: Most control loops in existing industrial plants utilize PID controllers (which are linear) and are tuned assuming a process is at least pseudo-linear in its targeted operating zone. Major perturbations to the plant (i.e. major incidents of variability) can violate some of these assumptions. Therefore, a highly variable plant with frequent major disturbances can drive the need for more sophisticated control (e.g., adaptive tuning, heuristic rules, customization with 3rd part products), adding to the design and support costs of the system. Therefore, while automation can certainly help reduce the frequency and severity of perturbations to a plant, there is value in avoiding major perturbations to a plant from all sources, such that process control linearization assumptions can be utilized and need for complex automation algorithms and system customization can be minimized.

Caveats to consider

Bill Gates, founder of Microsoft, has commented that "automation applied to an efficient operation will magnify the efficiency." A corollary is that "automation applied to an inefficient operation will magnify the inefficiency." This suggests automation should not be viewed as a "cure" for a poorly designed and inadequately understood process. Only after a process is well designed and understood can automation pay big dividends. Said another way, automation is best applied to repetitive operations that are well understood and for which relevant on-line (or near real-time) measurements exist.

Remember, "If you can't measure it, you can't manage it."

Another caveat is the natural inclination of engineers to configure alarms to much of the information available to automation systems from smart sensors, valves, and other sources. For some plants, thousands of alarms have been configured, many of which do not represent an actual plant abnormal situation requiring an operator response. This has contributed to information overload frustrations for operators in times of plant upsets, for example. Sometimes, this has resulted in operators missing real abnormal event alerts, which can then increase plant variability. In other cases, the lack of linked on-line information regarding abnormal situation cause(s) and recommended operator response has significantly delayed alarm response time, again contributing to increased process variability. The importance of a well designed HMI and alarm system cannot be overemphasized.

Variability may add spice to life, but typically has negative consequences when applied to the management of societies, companies, and manufacturing operations. Applying automation to well-understood processes can result in significant variability reduction and resulting cycle time, WIP, and/or throughput benefits.


Joseph S. Alford, Ph.D., P.E., CAP, is an automation consultant. Acknowledgement: Thanks to Dr. P. Kokitkar for generating the Kingman's Equation shown in this article.