The following technical discussion is part of an occasional series showcasing theISA Mentor Program, authored byGreg McMillan, industry consultant, author of numerous process control books, 2010 ISA Life Achievement Award recipient and retired Senior Fellow from Solutia Inc. (now Eastman Chemical). Greg will be posting questions and responses from the ISA Mentor Program, with contributions from program participants.
By Gregory K. McMillan, CAP
In the ISA Mentor Program, I am providing guidance for extremely talented individuals from countries such as Argentina, Brazil, Malaysia, Mexico, Saudi Arabia, and the USA. These questions come from Flavio Briquente and Syed Misbahuddin.
Model predictive control (MPC) has a proven successful history of providing extensive multivariable control and optimization. The applications in refineries are extensive, forcing the PID in most cases to take a backseat. These processes tend to employ very large MPC matrices and employ extensive optimization by linear programs (LP). The models are linear and may be switched for different product mixtures. The plants tend to have a more constant production rates and greater linearity than seen in specialty chemical and biological processes.
MPC is also widely used in petrochemical plants. The applications in other parts of the process industry are increasing but tend to use much smaller MPC matrices focused on a unit operation. MPC offers dynamic decoupling, disturbance and constraint control. To do the same in PID requires dynamic compensation of decoupling and feedforward signals and override control. The software to accomplish dynamic compensation for the PID is not explained or widely used. Also, interactions and override control involving more than two process variables is more challenging than practitioners can address. MPC is easier to tune and has an integrated LP for optimization.
Flavio Briguente is an advanced process control consultant at Evonik in North America, and is one of the original protégés of the ISA Mentor Program. Flavio has expertise in model predictive control and advanced PID control. He has worked at Rohm, Haas Company, and Monsanto Company. At Monsanto, he was appointed to the manufacturing technologist program, and served as the process control lead at the Sao Jose dos Campos plant in Brazil and a technical reference for the company's South American sites. During his career, Flavio focused on different manufacturing processes, and made major contributions in optimization, advanced control strategies, Six Sigma and capital projects. He earned a chemical engineering degree from the University of São Paulo, a post-graduate degree in environmental engineering from FAAP, a master's degree in automation and robotics from the University of Taubate, and a PhD in material and manufacturing processes from Aeronautics Institute of Technology.
Syed Misbahuddin is an advanced process control engineer for a major specialty chemicals company with experience in model predictive control and advanced PID control. Before joining industry, he received a master's degree in chemical engineering with a focus on neural network-based controls. Additionally, he is trained as a Six Sigma Black Belt, which focuses on utilizing statistical process controls for variability reduction. This combination helps him implement controls utilizing physics-based, as well as, data-driven methods.
The considerable experience and knowledge of Flavio and Syed blurs the line between protégé and resource leading to exceptionally technical and insightful questions and answers.
Flavio Briguente's Questions
Can the existing MPC/APC techniques be applied for batch operation? Is there a non-linear MPC application available? Is there a known case in operation for chemical industry? What are the pros and cons of linear versus nonlinear MPC?
Mark Darby's Answers
MPC was originally developed for continuous or semi-continuous processes. It is based on a receding horizon where the prediction and control horizons are fixed and shifted forwarded each execution of the controller. Most MPCs include an optimizer that optimizes the steady state at the end of the horizon, which the dynamic part of the MPC steers towards.
Batch processes are by definition non-steady-state and typically have an end-point condition that must be met at batch end and usually have a trajectory over time that controlled variables (CVs) are desired to follow. As a result, the standard MPC algorithm is not appropriate for batch processes and must be modified (note: there may be exceptions to this based on the application). I am aware of MPC batch products available in the market, but I have no experience with them. Due to the nonlinear nature of batch processes, especially those involving exothermic reaction, a nonlinear MPC may be necessary.
By far, the majority of MPCs applied industrially utilize a linear model. Many of the commercial linear packages include previsions for managing nonlinearities, such as using linearizing transformations, changing the gain, dynamics, or the models themselves. A typical approach is to apply a nonlinear static transformation to a manipulated variable or a controlled variable, commonly called Hammerstein and Wiener transformations. An example is characterizing the valve-flow relationship or controlling the logarithm of a distillation composition. Transformations are performed before or after the MPC engine (optimization) so that a linear optimization problem is retained.
Given the success of modeling chemical processes it may be surprising that linear, empirically developed models are still the norm. The reason is that it is still quicker and cheaper to develop an empirical model and linear models most often perform well for the majority of processes, especially with the nonlinear capabilities mentioned previously.
Nonlinear MPC applications tend to be reserved for those applications where nonlinearities are present in both system gains and dynamic responses and the controller must operate at significantly different targets. Nonlinear MPC is routinely applied in polymer manufacturing. These applications typically have less than five manipulated variables (MVs). A range of models have been used in nonlinear MPC, including neural nets, first principles, and hybrid models that combine first principle and empirical models.
A potential disadvantage of developing a nonlinear MPC application is the time necessary to develop and validate the model. If a first principle model is used, lower level PID loops must also be modeled if the dynamics are significant (i.e., cannot be ignored). With empirical modeling, the dynamics of the PID loops are embedded in the plant responses. Compared to a linear model, a nonlinear model will also require more computation time, so one would need to ensure that the controller can meet the required execution period based on the dynamics of the process and disturbances. In addition, there may be decisions around how to update the mode, i.e., which parameters or biases to adjust. For these reasons, nonlinear MPC is reserved for those applications that cannot be adequately controlled with linear MPC.
My opinion is that we'll be seeing more nonlinear applications once it becomes easier to develop nonlinear models. I see hybrid models being critical to this. Known information would be incorporated and unknown parts would be described using empirically models using a range of techniques that might include machine learning. Such an approach might actually reduce the time of model development compared to linear approaches.
Greg McMillan's answers
MPC for batch operations can be achieved by the translation of the controlled variable from batch temperature or composition with a unidirectional response (e.g., increasing temperature or composition) to the slope of the batch profile (temperature or composition rate of change) as noted in my article Get the Most out of Your Batch you then have a continuous type of process with a bi-directional response. There is still potentially a nonlinearity issue. For a perspective on the many challenges see my blog Why batch processes are difficult.
I agree with Mark Darby that the use of hybrid systems where nonlinear models are integrated could be beneficial. My preference would be in the following order in terms of ability to understand and improve:
- first principle calculations
- simple signal characterizations
- principle components analysis (PCA) and partial least squares (PLS)
- neural networks (NN)
There is an opportunity to use principle components for neural network inputs to eliminate correlations between inputs and to reduce the number of inputs. You are much more vulnerable with black box approaches like neural networks to inadequacies in training data. More details about the use of NN and recent advances will be discussed in a subsequent question by Syed.
There is some synergy to be gained by using the best of what each of the above have to offer. In the literature and in practices, experts in a particular technology often do not see the benefit of other technologies. There are exceptions as seen in papers referenced in my answer to the next question. I personally see benefits in running a first principle model (FPM) to understand causes and effects and to identify process gains. Not realized is that the FPM parameters in a virtual plant that uses a digital twin running real time using the same setpoints as the actual plant can be adapted by use of a MPC. In the next section we will see how NN can be used to help a FPM.
Signal characterization is a valuable tool to address nonlinearities in the valve and process as detailed in my blog Unexpected benefits of signal characterizers. I tried using NN to predict pH for a mixture of weak acids and bases and found better results from the simple use of a signal characterizer. Part of the problem is that the process gain is inversely proportional to production rate as detailed in my blog Hidden factor in our most important control loops.
Since dead time mismatch has a big effect on MPC performance as detailed in the ISA Mentor Post How to Improve Loop Performance for Dead Time Dominant Systems, an intelligent update of dead time simply based on production rate for a transportation delay can be beneficial.
Syed Misbahuddin's follow-up question
Recently, there has been an increased focus on the use of deep neural networks for artificial intelligence (AI) applications. Deep signifies many hidden layers. Recurrent neural networks have also been able in some cases to insure relationships are cause and effect rather than just correlations. They use a rather black box approach with models built from training data. How successful are deep neural networks in process control?
Greg McMillan's answers
Pavilion Technologies in Austin has integrated Neural Networks with Model Predictive Control. Successful applications in the optimization of ethanol processes have been reported a decade ago. In the Pavilion 1996 white paper "The Process Perfector: The next step to Multivariable Control and Optimization" it appears that process gains possibly, from step testing of FPM or bump testing of actual process for an MPC, were used as the starting point. The NN was then able to provide a nonlinear model of the dynamics given the steady state gains. I am not sure what complexity of dynamics can be identified. The predictions of NN for continuous processes have the most notable successes in plug flow processes where there is no appreciable process time constant and the process dynamics simplify to a transportation delay. Examples of successes of NN for plug flow include dryer moisture, furnace CO, and kiln or catalytic reactor product composition prediction. Possible applications also exist for inline systems and sheets in pulp and paper processes and for extruders and static mixers.
While the incentive is greater for high value biologic products, there are challenges with models of biological processes due to multiplicative effects (neural networks and data analytic models assume additive effects). Almost, every first principle model (FPM) has specific growth rate and product formation the result of a multiplication of factors each between 0 and 1 to detail the effect of temperature, pH, dissolved oxygen, glucose, amino acid (e.g., glutamine), and inhibitors (e.g., lactic acid). Thus, each factor changes the effect of every other factor. You can understand this by realizing that if the temperature is too high, cells are not going to grow and may in fact die. It does not matter if there is enough oxygen or glucose. Similarly if there is not enough oxygen, it does not matter if the all the other conditions are fine. One way to address this problem is to make all factors as close to one and as constant as possible except for the factor of greatest interest. It has been shown data analytics can be used to identify the limitation and/or inhibition FPM parameter for one condition, such as the effect of glucose concentration via the Michaelis-Menten equation if all other factors are constant and nearly one.
Process control is about changes in process inputs and consequential changes in process outputs. If there is no change, you cannot identify the process gain or dynamics. We know this is necessary in the identification of models for MPC and PID tuning and feedforward control. We often forget this in the data sets used to develop data models. A smart Design of Experiments (DOE) is really best to get the data sets to show changes in process outputs for changes in process inputs and to cover the range of interest. If setpoints are changed for different production rates and products, existing historical data may be rich enough if carefully pruned. Remember neural network models like statistical models are correlations and not cause and effect. Review by people knowledgeable in the process and control system is essential.
Time synchronization of process inputs with process outputs is needed for continuous but not necessarily for batch models, explaining the notable successes in predicting batch end points. Often delays are inserted on continuous process inputs. This is sufficient for plug flow volumes, such as dryers, where the dynamics are principally a transport delay. For back mixed volumes such as vessels and columns a time lag and delay should be used that is dependent upon production rate. Neural network (NN) models are more difficult to troubleshoot than data analytic models and are vulnerable to correlated inputs (data analytics benefits from principle component analysis and drill down to contributors). NN models can introduce localized reversal of slope and bizarre extrapolation beyond training data not seen in data analytics. Data analytics' piecewise linear fit can successfully model nonlinear batch profiles. To me this is similar in principle to the use of signal characterizers to provide a piecewise fit of titration curves.
Process inputs and outputs that are coincidental are an issue for process diagnostics and predictions by MVSPC and NN models. Coincidences can come and go and never even appear again. They can be caused by unmeasured disturbances (e.g., concentrations of unrealized inhibiters and contaminants), operator actions (e.g., largely unpredictable and unrepeatable), operating states (e.g., controllers not in highest mode or at output limits), weather (e.g., blue northerners), poor installations (e.g., unsecured capillary blowing in wind), and just bad luck.
I found a 1998 Hydrocarbon Processing article by Aspen Technology Inc. "Applying neural networks" that provides practical guidance and opportunities for hybrid models.
The dynamics can be adapted and cause and effect relationships increased by advancements associated with recurrent neural networks as discussed in Chapter 2 Neural Networks with Feedback and Self-Organization in The Fundamentals of Computational Intelligence: System Approach by Mikhail Z. Zgurovsky and Yuriy P. Zaychenko (Springer 2016).
Mark Darby's answers
The companies best known for neural net-based controllers are Pavilion (now Rockwell) and AspenTech. There have been multiple papers and presentations by these companies over the past 20 years with many successful applications in polymers. It's clear from reading these papers that their approaches have continued to evolve over time and standard approaches have been developed. Today both approaches incorporate first principles models and make extensive use of historical data. For polymer reactor applications, the FPM involves dynamic reaction heat and mass balance equations and historical data is used to develop steady-state property predictions. Process testing time is needed only to capture or confirm dynamic aspects of the models.
Enhancements to the neural networks used in control applications have been reported. AspenTech addressed the extrapolation challenges of neural nets with bounded derivatives. Pavilion makes use of constrained neural nets in their fitting of models.
Rockwell describes a different approach to the modeling and control of a fed-batch ethanol process in a presentation made at the 2009 American Control Conference, titled "Industrial Application of Nonlinear Model Predictive Control Technology for Fuel Ethanol Fermentation." The first step was the development of a kinetic model based on the structure of a FPM. Certain reaction parameters in the nonlinear state space model were modeled using a neural net. The online model is a more efficient non-linear model, fit from the initial model that handles nonlinear dynamics. Parameters are fit by a gain constrained neural-net. The nonlinear model is described in a Hydrocarbon Processing article titled Model predictive control for nonlinear processes with varying dynamics.
To Syed's follow-up question about deep neural networks, Deep neural networks require more parameters, but techniques have been developed that help deal with this. I have not seen results in process control applications, but it will be interesting to see if these enhancements developed and used by the google-types will be useful for our industries.
In addition to Greg's citings, I wanted to mention a few other articles that describe approaches to nonlinear control. A FPM-based nonlinear controller was developed by ExxonMobil, primarily for polymer applications. It is described in a paper presented at the Chemical Process Control VI conference (2001) titled "Evolution of a Nonlinear Model Predictive Controller," and in a subsequent paper presented at another conference, Assessment and future directions of nonlinear model predictive control (2005), entitled NLMPC: A Platform for Optimal Control of Feed- or Product-Flexible Manufacturing. The motivation for a first principles model-based MPC for polymers included the nonlinearity associated with both gains and dynamics, constraint handling, control of new grades not previous produced, and the portability of the model/controller to other plants. In the modeling step, the estimation of model parameters in the FPM (parameter estimation) was a cited as a challenge. State estimation of the CVs, in light of unmeasured disturbances, is considered essential for the model update (feedback step). Finally, the increased skills necessary to support and maintain the nonlinear controller was mentioned, in particular, to diagnosis and correct convergence problems.
A hybrid modeling approach to batch processes is described in a 2007 conference presentation at the 8th International IFAC Symposium on Dynamics and Control of Process Systems by IPCOS, titled "An Efficient Approach for Efficient Modeling and Advanced Control of Chemical Batch Processes." The motivation for the nonlinear controller is the nonlinear behavior of many batch processes. Here, fundamental relationships were used for mass and energy balances and an empirical model for the reaction energy (which includes the kinetics), which was fit from historical data. The controller used the MPC structure, modified for the batch process. Future prediction of the CVs in the controller were made using the hybrid model, whereas the dynamic controller incorporated linearizations of the hybrid model.
I think it is fair to say that there is a lack of nonlinear solvers tailored to hybrid modeling. An exception is the freely available software environments APMonitor and GEKKO developed by John Hedengren's group at BYU. It solves dynamic optimization problems with first principle or hybrid models. It has built-in functions for model building, updating, and control. Here is a link to the website that contain references and videos for a range of nonlinear applications, including a batch distillation application.
Hunter Vegas' answers
I worked with neutral networks quite a bit when they first came out in the late 1990s. I have not tried working with them much since but I will pass on my findings which I expect are as applicable now as they were then.
Neural networks sound useful in principle. Give a neural network a pile of training data, let it 'discover' correlations between the inputs and the output data, then reverse those correlations in order to create a model which can be used for control. Unfortunately actually creating such a neural network and using it for control is much harder than it looks. Some reasons for this are:
- Finding training data is hard. Most of the time of the system is running fairly normal and tends to draw flat lines. Only during upsets does it actually move around and provide the neural network useful information. Therefore you only want to feed the networks upset data to train it. Then you need to find more upset data to test it. Finding that much upset data is are not so easy to do. (If you train it on normal data, the neural network learns to draw straight lines which does not do much for control.)
- Finding the correlations is not so easy. The marketing literature suggests you just feed it the data and the network "figures it out." In reality that doesn't usually happen. It may be that the correlations involve the derivative of an input, or the correlation is shifted in time, or perhaps there is correlation of a mathematical combination of inputs involving variables with different time shifts. Long story short - the system usually doesn't 'figure it out' - YOU DO! After playing with it for a while and testing and re-testing data you will start to see the correlations yourself which allows you to help the network focus on information that matters. In many cases you actually figure out the correlation and the neural network just backs you up to confirm it.
- Implementing a multi variable controller is always a challenge. The more variables you add, the lower the reliability becomes. Implementing any multivariable controller is a challenge because you have to make it smart enough to know how to handle input data failures gracefully. So even when you have a model, turning it into a robust controller that can manipulate the process is not always such an easy thing.
I am not saying neutral networks do not work - I actually had very good success with them. However when all was said and done I pretty much figured out the correlations myself through trial and error and was able to utilize that information to improve control. I wrote a paper on the topic and won an ISA award because neural networks were all the rage at that time, but the reality was I just used the software to reinforce what I learned during the 'network training' process.