Predictive maintenance embraces analytics
Shifting the state of the art
- Run-to-fail is not a strategy embraced by plant management or production staff, and new technologies are equipping production staff with the intelligence needed to avoid downtime.
- Data is more abundant as costs decrease, while fewer experienced staff are available to maintain profitable operations.
- Predictive analytics pushes the pursuit of asset reliability from its reactive orientation to one that is fully forward-looking and better aligned with the goals and realities of today's manufacturing environment.
By Robert Rice and Richard Bontatibus
What is the state of the art in asset reliability? Consider a typical production plant. The human machine interface (HMI) graphics flash like a stoplight: red, yellow, and green. Performance values consistently fluctuate up and down. The data is real-time and it is highly dynamic - professionals equate it to a plant's vital signs. All the while operators monitor the HMI waiting for indications of an excursion, and maintenance staff tend to their calendar-based maintenance schedules in an effort to ward off failure. Condition-based tools monitor the few attributes that are readily correlated. With uptime synonymous with profit, this inefficient and risky approach belies the technological advances and industry's investments in improving asset reliability.
Run-to-fail is not a strategy embraced by either plant management or production staff, and changes in the plant environment present new challenges in maintaining uptime. On the one hand, an ever-increasing supply of operational data is now available, which can provide insights into the health of production assets. On the other hand, dwindling domain experience places a new form of strain on production staff, who must analyze the data in a time frame needed for improved uptime and reduced operations and maintenance (O&M) costs. Data is becoming more abundant as its cost decreases. Although the shift to predictive maintenance provides a meaningful improvement in the manner by which plants maintain normal operation, there remains a strong element of reactive behavior that manufacturers are looking to overcome. Existing predictive maintenance tools do not fully equip a downsized staff to take advantage of the volume of data that is now being generated.
Recent innovations in asset reliability and condition-based monitoring are evolving the state of the art. New technologies address both the need for higher order diagnostics and the challenge of a growing experience gap. These advances possess the ability to process abundant, complex data. Equally important, they are uniquely capable of interpolation - they draw inferences from the data much like a seasoned professional. They have been termed "predictive analytics" as much to distinguish their ability to process and correlate abundant data sources as to reflect their faculty for understanding how changes in asset behavior lead to failure.
By definition, predictive analytics encompasses a variety of techniques from statistics, modeling, machine learning, and data mining that analyze current and historical data to predict future events. The nature of predictive analytics pushes the pursuit of asset reliability from its residual reactive orientation to one that is fully forward-looking and better aligned with the goals and realities of today's manufacturing environment.
A sea change is underway
The changing needs of manufacturers have not been lost on suppliers. A "big data" frenzy has captured the attention of thought leaders from across our industry. They accurately see the deluge of data - both raw and processed - as less an advantage and more a hindrance to effective plant management unless analytics improve (Figure 1). Whether through the R&D investments of Fortune 500 behemoths or the innovations of small companies, significant advances in data analysis have been achieved specifically within the manufacturing realm. For this big data problem we now have big analytic capabilities that address manufacturing's needs.
These predictive analytic capabilities are not simply working their way through some corporate skunk works. They are a current reality. For example, during the Minds + Machines 2012 conference, General Electric's CEO stood alongside a jet engine and described how that single asset utilized over 20 sensors to collect a terabyte of data during each day of operation. Again, one asset, 20 sensors, and one terabyte each day. Scale that amount of data to plant-size, and the volume of raw data quickly becomes staggering . This is today 's reality . It is easy to project how tomorrow's data demands will be increasingly complex.
Moore's law for increases in computer processing power is complemented by equally impressive reductions in the cost of the data itself. Consider the global market for sensors. As one example, forecasts suggest that the average unit cost of a pressure sensor will fall 30 percent as unit sales nearly double over the next four years. As the cost to capture data continues to fall, it is conceivable that many manufacturers will drown in their data due to an inability to transform it into actionable information. Although existing asset maintenance and reliability technologies, such as CMMS, EAMS, and others, offer the start of a solution, their data processing limitations underscore the need for more powerful predictive analytic capabilities.
Considering the costs of downtime
If downtime is costly, then unplanned downtime is crippling. Contributors to unplanned downtime's higher cost seem endless. They include the obvious, such as unexpected losses in production, employee overtime, and replacement equipment. Equally important are other and more esoteric cost factors that involve safety, waste, and even brand.
It is widely reported that 5 percent of plant production is lost annually due to downtime. Putting that into a different context, roughly $647 billion is surrendered on a global basis by manufacturers across all industry segments - the corresponding portion of nearly $13 trillion in production. For an average cement mill, lost production equates to nearly $7 million. For a chemical plant, the value grows to approximately $50 million. These sums stand out all the more when considered relative to a persistently sluggish global economy - an environment where manufacturers have been forced to extend the life of existing assets and to do so with fewer human resources.
For production facilities operating close to the margin, the impact of unplanned downtime can be especially troubling. In its November 2012 study on operational risks, the Aberdeen Group distinguished three categories of manufacturers: Best-in-Class, Industry Average, and Laggards. Whereas Best-in-Class manufacturers experienced unscheduled downtime of 1.5%, the study found that Laggards suffered from a rate of 14.8%. The associated impact on operating margin between these two categories was an astounding 33%.
Beyond the loss to production and its obvious financial impact, unplanned downtime also includes other equally significant effects, such as safety. In its 2002 fact sheet, the Occupational Safety and Health Administration (OSHA) assigned an annual cost of $125 billion to U.S. businesses alone in association with six million non-fatal workplace injuries. For 2004 the National Safety Council estimated the costs of occupational deaths and injuries at more than $142 billion. These figures do not correlate directly with documented equipment failures. Although production environments are inherently dangerous even under normal conditions, it is reasonable to assume that the rate of incidents and their significance are greater during periods of unplanned downtime.
The essence of experience
Staffing is one factor that highlights the need for further advances in asset reliability. Sustained reductions to headcount and the anticipated loss of expertise due to the industry's aging workforce undermine the industry's ability to maintain effective and efficient production levels. New tools are required to backfill those losses - ones capable of addressing all production equipment and processing the abundant data.
Employment remains depressed and near a 70-year low. According to the Bureau of Labor Statistics, employment within the manufacturing sector decreased by 12.9 percent from January 2008 through September 2012, a loss of 1.77 million jobs. April 1941 was the last time employment sank to the current level. Gone are the marginal employees - those staff that were deemed non-critical. Although their jobs were eliminated, the workload was simply redistributed.
The aging workforce has become a siren's song that adds to management's list of concerns. Sources such as Advanced Technology Services and Neilson Research suggest that nearly 40 percent of production staff will retire by the year 2015. Unlike the reduction of marginal employees, the loss of these highly experienced staff will place a disproportionate strain on plant operations. Gone will be that valuable combination of knowledge and intuition that understood the quirks of the plant environment - the visible and audible clues that are not listed in an operating manual and that only come with experience. In spite of their knowledge, newly minted graduates from trade schools and universities will lack the first-hand experience with which to provide a seamless transition. Depending on the environment, that process can take years.
The current state-of-the-art technologies in condition-based monitoring and asset reliability provide meaningful assistance in terms of monitoring asset health and enabling residual staff to function effectively. However, they fall short in their ability to fully understand the true nature of an asset's condition. With a silo-like focus on a handful of asset attributes, such as vibration, rotational speed, and load, these approaches provide only a basic, albeit meaningful, understanding of a given asset's propensity to fail. Additionally, they are incapable of applying past experience to current operations. Most are simply restricted to pairings of high and low alarm levels.
Through the application of predictive analytics, manufacturers gain access to insights that fully complement their existing predictive maintenance tools - the insights made possible by processing their growing stores of data. Predictive analytics enables manufacturers to capture the essence of both experience and intuition by cataloging varying production conditions. Due to growing complexity in the production environment, the capabilities of predictive analytics offer significant value.
Dynamic data clustering
A significant breach exists between existing model-based approaches and predictive analytics in assessing asset health. Whereas the former applies a mathematical approach based on equations, the latter utilizes statistics and machine learning in a purely data-driven approach. Modeling tools produce static thresholds based on design or expected levels, and predictive analytics produce a data model that dynamically changes in response to evolving asset conditions. Due to these differences, they can be highly complementary tools for maintaining asset reliability.
Most modeling tools evaluate changes in the condition of a given production asset by comparing values of various attributes to their respective design levels. Information is usually presented in a trend format. As conditions degrade over time, alerts are raised indicating that attention is required. Negative trends indicative of fouling or other performance-related problems are common health and reliability issues that develop over time. More difficult, however, is identifying imminent asset failure that can result in an unplanned outage. Asset failures are usually due to a combination of factors, which cannot be predicted a priori or detected simply by viewing trend data. Over time the underlying models lose their efficacy as the asset has changed with age while the models remain static.
Predictive analytics technologies apply machine-learning algorithms to produce data-driven models of an asset. One technique is to dynamically form and evolve data clusters in sync with each asset's life cycle (Figure 2). Clusters are based on numerous data inputs that respond to the changing conditions of an individual asset, and they correspond with the various modes, operating ranges, and products to which the asset is applied. Once cataloged in a knowledge or experience database, clusters associated with asset degradation or other negative attribute trigger alerts. Similarly, the formation of new clusters prompts alerts as the predictive analytics technology identifies new conditions that have yet to be classified. Unlike static and limited input models, the clusters fully account for the asset's condition and recognize both subtle and significant changes in behavior.
The benefits of predictive analytics are significant. This is especially true when used as a complement to existing maintenance and reliability tools.
The benefit of time
Predictive analytics provides a much-needed supplement to predictive maintenance technologies. It applies machine learning to cluster large volumes of multi-variable data. Through the cataloging of data clusters, predictive analytics establishes a comprehensive profile for each asset - unique fingerprints left behind during all phases of operation. Those fingerprints provide detailed knowledge of the asset's performance at varying operating rates, with differing products, and under other changing conditions within the production environment.
A facility that experienced the catastrophic failure of a fan system provides an example where predictive analytics' benefits were made clear. The unit had been equipped with a vibration analysis tool (Figure 3). In spite of monitoring standard vibration attributes, the facility's production staff received their first alert only minutes prior to the failure. By the time vibration levels had triggered an alarm, the damage to the fan system had already occurred and staff was unable to address the situation with an appropriate corrective action. The ensuing shutdown was costly due to lost production and premiums paid for both replacement equipment and employee overtime. Fortunately, no injuries were sustained.
PlantESP from Control Station, Inc., was subsequently applied to the same unit. Like other predictive analytics tools, it catalogs historical data and compares the associated clusters with current data. Its analytics module is operated in one of two modes: closed-book (i.e., operating) and open-book (i.e., learning). The two modes allow the application to remember past conditions as much as to single out new ones and thereby avoid catastrophic failures.
Predictive maintenance technologies like vibration analysis tools see only part of the picture. In the case of the fan failure, the existing vibration analysis tool had utilized a subset of the available asset tags. Although vibration, current, and temperature were constantly monitored, collectively they failed to characterize the true physics of the fan system. With partial information, the plant's predictive maintenance solution provided only part of the answer to what was a complex asset reliability problem.
Adjusting alarms is not a solution to this type of multi-variable problem. Tightening alarm limits on vibration would have resulted in too many nuisance alarms - another problem widely faced by manufacturers. Measurements of vibration and other available data is insufficient to assess an asset's health by viewing them in isolation or relative to a fixed limit. Multi-variable problems require the use of multi-dimensional solutions.
Hindsight is 20/20, and yet in the case of the fan failure the forward-looking value of predictive analytics was clearly highlighted. Using historical data from a broader sampling of asset tags, the application was initially run in open book mode. With no previous knowledge of the fan's behavior, several data clusters were generated and quickly attributed to various normal operating conditions - distinctive correlations involving numerous data tags. Another highly distinguishable cluster formed approximately ten hours prior to the failure, signaling a dynamic change in the relationship among and between the asset's measured attributes. Even with that amount of lead-time, staff would have been equipped to conduct an investigation and prescribe an appropriate remediation plan.
It is difficult for plant staff to discern subtle changes in the state of seemingly innumerable production assets. The challenge becomes increasingly difficult when the same staff are required to interpolate the relationship between multiple, dynamic data sets for each of those many assets. In the case of the fan system, the newly formed cluster highlighted a notable increase in vibration that corresponded with only slight increases to other measured variables. In spite of the distinct increase, the level of vibration was still considered within tolerance, and there was no need for concern. Only when vibration was correlated with other values, such as temperature, production flow, among other data, did the asset's troubled health become clear.
The same data was processed a second time while in closed book mode. The second pass allowed the application to apply its knowledge of the fan's operational conditions. A complete catalog of conditions was available as a reference. The resulting warning was far more meaningful from an asset maintenance and reliability standpoint, as the technology generated an initial alert a full 15 days prior to the failure. The alert stemmed from the now-documented relationship between the numerous data sources.
Predictive analytics solutions equip manufacturers with much-needed and higher-order diagnostic capabilities. They are not prone to obsolescence, as data clusters are generated dynamically when new conditions are experienced. Additionally, their ability to rapidly process large amounts of data from a broader array of tags enables predictive analytics to account for an individual asset's true nature. In a big data environment, manufacturers can secure substantial gains from the application of these analytic innovations.
Alerts have been triggered within the manufacturing industry. Data has far exceeded its design level, and staffing has fallen below its threshold for safe and effective production. The surge in data is outpacing the processing capabilities of an undersized workforce, and it is not capable of being fully utilized by existing tools for the benefit of manufacturers. The costs are high. New technologies are needed to restore plant operation - a big analytics solution to the big data challenge.
The industry continues to embrace technology as an enabler of safe and profitable production. Widespread adoption of predictive maintenance tools signaled a meaningful advancement in the use of technology, and it has reduced the probability of unplanned downtime due to asset failures. Due to their limitations, these tools fall short of the goal and perpetuate aspects of the run-to-fail behavior.
Suppliers across the industry - large and small - have responded to the growing need for higher-order diagnostics. Predictive analytic technologies have been applied successfully to solve complex asset reliability challenges. They are capable of processing the swelling influx of plant data, and they provide a solution to the industry's losses of human capital. Aligned with today's production environment, predictive analytics enables manufacturers to stop looking over their shoulder sand face forward.
ABOUT THE AUTHORS
Robert Rice, Ph.D. ([email protected]), is vice president of engineering at Control Station, Inc., and leads the company's research and development program.
Richard Bontatibus ([email protected]) recently joined Control Station, Inc. as vice president of global sales. Mr. Bontatibus was previously with Emerson Process Management and managed asset reliability sales throughout the North American region.