Analytics next: Beyond spreadsheets
New approaches handle more data volume and perform predictive analytics
By Michael Risse
Merriam-Webster's online dictionary says the first known use of "analytics" was in 1590 when it was defined as "the method of logical analysis"; whereas "analysis" was first used in 1581 and was defined as "the separation of a whole into its component parts." Fast forward 430 years, and analytics is now defined in many ways, including data visualization, machine learning, business intelligence, dashboards, and key performance indicators (KPIs). The pressure to gain insight from data is so pervasive that analytics has become a throwaway term in marketing materials for all types of software.
But whatever analytics is called or supposed to mean, process manufacturers have too much data and not enough insights. Most process industry companies have collected years of time-series historical data but are unable to quickly surface and share critical insights leading to improvements in efficiency and innovation. Additionally, it is difficult to determine value or affect in-process batches or processes, because it takes so long to find the insights.
Further, this "data rich, information poor" (DRIP) (figure 1) situation is only getting worse with the exponential increase in data as the Industrial Internet of Things (IIoT) takes hold. IIoT forecasts correlate to the amount of data expected, and market intelligence firm IDC is expecting worldwide spending on IoT to reach $745 billion in 2019, led by the manufacturing sectors. That represents a massive amount of sensor data, and it will go to waste absent robust analytics and a flexible, cost-effective way to store and process it.
If business and production insights are going to be faster, better, and easier to achieve-then something will need to change by bridging computer science innovation with the expertise and experience of plant employees. The spreadsheet, the backbone of the past 30 years of analytics efforts in manufacturing, will simply not suffice for the next 30 years. There is too much data, too few engineering professionals, and too many demands for insight from improvements in analytics for spreadsheets to be the primary solution.
Figure 1. Many process manufacturing companies are drowning in data but thirsting for information.
The increased attention on analytics in process manufacturing has led to a taxonomy for different types of analytics. It is important to identify how they may be applied in process manufacturing (figure 2).
- Descriptive analytics are by definition backward-looking, because they describe what happened in reports, charts, and KPIs based on collected data. This is the most widely used type of analytics across all industries, and the insights are broadly useful and may be shared in near real time.
- Monitoring analytics track asset, batch, or operations performance and seek to answer the question "what is happening now?" Typically, monitoring solutions answer the question of current status in dashboards or process graphics updated in near real time, but they are strictly advisory and are thus not suited for inclusion in closed-loop control systems.
- Diagnostic analytics seek to identify why something happened based on analysis of historical data, often called root-cause analysis. As descriptive analytics are to reports, diagnostic analytics are to spreadsheets as engineers combine, contextualize, and perform calculations on data to uncover cause and effect in processes and units.
- Predictive analytics help engineers identify what will likely happen based on real-time and historical data, enabling corrective action to be taken before an undesirable outcome. Benefits include avoiding unplanned downtime, optimizing maintenance schedules, and improving quality or yields.
- Prescriptive analytics aim to optimize outcomes by informing plant employees of their best actions based on existing conditions. In a closed-loop system, prescriptive analytics can automate asset or process adjustments based on a predefined set of conditions. In an open-loop system, prescriptive analytics inform engineers of desired actions.
Figure 2. This taxonomy describes five different types of analytics in process manufacturing and their benefits.
The future of analytics: Three developments
Against the backdrop of DRIP, with ever more data coming soon and elevated pressure to gain faster insights of all types for improved production, there are three important trends that will define the future of analytics as experienced in process manufacturing environments.
1. Recognition of employee empowerment through self-service analytics. The reason spreadsheets have enjoyed their run of success as the primary tool for analytics is that they are accessible to the employees who know the questions to ask. The approach of information technology (IT) personnel without industrial knowledge generating or automating analytics or insights is proving short lived, and deservedly so. It simply does not work in complex and rapidly changing environments with extensive interaction among variables.
An example of the importance of a self-service approach can be found in a recent McKinsey & Company report. "Value emerges as a combination of the tool and the people who operate it. Yet we have seen too many cases where that simple truth has been forgotten in the wave enthusiasm for a new approach. Advanced solutions often fail not because they produce erroneous results, but because the workforce does not understand, or trust, those results." Technology investments are necessary, but not sufficient to achieve productivity improvements, the authors write. To succeed, it is essential for manufacturers to invest in their people.
In process industries, such as oil and gas, chemical, refining, pharmaceutical, and food and beverage, engineers are the most important group of analytics users. They have the required experience, expertise, and history with the plant and processes. Self-service analytics let engineers work at an application level with productivity, empowerment, interaction, and ease-of-use benefits (figure 3). In the future, the universe of analytics users will expand beyond engineers to operators, executives, and accountants-all of whom will also benefit.
Figure 3. Self-service analytics enable engineers to work at the application level and gain productivity, empowerment, interaction, and ease-of-use benefits using Seeq R21 software.
2. The emergence of advanced analytics. This new class of analytics speaks to the inclusion of cognitive computing technologies into the visualization and calculation offerings that have been used for years to accelerate insights for end users. McKinsey defines advanced analytics solutions this way:
"[Advanced analytics solutions] . . . provide easier access to data from multiple data sources, along with advanced modeling algorithms and easy-to-use visualization approaches and could finally give manufacturers new ways to control and optimize all processes throughout their entire operations." Figure 4 depicts data from multiple sources accessed from a single advanced analytics application.
The introduction of machine learning and other analytic techniques accelerate an engineer's efforts when seeking correlations, clustering, or any other needle-within-the- haystack analysis of process data (figure 5). With these features built on multidimensional models and enabled by assembling data from different sources, engineers gain an order-of-magnitude improvement in analytic capabilities, akin to moving from pen and paper to the spreadsheet 30 years ago.
These innovations in advanced analytics are not a black box replacement for the expertise of the engineers but are instead a complement and accelerator to their skills, with transparency to the underlying algorithms supporting a first principles approach to investigations.
3. Analytics moving to the cloud. Companies of all types, including process manufacturers, are moving their IT infrastructure and data to public and hybrid clouds to increase agility, speed responsiveness, and reduce complexity. Driving this growth are the burgeoning data volumes and increased demand from compute-intensive workloads.
Analytics workloads are particularly suited for migration, because most use cases require the scalability, agility, time to market, and reduced costs provided by the cloud. Large process manufacturers will likely utilize a mix of public and private cloud offerings, as well as on-premise components, for analytics.
The trend is in its infancy, though some industries are ahead. Chemical manufacturers, for example, are beginning to embrace the cloud, for analytics as well as other use cases. As a result, Microsoft, Amazon, and Google have specifically focused on the oil and gas sector as a starting point for their efforts. This is clearly a sign of market interest, and it is also a sign of the maturity of the cloud offerings: Amazon brought out AWS in 2002, and then introduced S3 (storage) and EC2 (virtual machines) in 2006. Cloud computing competition then increased with Microsoft's and Google's cloud platform introductions in 2008.
Storing large volumes of data in the cloud is increasing, and it is already a "when" and not an "if" question for most companies. Consequently, the big public cloud platforms are paying more attention to the largest sources of data, with manufacturing leading all sectors of the economy. What this means for process manufacturing customers is faster time to deployment and a lower price for analytics access.
Dominant for decades as the analytics tool of choice, spreadsheets are not up to the task of performing advanced analytics on ever-larger datasets, yet their accessibility to engineers is a requirement for any future analytics offering. Insights that take too long to discover languish because they cannot easily be published and shared with others. Advanced analytics applications connect with data from a wide array of sources and surface insights much more quickly in a format that is easy to share, enabling actions to improve business results and profitability.
Here is an example showing advanced analytics in action.
Figure 4. Advanced analytics applications can access data from multiple sources.
Figure 5. Software takes advantage of advances in a range of technologies, including machine learning, empowering engineers to create insights.
A chemical company took advantage of a browser-based advanced analytics application running in the cloud to connect back to its on-premise data via a secure HTTPS connection and a remote connection agent. The solution was deployed and accessible in a matter of hours, and the data stayed where it was, enabling insight in days rather than months.
Another option is to make the cloud the destination for datasets collected from remote or IIoT end points. This is a more natural and easier option than trying to reroute data from carriers and wireless systems back into IT systems and then to the cloud, because data "born on the cloud" is a popular option for many monitoring applications. In this case, end users can then access the data by either running analytics on the cloud or by running the analytics solution on premise with a remote connection to the cloud-based data.
In either scenario, the monitoring data may be complemented or contextualized by connecting the analytics solutions to other data sources-historians, manufacturing execution systems, etc.-to get a complete view of all data. For chemical companies, this scenario can be used for new insights into supply chain and operations by complementing existing data with data from wireless or cellular networks.
A third scenario is accessing multiple sites from a cloud deployment of analytics software. Although moving or copying the data to the cloud also could facilitate cross-plant comparisons for yields, quality, etc., a simple remote connection for occasional queries and comparisons may suffice, depending on the frequency and requirements of the end user.
Analytics is not new, and neither are the unrealized promises that have surrounded the field. But technical advancements, cloud computing and machine learning for instance, along with the massive explosion in data from sensors and other sources, have come together to create new opportunities. There is now reason to believe analytics will finally generate measurable value for process manufacturers by rapidly bringing shareable insights to light.