A formula for accelerating autonomous anomaly detection
Machine learning is becoming increasingly important in identifying anomalies and thus improving the level of quality control autonomy in process industries. But training such systems to identify meaningful deviations from normal data is often challenging because of a paucity of real-world examples, writes 2nd ABB review of 2021.
With a view to overcoming this drawback, Corys and ABB have combined two simulation technologies to create an environment that generates data that is remarkably similar to that produced
by specific processes in real industrial plants. This new level of simulation accuracy opens the door to tailor-made, targeted and accelerated anomaly detection capabilities. Industrial facilities need to run as smoothly as possible. To do so, indications of potentialproblems, such as anomalous vibrations, temperatures, pressures, and sounds need to be detected, identified, analyzed, and managed in their earliest stages. Anomaly detection, a key form of machine learning, can play a major role here by effectively supporting plant operators as they monitor the health of industrial systems.
Machine learning models, however, are typically trained using historical plant data. But as industrial systems are very robust, there are often not enough examples of real failure cases in the data to train reliable models. Moreover, even if some failure cases did occur, they are often hard to find in the data because they were not labeled as such by the operator, or because they were not noticed when they occurred. Furthermore, this state of affairs can lead to the mistaken identification of anomalous situations as being normal.
Creating an infrastructure for machine learning research
With a view to overcoming these drawbacks, data scientists are using high-fidelity process simulators, such as the Indiss Plus Simulator from Corys, to train machine learning models on specific normal and abnormal plant situations, such as, for example, valve failures, in order to correctly label such events. For instance, Corys and ABB have created an infrastructure for machine learning designed to explore the potential – as well as the data requirements – of different algorithms in a realistic setup. At the heart of the infrastructure are the simulation tools of the two companies: Corys’ process simulation Indiss Plus and ABB’s control system simulator 800xA Simulator.
Individually, both tools have been proven to be highly accurate in several operator training projects. Now, in a combined configuration, the tools can generate a simulation of the behavior
of a process and its associated automation system, such as, for instance, a real plant’s control logic, including alarms and safety logic.
A key advantage of Indiss Plus in this setup is that it also opens the door to simulating various plant equipment failures, eg, a valve leakage. The resulting failure data can overcome the issue of not having a sufficient number of failure cases to support machine learning. To create simulation data sets suitable for the training and validation of a machine learning model, the execution of simulation experiments must be automated. In the present case, an experiment controller was developed. The experiment controller takes in an experiment plan describing when to perform various operator actions like setpoint changes and when to trigger failures within the Indiss Plus process simulation. The experiment controller performs batches of experiments, starting and stopping the process simulation from different initial process states and automatically performing operator actions. It also starts the data collection that receives data from an 800xASimulator, making it possible to use ABB’s 800xA as a simulated control system in a simulator, with identical operator layout, view and control logic as in the plant. The data and a protocol of the actions performed by the experiment controller are stored in a time-series database and made available to a data scientist for the training of machine learning models.