Anomaly Detection Service¶
The Anomaly Detection Service aims to automatically detect unexpected behaviour of processes and assets using time series data.
For a given asset and for a specified period, the user is notified if the asset behaves abnormally in any way. Using this information the user is able to monitor their assets (e.g. by setting notification/warning thresholds).
The service enables the user to run the following applications:
- Process & Condition Monitoring
- Early warning functionality
- Detection of fault conditions without explicit definition
For accessing this service you need to have the respective roles listed in Analytics roles and scopes.
The Anomaly Detection Service uses a density-based clustering approach to detect anomalous behaviour. The clustering is done in a training with historical data and leads to a model for normal behaviour of time series data.
This model is evaluated for a given set of data points by checking whether the data points belong to one of the clusters formed during model training. Data points belonging to a cluster are considered normal. Data points not belonging to a cluster are given a score that indicates their distance to the nearest cluster. The higher this score, the more likely it is that the data point is an anomaly.
In addition to one or multiple time series (original sensor data in IoT model format), the service uses a specific configuration as input for the signal calculation. The configuration information consists of the following parts:
Model Training (Clustering)¶
The Model Training clusters the historical training data (time series) and provides a model of normal behaviour of the process / asset. The clustering is done using the density-based algorithm DBSCAN (Ester, Martin, et al. "A density-based algorithm for discovering clusters in large spatial databases with noise.").
The algorithm uses the following parameters:
Threshold for the distance to determine if a point belongs to the cluster
Minimum cluster size
Type of the distance measure algorithm (optional):
- Euclidean (default)
Human-friendly name of the model (not an empty string, maximum length is 255 characters, only ASCII characters, default: 'model').
If optional parameters are missing, defaults are used for executing DBSCAN.
The clustering assigns each data instance a cluster ID to which it belongs, or a noise label if it does not belong to any cluster.
Models have a limited lifetime. They are automatically deleted (lifetime is 1 day at least).
Model Evaluation (Scoring)¶
The Model Evaluation determines whether a given set of data points is anomalous or not. This is done by calculating the distance of each data point to its nearest neighbor that is part of a cluster. If this distance is smaller or equal to epsilon, the data point in question is considered normal. Such data points are assigned a score of "0". In all other cases the data point is assigned a score equal to the difference between the calculated distance and epsilon. The higher the score, the more likely it is that the data point is an anomaly.
The Anomaly Detection Service exposes its API for realizing the following tasks:
- Train a model
- Get all existing models
- Get a model by ID
- Delete a model
- Detect anomalies of time series data
The manager of a brewery wants to detect some abnormal characteristics of the factory's production line.
The manager collects time series data of a relevant sensor of the production line using the IoT Time Series Service. They upload the data into MindSphere and use the Anomaly Detection Service to compare them with a previously trained model. The response from the Anomaly Detection Service provides candidates for anomalies.
Any questions left?
Except where otherwise noted, content on this site is licensed under the MindSphere Development License Agreement.