Anomaly Detection Service¶
The Anomaly Detection Service aims to automatically detect unexpected behaviour of processes and assets using time series data.
For a given asset and for a specified period of time, the user will be notified whether the asset's behaviour is abnormal in any way. Using this information the user is able to monitor his assets (e.g. by setting notification/warning thresholds).
The service enables the user to run the following applications:
- Process & Condition Monitoring
- Early warning functionality
- Detection of fault conditions without explicit definition
The Anomaly Detection Service uses a density-based clustering approach to detect anomalous behaviour. The clustering is done in a training with historical data and leads to a model for time series data considered normal.
This model is evaluated for a given set of data points by checking whether the data points belong to one of the clusters formed during model training. Data points belonging to a cluster are considered normal. Data points not belonging to a cluster are given a score that indicates their distance to the nearest cluster. The higher this score, the more likely it is that the data point is an anomaly.
In addition to one or multiple time series (original sensor data in IoT model format) as input, the service uses a specific configuration for the signal calculation. The configuration information consists of the following parts:
Model Training (Clustering)¶
The model training clusters the historical training data (time series) and providing us with a model of normal behaviour of our process / asset. The clustering is done using the density-based algorithm DBSCAN (Ester, Martin, et al. "A density-based algorithm for discovering clusters in large spatial databases with noise.").
The algorithm uses the following parameters:
Threshold for the distance to check if a point belongs to the cluster (optional, default: estimated)
Minimum cluster size (points per cluster, optional, default: estimated)
Type of the distance measure algorithm:
- Euclidean (default)
If a parameter is missing, the defaults or estimated values are used for executing DBSCAN.
The clustering results in each data instance being assigned a cluster ID to which it belongs, or a noise label if it does not belong to any cluster.
Created models are not stored forever. Eventually they will be automatically deleted (life time is 1 day at least).
Model Evaluation (Scoring)¶
The model evaluation determines whether a given set of data points is anomalous or not. This is done by calculating the distance of each data point to its nearest neighbour that is part of a cluster. If this distance is smaller or equal to epsilon, then the data point in question is not anomalous. Such data points are assigned a score of "0". In all other cases the data point is assigned a score equal to the difference between the distance calculated and epsilon. The higher the score, the more probable it is that the data point is an anomaly.
The Anomaly Detection Service exposes its API for realizing the following tasks:
- Train a model.
- Get all existing models.
- Get model by ID.
- Delete a model.
- Detect anomalies of time series data.
The manager of a brewery wants to detect some abnormal characteristics of his production line.
Collect time series data of a relevant sensor of the production line using the IoT Time Series Service. Feed the Anomaly Detection Service API with this time series and evaluate the response.
- The production line is connected to MindSphere.
Any questions left?
Except where otherwise noted, content on this site is licensed under the MindSphere Development License Agreement.