2019 Master Thesis: Anomaly Detection in Time Series Data with Neural Networks

Tags: Machine Learning, Deep Learning, Anomaly Detection, Time Series Data

1. Abstract
2. Content
3. Reference

1. Abstract

For the implementation of preventive maintenance in production environments, this thesis extends an anomaly detection approach whose performance is evaluated using time series data from processes of different complexity.

Sensor data of a wide range of process parameters are becoming increasingly available. For the processing of large amounts of data, the application of machine learning methods is state of the art. The absence of information about all potential error patterns of the process data limits the choice of compatible methods.

The anomaly detection method learns the normal process cycles from the time series data of successfully completed processes. Autoencoders, adapted structures of neural networks, learn to reconstruct these training data. The optimized reconstruction capability of normal processes is essential for the detection of deviating processes. Statistical methods, based on the reconstruction error distribution of normal and anomalous validation data, are used to further optimize the anomaly detection performance.

This extended approach is applied against different datasets. The procedure comprises data preprocessing, model design, model training and the evaluation and optimization of the anomaly detection performance.

The applied datasets confirm the practical applicability of the extended anomaly detection procedure. However, more complex process data reveal the limitations of simple autoencoder models.

2. Content

Preventive Maintenance (PvM):
Maintenance activities are only to be performed when actually needed. Unexpected breaks and unexploited equipment lifetime are minimized simultaneously
emft_predictive_maintenance

Anomaly Detection:
Anomaly Detection is the problem of finding patterns in data that do not follow the expected behavior. Anomalies are unusual data instances that should be further analyzed to identify the cause of its occurrence.
emft_anomaly_detection

Time Series Data:
Time Series Data in production environments is data from sensors sampled in a sequential order and usually recorded at regular intervals.
emft_time_series_data

Autoencoder:
Autoencoders are feedforward networks with sequential layers of neuronal units. The input data is to be reconstructed at the output layer. The information exchange is constrained by the specific network architecture. By using only unlabeled data, autoencoders learn the encoding of information independently. Hence, they are considered an unsupervised learning approach.
emft_autoencoder

Time Series Data Anomaly Detection:
an anomaly classifier is defined using statistical methods based on the reconstruction performance of the trained autoencoder model
emft_threshold
emft_threshold_confusion

Reconstruction Error: The trained autoencoder model is evaluated according to its reconstruction performance based on normal and anomalous input data
emft_reconstruction_error
emft_reconstruction_error_density

Optimum F-Beta Threshold: The optimum F-Beta threshold is determined using validation data applied to classify normal and anomalous test data
emft_optimum_threshold
emft_optimum_threshold_confusion
emft_performance_parameters

3. Reference

emft_zeugnis

back