Main Content

Acoustics-Based Machine Fault Recognition

In this example, you develop a deep learning model to detect faults in an air compressor using acoustic measurements. After developing the model, you package the system so that you can recognize faults based on streaming input data.

Data Preparation

Download and unzip the air compressor data set [1]. This data set consists of recordings from air compressors in a healthy state or one of seven faulty states.

downloadFolder = matlab.internal.examples.downloadSupportFile("audio","AirCompressorDataset/");
dataFolder = tempdir;
dataset = fullfile(dataFolder,"AirCompressorDataset");

Create an audioDatastore (Audio Toolbox) object to manage the data and split it into training and validation sets.

ads = audioDatastore(dataset,IncludeSubfolders=true);

You can reduce the training data set used in this example to speed up the runtime at the cost of performance. In general, reducing the data set is a good practice for development and debugging.

speedupExample = false;
if speedupExample
    lbls = folders2labels(ads.Files);
    idxs = splitlabels(lbls,0.2);

    ads = subset(ads,idxs{1});

The data labels are encoded in their containing folder name. To split the data into train and test sets, use folders2labels (Signal Processing Toolbox) and splitlabels (Signal Processing Toolbox).

lbls = folders2labels(ads.Files);
idxs = splitlabels(lbls,0.9);

adsTrain = subset(ads,idxs{1});
labelsTrain = lbls(idxs{1});
adsValidation = subset(ads,idxs{2});
labelsValidation = lbls(idxs{2});

Call countlabels (Signal Processing Toolbox) to inspect the distribution of labels in the train and validation sets.

ans=8×3 table
      Label      Count    Percent
    _________    _____    _______

    Bearing       203      12.5  
    Flywheel      203      12.5  
    Healthy       203      12.5  
    LIV           203      12.5  
    LOV           203      12.5  
    NRV           203      12.5  
    Piston        203      12.5  
    Riderbelt     203      12.5  

ans=8×3 table
      Label      Count    Percent
    _________    _____    _______

    Bearing       22       12.5  
    Flywheel      22       12.5  
    Healthy       22       12.5  
    LIV           22       12.5  
    LOV           22       12.5  
    NRV           22       12.5  
    Piston        22       12.5  
    Riderbelt     22       12.5  

The data consists of time-series recordings of acoustics from faulty or healthy air compressors. As such, there are strong relationships between samples in time. Listen to a recording and plot the waveform.

[sampleData,sampleDataInfo] = read(adsTrain);
fs = sampleDataInfo.SampleRate;

title("State: " + string(labelsTrain(1)))
axis tight

Because the samples are related in time, you can use a recurrent neural network (RNN) to model the data. A long short-term memory (LSTM) network is a popular choice of RNN because it is designed to avoid vanishing and exploding gradients. Before you can train the network, it's important to prepare the data adequately. Often, it is best to transform or extract features from 1-dimensional signal data in order to provide a richer set of features for the model to learn from.

Feature Engineering

The next step is to extract a set of acoustic features used as inputs to the network. Audio Toolbox™ enables you to extract spectral descriptors that are commonly used as inputs in machine learning tasks. You can extract the features using individual functions, or you can use audioFeatureExtractor (Audio Toolbox) to simplify the workflow and do it all at once. After feature extraction, orient time along rows which is the expected format for sequences in Deep Learning Toolbox™.

windowLength = 512;
overlapLength = 0;

afe = audioFeatureExtractor(SampleRate=fs, ...
    spectralCentroid=true, ...
    spectralCrest=true, ...
    spectralDecrease=true, ...
    spectralEntropy=true, ...
    spectralFlatness=true, ...
    spectralFlux=false, ...
    spectralKurtosis=true, ...
    spectralRolloffPoint=true, ...
    spectralSkewness=true, ...
    spectralSlope=true, ...

trainFeatures = extract(afe,adsTrain);
trainFeatures = cellfun(@(x)x',trainFeatures,UniformOutput=false);
disp("Feature extraction of train set took " + toc + " seconds.");
Feature extraction of train set took 13.2827 seconds.

Data Augmentation

The training set contains a relatively small number of acoustic recordings for training a deep learning model. A popular method to enlarge the data set is to use mixup. In mixup, you augment your dataset by mixing the features and labels from two different class instances. Mixup was reformulated by [2] as labels drawn from a probability distribution instead of mixed labels. The supporting function, mixup, takes the training features, associated labels, and the number of mixes per observation and then outputs the mixes and associated labels.

numMixesPerInstance = 2;
[augData,augLabels] = mixup(trainFeatures,labelsTrain,numMixesPerInstance);

trainLabels = cat(1,labelsTrain,augLabels);
trainFeatures = cat(2,trainFeatures',augData);
disp("Feature augmentation of train set took " + toc + " seconds.");
Feature augmentation of train set took 0.12886 seconds.

Generate Validation Features

Repeat the feature extraction for the validation features.

featuresValidation = extract(afe,adsValidation);
featuresValidation = cellfun(@(x)x',featuresValidation,UniformOutput=false);
disp("Feature extraction of validation set took " + toc + " seconds.");
Feature extraction of validation set took 1.4529 seconds.

Train Model

Next, you define and train a network. To skip training the network, set downloadPretrainedSystem to true, then continue to the next section.

downloadPretrainedSystem = false;
if downloadPretrainedSystem
    downloadFolder = matlab.internal.examples.downloadSupportFile("audio","AcousticsBasedMachineFaultRecognition/");
    dataFolder = tempdir;
    netFolder = fullfile(dataFolder,"AcousticsBasedMachineFaultRecognition");


Define Network

An LSTM layer learns long-term dependencies between time steps of time series or sequence data. The first lstmLayer has 100 hidden units and outputs sequence data. Then a dropout layer is used to reduce overfitting. The second lstmLayer outputs the last step of the time sequence.

numHiddenUnits = 100;
dropProb = 0.2;
layers = [ ...

Define Network Hyperparameters

To define hyperparameters for the network, use trainingOptions.

miniBatchSize = 128;
validationFrequency = floor(numel(trainFeatures)/miniBatchSize);
options = trainingOptions("adam", ...
    MiniBatchSize=miniBatchSize, ...
    MaxEpochs=35, ...
    Plots="training-progress", ...
    Verbose=false, ...
    Shuffle="every-epoch", ...
    LearnRateSchedule="piecewise", ...
    LearnRateDropPeriod=30, ...
    LearnRateDropFactor=0.1, ...
    ValidationData={featuresValidation,labelsValidation}, ...

Train Network

To train the network, use trainNetwork.

airCompNet = trainNetwork(trainFeatures,trainLabels,layers,options);

Evaluate Network

View the confusion chart for the validation data.

validationResults = classify(airCompNet,featuresValidation);
confusionchart(validationResults,labelsValidation, ...
    Title="Accuracy: " + mean(validationResults == labelsValidation)*100 + " (%)");

Model Streaming Detection

Create Functions to Process Data in a Streaming Loop

Once you have a trained network with satisfactory performance, you can apply the network to test data in a streaming fashion.

There are many additional considerations to take into account to make the system work in a real-world embedded system.

For example,

  • The rate or interval at which classification can be performed with accurate results

  • The size of the network in terms of generated code (program memory) and weights (data memory)

  • The efficiency of the network in terms of computation speed

In MATLAB, you can mimic how the network is deployed and used in hardware on a real embedded system and begin to answer these important questions.

Create MATLAB Function Compatible with C/C++ Code Generation

Once you train your deep learning model, you will deploy it to an embedded target. That means you also need to deploy the code used to perform the feature extraction. Use the generateMATLABFunction method of audioFeatureExtractor to generate a MATLAB function compatible with C/C++ code generation. Specify IsStreaming as true so that the generated function is optimized for stream processing.

filename = fullfile(pwd,"extractAudioFeatures");

Combine Streaming Feature Extraction and Classification

Save the trained network as a MAT file.


Create a function that combines the feature extraction and deep learning classification.

type recognizeAirCompressorFault.m
function scores = recognizeAirCompressorFault(audioIn,rs)
% This is a streaming classifier function 

persistent airCompNet

if isempty(airCompNet)
    airCompNet = coder.loadDeepLearningNetwork('AirCompressorFaultRecognitionModel.mat');
if rs
    airCompNet = resetState(airCompNet);

% Extract features using function
features = extractAudioFeatures(audioIn);

% Classify
[airCompNet,scores] = predictAndUpdateState(airCompNet,features);


Test Streaming Loop

Next, you test the streaming classifier in MATLAB. Stream audio one frame at a time to represent a system as it would be deployed in a real-time embedded system. This enables you to measure and visualize the timing and accuracy of the streaming implementation.

Stream in several audio files and plot the output classification results for each frame of data. At a time interval equal to the length of each file, evaluate the output of the classifier.


N = 10;
labels = categories(labelsValidation);
numLabels = numel(labels);

% Create a dsp.AsyncBuffer to read audio in a streaming fashion
audioSource = dsp.AsyncBuffer;

% Create a dsp.AsyncBuffer to accumulate scores
scoreBuffer = dsp.AsyncBuffer;

% Create a dsp.AsyncBuffer to record execution time.
timingBuffer = dsp.AsyncBuffer;

% Pre-allocate array to store results
streamingResults = categorical(zeros(N,1));

% Loop over files
for fileIdx = 1:N

    % Read one audio file and put it in the source buffer
    [data,dataInfo] = read(adsValidation);

    % Inner loop over frames
    rs = true;
    while audioSource.NumUnreadSamples >= windowLength

        % Get a frame of audio data
        x = read(audioSource,windowLength);

        % Apply streaming classifier function
        score = recognizeAirCompressorFault(x,rs);

        % Store score for analysis

        rs = false;

    % Store class result for that file
    scores = read(scoreBuffer);
    [~,result] = max(scores(end,:),[],2);
    streamingResults(fileIdx) = categorical(labels(result));

    % Plot scores to compare over time
    plot(scores) %#ok<*NASGU>
    xlabel("Time Step")
    title(["Known Label = " + string(labelsValidation(fileIdx)),"Predicted Label = " + string(streamingResults(fileIdx))])

Compare the test results for the streaming version of the classifier and the non-streaming.

testError = mean(validationResults(1:N) ~= streamingResults);
disp("Error between streaming classifier and non-streaming: " + testError*100 + " (%)")
Error between streaming classifier and non-streaming: 80 (%)

Analyze the execution time. The execution time when state is reset is often above the 32 ms budget. However, in a real, deployed system, that initialization time will only be incurred once. The execution time of the main loop is around 10 ms, which is well below the 32 ms budget for real-time performance.

executionTime = read(timingBuffer)*1000;
budget = (windowLength/afe.SampleRate)*1000;
title("Execution Time Per Frame")
xlabel("Frame Number")
ylabel("Time (ms)")

Supporting Functions

function [augData,augLabels] = mixup(data,labels,numMixesPerInstance)
augData = cell(1,numel(data)*numMixesPerInstance);
augLabels = repelem(labels,numMixesPerInstance);

kk = 1;
for ii = 1:numel(data)
    for jj = 1:numMixesPerInstance
        lambda = max(min((randn./10)+0.5,1),0);
        % Find all available data with different labels.
        availableData = find(labels~=labels(ii));

        % Randomly choose one of the available data with a different label.
        numAvailableData = numel(availableData);
        idx = randi([1,numAvailableData]);

        % Mix.
        augData{kk} = lambda*data{ii} + (1-lambda)*data{availableData(idx)};

        % Specify the label as randomly set by lambda.
        if lambda < rand
            augLabels(kk) = labels(availableData(idx));
            augLabels(kk) = labels(ii);
        kk = kk + 1;



[1] Verma, Nishchal K., et al. "Intelligent Condition Based Monitoring Using Acoustic Signals for Air Compressors." IEEE Transactions on Reliability, vol. 65, no. 1, Mar. 2016, pp. 291–309. (Crossref), doi:10.1109/TR.2015.2459684.

[2] Huszar, Ferenc. "Mixup: Data-Dependent Data Augmentation." InFERENCe. November 03, 2017. Accessed January 15, 2019.