Main Content

Accelerate Signal Feature Extraction and Classification Using a GPU

This example uses signal feature extraction objects to extract multidomain features that can be used to identify faulty bearing signals in mechanical systems. Feature extraction objects enable the computation of multiple features in an efficient way by reducing the number of times that signals are transformed into a particular domain. The example compares feature extraction time while running on:

  • An Intel® Xeon® Gold 5218 CPU @ 2.30GHz CPU worker

  • A single NVIDIA® A100-PCIE-40GB graphical processing unit (GPU)

The acceleration results may vary based on the available hardware resources.

To learn the feature extraction and model training workflow, see Machine Learning and Deep Learning Classification Using Signal Feature Extraction Objects. To learn how to extract features and train models in parallel using a parallel pool of workers, see Accelerate Signal Feature Extraction and Classification Using a Parallel Pool of Workers.

Introduction

This example extends the Machine Learning and Deep Learning Classification Using Signal Feature Extraction Objects example by showing how to compute features and train models using a GPU. Visit that example to read the details of the problem and the dataset.

Download and Prepare the Data

The data set contains acceleration signals collected from rotating machines in bearing test rig and real-world machines such an oil pump bearing, an intermediate speed bearing, and a planet bearing. There are 34 files in total. The signals in the files are sampled at fs = 25 Hz. The file names describe the signals they contain:

  • healthy.mat Healthy signals

  • innerfault.mat Signals with inner race faults

  • outerfault.mat Signals with outer race faults

Download the data files into your temporary directory, whose location is specified by the tempdir command in MATLAB®. If you want to place the data files in a folder different from tempdir, change the directory name in the subsequent instructions. Create a signalDatastore object to access the data in the files and obtain the labels.

dataURL = 'https://www.mathworks.com/supportfiles/SPT/data/rollingBearingDataset.zip';
datasetFolder = fullfile(tempdir,'rollingBearingDataset');
zipFile = fullfile(tempdir,'rollingBearingDataset.zip');
if ~exist(datasetFolder,'dir')
    websave(zipFile,dataURL);
    unzip(zipFile,datasetFolder);
end

Create a signalDatastore object to access the data in the files and obtain the labels. Use single-precision arithmetic in the feature extraction and model training steps to reduce memory requirements and computational time.

sds = signalDatastore(datasetFolder,OutputDataType="single");

The dataset filenames contain the label name. Get a list of labels from the filenames in the datastore using the filenames2labels function.

labels = filenames2labels(sds,ExtractBefore='_');

% Shorten labels for better display
labels = getShortenedLabels(labels);

To accelerate subsequent feature extraction computations using a GPU, create a signalDatastore that returns the variables stored in the files as gpuArray objects. A gpuArray object represents an array stored in GPU memory. Create a datastore that returns gpuArrays with single-precision data

sdsGPU = signalDatastore(datasetFolder,OutputDataType="single",OutputEnvironment="gpu");

Setup for Feature Extraction Objects

In this section you set up the feature extractors that extract multidomain features from the signals. These features will be used to implement machine learning and deep learning solutions to classify signals as healthy, as having inner race faults, or as having outer race faults [3]. Use the signalTimeFeatureExtractor, signalFrequencyFeatureExtractor, and signalTimeFrequencyFeatureExtractor objects to extract features from all the signals.

  • For time domain, use root-mean-square value, impulse factor, standard deviation, and clearance factor as features.

  • For frequency domain, use median frequency, band power, power bandwidth, and peak amplitude of the power spectral density (PSD) as features.

  • For time-frequency domain, use these features from the signal spectrogram: spectral kurtosis [4], spectral skewness, spectral flatness, and time-frequency ridges [5]. Additionally, use the scale-averaged wavelet scalogram as a feature.

Create a signalTimeFeatureExtractor that extracts the time-domain features.

timeFE = signalTimeFeatureExtractor(SampleRate=25,...
    RMS=true, ...
    ImpulseFactor=true, ...
    StandardDeviation=true, ...
    ClearanceFactor=true);

Create a signalFrequencyFeatureExtractor that extracts the frequency-domain features.

 freqFE = signalFrequencyFeatureExtractor(SampleRate=25, ...
    MedianFrequency=true, ...
    BandPower=true, ...
    PowerBandwidth=true, ...
    PeakAmplitude=true);

Create a signalTimeFrequencyFeatureExtractor that extracts the time-frequency domain features. To extract the time-frequency features, use a spectrogram with 90% leakage.

 timeFreqFE = signalTimeFrequencyFeatureExtractor(SampleRate=25, ...
    SpectralKurtosis=true, ...
    SpectralSkewness=true, ...
    SpectralFlatness=true, ...
    TFRidges=true, ...
    ScaleSpectrum=true);

setExtractorParameters(timeFreqFE,"spectrogram",Leakage=0.9);

Train an SVM Classifier Using Multidomain Features

Extract Multidomain Features

In this section you extract multidomain features using a CPU and a GPU and compare computational time.

Extract features using the CPU and measure the computation time.

tstart = tic;
SVMCPUFeatures = cellfun(@(a,b,c) [real(a) real(b) real(c)],extract(timeFE,sds),extract(freqFE,sds),...
    extract(timeFreqFE,sds),UniformOutput=false);
tCPU = toc(tstart);

Accelerate the feature extraction process using a GPU by setting the sdsGPU datastore as input to the extract methods of the feature extractors. Recall that the OutputEnvironment property of this datastore was set to "gpu" for this purpose. Measure the computation time.

device = gpuDevice
device = 
  CUDADevice with properties:

                 Name: 'NVIDIA A100-PCIE-40GB'
                Index: 1 (of 1)
    ComputeCapability: '8.0'
          DriverModel: 'TCC'
          TotalMemory: 42635952128 (42.64 GB)
      AvailableMemory: 41021417016 (41.02 GB)
      DeviceAvailable: true
       DeviceSelected: true

  Show all properties.

tstart = tic;
SVMGPUFeatures = cellfun(@(a,b,c) [real(a) real(b) real(c)],extract(timeFE,sdsGPU),...
    extract(freqFE,sdsGPU),extract(timeFreqFE,sdsGPU),UniformOutput=false);
wait(device)
tGPU = toc(tstart);

bar(["CPU" "GPU"],[tCPU tGPU],0.8,FontSize=12,...
    Labels=["" num2str((tCPU/tGPU),2)+ "X faster"])
title("Speed-up in Feature Extraction Using CPU vs. GPU")
ylabel("Run Time (seconds)")

Figure contains an axes object. The axes object with title Speed-up in Feature Extraction Using CPU vs. GPU, ylabel Run Time (seconds) contains an object of type bar.

Train an SVM Classifier Model

Obtain an in-memory feature matrix and use it to train a multiclass SVM classifier. Split the feature table into training and testing feature data sets. Obtain their corresponding labels. Reset the random number generator for reproducible results.

featureMatrixCPU = cell2mat(SVMCPUFeatures);

rng default
cvp = cvpartition(labels,Holdout=0.25);

trainingResponse = labels(cvp.training,:);
trainingPredictors = featureMatrixCPU(cvp.training,:);

testResponse = labels(cvp.test,:);
testPredictors = featureMatrixCPU(cvp.test,:);

Train a multiclass SVM classifier model using in-memory training feature matrix and its corresponding labels.

tStart = tic;
SVMModel = fitcecoc(trainingPredictors,trainingResponse);
tTrainingCPU = toc(tStart);

Accelerate the SVM training process by using GPU features. Compare the time it takes to train the model using the CPU and GPU.

featureMatrixGPU = vertcat(SVMGPUFeatures{:});

tStart = tic;
fitcecoc(featureMatrixGPU(cvp.training,:), trainingResponse);
wait(device)
tTrainingGPU = toc(tStart);
 
bar(["CPU" "GPU"],[tTrainingCPU tTrainingGPU],0.8,FontSize=12,...
    Labels=["" num2str((tTrainingCPU/tTrainingGPU),2)+"X faster"])
title("Speed-up in SVM Model Training Using CPU vs. GPU")
ylabel("Run Time (seconds)")

Figure contains an axes object. The axes object with title Speed-up in SVM Model Training Using CPU vs. GPU, ylabel Run Time (seconds) contains an object of type bar.

Use the trained SVM classifier and in-memory test features to observe the classifier accuracy.

predictedLabels = predict(SVMModel, testPredictors);

figure
confusionchart(testResponse,predictedLabels,...
    ColumnSummary="column-normalized",RowSummary="row-normalized");

Figure contains an object of type ConfusionMatrixChart.

Train an LSTM Network Using Features

Set Up Feature Extraction Objects for Training an LSTM Network

Each signal in the signalDatastore object sds has around 150,000 samples. Window each signal into 1000-sample signal frames and extract multidomain features from it using all three feature extractors. To window the signals, set the FrameSize for all three feature extractors to 1000.

timeFE.FrameSize = 1000;
freqFE.FrameSize = 1000;
timeFreqFE.FrameSize = 1000;

Features extracted from frames correspond to a sequence of features over time that have lower dimension than the original signal. The dimension reduction helps the LSTM network to train faster. The workflow follows these steps:

  • Split the signals in the signalDatastore object into frames.

  • For each signal, extract the features from all three domains and concatenate them.

  • Split the signal datastore into training and test datastores. Get the labels for each set.

  • Train the recurrent deep learning network using the labels and feature matrices.

  • Classify the signals using the trained network.

Split the labels into training and testing sets. Use 70% of the labels for training set and the remaining 30% for testing data. Use splitlabels to obtain the desired partition of the labels. This guarantees that each split data set contains similar label proportions as the entire data set. Obtain the corresponding datastore subsets from the signalDatastore. Reset the random number generator for reproducible results.

rng default

splitIndices = splitlabels(labels,0.7,"randomized");

trainIdx = splitIndices{1};
trainLabels = labels(trainIdx);

Extract Multidomain Features

Obtain a subset of the files in sds and extract multidomain in-memory training features.

trainDsCPU = subset(sds,trainIdx); 

tStart = tic;
trainCPUFeatures = cellfun(@(a,b,c) [real(a) real(b) real(c)],extract(timeFE,trainDsCPU),...
    extract(freqFE,trainDsCPU),extract(timeFreqFE,trainDsCPU),UniformOutput=false);
tCPU = toc(tStart);

Similarly, obtain a subset of the files in the signalDatastore, sdsGPU, to extract training features using the GPU. Compare the computational time needed to extract features on the CPU and on the GPU.

trainDsGPU = subset(sdsGPU,trainIdx); 

tStart = tic;
trainGPUFeatures = cellfun(@(a,b,c) [real(a) real(b) real(c)],extract(timeFE,trainDsGPU),...
    extract(freqFE,trainDsGPU),extract(timeFreqFE,trainDsGPU),UniformOutput=false);
wait(device)
tGPU = toc(tStart);

Compare the time it takes to train the model using the CPU and the GPU.

bar(["CPU" "GPU"],[tCPU tGPU],0.8,FontSize = 12,...
    Labels=["" num2str((tCPU/tGPU),2)+ "X faster"])
title("Speed-up in Feature Extraction for LSTM Network Using CPU vs. GPU")
ylabel("Run Time (seconds)")

Figure contains an axes object. The axes object with title Speed-up in Feature Extraction for LSTM Network Using CPU vs. GPU, ylabel Run Time (seconds) contains an object of type bar.

Train an LSTM network

Train an LSTM network using the training features and their corresponding labels.

numFeatures = size(trainCPUFeatures{1},2);
numClasses = 3;
 
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(50,OutputMode="last")
    fullyConnectedLayer(numClasses)
    softmaxLayer];

options = trainingOptions("adam", ...
    Shuffle="every-epoch", ...    
    Plots="training-progress", ...
    ExecutionEnvironment="cpu", ...
    MaxEpochs=100, ...
    Verbose=false);

tStart = tic;
netCPU = trainnet(trainCPUFeatures,trainLabels,layers,"crossentropy",options);

tTrainingCPU = toc(tStart);

Accelerate the training process by setting ExecutionEnvironment to "gpu" in the trainingOptions for the network. Compare the time it takes to train the network using the CPU and GPU.

options.ExecutionEnvironment = "gpu";

tStart = tic;
netGPU = trainnet(trainGPUFeatures,trainLabels,layers,"crossentropy",options);

wait(device)
tTrainingGPU = toc(tStart);

bar(["CPU" "GPU"],[tTrainingCPU tTrainingGPU],0.8,FontSize=12,...
    Labels=["" num2str((tTrainingCPU/tTrainingGPU),2)+"X faster"])
title("Speed-up in LSTM Network Training Using CPU vs. GPU")
ylabel("Run Time (seconds)")

Figure contains an axes object. The axes object with title Speed-up in LSTM Network Training Using CPU vs. GPU, ylabel Run Time (seconds) contains an object of type bar.

Obtain the testing signalDatastore subsets from sds and extract the multidomain gpuArray test features for the signals in it.

testIdx = splitIndices{2};
testDs = subset(sds,testIdx);
testLabels = labels(testIdx);

testFeatures = cellfun(@(a,b,c) [real(a) real(b) real(c)],extract(timeFE,testDs),...
    extract(freqFE,testDs),extract(timeFreqFE,testDs),UniformOutput=false);

Use the trained network to classify the signals in the test dataset and analyze the accuracy of the network.

scores = minibatchpredict(netGPU,testFeatures);
classNames = categories(labels);
predTest = scores2label(scores,classNames);
 
figure
cm = confusionchart(testLabels,predTest,...
    ColumnSummary="column-normalized",RowSummary="row-normalized");

Figure contains an object of type ConfusionMatrixChart.

Summary

This example shows how feature extraction and LSTM network training process can be accelerated using a GPU. Running your code on a GPU is straightforward and can provide a significant speedup for many workflows. Generally, using a GPU is more beneficial when you are performing computations on larger amounts of data, though the speedup you can achieve depends on your specific hardware and code. To observe the performance acceleration for feature extraction and model training when a parallel pool of CPU workers is used, visit the Accelerate Signal Feature Extraction and Classification Using a Parallel Pool of Workers example.

References

[1] Cheng, Cheng, Guijun Ma, Yong Zhang, Mingyang Sun, Fei Teng, Han Ding, and Ye Yuan. “A Deep Learning-Based Remaining Useful Life Prediction Approach for Bearings.” IEEE/ASME Transactions on Mechatronics 25, no. 3 (June 2020): 1243–54. https://doi.org/10.1109/TMECH.2020.2971503

[2] Riaz, Saleem, Hassan Elahi, Kashif Javaid, and Tufail Shahzad. "Vibration Feature Extraction and Analysis for Fault Diagnosis of Rotating Machinery - A Literature Survey." Asia Pacific Journal of Multidisciplinary Research 5, no. 1 (2017): 103–110.

[3] Caesarendra, Wahyu, and Tegoeh Tjahjowidodo. “A Review of Feature Extraction Methods in Vibration-Based Condition Monitoring and Its Application for Degradation Trend Estimation of Low-Speed Slew Bearing.” Machines 5, no. 4 (December 2017): 21. https://doi.org/10.3390/machines5040021

[4] Tian, Jing, Carlos Morillo, Michael H. Azarian, and Michael Pecht. “Motor Bearing Fault Detection Using Spectral Kurtosis-Based Feature Extraction Coupled With K-Nearest Neighbor Distance Analysis.” IEEE Transactions on Industrial Electronics 63, no. 3 (March 2016): 1793–1803. https://doi.org/10.1109/TIE.2015.2509913

[5] Li, Yifan, Xin Zhang, Zaigang Chen, Yaocheng Yang, Changqing Geng, and Ming J. Zuo. “Time-Frequency Ridge Estimation: An Effective Tool for Gear and Bearing Fault Diagnosis at Time-Varying Speeds.” Mechanical Systems and Signal Processing 189 (April 2023): 110108. https://doi.org/10.1016/j.ymssp.2023.110108

Helper Function

getShortenedLabels – This function shortens the labels for better display on confusion charts.

function shortenedLabels = getShortenedLabels(labels)
%   This function is only intended support examples in the Signal
%   Processing Toolbox. It may be changed or removed in a future release

shortenedLabels = string(size(labels));
labels = string(labels);

for idx = 1: numel(labels)
    if labels(idx) == "HealthySignal"
        str2erase = "Signal";
    else
        str2erase = "Fault";
    end
    shortenedLabels(idx) = erase(labels(idx),str2erase);
end

shortenedLabels = categorical(shortenedLabels);
shortenedLabels = shortenedLabels(:);
end

See Also

Functions

Objects

Related Topics