Main Content

Run Sequence-to-Sequence Regression on FPGAs

This example shows how to create, compile, and deploy a long short-term memory (LSTM) network trained on remaining useful life (RUL) of engines. Use the deployed network to predict the RUL for an engine. Use MATLAB® to retrieve the prediction results from the target device.

This example uses the turbofan engine degradation data used in [1]. The example uses an LSTM network to predict the remaining useful life of an engine measured in cycles when given time series data representing various sensors in the engine. The training data contains simulated time series data for 100 engines. Each sequence varies in length and corresponds to a full run to failure (RTF) instance. The test data contains 100 partial sequences and the corresponding values for the remaining useful life at the end of each sequence.

The data set contains 100 training observations and 100 test observations.

To learn more about how to train this network, see Sequence-to-Sequence Regression Using Deep Learning. Fore this example, you must have a Xilinx® Zynq® Ultrascale+™ ZCU102 SoC development kit.

Download Data

Download and unzip the turbofan engine degradation simulation data set.

Each time series in the turbofan engine degradation simulation data set represents a different engine. Each engine starts with unknown degrees of initial wear and manufacturing variation. The engine operates normally at the start of each time series, and develops a fault at some point during the series. In the training set, the fault grows in magnitude until system failure.

The data contains ZIP-compressed text files with 26 columns of numbers, separated by spaces. Each row is a snapshot of data taken during a single operational cycle, and each column is a different variable. The columns correspond to the following:

  • Column 1 — Unit number

  • Column 2 — Time in cycles

  • Columns 3–5 — Operational settings

  • Columns 6–26 — Sensor measurements 1–21

Create a directory to store the turbofan engine degradation Simulation data set.

dataFolder = fullfile(tempdir,"turbofan");
if ~exist(dataFolder,'dir')
    mkdir(dataFolder);
end

Download and extract the turbofan engine degradation simulation data set.

filename = matlab.internal.examples.downloadSupportFile("nnet","data/TurbofanEngineDegradationSimulationData.zip");
unzip(filename,dataFolder)

Prepare Test Data

Load the test data using the processTurboFanDataTest function attached to this example. The processTurboFanDataTest function extracts the data from filenamePredictors and filenameResponses and returns the cell arrays XTest and YTest, which contain the test predictor and response sequences, respectively.

filenamePredictors = fullfile(pwd,"test_FD001.txt");
filenameResponses = fullfile(pwd,"RUL_FD001.txt");
[XTest,YTest] = processTurboFanDataTest(filenamePredictors,filenameResponses);

Remove features with constant values using idxConstant calculated from the training data. Normalize the test predictors using the same parameters as in the training data. Clip the test responses at the same threshold used for the training data.

filenamePredictors = fullfile(pwd,"train_FD001.txt");
[XTrain,YTrain] = processTurboFanDataTrain(filenamePredictors);

Remove Features with Constant Values

Features that remain constant for all time steps can negatively impact the training. Find the rows of data that have the same minimum and maximum values, and remove the rows. Then use these values to clean up the test dataset.

m = min([XTrain{:}],[],2);
M = max([XTrain{:}],[],2);
idxConstant = M == m;

for i = 1:numel(XTrain)
    XTrain{i}(idxConstant,:) = [];
end

numFeatures = size(XTrain{1},1);
mu = mean([XTrain{:}],2);
sig = std([XTrain{:}],0,2);

for i = 1:numel(XTrain)
    XTrain{i} = (XTrain{i} - mu) ./ sig;
end

thr = 150; %threshold
for i = 1:numel(XTest)
    XTest{i}(idxConstant,:) = [];
    XTest{i} = (XTest{i} - mu) ./ sig;
    YTest{i}(YTest{i} > thr) = thr;
end

Load the Pretrained Network

Load the LSTM network. This network was trained on NASA CMAPSS Data described in [1], enter:

load CMAPSSDataNetwork

View the layers of the network by using the analyzeNetwork function. The function returns a graphical representation of the network and the parameter settings for the layers in the network.

analyzeNetwork(net)

Define FPGA Board Interface

Define the target FPGA board programming interface by using the dlhdl.Target object. Specify that the interface is for a Xilinx board with an Ethernet interface.

To create the target object, enter:

hTarget = dlhdl.Target('Xilinx','Interface','Ethernet');

Alternatively, to use the JTAG interface, install Xilinx™ Vivado™ Design Suite 2020.2. To set the Xilinx Vivado toolpath, enter:

hdlsetuptoolpath('ToolName', 'Xilinx Vivado', 'ToolPath', 'C:\Xilinx\Vivado\2020.2\bin\vivado.bat');
hTarget = dlhdl.Target('Xilinx','Interface','JTAG');

Prepare Network for Deployment

Prepare the network for deployment by creating a dlhdl.Workflow object. Specify the network and the bitstream name. Ensure that the bitstream name matches the data type of your FPGA board. In this example, the target board is the Xilinx ZCU102 SOC. The bitstream uses a single data type.

hW = dlhdl.Workflow('network', net, 'Bitstream', 'zcu102_lstm_single','Target',hTarget);

Alternatively, to run the example on the Xilinx ZC706 board, enter:

hW = dlhdl.Workflow('Network', snet, 'Bitstream', 'zc706_lstm_single','Target',hTarget);

Compile Network

Run the compile method of the dlhdl.Workflow object to compile the network and generate the instructions, weights, and biases for deployment. The total number of frames exceeds the default value of 30. Set the InputFrameNumberLimit name-value argument to 500 to run predictions in chunks of 500 frames to prevent timeouts.

dn = compile(hW,'InputFrameNumberLimit',500)
### Compiling network for Deep Learning FPGA prototyping ...
### Targeting FPGA bitstream zcu102_lstm_single.
### The network includes the following layers:
     1   'sequenceinput'      Sequence Input      Sequence input with 17 dimensions            (SW Layer)
     2   'lstm'               LSTM                LSTM with 200 hidden units                   (HW Layer)
     3   'fc_1'               Fully Connected     50 fully connected layer                     (HW Layer)
     4   'fc_2'               Fully Connected     1 fully connected layer                      (HW Layer)
     5   'regressionoutput'   Regression Output   mean-squared-error with response 'Response'  (SW Layer)
                                                                                             
### Notice: The layer 'sequenceinput' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software.
### Notice: The layer 'regressionoutput' with type 'nnet.cnn.layer.RegressionOutputLayer' is implemented in software.
### Compiling layer group: lstm.wi ...
### Compiling layer group: lstm.wi ... complete.
### Compiling layer group: lstm.wo ...
### Compiling layer group: lstm.wo ... complete.
### Compiling layer group: lstm.wg ...
### Compiling layer group: lstm.wg ... complete.
### Compiling layer group: lstm.wf ...
### Compiling layer group: lstm.wf ... complete.
### Compiling layer group: fc_1>>fc_2 ...
### Compiling layer group: fc_1>>fc_2 ... complete.

### Allocating external memory buffers:

          offset_name          offset_address    allocated_space 
    _______________________    ______________    ________________

    "InputDataOffset"           "0x00000000"     "4.0 MB"        
    "OutputResultOffset"        "0x00400000"     "4.0 MB"        
    "SchedulerDataOffset"       "0x00800000"     "4.0 MB"        
    "SystemBufferOffset"        "0x00c00000"     "20.0 MB"       
    "InstructionDataOffset"     "0x02000000"     "4.0 MB"        
    "FCWeightDataOffset"        "0x02400000"     "4.0 MB"        
    "EndOffset"                 "0x02800000"     "Total: 40.0 MB"

### Network compilation complete.
dn = struct with fields:
             weights: [1×1 struct]
        instructions: [1×1 struct]
           registers: [1×1 struct]
    syncInstructions: [1×1 struct]
        constantData: {}
             ddrInfo: [1×1 struct]

Program Bitstream onto FPGA and Download Network Weights

To deploy the network on the Xilinx ZCU102 SoC hardware, run the deploy function of the dlhdl.Workflow object. This function uses the output of the compile function to program the FPGA board by using the programming file. The deploy function starts programming the FPGA device and displays progress messages, and the required time to deploy the network.

 hW.deploy
### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA.
### Resetting network state.
### Loading weights to FC Processor.
### FC Weights loaded. Current time is 09-May-2023 14:02:27

Predict Remaining Useful Life

Run the predict method of the dlhdl.Workflow object, to make predictions on the test data.

for i = 1:numel(XTest)
 YPred{i} = hW.predict(XTest{i},Profile='on');
end
### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 31.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85597                  0.00039                      31            2660531           2563.4
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19288                  0.00009 
    lstm.wf                  19315                  0.00009 
    lstm.sigmoid_1             275                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       487                  0.00000 
    lstm.multiplication_1       417                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       481                  0.00000 
    fc_1                      4680                  0.00002 
    fc_2                       335                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 49.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85575                  0.00039                      49            4204571           2563.9
    memSeparator_0             102                  0.00000 
    lstm.wi                  19473                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 431                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4719                  0.00002 
    fc_2                       286                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 126.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85581                  0.00039                     126           10810895           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19479                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4720                  0.00002 
    fc_2                       285                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 106.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85577                  0.00039                     106            9093836           2564.4
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19288                  0.00009 
    lstm.wf                  19315                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             257                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       487                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       481                  0.00000 
    fc_1                      4673                  0.00002 
    fc_2                       332                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 98.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85596                  0.00039                      98            8408321           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19462                  0.00009 
    lstm.wo                  19320                  0.00009 
    lstm.wg                  19277                  0.00009 
    lstm.wf                  19296                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4726                  0.00002 
    fc_2                       289                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 105.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85596                  0.00039                     105            9008955           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19290                  0.00009 
    lstm.wf                  19312                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       487                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       461                  0.00000 
    fc_1                      4675                  0.00002 
    fc_2                       340                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 160.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85586                  0.00039                     160           13727877           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19297                  0.00009 
    lstm.wg                  19281                  0.00009 
    lstm.wf                  19312                  0.00009 
    lstm.sigmoid_1             284                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       461                  0.00000 
    fc_1                      4672                  0.00002 
    fc_2                       333                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 166.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85578                  0.00039                     166           14242348           2564.2
    memSeparator_0             102                  0.00000 
    lstm.wi                  19396                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19317                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       477                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4670                  0.00002 
    fc_2                       335                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 55.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85636                  0.00039                      55            4719211           2564.0
    memSeparator_0             102                  0.00000 
    lstm.wi                  19456                  0.00009 
    lstm.wo                  19297                  0.00009 
    lstm.wg                  19281                  0.00009 
    lstm.wf                  19312                  0.00009 
    lstm.sigmoid_1             284                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       487                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       461                  0.00000 
    fc_1                      4671                  0.00002 
    fc_2                       334                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 192.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85578                  0.00039                     192           16473028           2564.2
    memSeparator_0             102                  0.00000 
    lstm.wi                  19396                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19317                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       477                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4673                  0.00002 
    fc_2                       332                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 83.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85585                  0.00039                      83            7121410           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19463                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4717                  0.00002 
    fc_2                       288                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 217.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85605                  0.00039                     217           18617229           2564.3
    memSeparator_0             102                  0.00000 
    lstm.wi                  19483                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             275                  0.00000 
    lstm.sigmoid_3             327                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 431                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4722                  0.00002 
    fc_2                       283                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 195.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85636                  0.00039                     195           16729884           2564.3
    memSeparator_0             102                  0.00000 
    lstm.wi                  19456                  0.00009 
    lstm.wo                  19297                  0.00009 
    lstm.wg                  19281                  0.00009 
    lstm.wf                  19312                  0.00009 
    lstm.sigmoid_1             284                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       487                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       461                  0.00000 
    fc_1                      4672                  0.00002 
    fc_2                       333                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 46.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85596                  0.00039                      46            3947354           2563.7
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19296                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19307                  0.00009 
    lstm.sigmoid_1             274                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       461                  0.00000 
    fc_1                      4679                  0.00002 
    fc_2                       336                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 76.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85584                  0.00039                      76            6520835           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19482                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             265                  0.00000 
    lstm.sigmoid_3             337                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4717                  0.00002 
    fc_2                       288                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 113.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85585                  0.00039                     113            9695264           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19290                  0.00009 
    lstm.wf                  19312                  0.00009 
    lstm.sigmoid_1             284                  0.00000 
    lstm.sigmoid_3             257                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4671                  0.00002 
    fc_2                       334                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 165.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85597                  0.00039                     165           14156683           2564.2
    memSeparator_0             102                  0.00000 
    lstm.wi                  19475                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4716                  0.00002 
    fc_2                       289                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 133.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85578                  0.00039                     133           11411341           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19458                  0.00009 
    lstm.wo                  19315                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4718                  0.00002 
    fc_2                       287                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 135.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85576                  0.00039                     135           11582414           2564.2
    memSeparator_0             102                  0.00000 
    lstm.wi                  19456                  0.00009 
    lstm.wo                  19307                  0.00009 
    lstm.wg                  19286                  0.00009 
    lstm.wf                  19296                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4721                  0.00002 
    fc_2                       284                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 184.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85546                  0.00039                     184           15786933           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19366                  0.00009 
    lstm.wo                  19296                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19307                  0.00009 
    lstm.sigmoid_1             284                  0.00000 
    lstm.sigmoid_3             257                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4669                  0.00002 
    fc_2                       336                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 148.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85587                  0.00039                     148           12698421           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19288                  0.00009 
    lstm.wf                  19315                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       477                  0.00000 
    lstm.multiplication_1       417                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       481                  0.00000 
    fc_1                      4679                  0.00002 
    fc_2                       326                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 39.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85596                  0.00039                      39            3346572           2563.8
    memSeparator_0             102                  0.00000 
    lstm.wi                  19462                  0.00009 
    lstm.wo                  19320                  0.00009 
    lstm.wg                  19277                  0.00009 
    lstm.wf                  19296                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4726                  0.00002 
    fc_2                       289                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 130.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85586                  0.00039                     130           11153050           2564.3
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19297                  0.00009 
    lstm.wg                  19281                  0.00009 
    lstm.wf                  19312                  0.00009 
    lstm.sigmoid_1             274                  0.00000 
    lstm.sigmoid_3             257                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4672                  0.00002 
    fc_2                       333                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 186.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85586                  0.00039                     186           15958277           2564.2
    memSeparator_0             102                  0.00000 
    lstm.wi                  19462                  0.00009 
    lstm.wo                  19320                  0.00009 
    lstm.wg                  19277                  0.00009 
    lstm.wf                  19296                  0.00009 
    lstm.sigmoid_1             275                  0.00000 
    lstm.sigmoid_3             327                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 431                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4715                  0.00002 
    fc_2                       290                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 48.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85575                  0.00039                      48            4119029           2563.7
    memSeparator_0             102                  0.00000 
    lstm.wi                  19473                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       417                  0.00000 
    lstm.c_add                 431                  0.00000 
    lstm.tanh_2                271                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4719                  0.00002 
    fc_2                       286                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 76.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85595                  0.00039                      76            6521203           2563.9
    memSeparator_0             102                  0.00000 
    lstm.wi                  19473                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             275                  0.00000 
    lstm.sigmoid_3             327                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       411                  0.00000 
    fc_1                      4717                  0.00002 
    fc_2                       288                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 140.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85586                  0.00039                     140           12011869           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19288                  0.00009 
    lstm.wf                  19314                  0.00009 
    lstm.sigmoid_1             275                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4676                  0.00002 
    fc_2                       329                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 158.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85586                  0.00039                     158           13556375           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19406                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19288                  0.00009 
    lstm.wf                  19314                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             267                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             267                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       461                  0.00000 
    fc_1                      4676                  0.00002 
    fc_2                       329                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 171.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85585                  0.00039                     171           14671396           2564.2
    memSeparator_0             102                  0.00000 
    lstm.wi                  19450                  0.00009 
    lstm.wo                  19312                  0.00009 
    lstm.wg                  19286                  0.00009 
    lstm.wf                  19296                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4724                  0.00002 
    fc_2                       291                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 143.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85601                  0.00039                     143           12269177           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19479                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       427                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 411                  0.00000 
    lstm.tanh_2                291                  0.00000 
    lstm.multiplication_3       401                  0.00000 
    fc_1                      4722                  0.00002 
    fc_2                       283                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 196.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85577                  0.00039                     196           16815689           2564.3
    memSeparator_0             102                  0.00000 
    lstm.wi                  19396                  0.00009 
    lstm.wo                  19287                  0.00009 
    lstm.wg                  19288                  0.00009 
    lstm.wf                  19315                  0.00009 
    lstm.sigmoid_1             275                  0.00000 
    lstm.sigmoid_3             257                  0.00000 
    lstm.tanh_1                277                  0.00000 
    lstm.sigmoid_2             277                  0.00000 
    lstm.multiplication_2       497                  0.00000 
    lstm.multiplication_1       427                  0.00000 
    lstm.c_add                 421                  0.00000 
    lstm.tanh_2                281                  0.00000 
    lstm.multiplication_3       471                  0.00000 
    fc_1                      4680                  0.00002 
    fc_2                       325                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 145.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      85590                  0.00039                     145           12440830           2564.1
    memSeparator_0             102                  0.00000 
    lstm.wi                  19488                  0.00009 
    lstm.wo                  19317                  0.00009 
    lstm.wg                  19287                  0.00009 
    lstm.wf                  19287                  0.00009 
    lstm.sigmoid_1             285                  0.00000 
    lstm.sigmoid_3             317                  0.00000 
    lstm.tanh_1                287                  0.00000 
    lstm.sigm...

The LSTM network makes predictions on the partial sequence one time step at a time. At each time step, the network makes predictions using the value at this time step, and the network state calculated from the previous time steps. The network updates its state between each prediction. The predict function returns a sequence of these predictions. The last element of the prediction corresponds to the predicted RUL for the partial sequence.

Alternatively, you can make predictions one time step at a time by using predictAndUpdateState. This function is useful when you have the values of the time steps in a stream. Usually, it is faster to make predictions on full sequences when compared to making predictions one time step at a time. For an example showing how to forecast future time steps by updating the network between single time step predictions, see Time Series Forecasting Using Deep Learning.

Visualize some of the predictions in a plot.

idx = randperm(numel(YPred),4);
figure
for i = 1:numel(idx)
    subplot(2,2,i)
    
    plot(YTest{idx(i)},'--')
    hold on
    plot(YPred{idx(i)},'.-')
    hold off
    
    ylim([0 thr + 25])
    title("Test Observation " + idx(i))
    xlabel("Time Step")
    ylabel("RUL")
end
legend(["Test Data" "Predicted"],'Location','southeast')

For a given partial sequence, the predicted current RUL is the last element of the predicted sequences. Calculate the root-mean-square error (RMSE) of the predictions and visualize the prediction error in a histogram.

for i = 1:numel(YTest)
    YTestLast(i) = YTest{i}(end);
    YPredLast(i) = YPred{i}(end);
end
figure
rmse = sqrt(mean((YPredLast - YTestLast).^2))
rmse = single
    20.7713
histogram(YPredLast - YTestLast)
title("RMSE = " + rmse)
ylabel("Frequency")
xlabel("Error")

References

  1. Saxena, Abhinav, Kai Goebel, Don Simon, and Neil Eklund. "Damage propagation modeling for aircraft engine run-to-failure simulation." 2008 International Conference on Prognostics and Health Management (2008): 1–9. https://doi.org/10.1109/PHM.2008.4711414.

See Also

| | | | |