How is it possible to use a validation set with a LSTM?

When I try to use the Validation set with a LSTM layer, it shows the following error:
options = trainingOptions('adam', ...
'ExecutionEnvironment','gpu', ...
'GradientThreshold',1, ...
'MaxEpochs',maxEpochs, ...
'ValidationData',{XTest,YTest},...
'MiniBatchSize',miniBatchSize, ...
'LearnRateSchedule','piecewise', ...
'SequenceLength','longest', ...
'Shuffle','never', ...
'Verbose',0, ...
'Plots','training-progress');
net = trainNetwork(XTrain,categorical(YTrain),layers,options);
Error:
Training with validation data is not supported for networks with LSTM layers.
Is there another way to use the Validation set during the training of the network?

 Réponse acceptée

Joss Knight
Joss Knight le 29 Avr 2018

1 vote

It's ugly, but if you use Checkpoints, then you can use an OutputFcn to (once per epoch) load the network from a checkpoint and run it against your validation data. It isn't very efficient, but it's okay if you're only doing it once per epoch. You won't get it on the training plot of course.

10 commentaires

Thanks Joseph! Yes, I think it's the only viable solution...
Dear all, I have a similar problem. I want to run the pretrained network in order to validate it with a validation dataset. How Do I use the CheckPoints and the OutputFcn to run the network on the validation data?
Best Regards
i also have the same problem with Alessio Izzo
could anyone tell me how to use the CheckPoints and the OutputFcn to run the network on the validation data?
Best Regards!
Do you have R2018b? ValidationData is supported for sequence networks in R2018b.
Wow,
thank you for your information Joss!!
Sadly, I only have access to 2018a with the Neural Network Toolbox at my university.
So I am still facing that problem. I already managed to setup the OutputFCN, but somehow the data gets lost after eacht epoch and is not saved. Also I can't fetch the LSTM - net after one epoch, with a checkpoint.
After setting optimoptions I can access the OutputFcn.
opt<-optimoptions (vargin)
options=( [....],...
'OutputFcn',@(info,output)outputFCN(info,opt,net));
Since info and net are output variables of 'trainNetwork', I had hoped that this variable would contain the network data (and that I won't need checkpoints). Essentially it stays empty. (if anyone was wondering...)
If I define output outside of the outputFCN as an empty struct (or an empty array), doTrainNetworks throws an error and it says:
Error using nnet.internal.cnn.util.UserCallbackReporter>iCallbackWrapper (line 115) Not enough input arguments.
Error in nnet.internal.cnn.util.UserCallbackReporter>@(f)iCallbackWrapper(f,this.Info) (line
85)
stop = cellfun( @(f) iCallbackWrapper(f, this.Info), this.Callbacks );
Error in nnet.internal.cnn.util.UserCallbackReporter/callCallbacks (line 85)
stop = cellfun( @(f) iCallbackWrapper(f, this.Info), this.Callbacks );
Error in nnet.internal.cnn.util.UserCallbackReporter/start (line 48)
this.callCallbacks();
Error in nnet.internal.cnn.util.VectorReporter/computeAndReport (line 56)
feval( method, this.Reporters{i}, varargin{:} );
Error in nnet.internal.cnn.util.VectorReporter/start (line 16)
computeAndReport( this, 'start' );
Error in nnet.internal.cnn.Trainer/train (line 62)
reporter.start();
Error in trainNetwork>doTrainNetwork (line 250)
trainedNet = trainer.train(trainedNet, trainingDispatcher);
only if 'output' is empty this error is not generated. Even if the data in info is now not interesting to me, since it is the same information in info, after a regular 'trainNetwork run', it is a problem as soon as I can access the weights of the NN and calculate the Validation RMSE, since this data will be lost after each step too.
I'd really apprechiate it if you could help me with both problems.
Firstly --> getting the Neural Network data after one epoch.
Secondly ---> how to get the data out of the OutputFcn without throwing an error as soon as I try to save it.
THANK you very much!
The idea is not that your OutputFcn is passed the network, it is that inside your OutputFcn you load your checkpointed network and then use that to do prediction on your validation data to report a validation metric. If you want to preserve this metric for later (rather than just plot it) you can use a mechanism such as an up-level variable defined in the outer scope of a nested function to store the output.
Hi, can I use "OutputFcn" with a function (I created) that takes the softmax average score of several validation images (and creates a new predicted label) at every iteration? Thank you!
Hey M J, you should probably ask a new question and provide a bit more detail and code. Thanks.
M J
M J le 8 Oct 2020
Modifié(e) : M J le 8 Oct 2020
Hi, thank you for your answer. I did ask a new question (see link below) :
I do not have a code for this, as I am really not sure where to even start. Also, I am not sure if it is okay to post a link to the question here, but if not, please let me know. Thank you.

Connectez-vous pour commenter.

Plus de réponses (2)

Mads Bergholt
Mads Bergholt le 17 Mai 2018

0 votes

Dear Joss, will this be part of Matlab 2018b? This is an aspect of LSTM that is very important for validating these algorithms.
Best regards Mads

3 commentaires

Joss Knight
Joss Knight le 17 Mai 2018
Modifié(e) : Joss Knight le 17 Mai 2018
Yes, 18b is the plan. Get hold of the prerelease when it comes available (early June I think), if you can.
I am sorry to say that this is still not included in Matlab 2018b. Sigh. Maybe we have to turn to Tensorflow for deep learning.
There are some restrictions on the format of the data.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by