Why does training perfomance change when a validation set is considered?

3 vues (au cours des 30 derniers jours)
Ana
Ana le 2 Oct 2012
Hello!
For example, I considered the input and output:
input=1:1:10 output=[1:2:15 24 24]
and then I try 3 different options:
OPTION 1 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10,1:10); [net,tr,Y1,E1] = train(net,input,output);
OPTION 2 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10,1:8,9:10); %net.divideParam.trainRatio=1;net.divideParam.valRatio=0;net.divideParam.testRatio=0; [net,tr,Y1,E1] = train(net,input,output);
OPTION 3 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(8,1:8); [net,tr,Y1,E1] = train(net,input(:,1:8),output(:,1:8));
The initialisations are similar, the all 3 options stopped because they reached the maximum epoch. I checked epoch=0 and the weights and bias are similar but the (training) performance isn't. And from epoch=0, everything is different when comparing the 3 options. If I don't change divideFcn and I consider the same experiments as before, using the same indices for training, I have the same problem. So it isn't because of divideind! I'd like to understand why this is happening. I checked the functions step by step. Could anyone help me? Thank you very much. Ana
  1 commentaire
Greg Heath
Greg Heath le 5 Oct 2012
I took a prelimiary look. Something subtle is going on.
1. Option 1 is irrelevant.
2. I chose Nepochs = 1 and and rng(0) initialization.
3. The final weights for Options 2 & 3 are different (They shouldn't be).
I'll be baahk.
Aahnold.

Connectez-vous pour commenter.

Réponse acceptée

Greg Heath
Greg Heath le 28 Nov 2012
The difference in the last two results was completely caused by using
1) ... = train(net,input(:,1:8),output(:,1:8));
instead of
2) ... = train(net,input,output);
Verification: For each of these 2 syntaxes I ran 3 trials for one epoch with
a. divideind(10,1:8,9:10);
b. divideind(10,1:8);
c. divideind(8,1:8);
For each syntax the 3 trials yielded identical results.
The reason why probably lies in the code of train:
type train
Hope this helps.
Thank you for officially accepting my answer.
Greg

Plus de réponses (1)

Zeeshan
Zeeshan le 27 Nov 2012
Hi,
I think because the data is divided randomly to check for validation of model, therefore some network may get trained better than the other because it was trained on a different set of data (randomly chosen training data).
I am also working on a comparison of architectures and I am going to fix the time points for each dataset for training and validation to compare them.
Regards,
Shan

Catégories

En savoir plus sur Sequence and Numeric Feature Data Workflows dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by