About net.divideParaM.valRatio

Question

mike mike le 22 Sep 2018

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/420262-about-net-divideparam-valratio

Commenté : mike mike le 26 Sep 2018

I know it's possible to use

net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;

to divide the percentage of data into inputs for training, testing and validation. Now, in a classification problem, I didn't want the validation to be too low and I set

net.divideParam.valRatio = 0/100;

In fact, the neural network seemed not to use early stopping after 6 iterations of validation; by chance, I left the other parameters unchanged and so, in the code I wrote,

net.divideParam.trainRatio = 70/100;
    net.divideParam.valRatio =0/100;
    net.divideParam.testRatio = 15/100;

When the sum of the percentages of data distribution between training, testing and validation did not make 100% but the neural network runs the same without giving problems and without appearing error messages. I have done other tests modifying the percentages always so that it did not do 100 % as in the following cases:

net.divideParam.trainRatio = 35/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 25/100;

or

net.divideParam.trainRatio = 35/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 65/100;

my question is how to interpret the subdivision of the dataset between validation, test and training when the sum is not 100% and/or some data is set to 0%. If, for example, I put the training data at 0%, does this mean that the network is not being trained? Or if I put the test data at 0%, does it mean that the network is not being tested? And if the data distribution is greater than 100% does that mean that the remaining % of the inputs of the dataset is not used? And if the percentage distribution of the data is greater than 100% does this mean that some input is used both for the test and also, for example, for the validation?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Greg Heath le 23 Sep 2018

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/420262-about-net-divideparam-valratio#answer_337940

Modifié(e) : Greg Heath le 23 Sep 2018

Ouvrir dans MATLAB Online

1. Now, in a classification problem, I didn't want the validation to be too low and I set

net.divideParam.valRatio = 0/100;

 % Your statement makes no sense: You have 
eliminated the val subset!

2. In fact, the neural network seemed not to use early stopping after 6 iterations of validation;

% Of course! valRatio = 0 eliminates the val subset!

3. In fact, the neural network seemed not to use early stopping after 6 iterations of validation; by chance, I left the other parameters unchanged and so, in the code I wrote,

    net.divideParam.trainRatio = 70/100;
    net.divideParam.valRatio   = 0/100;
    net.divideParam.testRatio  = 15/100;
 % The progam will AUTOMATICALLY CHANGE the fractions to have 
a unit sum. To find out what they are use
     a = net.divideParam.trainRatio 
     b = net.divideParam.valRatio 
     c = net.divideParam.testRatio

4. my question is how to interpret the subdivision of the dataset between validation, test and training when the sum is not 100% and/or some data is set to 0%. If, for example, I put the training data at 0%, does this mean that the network is not being trained? Or if I put the test data at 0%, does it mean that the network is not being tested? And if the data distribution is greater than 100% does that mean that the remaining % of the inputs of the dataset is not used? And if the percentage distribution of the data is greater than 100% does this mean that some input is used both for the test and also, for example, for the validation?

 See my answer to question 3.
 Hope this helps.
 %%Thank you for formally accepting my answer%%

Greg

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Answer 2

mike mike le 23 Sep 2018

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/420262-about-net-divideparam-valratio#answer_337959

Thank you greg everything is clear but I want to explain why I want to delete the validation set. I am trying to create a neural network to predict the direction of a stock index based on some indicators of technical analysis as input. I was inspired by the following work _ Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange [Yakup Kara , Melek Acar Boyacioglu, Ömer Kaan Baykan, 2010]_

you can easily find it on the internet.

In this work we talk about datasets training and datasets hold out.

In this work we talk about datasets for training and datasets for hold out. I did some research and discovered that hold-out data is synonymous with test data, so I thought I should delete the validation data and divide the data into 50% for training data and 50% for hold-out data (as the authors of the article do). If I tried to build the neural network studied in the article, implement it in Matlab with the data included in the article, with similar setting parameters but using the breakdown of the default dataset of Matlab (training, testing and validation) do not come close even remotely to the performance that are declared in the article for the training phase (I'm not importing the over fitting at the moment). If instead I divide the data into 50% training and 50% test, at least for the training phase I get very high performance data and compare them to the performance phase of the training phase of the article network. It's obvious that it's important that the net doesn't overfit and doesn't extrapolate but I want to see this in the next phase, once I understand the meaning of hold-out.

2 commentaires
Afficher AucuneMasquer Aucune

Greg Heath le 26 Sep 2018

The question is:

Do you understand the purpose of the validation subset?

Greg

mike mike le 26 Sep 2018

Yes, the validation data set is intended to avoid overfitting.

Connectez-vous pour commenter.

About net.divideParaM.valRatio

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (2)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

About net.divideParaM.valRatio

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (2)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires Afficher AucuneMasquer Aucune

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune