How does crossval (for k-fold CV) work in MATLAB after training a classifier?

16 vues (au cours des 30 derniers jours)
Sanjay Yadav
Sanjay Yadav le 7 Mar 2016
Commenté : seung ho yeom le 1 Fév 2019
To my knowledge, k-fold CV is a technique for model selection where the data is first divided into k-folds where the data in each fold is stratified. Now, consider the following code:
trainedClassifier = fitcnb(X, Y);
partitionedModel = crossval(trainedClassifier, 'KFold', 10);
accuracy = 1 - kfoldLoss(partitionedModel, 'LossFun', 'ClassifError');
The above code first trains the data in matrix X as per the class labels in vector Y. The trainedClassifier is then used in the function crossval(). My doubt is very simple. Does this line of code
partitionedModel = crossval(trainedClassifier, 'KFold', 10);
divide the matrix X into ten folds and then trains on 9 folds, testes on the remaining fold and this is repeated 10 times with each fold as test matrix or does it simply use the trainedClassifier that was trained in the previous line on the whole matrix X and then testes on each fold as I can only see that the fitcnb has been used only once. Does the function crossval() works upon it internally? If it doesn't, then the training is being done on the whole data instead of on the 9 folds in each iteration as is defined by cross-validation.
Fellow members of the community, I will be highly obliged if this doubt of mine can be cleared. Thanking you in anticipation.
  3 commentaires
Raghav G Raghav G
Raghav G Raghav G le 22 Oct 2018
I too have the same question. Did anyone find the answer?
Fulin Wei
Fulin Wei le 28 Déc 2018
I have the same question. Do you have any answer now?

Connectez-vous pour commenter.

Réponses (3)

Don Mathis
Don Mathis le 30 Nov 2018
The answer is that it divides the dataset into 10 folds and trains the model 10 times on 9 folds each time, using the remaining fold as the test set. The only information taken from 'trainedClassifier' are the hyperparameter values, which are used in each of the 10 trainings. 'fitcnb' is not called 10 times, 'ClassificationNaiveBayes.fit' is.
  11 commentaires
Don Mathis
Don Mathis le 17 Jan 2019
Those are fit as part of the normal fitting process.
seung ho yeom
seung ho yeom le 1 Fév 2019
Okay, now i fully understand. Thank you.

Connectez-vous pour commenter.


fatemeh ghorbani
fatemeh ghorbani le 3 Déc 2017
do you find any answer?

James Ratti
James Ratti le 21 Oct 2018
Any answers??

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by