Why is my accuracy of trained classifier using function generated from classification learner is less than the model directly exported from the classification learner app?

2 vues (au cours des 30 derniers jours)
load("savedPumpData.mat");
disp(pumpData);
Data = removevars(pumpData,"flow");
save("Data.mat","Data");
disp(Data)
trainRatio = 0.7;
% Create a random partition of the data into training and test sets
c = cvpartition(size(Data, 1), 'HoldOut', 1 - trainRatio);
% Create the training and test sets
trainingData = Data(c.training, :);
testData = Data(c.test, :);
[featureTableTrain,outputTable0] = Features(trainingData);
disp(featureTableTrain)
[trainedClassifier, validationAccuracy] = BagTrees(featureTableTrain);
[featureTableTest,outputTable] = Features(testData);
disp(featureTableTest)
[yfit,scores] = BaggedTress.predictFcn(featureTableTest);
disp(yfit);
accuracy = sum(yfit==testData.faultCode)/numel(testData.faultCode)*100;
fprintf('Accuracy: %.2f%%\n', accuracy);
figure;
confusionchart(testData.faultCode, yfit);
title('Confusion Matrix RF');
[yfit1,scores1] = trainedClassifier.predictFcn(featureTableTest);
disp(yfit1);
accuracy = sum(yfit1==testData.faultCode)/numel(testData.faultCode)*100;
fprintf('Accuracy: %.2f%%\n', accuracy);
figure;
confusionchart(testData.faultCode, yfit1);
title('Confusion Matrix');
%Feature is the function code generated using Diagnostic feature designer
%BaggedTrees is the model exported to workspace using classification learner getting 90% accuracy
%BagTrees is the generated function code of the same model which is exported getting 70%
  1 commentaire
Vinay Maruvada
Vinay Maruvada le 19 Oct 2023
I have datasest of total 240 rows which i spitted as mentioned in above code
I have imported featureTableTrain into the Classification learner for training and featureTableTest for Testing the data

Connectez-vous pour commenter.

Réponse acceptée

Drew
Drew le 18 Oct 2023
Based on what you sent, it looks like the short answer is that the model exported from Classification Learner was trained on all of the data (100%), while the model trained with the training function was trained with 70% of the data.
The final model Classification Learner exports is always trained using the full data set, excluding any data reserved for testing (See https://www.mathworks.com/help/stats/export-classification-model-for-use-with-new-data.html ). If you don't want Classification Learner to use the holdout validation data when training its final model for export, then do the following:
  • Start the Classification Learner session by loading only the training data (70%). Choose whichever validation scheme you would like to use within this 70% of data.
  • After the session is started, load the remaining 30% of the data as the test set.
  • Then, when the final model is exported, it will be trained on only 70% of the data.
When exporting the model, if you check the box to "Include training data in the exported model", then you can take a look at the size of the training data by examining the properties of the exported model. For example, if the exported trainedModel is an ensemble of trees, take a look at:
size(trainedModel.ClassificationEnsemble.X)
If this answer helps you, please remember to accept the answer.

Plus de réponses (0)

Produits


Version

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by