why mse is 0.00 for three different data sets?

2 vues (au cours des 30 derniers jours)
Sanchit
Sanchit le 10 Juil 2023
Réponse apportée : Menika le 10 Juil 2023
Please let me know what is wrong in matalb code given below because it is giving mean square error is for three different data sets,
clear, clc, close all
data = readtable('c:/matlab/study_data.csv');
X = data(:, 1:end-1); % Select all columns except the last one
y = data(:, end); % Select the last column
numGroundTruth = numel(y);
numTrainingSamples = round(0.8 * numGroundTruth);
trainingIndexes = randsample(numGroundTruth, numTrainingSamples);
testIndexes = setdiff((1:numGroundTruth)', trainingIndexes);
X_train = X(trainingIndexes, :);
X_test = X(testIndexes, :);
y_train = y(trainingIndexes, :);
y_test = y(testIndexes, :);
% Create a Random Forest classifier
rf_classifier = TreeBagger(100, table2array(X_train), table2array(y_train), 'OOBPrediction', 'On');
predicted = predict(rf_classifier, table2array(X_train));
YY = categorical(predicted);
ZZ = str2double(cellstr(YY));
Z = table2array(y_train);
oob_mse = immse(Z, ZZ);
disp(sprintf('Out-of-Bag Mean Square Error: %.4f', oob_mse));
Thanks for your kind help.
Sanchit

Réponses (1)

Menika
Menika le 10 Juil 2023
Hi,
A possible problem with the above code can be that the predicted variable is being calculated using the training data X_train, rather than the test data X_test. Since the MSE is calculated using the training data, it will always be zero because the model is predicting the same data it was trained on. You can try replacing X_train with X_test when calculating the predicted variable.
Hope it helps!

Produits


Version

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by