SVM accuracy is different when done inside for loop than when done individually

1 vue (au cours des 30 derniers jours)
Hello,
I'm working on classifying motions from EMG signals with SVM. I have 5 subjects, and I'm trying to get the accuracy for each subject. My data for each subject is a 34290 x 16 array. I stored all the subject data in a 3D array, "data", where data(:, :, 1) is a 34290 x 16 array for the first subject, data(:, :, 2) is the data for the second subject, and so on.
I'm running SVM on each subject with this loop:
for i = 1: 5 % num subjects
thisSub = data(:, :, i);
% rescale the data
colMin = min(thisSub); colMax = max(thisSub);
scaledData = (thisSub - colMin) ./ (colMax - colMin);
% shuffle the data
shuffle = randperm(size(scaledData,1));
scaledData = scaledData(shuffle,:); labels = labels(shuffle,:);
% create the SVM model
t = templateSVM('KernelFunction','gaussian');
Mdl = fitcecoc(scaledData,labels,'Learners',t, 'coding', 'onevsall', 'CrossVal', 'on', 'kfold' , 5);
% calculate accuracy
accuracy(i, :) = (1 - kfoldLoss(Mdl)) * 100
end
If I run each subject one-by-one, it runs relatively quickly and performs pretty well for all subjects. However, if I try to run it in this loop, the accuracy for the first subject matches what I get when I do it individually, bu the rest of the subjects are very low. It also takes much longer for each subject than when I do it one-by-one. When I run it one-by-one, I still use this loop, I just set the index to 1, 2, 3, 4, 5, so I'm not resetting anything this way that wouldn't get reset when it all gets run at once in the loop.
Why could this be happening?

Réponse acceptée

Aditya Patil
Aditya Patil le 20 Nov 2020
You need to set random number seed, without which some parameters in fitcecoc will change on each execution. See the following code for example,
rng(123);
data = rand(250, 16, 5);
labels = randi([0,1], [250, 1]);
losses = zeros(5);
for i = 1:5 % num subjects
rng(123);
thisSub = data(:, :, i);
% create the SVM model
t = templateSVM('KernelFunction','gaussian');
Mdl = fitcecoc(thisSub,labels,'Learners',t, 'coding', 'onevsall', ...
'CrossVal', 'on', 'kfold' , 5);
% calculate accuracy
losses(i) = kfoldLoss(Mdl);
clear t Mdl;
end
rng(123);
thisSub = data(:, :, 3);
% create the SVM model
t = templateSVM('KernelFunction','gaussian');
Mdl = fitcecoc(thisSub,labels,'Learners',t, 'coding', 'onevsall', ...
'CrossVal', 'on', 'kfold' , 5);
% calculate accuracy
loss2 = kfoldLoss(Mdl);
losses(3) - loss2 % should be zero
  1 commentaire
Ashley Rice
Ashley Rice le 22 Nov 2020
Thank you! I had wondered if it was something like that, but I didn't know how to go about the random number seed. I ended up adding in a line to clear out the variables defined in the loop at the end so that fitcecoc was just getting redefined every time. I'll go ahead and accept your answer though, because I think that's a better approach.
I will also add that I was forgetting to update the labels, so that was also a factor when running it all together in the loop.

Connectez-vous pour commenter.

Plus de réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by