How to compare matrices with different dimensions?

I wrote a code for classification by using 5 classifiers and at the end I used voting this code is for initia defining of train and test data:
clear all
close all
clc
load liver.mat;
data=Liver;
[n,m]=size(data);
rows=(1:n);
test_count=floor((1/6)*n);
sum_ens=0;sum_result=0;
test_rows=randsample(rows,test_count);
train_rows=setdiff(rows,test_rows);
test=data(test_rows,:);
train=data(train_rows,:);
xtest=test(:,1:m-1);
ytest=test(:,m);
xtrain=train(:,1:m-1);
ytrain=train(:,m);
I put the resault of each classifier in out 1-5 and then aggregate them in output and compared output with test labels :
out1 = majorityvote(tt(1,:));
out2 = majorityvote(tt(2,:));
out3 = majorityvote(tt(3,:));
out4 = majorityvote(tt(4,:) );
out5 = majorityvote(tt(5,:) );
output=[out1,out2,out3,out4,out5];
for i=1:test_count
if(output(i)==1 && ytest(i)==1)
tp_ens=tp_ens+1;
end
if(output(i)==0 && ytest(i)==0)
tn_ens=tn_ens+1;
end
if(output(i)==0 && ytest(i)==1)
fp_ens=fp_ens+1;
end
if(output(i)==1 && ytest(i)==0)
fn_ens=fn_ens+1;
end
end
this codes doesn't have any problems with other datasets in this part but for the liver data (attached) It shows this error:
Index exceeds matrix dimensions.
Error in pimaclassify_new (line 174)
if(output(i)==1 && ytest(i)==1)
Maybe because the number of test_count for specified number of rows of data were obtained 5 so comparing them with test labels was true but for liver data which have different number of rows , this comparing shows error.
Should I change the sizes of outputs of classifiers or the size of test data or test_count?
I'll be grateful to have your opinions about fixing the error.
Thanks

6 commentaires

dpb
dpb le 22 Juin 2019
"Should I change the sizes of outputs of classifiers or the size of test data or test_count?"
How can we possibly know?
I'd only say that naming variables with numeric postfixes and manually catenating those variables into a composite array is simply a wrong way to approach coding your problem. Use cell arrays or named struct fields or some other dynamic way to handle variably-sized input data instead.
Hardcoding values for loops and the like is bound to lead to such mismatches when the precise data formats aren't present for every case; writing generic code to handle such will save you much grief going forward.
phdcomputer Eng
phdcomputer Eng le 24 Juin 2019
Modifié(e) : phdcomputer Eng le 24 Juin 2019
Thank you very much
In fact I compared the first row of 5 classifiers with 5 test lebels. When I used another dataset the number of tests obtained 5 (because test_count=floor((1/6)*n); and n:the number of rows was 32 )
My question is when I load different datasets, since the number of rows are different so naturally the test_count will be different and for comparing this count with the first row of the resaults of five classifiers doesn't seem true, but I don't know how to change this part by using cell arrays or struct.
generally I want to compare the first row of multiple classifiers with ytest(test labels) and the number of ytest may be more or less (depend on dataets).
As you have said, I should use variably-sized input data but I don't have any idea how to define it.
Is It possible to inform me of the solution that you think will fix this problem.
I'll be very gratefull to have your opinions.
What dpb is referring to with the variable size output is to change your definition of output to be something more like this.
output(1) = majorityvote(tt(1,:));
output(2) = majorityvote(tt(2,:));
output(3) = majorityvote(tt(3,:));
output(4) = majorityvote(tt(4,:) );
output(5) = majorityvote(tt(5,:) );
As for the original indexing issue, I'm not entirely sure how to fix you code, but I can describe the problem in greater detail.
You are getting 'index exceeds matrix dimensions' for this command:
if(output(i)==1 && ytest(i)==1)
I can clearly tell that output has five elements, while 'i' is defined from 1:test_count, which is in turn based on the number of rows in you data. I do not know the exact number of rows in ytest, but I can see that it is also based on the size of data.
Therefore, I would assume that output(i) is causing the problem when i>5, and you need to either reexamine how many elements you have in output, or you need to find a different way of calculating the range if i.
I strongly suspect that
output(1) = majorityvote(tt(1,:));
would result in an error. Since majorityvote, whatever that is, is indexed by a row vector, presumably with more than 1 element (otherwise why the : ?), it is very likely to return more than one element.
Now, depending on the shape of majorityvector and tt, the concatenation of [out1, out2, ...] is going to return a 2D matrix (if majorityvector is a column vector or a matrix), or a row vector (if majorityvector is a row vector).
Assuming the former, then this would problably work:
output = zeros(size(tt, 1), 5);
output(:, 1) = majorityvote(tt(1, :));
output(:, 2) = ...
which is simply the one liner
output = majorityvote(tt); %assuming tt has 5 columns, otherwise tt(1:5, :)
You bring up a good point about the size of the output of majorityvote, I should not have assumed it was a single element. With that said though, why would the result be a single column vector, when the input is a single row vector. It would make sense with the original concatenation of output, but it seems like a confusing way to write the function. Also, do we know that majorityvote(tt) will automatically consider the rows individually, or will it simple take the entire array as a single large input?
Guillaume
Guillaume le 24 Juin 2019
If majorityvote is a vector, since tt(row, :) is a vector (column or row doesn't matter), then majorityvote(tt(row, :)) will be the same shape as majorityvote. A(B) is the shape of A when both A and B are vectors
If majorityvote is not a vector, then majorityvote(tt(row, :)) will be the same shape as tt(row, :) hence a column vector. A(B) is the shape of B when either A or B is not a vector.
An annoying or useful inconsistency depending on your point of view.
A corollary of the above is that output = majorityvote(tt) will be the same size as tt, with output(r, c) equal to majorityvote(tt(r, c))

Connectez-vous pour commenter.

Réponses (0)

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Commenté :

le 24 Juin 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by