MATLAB Answers

How to create a confusion matrix.

32 views (last 30 days)
sreelekshmi ms
sreelekshmi ms on 9 Mar 2020
Commented: sreelekshmi ms on 10 Mar 2020
How to create a confusion matrix for clustering? For this how can I get predicted class and actual class? And how can I get those TP, TN, FP and FN values from it? I am confused please help me.
Thanks in advance.

  2 Comments

Benjamin Großmann
Benjamin Großmann on 9 Mar 2020
  • Create confusion matrix with
conf_mat = confusion(targets,outputs);
  • The predicted class (outputs) is the result of your net for a given input (after training), the "actual class" is your label (targets)
  • We can now compare target and output for each sample and for one class "c" against all others:
  • target == c & output == c --> TP (The output is positive and that is true)
  • target == c & output ~= c --> FN (The output is negative but that is false)
  • target ~= c & output == c --> FP (The output is positive but that is false)
  • target ~= c & output ~= c --> TN (The output is negative and that is true)
sreelekshmi ms
sreelekshmi ms on 9 Mar 2020
I used confusionmat() also. How can I get the TP, TN, FP, FN values using a MatLab?

Sign in to comment.

Answers (1)

Benjamin Großmann
Benjamin Großmann on 9 Mar 2020
Lets use the cifar10 demo included in Matlab for your question
clearvars
close all
clc
load('Cifar10Labels.mat','trueLabels','predictedLabels');
[m,order] = confusionmat(trueLabels,predictedLabels);
figure
cm = confusionchart(m,order);
Please look at the confusionchart and consider the following for one particular class c,
  • The diagonal element of class c is the amount of TP
  • Everything inside the predicted class column except the diagonal element is falsely predicted as class c --> FP
  • Everything inside the true class row except the diagonal element is of class c but not predicted as c --> FN
  • Every other diagonal element except the diagonal element of class c itself is TN.
Now, we can put this easily in in code by summing up row, column and diagonal elements and substracting the TP. Lets pick a class, e.g. c=2 "automobile", and calculate TP, FP, FN, TN for that class
c = 2;
TP = cm.NormalizedValues(c,c) % true class is c and predicted as c
FP = sum(cm.NormalizedValues(:,c))-TP % predicted as c, true class is not c
FN = sum(cm.NormalizedValues(c,:))-TP % true class is c, not predicted as c
TN = sum(diag(cm.NormalizedValues))-TP % true class is not c, not predicted as c

  3 Comments

Benjamin Großmann
Benjamin Großmann on 9 Mar 2020
In addition, an approach to get the confusion matrixes for a single class and therefor their TP,FP,FN,TN values
clearvars
close all
clc
load('Cifar10Labels.mat','trueLabels','predictedLabels');
[m,order] = confusionmat(trueLabels,predictedLabels);
figure
cm = confusionchart(m,order);
cmv = cm.NormalizedValues;
% confusion matrix for a single class
conf_mat_sc = @(c) [cmv(c,c), sum(cmv(:,c))-cmv(c,c);
sum(cmv(c,:))-cmv(c,c), sum(diag(cmv))-cmv(c,c);];
cms = arrayfun(conf_mat_sc, [1:numel(order)], 'UniformOutput', false);
sreelekshmi ms
sreelekshmi ms on 9 Mar 2020
Thank you Sir.
I have a doubt, based on the c value the confusion matrix will change. If I finding the accuracy
( A=(TP+TN)/(TP+TN+FP+FN)) based on the c value it will change also. From this how can I get a fixed accuracy?
sreelekshmi ms
sreelekshmi ms on 10 Mar 2020
Sir, I tried this in a different data set I got some errors like:
" Index in position 1 exceeds array bounds (must not exceed 2).
Error in shpa (line 68)
TP = cm.NormalizedValues(c,c) ; "
clc;
clear;
data=xlsread('Ecolixl.xlsx');
minpts=6;
epsilon=2;
[idx, corepts] = dbscan(data,epsilon,minpts);
nearEnough = 0.02;
x = data(:,1);
y = data(:,2);
indexesToKeep = false(1, length(x));
for k = 1 : length(x)
distances = sqrt((x(k) - x).^2 + (y(k) - y).^2);
if sum(distances > nearEnough) >= 5
indexesToKeep(k) = true;
end
end
x = x(indexesToKeep);
y = y(indexesToKeep);
P=[x y];
dist2 = (data(:,1) - P(:,1).').^2 + (data(:,2) - P(:,2).').^2;
[~,id] = mink(distances,20,1);
clusters = data(id);
maximum_num_clusters = 7;
Z = linkage(clusters, 'average');
id= cluster(Z, 'Maxclust', maximum_num_clusters);
figure()
dendrogram(Z)
uni=length(Z);
outl=rmoutliers(Z);
I=data(:,4);
X= -ones(748,1);
B=[-ones(length(I) - length(id),1);id];
Si=[id;X];
% load('Cifar10Labels.mat','trueLabels','predictedLabels');
[m,order] = confusionmat(I,idx);
figure
cm = confusionchart(m,order);
c = 3;
TP = cm.NormalizedValues(c,c) ;
FP = sum(cm.NormalizedValues(:,c))-TP ;
FN = sum(cm.NormalizedValues(c,:))-TP ;
TN = sum(diag(cm.NormalizedValues))-TP;
A=(TP+TN)/(TP+TN+FP+FN)*100;
How can I get this? Please help me.

Sign in to comment.

Sign in to answer this question.


Translated by