MATLAB Answers

0

Why is accuracy so much lower when using fitcecoc() compared to trainImage​CategoryCl​assifier()​?

Asked by Thomas Bloomfield on 20 Mar 2018
Latest activity Answered by Prajith Chilummula on 4 Apr 2018
I am trying to use bag of words and fitcecoc() (multiclass SVM) to reproduce similar results to those obtained by using Image Category Classifier (as seen in the documentation: https://uk.mathworks.com/help/vision/examples/image-category-classification-using-bag-of-features.
% Code from documentation
bag = bagOfFeatures(trainingSet); % create bag of features from trainingSet (an image datastore)
categoryClassifier = trainImageCategoryClassifier(trainingSet, bag);
confMatrix = evaluate(categoryClassifier, validationSet);
This returns accuracy of ~98% on the validation set.
However when I pass the histogram of visual word occurrences into the multiclass SVM classifier it has ~2.5% accuracy.
SVM_SURF = fitcecoc(trainFeatures,trainingSet.Labels);
bag = bagOfFeatures(validationSet);
featureMatrix = encode(bag, validationSet); % histogram of visual word occurrences
[pred score cost] = predict(SVM_SURF, featureMatrix)
accuracy = sum(validationSet.Labels == pred)/size(validationSet.Labels,1);
accuracy
Is there an obvious reason as to why the accuracy is so much lower when bag of words is passed into fitcecoc() rather than trainImageCategoryClassifier()?

  0 Comments

Sign in to comment.

1 Answer

Answer by Prajith Chilummula on 4 Apr 2018

Hi Thomas,
I tried bag of words with fitcecoc() modifying Image category classifier code and got around 90% accuracy. The code is as below:
url = 'http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz';
outputFolder = fullfile(tempdir, 'caltech101'); % define output folder
if ~exist(outputFolder, 'dir') % download only once
disp('Downloading 126MB Caltech101 data set...');
untar(url, outputFolder);
end
rootFolder = fullfile(outputFolder, '101_ObjectCategories');
categories = {'airplanes', 'ferry', 'laptop'};
imds = imageDatastore(fullfile(rootFolder, categories), 'LabelSource', 'foldernames');
[trainingSet, validationSet] = splitEachLabel(imds, 0.3, 'randomize');
bag = bagOfFeatures(trainingSet);
trainFeatures = encode(bag, trainingSet);
SVM_SURF = fitcecoc(trainFeatures,trainingSet.Labels);
featureMatrix = encode(bag, validationSet);
[pred score cost] = predict(SVM_SURF, featureMatrix)
accuracy = sum(validationSet.Labels == pred)/size(validationSet.Labels,1);
accuracy
But I was unable to find any error with your code.It will be helpful to debug if you provide your whole code.
One more thing to be noted is the validationset histogram should be built on the vocabulary built using trainingset i.e The bag object obtained using trainingset is used for validationset too.

  0 Comments

Sign in to comment.