Adding Cross Validation to Classification code

Question

Pooyan Mobtahej le 10 Nov 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/642745-adding-cross-validation-to-classification-code

Commenté : Pooyan Mobtahej le 23 Nov 2020

Ouvrir dans MATLAB Online

I want to add cross validation (e.g. 10-fold) in my classification code. Every time it shall use 10 percents of data.

How can I do that

code:

clear all
close all
TrainRatio=0.8;
ValidationRatio=0.1;
folder='/Users/pooyan/Documents/normal/';  % change this path to your normal data folder
audio_files=dir(fullfile(folder,'*.ogg'));
nfileNum=length(audio_files);
%nfileNum=200
normal=[];
for i = 1:nfileNum 
    normal_name = [folder audio_files(i).name]; 
    normal(i,:) = audioread(normal_name);   
end 
normal=normal';
nLabels = repelem(categorical("normal"),nfileNum,1);
folder='/Users/pooyan/Documents/anomaly/'; % change this path to your anomaly data folder
audio_files=dir(fullfile(folder,'*.ogg'));
afileNum=length(audio_files);
anomaly=[];
for i = 1:afileNum 
    anomaly_name = [folder audio_files(i).name]; 
    anomaly(i,:) = audioread(anomaly_name);   
end 
anomaly=anomaly';
aLabels = repelem(categorical("anomaly"),afileNum,1);
% randomize the inputs if necessary
% normal=normal(:,randperm(nfileNum, nfileNum));
% anomaly=anomaly(:,randperm(afileNum, afileNum));
nTrainNum = round(nfileNum*TrainRatio);
aTrainNum = round(afileNum*TrainRatio);
nValidationNum = round(nfileNum*ValidationRatio);
aValidationNum = round(afileNum*ValidationRatio);
audioTrain = [normal(:,1:nTrainNum),anomaly(:,1:aTrainNum)];
labelsTrain = [nLabels(1:nTrainNum);aLabels(1:aTrainNum)];
audioValidation = [normal(:,nTrainNum+1:nTrainNum+nValidationNum),anomaly(:,aTrainNum+1:aTrainNum+aValidationNum)];
labelsValidation = [nLabels(nTrainNum+1:nTrainNum+nValidationNum);aLabels(aTrainNum+1:aTrainNum+aValidationNum)];
audioTest = [normal(:,nTrainNum+nValidationNum+1:end),anomaly(:,aTrainNum+aValidationNum+1:end)];
labelsTest = [nLabels(nTrainNum+nValidationNum+1:end); aLabels(aTrainNum+aValidationNum+1:end)];
fs=44100;
%  Create an audioFeatureExtractor object 
%to extract the centroid and slope of the mel spectrum over time.
aFE = audioFeatureExtractor("SampleRate",fs, ...    %Fs
    "SpectralDescriptorInput","melSpectrum", ...
    "spectralCentroid",true, ...
    "spectralSlope",true);
featuresTrain = extract(aFE,audioTrain);
[numHopsPerSequence,numFeatures,numSignals] = size(featuresTrain);
numHopsPerSequence;
numFeatures;
numSignals;
%treat the extracted features as sequences and use a
%sequenceInputLayer as the first layer of your deep learning model. 
featuresTrain = permute(featuresTrain,[2,1,3]); %permute switching dimensions in array
featuresTrain = squeeze(num2cell(featuresTrain,[1,2]));%remove dimensions
numSignals = numel(featuresTrain); %number of signals of normal and anomalies
[numFeatures,numHopsPerSequence] = size(featuresTrain{1});
%Extract the validation features.
featuresValidation = extract(aFE,audioValidation);
featuresValidation = permute(featuresValidation,[2,1,3]);
featuresValidation = squeeze(num2cell(featuresValidation,[1,2]));
%Define the network architecture.
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(50,"OutputMode","last")
    fullyConnectedLayer(numel(unique(labelsTrain))) %%labelTrain=audio
    softmaxLayer
    classificationLayer];
%To define the training options
options = trainingOptions("adam", ...
    "Shuffle","every-epoch", ...
    "ValidationData",{featuresValidation,labelsValidation}, ... %%labelValidatin=audioValidation
    "Plots","training-progress", ...
    "Verbose",false);
%To train the network
net = trainNetwork(featuresTrain,labelsTrain,layers,options);
%Test the network %10 preccent 
%classify(net,permute(extract(aFE,audioTest),[2 257 35]))
TestFeature=extract(aFE, audioTest);
for i=1:size(TestFeature, 3)
TestFeatureIn = TestFeature(:,:,i)';
classify(net,TestFeatureIn)
end

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Aditya Patil le 16 Nov 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/642745-adding-cross-validation-to-classification-code#answer_545633

Modifié(e) : Aditya Patil le 16 Nov 2020

Ouvrir dans MATLAB Online

Currently, KFold validation is not supported for neural networks. I have brought the issue to the notice of the concerned people.

Note that KFold validation is not commonly used for neural networks, as neural networks are generally used with large amount of data, and hence KFold validation is not required.

You can split the dataset into 10 parts, and use different set of 9 parts each time to train the network, and validate on the remaining 1 part. You can do so using cvpartition and training functions. For example,

load fisheriris;
data_size = size(meas);
folds = 10;
c = cvpartition(data_size(1), "KFold", folds);
for i = 1:folds
    idx = training(c, i);
    train = meas(idx);
    test  = meas(~idx);
    
    % train and test model here
end

However, if quantity of data is an issue, I would recommend using other machine learning techniques such as SVM or trees, as that might give you better results. You can use the classification learner app for this.

10 commentaires
Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

Pooyan Mobtahej le 21 Nov 2020

Ouvrir dans MATLAB Online

I have modified the code as follows:

AllData = [normal anomaly];
Labels=[nLabels; aLabels];
% K indicates K-fold cross validation
K=10;
cv = cvpartition(Labels,'KFold',K);
% nTrainNum = round(nfileNum*TrainRatio*0.1);
% aTrainNum = round(afileNum*TrainRatio*0.1);
% nValidationNum = round(nfileNum*ValidationRatio*0.1);
% aValidationNum = round(afileNum*ValidationRatio*0.1);
for i=1:K
    
    audioTest = AllData(:, cv.test(i));
    labelsTest = Labels(cv.test(i));
    audioTrainValidation = AllData(:, ~cv.test(i));    
    labelsTrainValidation = Labels(~cv.test(i));
    % Vp: 10% from training dataset used for validation;
    Vp=0.1;
    TVL=length(labelsTrainValidation);
    ValidationIndex = randperm(TVL, floor(TVL*Vp));
    TrainIndex=1:TVL;
    TrainIndex(ValidationIndex)=[];    
    audioTrain = audioTrainValidation(:, TrainIndex);
    labelsTrain = labelsTrainValidation(:, TrainIndex);    
    audioValidation = audioTrainValidation(:, ValidationIndex);
    labelsValidation = labelsTrainValidation(:, ValidationIndex);

But I still get error for :

Index in position 2 exceeds array bounds (must not exceed 1).

Error in categorical/parenReference (line 19)

that.codes = this.codes(rowIndices,colIndices);

Error in CrossAllKfold (line 58)

labelsTrain = labelsTrainValidation(:, TrainIndex);

How should I fix that

Aditya Patil le 23 Nov 2020

Do not use length, use size instead.

Pooyan Mobtahej le 23 Nov 2020

Ouvrir dans MATLAB Online

Did that thanks

as I attached the modified code as last step I would like to show all the results from cross validation in one confusion matrix, how can I do that?

 all
close all
TrainRatio=0.8;
ValidationRatio=0.1;
folder='/Users/pooyan/Documents/normal/';  % change this path to your normal data folder
audio_files=dir(fullfile(folder,'*.ogg'));
nfileNum=length(audio_files);
nfileNum=100
normal=[];
for i = 1:nfileNum 
    normal_name = [folder audio_files(i).name]; 
    normal(i,:) = audioread(normal_name);   
end 
normal=normal';
nLabels = repelem(categorical("normal"),nfileNum,1);
folder='/Users/pooyan/Documents/anomaly/'; % change this path to your anomaly data folder
audio_files=dir(fullfile(folder,'*.ogg'));
afileNum=length(audio_files);
anomaly=[];
for i = 1:afileNum 
    anomaly_name = [folder audio_files(i).name]; 
    anomaly(i,:) = audioread(anomaly_name);   
end 
anomaly=anomaly';
aLabels = repelem(categorical("anomaly"),afileNum,1);
% randomize the inputs if necessary
%normal=normal(:,randperm(nfileNum, nfileNum));
%anomaly=anomaly(:,randperm(afileNum, afileNum));
AllData = [normal anomaly];
Labels=[nLabels; aLabels];
% K indicates K-fold cross validation
K=10;
cv = cvpartition(Labels,'KFold',K);
% nTrainNum = round(nfileNum*TrainRatio*0.1);
% aTrainNum = round(afileNum*TrainRatio*0.1);
% nValidationNum = round(nfileNum*ValidationRatio*0.1);
% aValidationNum = round(afileNum*ValidationRatio*0.1);
for i=1:K
    
    audioTest = AllData(:, cv.test(i));
    labelsTest = Labels(cv.test(i));
    audioTrainValidation = AllData(:, ~cv.test(i));    
    labelsTrainValidation = Labels(~cv.test(i));
    % Vp: 10% from training dataset used for validation;
    Vp=0.1;
    TVL=length(labelsTrainValidation);
    ValidationIndex = randperm(TVL, floor(TVL*Vp));
    TrainIndex=1:TVL;
    TrainIndex(ValidationIndex)=[];    
    audioTrain = audioTrainValidation(:, TrainIndex);
    labelsTrain = labelsTrainValidation(TrainIndex);    
    audioValidation = audioTrainValidation(:, ValidationIndex);
    labelsValidation = labelsTrainValidation(ValidationIndex);
    
% audioTrain = [normal(:,((i-1)*nTrainNum)+1:i*nTrainNum),anomaly(:,((i-1)*aTrainNum)+1:i*aTrainNum)];
% labelsTrain = [nLabels(((i-1)*nTrainNum)+1:i*nTrainNum);aLabels(((i-1)*aTrainNum)+1:i*aTrainNum)];
% 
% audioValidation = [normal(:,i*(nTrainNum+1:nTrainNum+nValidationNum)),anomaly(:,i*(aTrainNum+1:aTrainNum+aValidationNum))];
% labelsValidation = [nLabels(i*(nTrainNum+1):i*(nTrainNum+nValidationNum));aLabels(i*(aTrainNum+1:aTrainNum+aValidationNum))];
% 
% audioTest = [normal(:,i*(nTrainNum+nValidationNum+1):end),anomaly(:,i*(aTrainNum+aValidationNum+1):end)];
% labelsTest = [nLabels(i*(nTrainNum+nValidationNum+1):end); aLabels(i*(aTrainNum+aValidationNum+1):end)];
 
fs=44100;
%  Create an audioFeatureExtractor object 
%to extract the centroid and slope of the mel spectrum over time.
aFE = audioFeatureExtractor("SampleRate",fs, ...    %Fs
    "SpectralDescriptorInput","melSpectrum", ...
    "spectralCentroid",true, ...
    "spectralSlope",true);
featuresTrain = extract(aFE,audioTrain);
[numHopsPerSequence,numFeatures,numSignals] = size(featuresTrain);
numHopsPerSequence;
numFeatures;
numSignals;
%treat the extracted features as sequences and use a
%sequenceInputLayer as the first layer of your deep learning model. 
featuresTrain = permute(featuresTrain,[2,1,3]); %permute switching dimensions in array
featuresTrain = squeeze(num2cell(featuresTrain,[1,2]));%remove dimensions
numSignals = numel(featuresTrain); %number of signals of normal and anomalies
[numFeatures,numHopsPerSequence] = size(featuresTrain{1});
%Extract the validation features.
featuresValidation = extract(aFE,audioValidation);
featuresValidation = permute(featuresValidation,[2,1,3]);
featuresValidation = squeeze(num2cell(featuresValidation,[1,2]));
%Define the network architecture.
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(50,"OutputMode","last")
    fullyConnectedLayer(numel(unique(labelsTrain))) %%labelTrain=audio
    softmaxLayer
    classificationLayer];
%To define the training options
options = trainingOptions("adam", ...
    "Shuffle","every-epoch", ...
    "ValidationData",{featuresValidation,labelsValidation}, ... %%labelValidatin=audioValidation
    "Plots","training-progress", ...
    "Verbose",false);
%To train the network
net = trainNetwork(featuresTrain,labelsTrain,layers,options);
%Test the network %10 preccent 
%classify(net,permute(extract(aFE,audioTest),[2 257 35]))
TestFeature=extract(aFE, audioTest);
for i=1:size(TestFeature, 3)
TestFeatureIn = TestFeature(:,:,i)';
classify(net,TestFeatureIn)
predict(i) = classify(net,TestFeatureIn)
%labelsPred = categorical(classify(net,TestFeatureIn))
end
%Confusion Matrix Chart
plotconfusion(labelsTest,predict')
end

Connectez-vous pour commenter.

Adding Cross Validation to Classification code

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

10 commentaires
Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

Adding Cross Validation to Classification code

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

10 commentaires Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

10 commentaires
Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens