Invalid training data. Responses must be nonempty.

19 vues (au cours des 30 derniers jours)
Martin Hájek
Martin Hájek le 1 Mar 2021
Commenté : Martin Hájek le 3 Mar 2021
Hello,
I am trying to build simple network which will recognize gender from voice. I have many records. I read them in DataStore but I cant get them in sequenceInputLayer. I tried everything. I know that my Neural network will maybe not work because of layers, but I only want to strat it and than I will make it accurate. Every record is longer than 6000 samples.
I gives me this error:
Error using trainNetwork (line 183)
Invalid training data. Responses must be nonempty.
Error in Program2 (line 31)
net = trainNetwork(audioTrain,layers, options)
clc;
close all;
clear all;
net = network
audio = audioDatastore(fullfile('E:\Projekt\M or F'), ...
'IncludeSubfolders',true, ...
'FileExtension', '.wav', ...
'LabelSource','foldernames');
labelCount = countEachLabel(audio)
numTrainFiles = 1000;
[audioTrain,audioValidation] = splitEachLabel(audio,numTrainFiles,'randomize');
layers = [ ...
sequenceInputLayer(6000)
fullyConnectedLayer(10)
softmaxLayer
classificationLayer];
options = trainingOptions("adam", ...
"MaxEpochs",4, ...
"MiniBatchSize",256, ...
"Plots","training-progress", ...
"Verbose",false, ...
"Shuffle","every-epoch", ...
"LearnRateSchedule","piecewise", ...
"LearnRateDropFactor",0.1, ...
"LearnRateDropPeriod",1, ...
'ValidationFrequency',100);
net = trainNetwork(audioTrain,layers, options)

Réponse acceptée

jibrahim
jibrahim le 1 Mar 2021
Hi Martin,
You can't pass an audioDatastore directly to the network. Create a transform datastore that organizes the data into (audio,label) pairs.
The code below is a simple example where we try to recognize a speaker using an idea similar to yours. The accuracy is not good, but hopefully it is a good starting point.
If you have not done so already, O also recommend looking into this gender ID example in Audio Toolbox:
You might have better luck extracting features from the audio, rather than passing the raw audio to a network.
In any case, here is some example code:
% Download the FSDD data set
url = 'https://ssd.mathworks.com/supportfiles/audio/FSDD.zip';
datasetFolder = tempdir;
unzip(url,datasetFolder)
% Create datastore
% Use speaker name in file name as label
ads = audioDatastore(fullfile(datasetFolder,'FSDD'), ...
'IncludeSubfolders',true);
[~,filenames] = fileparts(ads.Files);
ads.Labels = categorical(extractBetween(filenames,'_','_'));
[adsTrain,adsValidation] = splitEachLabel(ads,.9);
inputSize = 500;
numHiddenUnits = 100;
numClasses = length(unique(ads.Labels));
layers = [ ...
sequenceInputLayer(inputSize)
bilstmLayer(numHiddenUnits,"OutputMode","sequence")
bilstmLayer(numHiddenUnits,"OutputMode","last")
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
% Transformed datastores to be passed directly to network
tdsTrain = transform(@(x,info)processData(x,inputSize,info),adsTrain,'IncludeInfo',true);
tdsValidation = transform(@(x,info)processData(x,inputSize,info),adsValidation,'IncludeInfo',true);
options = trainingOptions("adam", ...
"MaxEpochs",4, ...
"MiniBatchSize",256, ...
"Plots","training-progress", ...
"Verbose",false, ...
"Shuffle","every-epoch", ...
"LearnRateSchedule","piecewise", ...
"LearnRateDropFactor",0.1, ...
"LearnRateDropPeriod",1, ...
"ValidationData",tdsValidation,...
'ValidationFrequency',100);
net = trainNetwork(tdsTrain,layers, options)
Here is the transform function I used:
function [data,info] = processData(audio,inputSize,info)
% Break audio into sequences to length inputSize with overlap
% inputSize/2
audio = buffer(audio,inputSize,floor(inputSize/2));
audio = mat2cell(audio,inputSize,ones(1,size(audio,2))).';
label = repmat(info.Label,size(audio,1),1);
data = table(audio,label);
end

Plus de réponses (0)

Catégories

En savoir plus sur AI for Signals dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by