trainNetwork not working with transformedDatastore from audioDatastore
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi,
I'm trying to train a CNN with a database of audio files. For that purpose I'm reading my database with an audioDatastore and transforming it following this example so the net can read it.
The transformedDatastore seems to work well but when training the net it seems to enter on a infinite loop (the training window keeps blank, without any accuracy or loss line).
UPDATE: I was wrong, I didn't had enough patience. After about 10 minutes the program start to plot the Accuracy and Loss values but at a very low frequency (about an iteration per minute for the first 10 iterations, then it begin going faster). Finally the network completed its training in less than 6 hours. When using spectrogramas instead of raw audio files it took less than 1 hour, but the size increase may justify the duration increase.
I guess the problem is now solved but the training keeps being too slow for its purpose, ¿should I open a new question or keep updating this one?
Here is the code I'm using:
ads = audioDatastore(datafolder, ...
'IncludeSubfolders',true, ...
'FileExtensions','.wav', ...
'LabelSource','foldernames');
myds=transform(ads, @myReadFunction,'IncludeInfo',true);
myds=shuffle(myds);
function [dataOut, info]=myReadFunction(dataIn,info)
dataOut={dataIn,info.Label}
end
net=trainNetwork(myds,layers,opts);
The net layers and options are proved to be fine (I tested with a smaller dataset loaded directly into memory).
Thank you all!
2 commentaires
kc
le 22 Avr 2020
its giving following error
Undefined function 'transform' for input arguments of type 'audioDatastore'.
how did you even gt the output?
Réponses (2)
jibrahim
le 24 Avr 2019
Hi Manuel,
What might be happening is that trainNetwork is taking a long time doing an initial normalization of your data. If you have an imageInputLayer, can you try setting its Normalization property to 'none'? That will help us narrow down the issue.
HTH,
Jihad
2 commentaires
jibrahim
le 26 Avr 2019
Hi Manuel,
Is this myReadFunction the actual function you are using? If yes, then it seems that you are sending raw audio data to the network, which can be perfectly valid, but if you are really using a CNN, then your input time-domain audio should be first converted to some time-frequency image-like representation (e.g. spectrogram, mel-spectrogram, etc). The slowness might be due to the large input sample size (whoch would be equal to the length of each audio signal you are sending in).
Would it be possible for you to give me more information on the problem you're trying to solve, and the network structure you are using? Please note that some of the featured examples in Audio Toolbox might be of help (they do not use transform, but they should give an idea of the setup). For example:
Speech command recognition: https://www.mathworks.com/help/audio/examples/Speech-Command-Recognition-Using-Deep-Learning.html
Gender classification: https://www.mathworks.com/help/audio/examples/classify-gender-using-long-short-term-memory-networks.html
HTH,
Jihad
1 commentaire
Voir également
Catégories
En savoir plus sur AI for Audio dans Help Center et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!