i want to use LSTM based audio network to work with Live audio

Question

Arslan Munim le 27 Juil 2022

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio

Commenté : Arslan Munim le 28 Sep 2022

Hello Matlab team,

I am using this example to work with my audio data set https://www.mathworks.com/matlabcentral/fileexchange/74611-fault-detection-using-deep-learning-classification#examples_tab dataset is trained but I want to make the application live with PC, forexample I have a mic and make an application to use my trained model to predict the output.

Can you guide me or help me with that?

Regards,

Arslan Munaim

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

jibrahim le 27 Juil 2022

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio#answer_1016040

Ouvrir dans MATLAB Online

Hi Arslan,

There is a function in that repo (streamingClassifier) that should get the job done in conjunction with an audio device reader:

% Create a microphone object
adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
% These statistic value should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Pass to network
    scores = streamingClassifier(frame,M,S);
    % Use the scores any way you want
end

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

Arslan Munim le 28 Juil 2022

Modifié(e) : Arslan Munim le 28 Juil 2022

Hi jibrahim,

Thanks for your reply, I tried using streamingClassifier. however I am trying to use extract function instead of extractFeatures function (because of dependenices issues) however with extract function I can only use one feature at a time. however I trained network with 11 features.

Can you please how i can use extract function in streamingClassifier? I am attaching code for your reference:

windowLength = 512;

overlapLength = 0;

aFE = audioFeatureExtractor('SampleRate',44100, ...

'Window',hamming(windowLength,'periodic'),...

'OverlapLength',overlapLength,...

'spectralCentroid',true, ...

'spectralCrest',true,...

'spectralDecrease',true, ...

'spectralEntropy',true,...

'spectralFlatness',true,...

'spectralFlux',true,...

'spectralKurtosis',true,...

'spectralRolloffPoint',true,...

'spectralSkewness',true,...

'spectralSlope',true,...

'spectralSpread',true);

features = extract(aFE , audioIn)

%%%%%%%%%features = extractFeatures(audioIn);

% Normalize

features = ((features - M')./S');

[net, scores] = predictAndUpdateState(net,features);

jibrahim le 28 Juil 2022

Ouvrir dans MATLAB Online

Hi Arslan,

The extract function should also return 11 features. For example, if you replace the eixsting function extractFeatures with this modified function, things should work the same:

function featureVector = extractFeatures2(x)
%#codegen
persistent afe
if isempty(afe)
    windowLength = 512;
    overlapLength = 0;
    afe = audioFeatureExtractor('SampleRate',44100, ...
        'Window',hamming(windowLength,'periodic'),...
        'OverlapLength',overlapLength,...
        'spectralCentroid',true, ...
        'spectralCrest',true,...
        'spectralDecrease',true, ...
        'spectralEntropy',true,...
        'spectralFlatness',true,...
        'spectralFlux',true,...
        'spectralKurtosis',true,...
        'spectralRolloffPoint',true,...
        'spectralSkewness',true,...
        'spectralSlope',true,...
        'spectralSpread',true);
end
featureVector = extract(afe,x);
end

The size of featureVector will be 1-by-11, each element in the vector representing one of your features.

Notice I declared afe as persistent. This is to ensure the audio feature extractor is not recreated every time you call this function in your loop. the extractor goes through some one-time setup computations when you first call it. No need to waste time repeating those.

jibrahim le 2 Août 2022

Ouvrir dans MATLAB Online

Hi Arslan,

Since you trained the network with a sample rate of 16e3, you will have to perform sample-rate conversion from 44100 kHz to 16 kHz. This code is a possible implementation, where you essentially feed the network frames of length 512 sampled at 16 kHz, just like the original code:

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,...
                              Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D; % get as close to desired frame size
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    while buff.NumUnreadSamples >= 512
        frame = read(buff,512);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

Note that you can also potentially feed the network longer frames. That should also work, and is probably more efficient, as the network will run faster if you give it a long input (as opposed to multiple short ones):

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    N = buff.NumUnreadSamples;
    L = floor(N/512);
    if L>0
        frame = read(buff,512*L);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

If you can't change the frame size on the microphone, then you can handle that using another buffer, for example:

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=22000);
buffSRC = dsp.AsyncBuffer;
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    write(buffSRC,frame);
    frame = read(buffSRC,frameLength);
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    N = buff.NumUnreadSamples;
    L = floor(N/512);
    if L>0
        frame = read(buff,512*L);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

Arslan Munim le 9 Août 2022

Hi jibrahim,

Thankyou for your support, it was very helpful.

Now I want to use multiple mics for prediction can you please give me some idea how i can use streaming classifier with 3 or 4 mics of the predicition.

Thanks and have a nice day.

Regards,

Arslan

Connectez-vous pour commenter.

Answer 2

jibrahim le 9 Août 2022

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio#answer_1023635

Hi Arslan,

audioDeviceReader supports multi-mic devices. Use the ChannelMappingSource and ChannelMapping properties to map between device input channels and the output data.

This network was trained on mono data, so, to adapt it to multi-channel data, you either have to retrain your network for multi-channel data, or somehow combine your input channels into one channel (by a weighted sum, or selecting a particular channel, etc) and proceed like above.

23 commentaires
Afficher 21 commentaires plus anciensMasquer 21 commentaires plus anciens

Arslan Munim le 17 Août 2022

Modifié(e) : Walter Roberson le 19 Août 2022

Ouvrir dans MATLAB Online

Hi jibrahim,

I try to read data from multiple mic but it is giving me this error everytime i try to use multiple mic, I am trying to read frame from each Microphone and send that data to streaming classifier to predict the output but it giving me error always on frame1 = adr1()

Error using audioDeviceReader/setup

A given audio device may only be opened once.

Error in audioDeviceReader/setupImpl

Error in multipleMic (line 10)

frame1 = adr1() - Show complete stack trace

adr1 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (4- USB PnP Sound Device)",BitDepth="16-bit integer");
adr2 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (USB PnP Sound Device)",BitDepth="16-bit integer");
% These statistic value should come from your training...
% M = 0;
% S = 1;
while 1
    % Read a frame of data from microphone
    frame1 = adr1()
    frame2 = adr2()  
    % Pass to network
    [class] = streamingClassifier2(frame1,frame2,M,S)
    % Use the scores any way you want
end
function [class] = streamingClassifier2(frame1,frame2,M,S)
% This is a streaming classifier function 
persistent net; 
if isempty(net)
    net = coder.loadDeepLearningNetwork('net.mat');
end
% Extract features using function
%features = extract(aFE , audioIn)
features1 = extractFeatures2(frame1);
features2 = extractFeatures2(frame2);
% Normalize 
features1 = ((features1 - M)./S).';
features2 = ((features2 - M)./S).';
% Classify
[class] = classify(net,{features1,features2});
%[net, scores] = classify(net,feature)
end

jibrahim le 19 Août 2022

Ouvrir dans MATLAB Online

Arslan, we support the scenario with one USB card with several mics hooked to it. You can't use audioDeviceReader to read from separate cards at the same time. Even if we did, since these different mics run on different clocks, I am not sure how you would achieve synchronization between them anyway.

One possible workaround is to use a different MATLAB session to read from the other microphone, and send the data to MATLAB via UDP. So, in another MATLAB, run some code like this:

sender = dsp.UDPSender(RemoteIPPort=25000);
src = audioDeviceReader;
while(1)
    frame = src();
    sender(frame);
end

Them, in the main MATLAB, you can receive the audio:

rec = dsp.UDPReceiver(LocalIPPort=25000);
scope = timescope;
while(1)
    frame = rec();
    scope(frame);
end

This might work if your sound is in steady state and does not change often/fast. If synchronization between mics becomes an issue, then I think one card with multiple devices associated with it is definitely the way to go.

jibrahim le 20 Août 2022

OK, this helps. You will need other hardware (one device, multiple mics) for the system to recognize it. You could also give the UDP idea a shot, see how viable that is.

Arslan Munim le 28 Sep 2022

Hi again,

I am trying to train my network, with lowering BitsPerSample to 8 before it was 16 BitsPerSample. Every time i try to start training model it throw warning (given below) and terminates.

I try it with different sample rate but it gives same error everytime. I tried to change my layer structure, changing InitialLearnRate',0.001 but still i am getting same warning.

Warning: Training stopped at iteration 1 because training loss is NaN. Predictions using the output network might contain NaN values.

Model:

layers = [ ...

sequenceInputLayer(size(trainingFeatures{1},1))

lstmLayer(100,"OutputMode","sequence")

dropoutLayer(0.1)

lstmLayer(100,"OutputMode","last")

fullyConnectedLayer(5)

softmaxLayer

classificationLayer];

miniBatchSize = 30;

validationFrequency = floor(numel(trainingFeatures)/miniBatchSize);

options = trainingOptions("adam", ...

"MaxEpochs",100, ...

"MiniBatchSize",miniBatchSize, ...

"Plots","training-progress", ...

"Verbose",false, ...

"Shuffle","every-epoch", ...

"LearnRateSchedule","piecewise", ...

"LearnRateDropFactor",0.1, ...

"LearnRateDropPeriod",20,...

'InitialLearnRate',0.001,...

'ValidationData',{validationFeatures,adsValidation.Labels}, ...

'ValidationFrequency',validationFrequency);

Regards,

Arslan

Connectez-vous pour commenter.

i want to use LSTM based audio network to work with Live audio

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (2)

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

23 commentaires
Afficher 21 commentaires plus anciensMasquer 21 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

i want to use LSTM based audio network to work with Live audio

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (2)

5 commentaires Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

23 commentaires Afficher 21 commentaires plus anciensMasquer 21 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

23 commentaires
Afficher 21 commentaires plus anciensMasquer 21 commentaires plus anciens