Free-to-use speech-to-text conversion using MATLAB interface to .NET

25 views (last 30 days)
Steven Dakin
Steven Dakin on 10 Jan 2021
Commented: Nikos Korobos on 28 Mar 2022
Hi
I - like quite a few other other posters here - would like to perform realtime speech-to-text conversion within MATLAB.
One approach is to use the groovy new Deep Learning Speech Recognition example code but that would require appropriate trainging data (which is hard to acquire).
Another approach is the speech2text code posted by Gabriele Bunkheila: https://www.mathworks.com/matlabcentral/fileexchange/65266-speech2text
allowing MATLAB to access various 3rd party speech-to-text web services. Unfortunately all of these are paid subscription services.
I would like to access the free-to-use and perfectly good speech-to-text services built into Windows and accessed via the MS Speech API.
Some time ago Michelle Hirsch posted a code fragment (it's copied below ) written by a Matlab staffer (Jiro) suggesting this was possible
function rec = speechrecognition
% Add assembly
NET.addAssembly('System.Speech');
% Construct engine
rec = System.Speech.Recognition.SpeechRecognitionEngine;
rec.SetInputToDefaultAudioDevice;
rec.LoadGrammar(System.Speech.Recognition.DictationGrammar);
% Define listener callback
addlistener(rec, 'SpeechRecognized', @recognizedFcn);
% Start recognition
rec.RecognizeAsync(System.Speech.Recognition.RecognizeMode.Multiple);
% Callback
function recognizedFcn(obj, e)
% Get text
txt = char(e.Result.Text);
% Split into words
w = regexp(txt, '\s', 'split');
if length(w) > 1
% Look for the occurrence of the phrase "search for"
idx = find(strcmp(w(1:end-1), 'search') & ...
strcmp(w(2:end), 'for'), 1, 'first');
if ~isempty(idx) && length(w) >= idx+2
% The words after are the search terms
searchTerm = sprintf('%s+', w{idx+2:end});
searchTerm(end) = '';
% Search on the web
web(['http://www.google.com/search?q=', searchTerm]);
fprintf(2, 'search for "%s"\n', strrep(searchTerm, '+', ' '));
else
%disp(txt)
end
elseif length(w) == 1 && strcmpi(w{1}, 'stop')
obj.RecognizeAsyncStop;
obj.delete;
%disp(txt);
disp('Stopping Speech Recognition. Thank you for using!');
else
%disp(txt);
end
Has anyone got this to work? It may be a simple for for someone with experience of .NET programming in MATLAB to get this going...
I certainly think this would be useful to many people. Any pointers appreciated.
Cheers
Steven Dakin

Answers (2)

Nikos Korobos
Nikos Korobos on 23 Mar 2022
Hello, did you get it to work?

Brian Hemmat
Brian Hemmat on 25 Mar 2022
Hi Steven,
You can try out wav2vec 2.0. You can find a MATLAB implementation here:
  5 Comments
Nikos Korobos
Nikos Korobos on 28 Mar 2022
Thanks again managed to get it working with
writematrix(txt,'sttfile.txt' );

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by