Audio Toolbox Interface for SpeechBrain and Torchaudio Libraries

Deep Learning models supporting Audio Toolbox AI-powered functions for speech and audio signal processing
601 téléchargements
Mise à jour 15 oct. 2025
The Audio Toolbox Interface for SpeechBrain and Torchaudio Libraries enables the use of a collection of AI-powered speech processing functions in Audio Toolbox™ for automatic speech recognition (ASR) and speech synthesis.
Using Audio Toolbox and the Audio Toolbox Interface for SpeechBrain and Torchaudio Libraries, MATLAB users can take advantage of state-of-the-art AI models, without requiring any familiarity with Deep Learning.
The add-on automates the installation of Python® and PyTorch®, and it downloads selected Deep Learning models from the SpeechBrain and Torchaudio libraries. Once installed, it allows users to run the following functions through the underlying use of local AI models:
  • The speech2text function accepts a speechClient object with the model set to emformer or whisper for speech-to-text (STT) and automatic speech recognition (ASR). These complement the local wav2vec model, and the cloud service options Google, IBM, Microsoft, and Amazon. Using whisper also requires downloading the model weights separately, as described in Download Whisper Speech-to-Text Model
  • The text2speech function accepts a speechClient object with the model set to hifigan for text-to-speech (TTS) and speech synthesis. This complements the cloud service options Google, IBM, Microsoft, and Amazon.
The speech2text and text2speech functions accept and return strings and audio samples. They automate the whole end-to-end pipelines for automatic speech recognition and speech synthesis, while hiding from the user any signal pre-processing, feature extraction, model prediction, and output post-processing. speech2text can also be used interactively through the Signal Labeler App.
Follow the links below for practical code examples:
Version history
GPU Compute Capability Support
The range of supported NVIDIA GPU compute capabilities for this package is determined by the underlying Python libraries (SpeechBrain, TorchAudio, PyTorch) included at the time each MATLAB release is packaged. This support may not always match the broader GPU support available in MATLAB itself.
Supported NVIDIA GPU Compute Capabilities by MATLAB Release:
  • R2024a: Compute 5.0 to 8.6
  • R2024b–R2026a: Compute 5.0 to 8.9
Support for additional GPU architectures will be available only in future MATLAB releases, as newer Python library versions are integrated. For MATLAB releases not explicitly listed above, GPU support is unchanged from the most recent prior release.
Compatibilité avec les versions de MATLAB
Créé avec R2024a
Compatible avec les versions R2024a à R2026a
Plateformes compatibles
Windows macOS (Apple Silicon) macOS (Intel) Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!