La traduction de cette page n'est pas à jour. Cliquez ici pour voir la dernière version en anglais.

Traitement audio avec Deep Learning

Enrichir des workflows de Deep Learning avec des applications de traitement audio et de la parole

Appliquez le Deep Learning à des applications de traitement audio et de la parole en utilisant Deep Learning Toolbox™ avec Audio Toolbox™. Pour des applications de traitement du signal, veuillez consulter Traitement du signal avec Deep Learning. Pour des applications de télécommunications, veuillez consulter Télécommunications avec Deep Learning.

Applications

Signal Labeler

Label signal attributes, regions, and points of interest, and extract features

Fonctions

développer tout

Gestion et augmentation des données

`audioDatastore`	Datastore for collection of audio files
`audioDataAugmenter`	Augment audio data (depuis R2019b)

Extraction de caractéristiques

`audioFeatureExtractor`	Streamline audio feature extraction (depuis R2019b)
`openl3Embeddings`	Extract OpenL3 feature embeddings (depuis R2022a)
`pitchnn`	Estimate pitch with deep learning neural network (depuis R2021a)
`vggishEmbeddings`	Extract VGGish feature embeddings (depuis R2022a)

Réseaux préentraînés

`classifySound`	Classify sounds in audio signal (depuis R2020b)
`crepe`	(Not recommended) CREPE neural network (depuis R2021a)
`crepePreprocess`	Preprocess audio for CREPE deep learning network (depuis R2021a)
`crepePostprocess`	Postprocess output of CREPE deep learning network (depuis R2021a)
`openl3`	(Not recommended) OpenL3 neural network (depuis R2021a)
`openl3Embeddings`	Extract OpenL3 feature embeddings (depuis R2022a)
`openl3Preprocess`	Preprocess audio for OpenL3 feature extraction (depuis R2021a)
`pitchnn`	Estimate pitch with deep learning neural network (depuis R2021a)
`vggish`	(Not recommended) VGGish neural network (depuis R2020b)
`vggishEmbeddings`	Extract VGGish feature embeddings (depuis R2022a)
`vggishPreprocess`	Preprocess audio for VGGish feature extraction (depuis R2021a)
`yamnet`	(Not recommended) YAMNet neural network (depuis R2020b)
`yamnetGraph`	Graph of YAMNet AudioSet ontology (depuis R2020b)
`yamnetPreprocess`	Preprocess audio for YAMNet classification (depuis R2021a)

Blocs

VGGish	VGGish embeddings extraction network (depuis R2022a)
VGGish Embeddings	Extract VGGish embeddings (depuis R2022a)
YAMNet	YAMNet sound classification network (depuis R2021b)
Sound Classifier	Classify sounds in audio signal (depuis R2021b)
OpenL3	OpenL3 embeddings extraction network (depuis R2022b)
OpenL3 Embeddings	Extract OpenL3 embeddings (depuis R2022b)

Rubriques

Deep Learning for Audio Applications (Audio Toolbox)
Learn common tools and workflows to apply deep learning to audio applications.
Classify Sound Using Deep Learning (Audio Toolbox)
Train, validate, and test a simple long short-term memory (LSTM) to classify sounds.
Adapt Pretrained Audio Network for New Data Using Deep Network Designer
This example shows how to interactively adapt a pretrained network to classify new audio signals using Deep Network Designer.
Audio Transfer Learning Using Experiment Manager
Configure an experiment that compares the performance of multiple pretrained networks applied to a speech command recognition task using transfer learning.
Speaker Identification Using Custom SincNet Layer and Deep Learning
Perform speech recognition using a custom deep learning layer that implements a mel-scale filter bank.
Dereverberate Speech Using Deep Learning Networks
Train a deep learning model that removes reverberation from speech.
Speech Command Recognition in Simulink
Detect the presence of speech commands in audio using a Simulink^® model.
Sequential Feature Selection for Audio Features
This example shows a typical workflow for feature selection applied to the task of spoken digit recognition.
Train Spoken Digit Recognition Network Using Out-of-Memory Audio Data
This example trains a spoken digit recognition network on out-of-memory audio data using a transformed datastore.
Train Spoken Digit Recognition Network Using Out-of-Memory Features
This example trains a spoken digit recognition network on out-of-memory auditory spectrograms using a transformed datastore.
Investigate Audio Classifications Using Deep Learning Interpretability Techniques
This example shows how to use interpretability techniques to investigate the predictions of a deep neural network trained to classify audio data.
Accelerate Audio Deep Learning Using GPU-Based Feature Extraction
Leverage GPUs for feature extraction to decrease the time required to train an audio deep learning model.

Informations connexes

Exemples présentés

Audio-Based Anomaly Detection for Machine Health Monitoring

Design an autoencoder neural network to perform anomaly detection for machine sounds using unsupervised learning.

Traitement audio avec Deep Learning

Applications

Fonctions

Gestion et augmentation des données

Extraction de caractéristiques

Réseaux préentraînés

Blocs

Rubriques

Informations connexes

Exemples présentés

Audio-Based Anomaly Detection for Machine Health Monitoring

Train 3-D Speech Enhancement Network Using Deep Learning

3-D Speech Enhancement Using Trained Filter and Sum Network

Train 3-D Sound Event Localization and Detection (SELD) Using Deep Learning

3-D Sound Event Localization and Detection Using Trained Recurrent Convolutional Neural Network

Speaker Recognition Using x-vectors

Speaker Diarization Using x-vectors

Train Speech Command Recognition Model Using Deep Learning

Keyword Spotting in Noise Using MFCC and LSTM Networks

Denoise Speech Using Deep Learning Networks

Train Generative Adversarial Network (GAN) for Sound Synthesis

Voice Activity Detection in Noise Using Deep Learning

Speech Emotion Recognition

Acoustic Scene Recognition Using Late Fusion

End-to-End Deep Speaker Separation

Acoustics-Based Machine Fault Recognition

Keyword Spotting in Noise Code Generation on Raspberry Pi

Speech Command Recognition Code Generation with Intel MKL-DNN

Acoustics-Based Machine Fault Recognition Code Generation

Speech Command Recognition on Raspberry Pi Using Simulink