“Matlab Forever!" (Audio Signal Processing Project)
Afficher commentaires plus anciens
Broad Aim: When did someone say something on a TV program?
Viewers of a TV program (e.g. game show contestants answering timed questions) believe they are watching the same content at the same time as other viewers across the country. But due to transmission delays, they are actually watching a given part of the program seconds or minutes apart. This time difference between the original source material (audio reference file) and the viewed material (the user’s audio file) is what I am trying to determine using Matlab.
The Application: Using a phone, a user points their phone to the sound coming from a TV in a normal living room where there would be extraneous noises apart from the sound (program content) coming from the TV. The user records a sample of a few seconds at the beginning of the program. This captured sound sample is transmitted back to the server where it is compared against a reference audio file of the same content. The aim is to determine exactly where the viewer is up to in the TV program timeline.
Example
Stage 1: Each user is invited to grab a sound sample of a few seconds of the program material at the very beginning of the program. This is relayed back to the server as the user's initial program time.
Stage 2: At 5 minutes, 32 seconds from the beginning of the program (an arbitrary start point) the actor raises his hand and declares "Matlab Forever!".
User 1: Views this scene at 00:05:34 and acknowledges this cue by tapping a button on his phone UI.
User 2: Views this scene at 00:05:36, ditto
User 3: Views this scene at 00:05:42, ditto
The system now knows that users 1, 2, and 3 have delays of +2, +4, and +10 seconds respectively from when the material was broadcast. Mission accomplished.
Advice Sought
1) What is the level of difficulty of this project for someone who is not an engineer, does not have a mathematical background, no programming experience? I have never used Matlab before. Will download a trial shortly. That said, I do have some experience with audio, music, sound processing.
2) It's hard for me to gauge the feasibility of being able to pull this off. No doubt the community can help with the occasional problem, which is great, but I imagine there will be parts of the project where I need direct help. I am happy to pay for a few hours here and there, as that would seem fair. (Not sure if the community rules allow this) Anyone interested on that basis?
3) Can this project be done with these three Matlab tools? Signal Processing Toolbox, Signal Analyzer, Audio Toolbox. Perhaps Simulink too?
4) To kickstart this, any methodology tips would be great appreciated.
Thanks!
5 commentaires
Walter Roberson
le 28 Fév 2021
The example is mostly disconnected from the purpose. It requires that the actor say something known in advance and that the users click to fair precision. I would not recommend those approaches at all. I would suggest that you examine "music identification"
Walter Roberson
le 1 Mar 2021
That is, the techniques for "music identification" (such as the Shazam app) have solved how to synchronize extracting audio features without needing to have a distinct start point.
Once you have a synchronized start point then you can use cross-correlation or similar techniques to identify lags.
You would not need to have any special wake-word, just that there would have to be enough distinct audio to key on.
Steven Lord
le 1 Mar 2021
How do you envision MATLAB being involved here? Nothing in your problem description or example mentions MATLAB at all (other than it being part of the actor's speech to be recognized.)
Pete M
le 2 Mar 2021
Pete M
le 2 Mar 2021
Réponses (1)
Pete M
le 28 Fév 2021
0 votes
Catégories
En savoir plus sur Audio I/O and Waveform Generation dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!