splitEachLabel
Splits datastore according to specified label proportions
Syntax
Description
splits the audio files in [ADS1,ADS2]
= splitEachLabel(ADS
,p
)ADS
into two new datastores,
ADS1
and ADS2
. The new datastore
ADS1
contains the first p
files from each label
,and ADS2
contains the remaining files from each label.
p
can be either a number between 0 and 1, exclusive, indicating the
percentage of the files from each label to assign to ADS1
, or an
integer indicating the absolute number of files from each label to assign to
ADS1
.
splits the datastore into [ADS1,...,ADSM]
= splitEachLabel(ADS
,p1,...,pN
)N+1
new datastores. The new datastore
ADS1
contains the first p1
files from each
label, the next new datastore ADS2
contains the next
p2
files, and so on. If p1,…,pN
represent
numbers of files, then their sum must be no more than the number of files in the smallest
label in the original datastore, ADS
.
___ = splitEachLabel(___,'randomized')
randomly assigns the specified proportion of files from each label to the new
datastores.
___ = splitEachLabel(___,
specifies the properties of the new datastores using one or more name-value pair arguments.
For example, you can specify which labels to split with
Name,Value
)'Include','labelname'
.
Examples
Split by Fractions
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder,'FileExtensions','.wav');
Add the label A
to the first half of the files, and the label B
to the second half. If there are an odd number of files, assign the extra file the label B
. Call countEachLabel
to confirm that half of the files are labeled A
and half the files are labeled B
.
labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ... repmat({'B'},1,ceil(numel(ADS.Files)/2))]; ADS.Labels = labels; countEachLabel(ADS)
ans=2×2 table
Label Count
_____ _____
A 10
B 10
Split ADS into two datastores, ADS1
and ADS2
, specifying that each new datastore contains fifty percent of each label and the corresponding files. Call countEachLabel
to confirm that half of the files are labeled A
and half of the files are labeled B
for each of the new datastores.
[ADS1,ADS2] = splitEachLabel(ADS,0.5)
ADS1 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 7 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 7 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS2 = audioDatastore with properties: Files: { ' .../runnable/matlab/toolbox/audio/samples/Engine-16-44p1-stereo-20sec.wav'; ' .../matlab/toolbox/audio/samples/FemaleSpeech-16-8-mono-3secs.wav'; ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav' ... and 7 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 7 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS1count = countEachLabel(ADS1)
ADS1count=2×2 table
Label Count
_____ _____
A 5
B 5
ADS2count = countEachLabel(ADS2)
ADS2count=2×2 table
Label Count
_____ _____
A 5
B 5
Split by Number of Files
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder,'FileExtensions','.wav');
Add the label A
to the first half of the files, and the label B
to the second half. If there are an odd number of files, assign the extra file the label B
. Call countEachLabel
to confirm that half of the files are labeled A
and half the files are labeled B
.
labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ... repmat({'B'},1,ceil(numel(ADS.Files)/2))]; ADS.Labels = labels; countEachLabel(ADS)
ans=2×2 table
Label Count
_____ _____
A 10
B 10
Split ADS into two datastores, ADS1
and ADS2
. Specify that ADS1
contains four of each label and its corresponding file. ADS2
contains the remaining labels and corresponding files. Call countEachLabel
to confirm that ADS1
contains four files labeled A
and four files labeled B
, and that ADS2
contains the remaining labels.
[ADS1,ADS2] = splitEachLabel(ADS,4)
ADS1 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 5 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 5 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS2 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Counting-16-44p1-mono-15secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Engine-16-44p1-stereo-20sec.wav'; ' .../matlab/toolbox/audio/samples/FemaleSpeech-16-8-mono-3secs.wav' ... and 9 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 9 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS1count = countEachLabel(ADS1)
ADS1count=2×2 table
Label Count
_____ _____
A 4
B 4
ADS2count = countEachLabel(ADS2)
ADS2count=2×2 table
Label Count
_____ _____
A 6
B 6
Split Several Ways by Fractions
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder,'FileExtensions','.wav');
Add the label A
to the first half of the files, and the label B
to the second half. If there is an odd number of files, assign the extra file the label B
. Call countEachLabel
to confirm that half of the files are labeled A
and half the files are labeled B
.
labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ... repmat({'B'},1,ceil(numel(ADS.Files)/2))]; ADS.Labels = labels; countEachLabel(ADS)
ans=2×2 table
Label Count
_____ _____
A 10
B 10
Split ADS
into three new datastores, ADS60
, ADS10
, and ADS30
. The first datastore, ADS60
, contains the first 60% of files with the A
label and the first 60% of files with the B
label. ADS10
contains the next 10% of files from each label. ADS30
contains the remaining 30% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel
rounds down to the nearest whole number.
[ADS60,ADS10,ADS30] = splitEachLabel(ADS,0.6,0.1)
ADS60 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 9 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 9 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS10 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/FemaleSpeech-16-8-mono-3secs.wav'; ' .../matlab/toolbox/audio/samples/TrainWhistle-16-44p1-mono-9secs.wav' } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'B'} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS30 = audioDatastore with properties: Files: { ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav'; ' .../matlab/toolbox/audio/samples/JetAirplane-16-11p025-mono-16secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Laughter-16-8-mono-4secs.wav' ... and 3 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 3 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
Call countEachLabel
to confirm the correct distribution of labels for each datastore.
countEachLabel(ADS60)
ans=2×2 table
Label Count
_____ _____
A 6
B 6
countEachLabel(ADS10)
ans=2×2 table
Label Count
_____ _____
A 1
B 1
countEachLabel(ADS30)
ans=2×2 table
Label Count
_____ _____
A 3
B 3
Split Labels Several Ways by Number of Files
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder,'FileExtensions','.wav');
Add the label A
to the first half of the files, and the label B
to the second half. If there is an odd number of files, assign the extra file the label B
. Call countEachLabel
to confirm that half of the files are labeled A
and half the files are labeled B
.
labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ... repmat({'B'},1,ceil(numel(ADS.Files)/2))]; ADS.Labels = labels; countEachLabel(ADS)
ans=2×2 table
Label Count
_____ _____
A 10
B 10
Split ADS
into three new datastores, ADS1
, ADS2
, and ADS3
. The first datastore, ADS1
, contains the first file with the A
label and the first file with the B
label. ADS2
contains the next file from each label. ADS3
contains the remaining files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel
rounds down to the nearest whole number.
[ADS1,ADS2,ADS3] = splitEachLabel(ADS,1,1)
ADS1 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/MainStreetOne-16-16-mono-12secs.wav' } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'B'} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS2 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../matlab/toolbox/audio/samples/NoisySpeech-16-22p5-mono-5secs.wav' } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'B'} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS3 = audioDatastore with properties: Files: { ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Click-16-44p1-mono-0.2secs.wav'; ' .../matlab/toolbox/audio/samples/Counting-16-44p1-mono-15secs.wav' ... and 13 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 13 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
Call countEachLabel
to confirm the correct distribution of labels for each datastore.
countEachLabel(ADS1)
ans=2×2 table
Label Count
_____ _____
A 1
B 1
countEachLabel(ADS2)
ans=2×2 table
Label Count
_____ _____
A 1
B 1
countEachLabel(ADS3)
ans=2×2 table
Label Count
_____ _____
A 8
B 8
Split Labels in Random Order
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder,'FileExtensions','.wav')
ADS = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 17 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' Labels: {} SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
Add the label A
to the first half of the files, and the label B
to the second half. If there is an odd number of files, assign the extra file the label B
. Call countEachLabel
to confirm that half of the files are labeled A
and half the files are labeled B
.
labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ... repmat({'B'},1,ceil(numel(ADS.Files)/2))]; ADS.Labels = labels; countEachLabel(ADS)
ans=2×2 table
Label Count
_____ _____
A 10
B 10
Create two new datastores from the files in ADS
by randomly drawing from each label. The first datastore, ADS1
, contains two random files with the A
label and two random files with the B
label. ADS2
contains the remaining files from each label.
[ADS1,ADS2] = splitEachLabel(ADS,2,'randomized')
ADS1 = audioDatastore with properties: Files: { ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Engine-16-44p1-stereo-20sec.wav'; ' .../matlab/toolbox/audio/samples/MainStreetOne-16-16-mono-12secs.wav' ... and 1 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'B' ... and 1 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
ADS2 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Click-16-44p1-mono-0.2secs.wav' ... and 13 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 13 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
Include and Exclude Specified Labels
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder,'FileExtensions','.wav')
ADS = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 17 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' Labels: {} SupportedOutputFormats: ["wav" "flac" ... ] (1x7 string) DefaultOutputFormat: "wav"
Add the label A
to the first half of the files, and the label B
to the second half. If there is an odd number of files, assign the extra file the label B
. Call countEachLabel
to confirm that half of the files are labeled A
and half the files are labeled B
.
labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ... repmat({'B'},1,ceil(numel(ADS.Files)/2))]; ADS.Labels = labels; countEachLabel(ADS)
ans = 2x2 table Label Count _____ _____ A 10 B 10
Create two new datastores from the files in ADS
, including only the files with the A
label. ADS1
contains the first 70% of files with the A
label, and ADS2
contains the remaining 30% of labels with the A
label.
[ADS1,ADS2] = splitEachLabel(ADS,0.7,'Include','A')
ADS1 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 4 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 4 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" ... ] (1x7 string) DefaultOutputFormat: "wav" ADS2 = audioDatastore with properties: Files: { ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav'; ' .../matlab/toolbox/audio/samples/JetAirplane-16-11p025-mono-16secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Laughter-16-8-mono-4secs.wav' } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A'} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" ... ] (1x7 string) DefaultOutputFormat: "wav"
Equivalently, you can split only the A
label by excluding the B
label.
[ADS1,ADS2] = splitEachLabel(ADS,0.7,'Exclude','B')
ADS1 = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 4 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A' ... and 4 more} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" ... ] (1x7 string) DefaultOutputFormat: "wav" ADS2 = audioDatastore with properties: Files: { ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav'; ' .../matlab/toolbox/audio/samples/JetAirplane-16-11p025-mono-16secs.wav'; ' .../runnable/matlab/toolbox/audio/samples/Laughter-16-8-mono-4secs.wav' } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } Labels: {'A'; 'A'; 'A'} AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' SupportedOutputFormats: ["wav" "flac" ... ] (1x7 string) DefaultOutputFormat: "wav"
Split Using Fraction and Label Table
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder)
ADS = audioDatastore with properties: Files: { ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav'; ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav'; ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav' ... and 36 more } Folders: { ' .../Bdoc24b.2725827/build/runnable/matlab/toolbox/audio/samples' } AlternateFileSystemRoots: {} OutputDataType: 'double' OutputEnvironment: 'cpu' Labels: {} SupportedOutputFormats: ["wav" "flac" "ogg" "opus" "mp3" "mp4" "m4a"] DefaultOutputFormat: "wav"
Create a label table with two variables:
containsMusic
-- Can be eithertrue
orfalse
.instrument
-- Can beGuitar
,Drums
, orUnknown
.
containsGuitar = contains(ADS.Files,'guitar','IgnoreCase',true); containsDrums = contains(ADS.Files,'drum','IgnoreCase',true); containsMusic = or(containsGuitar,containsDrums); instrument = strings(size(ADS.Files)); instrument(:) = "Unknown"; instrument(containsGuitar) = "Guitar"; instrument(containsDrums) = "Drums";
Assign the label table to the Labels
property of audio datastore to associate the rows of the label table with the rows of the datastore. Call countEachLabel
to determine the incidences of containsMusic
and instrument
.
labels = table(containsMusic,instrument); ADS.Labels = labels; containsMusicCount = countEachLabel(ADS,'TableVariable','containsMusic')
containsMusicCount=2×2 table
containsMusic Count
_____________ _____
false 32
true 7
instrumentCount = countEachLabel(ADS,'TableVariable','instrument')
instrumentCount=3×2 table
instrument Count
__________ _____
Drums 4
Guitar 3
Unknown 32
Split the datastore ADS
into two, based on whether the audio file contains music. ADS1
contains 70% of the audio files that contain music, and ADS2
contains the rest. Call countEachLabel
to verify that the ratio of containsMusic == true
to containsMusic == false
is preserved for the new datastores, within rounding.
[ADS1,ADS2] = splitEachLabel(ADS,0.7,'TableVariable','containsMusic'); ADS1_containsMusicCount = countEachLabel(ADS1,'TableVariable','containsMusic')
ADS1_containsMusicCount=2×2 table
containsMusic Count
_____________ _____
false 22
true 5
ADS2_containsMusicCount = countEachLabel(ADS2,'TableVariable','containsMusic')
ADS2_containsMusicCount=2×2 table
containsMusic Count
_____________ _____
false 10
true 2
Split the datastore ADS
into two, based on the type of instrument present in the audio file. ADS3
contains 25% of the audio files that have an instrument label, and ADS4
contains the rest. Call countEachLabel
to verify that the ratio of instrument == "drums"
to instrument == "guitar"
is preserved for the new datastores, within rounding.
[ADS3,ADS4] = splitEachLabel(ADS,0.25,'TableVariable','instrument'); ADS3_instrumentCount = countEachLabel(ADS3,'TableVariable','instrument')
ADS3_instrumentCount=3×2 table
instrument Count
__________ _____
Drums 1
Guitar 1
Unknown 8
ADS4_instrumentCount = countEachLabel(ADS4,'TableVariable','instrument')
ADS4_instrumentCount=3×2 table
instrument Count
__________ _____
Drums 3
Guitar 2
Unknown 24
Split by Number of Files and Label Table
Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.
folder = fullfile(matlabroot,'toolbox','audio','samples'); ADS = audioDatastore(folder);
Create a label table with two variables:
containsMusic
- Can be eithertrue
orfalse
.instrument
- Can beGuitar
,Drums
, orUnknown
.
containsGuitar = contains(ADS.Files,'guitar','IgnoreCase',true); containsDrums = contains(ADS.Files,'drum','IgnoreCase',true); containsMusic = or(containsGuitar,containsDrums); instrument = strings(size(ADS.Files)); instrument(:) = "Unknown"; instrument(containsGuitar) = "Guitar"; instrument(containsDrums) = "Drums";
Assign the label table to the Labels
property of audio datastore to associate the rows of the label table with the rows of the datastore. Call countEachLabel
to determine the incidences of containsMusic
and instrument
.
labels = table(containsMusic,instrument); ADS.Labels = labels; containsMusicCount = countEachLabel(ADS,'TableVariable','containsMusic')
containsMusicCount=2×2 table
containsMusic Count
_____________ _____
false 32
true 7
instrumentCount = countEachLabel(ADS,'TableVariable','instrument');
Split the datastore ADS
into two, based on whether the audio file contains music. ADS1
contains 5 of each label under the table variable containsMusic
, and ADS2
contains the rest. Call countEachLabel
to verify.
[ADS1,ADS2] = splitEachLabel(ADS,5,'TableVariable','containsMusic'); ADS1_containsMusicCount = countEachLabel(ADS1,'TableVariable','containsMusic')
ADS1_containsMusicCount=2×2 table
containsMusic Count
_____________ _____
false 5
true 5
ADS2_containsMusicCount = countEachLabel(ADS2,'TableVariable','containsMusic')
ADS2_containsMusicCount=2×2 table
containsMusic Count
_____________ _____
false 27
true 2
Split the datastore ADS
into two, based on the type of instrument present in the audio file. ADS3
contains 2 of each label under the table variable instrument
, and ADS4
contains the rest. Call countEachLabel
to verify.
[ADS3,ADS4] = splitEachLabel(ADS,2,'TableVariable','instrument'); ADS3_instrumentCount = countEachLabel(ADS3,'TableVariable','instrument')
ADS3_instrumentCount=3×2 table
instrument Count
__________ _____
Drums 2
Guitar 2
Unknown 2
ADS4_instrumentCount = countEachLabel(ADS4,'TableVariable','instrument')
ADS4_instrumentCount=3×2 table
instrument Count
__________ _____
Drums 2
Guitar 1
Unknown 30
Input Arguments
ADS
— Input audio datastore
audioDatastore
object
Input audio datastore, specified as an audioDatastore
object.
p
— Proportion of files to split
scalar in interval (0,1) | positive integer scalar
Proportion of files to split, specified as a scalar in the interval (0,1), or a positive integer scalar.
If p
is in the interval (0,1), it represents the percentage of
the files from each label to assign to ADS1
. If
p
represents a percentage, and it does not result in a whole
number, then splitEachLabel
rounds down to the nearest whole
number.
If p
is an integer, it represents the absolute number of files
from each label to assign to ADS1
. When p
represents a number of files, there must be at least p
files
associated with each label.
Data Types: double
p1,...,pN
— List of proportions
scalars in interval (0,1) | positive integer scalars
List of proportions, specified as scalars in the interval (0,1) or positive integer scalars.
If the proportions are in the interval (0,1), they represent the percentage of the files from each label to assign to the output datastores. When the proportions represent percentages, their sum must be no more than 1.
If the proportions are integers, they indicate the absolute number of files from each label to assign to the output datastores. When the proportions represent numbers of files, there must be enough files associated with each label to satisfy each proportion.
Data Types: double
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: [ADS1,ADS2] =
splitEachLabel(ADS,0.5,'Exclude','noisy')
Include
— Labels to include
categorical, logical, or numeric vector | cell array of character vectors | string array
Labels to include, specified as the comma-separated pair consisting of
'Include'
and a vector, cell array, or string array of label
names with the same type as the Labels
property. Each name must
match one of the labels in the Labels
property of the
datastore.
This option cannot be used with the 'Exclude'
option.
Exclude
— Labels to exclude
categorical, logical, or numeric vector | cell array of character vectors | string array
Labels to exclude, specified as the comma-separated pair consisting of
'Exclude'
and a vector, cell array, or string array of label
names with the same type as the Labels
property. Each name must
match one of the labels in the Labels
property of the
datastore.
This option cannot be used with the 'Include'
option.
TableVariable
— Label table variable name
char | string
Table variable name, specified as the comma-separated pair consisting of
'TableVariable'
and a character vector or string. When the
Labels
property of the audio datastore ADS
is a table, you must use 'TableVariable'
to specify which label
you are using to split.
Data Types: char
| string
Output Arguments
[ADS1,ADS2]
— Output audio datastores
audioDatastore
objects
Output audio datastores, returned as audioDatastore
objects. ADS1
contains the specified proportion of files from each
label in ADS
, and ADS2
contains the remaining
files.
[ADS1,...,ADSM]
— List of output audio datastores
audioDatastore
objects
List of output audio datastores, returned as audioDatastore
objects. The number of elements in the list is one more that the number of listed
proportions. Each of the new datastores contains the proportion of each label in
ADS
defined by p1,…,pN
. Any files left over
are assigned to the Mth datastore.
Version History
Introduced in R2018b
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)