Main Content

splitEachLabel

Split ImageDatastore labels by proportions

Description

example

[imds1,imds2] = splitEachLabel(imds,p) splits the image files in imds into two new datastores, imds1 and imds2. The new datastore imds1 contains the first p files from each label and imds2 contains the remaining files from each label. p can be either a number between 0 and 1 indicating the percentage of the files from each label to assign to imds1, or an integer indicating the absolute number of files from each label to assign to imds1.

example

[imds1,...,imdsM] = splitEachLabel(imds,p1,...,pN) splits the datastore into N+1 new datastores. The first new datastore imds1 contains the first p1 files from each label, the next new datastore imds2 contains the next p2 files, and so on. If p1,...,pN represent numbers of files, then their sum must be no more than the number of files in the smallest label in the original datastore imds.

example

___ = splitEachLabel(___,'randomized') randomly assigns the specified proportion of files from each label to the new datastores.

example

___ = splitEachLabel(___,Name,Value) specifies the properties of the new datastores using one or more name-value pair arguments. For example, you can specify which labels to split with 'Include','labelname'.

Examples

collapse all

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),...
'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels
ans = 

     demos 
     demos 
     demos 
     demos 
     demos 
     demos 
     imagesci 
     imagesci 

Create two new datastores from the files in imds. The first datastore imds60 contains the first 60% of files with the demos label and the first 60% of files with the imagesci label. The second datastore imds40 contains the remaining 40% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel rounds down to the nearest whole number.

[imds60,imds40] = splitEachLabel(imds,0.6)
imds60 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
              ... and 2 more
             }
     Labels: [demos; demos; demos ... and 2 more categorical]
    ReadFcn: @readDatastoreImage


imds40 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\street1.jpg';
             ' ...\matlab\toolbox\matlab\demos\street2.jpg';
             ' ...\matlab\toolbox\matlab\imagesci\peppers.png'
             }
     Labels: [demos; demos; imagesci]
    ReadFcn: @readDatastoreImage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),...
'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels
ans = 

     demos 
     demos 
     demos 
     demos 
     demos 
     demos 
     imagesci 
     imagesci 

Create two new datastores from the files in imds. The first datastore imds1 contains the first file with the demos label and the first file with the imagesci label. The second datastore imds2 contains the remaining files from each label.

[imds1,imds2] = splitEachLabel(imds,1)
imds1 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\imagesci\corn.tif'
             }
     Labels: [demos; imagesci]
    ReadFcn: @readDatastoreImage


imds2 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg';
             ' ...\matlab\toolbox\matlab\demos\ngc6543a.jpg'
              ... and 3 more
             }
     Labels: [demos; demos; demos ... and 3 more categorical]
    ReadFcn: @readDatastoreImage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),...
'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels
ans = 

     demos 
     demos 
     demos 
     demos 
     demos 
     demos 
     imagesci 
     imagesci 

Create three new datastores from the files in imds. The first datastore imds60 contains the first 60% of files with the demos label and the first 60% of files with the imagesci label. The second datastore imds10 contains the next 10% of files from each label. The third datastore imds30 contains the remaining 30% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel rounds down to the nearest whole number.

[imds60, imds10, imds30] = splitEachLabel(imds,0.6,0.1)
imds60 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
              ... and 2 more
             }
     Labels: [demos; demos; demos ... and 2 more categorical]
    ReadFcn: @readDatastoreImage


imds10 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\street1.jpg'
             }
     Labels: demos
    ReadFcn: @readDatastoreImage


imds30 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\street2.jpg';
             ' ...\matlab\toolbox\matlab\imagesci\peppers.png'
             }
     Labels: [demos; imagesci]
    ReadFcn: @readDatastoreImage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),...
'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels
ans = 

     demos 
     demos 
     demos 
     demos 
     demos 
     demos 
     imagesci 
     imagesci 

Create three new datastores from the files in imds. The first datastore imds1 contains the first file with the demos label and the first file with the imagesci label. The second datastore imds2 contains the next file from each label. The third datastore imds3 contains the remaining files from each label.

[imds1, imds2, imds3] = splitEachLabel(imds,1,1)
imds1 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\imagesci\corn.tif'
             }
     Labels: [demos; imagesci]
    ReadFcn: @readDatastoreImage


imds2 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\imagesci\peppers.png'
             }
     Labels: [demos; imagesci]
    ReadFcn: @readDatastoreImage


imds3 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg';
             ' ...\matlab\toolbox\matlab\demos\ngc6543a.jpg';
             ' ...\matlab\toolbox\matlab\demos\street1.jpg'
              ... and 1 more
             }
     Labels: [demos; demos; demos ... and 1 more categorical]
    ReadFcn: @readDatastoreImage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),...
'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels
ans = 

     demos 
     demos 
     demos 
     demos 
     demos 
     demos 
     imagesci 
     imagesci 

Create two new datastores from the files in imds by randomly drawing from each label. The first datastore imds1 contains one random file with the demos label and one random file with the imagesci label. The second datastore imds2 contains the remaining files from each label.

[imds1, imds2] = splitEachLabel(imds,1,'randomized')
imds1 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\street2.jpg';
             ' ...\matlab\toolbox\matlab\imagesci\corn.tif'
             }
     Labels: [demos; imagesci]
    ReadFcn: @readDatastoreImage


imds2 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
              ... and 3 more
             }
     Labels: [demos; demos; demos ... and 3 more categorical]
    ReadFcn: @readDatastoreImage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),...
'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels
ans = 

     demos 
     demos 
     demos 
     demos 
     demos 
     demos 
     imagesci 
     imagesci 

Create two new datastores from the files in imds, including only the files with the demos label. The first datastore imds60 contains the first 60% of files with the demos label and the second datastore imds40 contains the remaining 40% of files with the demos label.

[imds60, imds40] = splitEachLabel(imds,0.6,'Include','demos')
imds60 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
              ... and 1 more
             }
     Labels: [demos; demos; demos ... and 1 more categorical]
    ReadFcn: @readDatastoreImage


imds40 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\street1.jpg';
             ' ...\matlab\toolbox\matlab\demos\street2.jpg'
             }
     Labels: [demos; demos]
    ReadFcn: @readDatastoreImage

Equivalently, you can split only the demos label by excluding the imagesci label.

[imds60, imds40] = splitEachLabel(imds,0.6,'Exclude','imagesci')
imds60 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
             ' ...\matlab\toolbox\matlab\demos\example.tif';
             ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
              ... and 1 more
             }
     Labels: [demos; demos; demos ... and 1 more categorical]
    ReadFcn: @readDatastoreImage


imds40 = 

  ImageDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\street1.jpg';
             ' ...\matlab\toolbox\matlab\demos\street2.jpg'
             }
     Labels: [demos; demos]
    ReadFcn: @readDatastoreImage

Input Arguments

collapse all

Input datastore, specified as an ImageDatastore object. To create an ImageDatstore from your image data, use the imageDatastore function.

Proportion of files to split, specified as a scalar in the interval (0,1) or a positive integer scalar.

  • If p is in the interval (0,1), then it represents the percentage of the files from each label to assign to imds1. If p does not result in a whole number of files, then splitEachLabel rounds down to the nearest whole number.

  • If p is an integer, then it represents the absolute number of files from each label to assign to imds1. There must be at least p files associated with each label.

Data Types: double

List of proportions, specified as scalars in the interval (0,1) or positive integer scalars. If the proportions are in the interval (0,1), then they represent the percentage of the files from each label to assign to the output datastores. If the proportions are integers, then they indicate the absolute number of files from each label to assign to the output datastores. When the proportions represent percentages, their sum must be no more than 1. When the proportions represent numbers of files, there must be enough files associated with each label to satisfy each proportion.

Data Types: double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: [imds1 imds2] = splitEachLabel(imds,0.5,'Exclude','demos')

Labels to include, specified as the comma-separated pair consisting of 'Include' and a vector, cell array, or string array of label names with the same type as the Labels property. Each name must match one of the labels in the Labels property of the datastore.

Data Types: char | cell | string

Labels to exclude, specified as the comma-separated pair consisting of 'Exclude' and a vector, cell array, or string array of label names with the same type as the Labels property. Each name defines a label associated with the datastore and must match the names in Labels. This option cannot be used with the 'Include' option.

Data Types: char | cell | string

Output Arguments

collapse all

Output datastores, returned as ImageDatastore objects. imds1 contains the specified proportion of files from each label in imds, and imds2 contains the remaining files.

List of output datastores, returned as ImageDatastore objects. The number of elements in the list is one more than the number of listed proportions. Each of the new datastores contains the proportion of each label in imds defined by p1,...,pN. Any files left over are assigned to the Mth datastore.

Extended Capabilities

Version History

Introduced in R2016a