MATLAB Answers

Matt J
1

Set union of Datastores with TransformedDatastores

Asked by Matt J
on 26 Jul 2019
Latest activity Edited by Matt J
on 26 Jul 2019
Given an imageDatastore and some transformation of it, e.g.,
imds1 = imageDatastore({'street1.jpg','peppers.png'});
imds2 = transform(imds1,@(x) imwarp(x,tform));
I would like to form the set union of these data stores in some way so that trainNetwork processes the series of images from both imds1 and imds2 as a single combined set (and similarly with the response data). Is this possible in some way?
I am aware that this functionality is somewhat captured by augmentedImageDatastore, but the operation I describe would open up a variety of data augmentation schemes not currently avaialble.
I am also aware of this thread,
but this does not cover what I am pursuing here, because the images in a TransformedDatastore are not physically stored anywhere (nor would I want them to be).

  0 Comments

Sign in to comment.

1 Answer

Answer by Jeremy Hughes on 26 Jul 2019
Edited by Jeremy Hughes on 26 Jul 2019
 Accepted Answer

Horizontal (i.e. associated reads)
-----------
cds = combine(imds,otherds);
Vertical (i.e. joining two sets of files into one datastore)
-----------
imds = imageDatastore({'folder1/*.jpg','folder2/*.png'});
Or leave off the extensions
imds = imageDatastore({'folder1/','folder2/'});

  6 Comments

Ahh, you can specify multiple folders in imageDatastore.
Yes, I can see that will work for joining two ordinary imageDatastores, but what about an imageDatastore and a TransformedDatastore, like in my posted example,
imds1 = imageDatastore({'street1.jpg','peppers.png'});
imds2 = transform(imds1,@(x) imwarp(x,tform));
As I understand it, imds2 does not store the transformed images in a physical folder, so you cannot simply list additional folders to capture its images in the training set.
I see.
imds2 = transform(imds1,@(x) {x;imwarp(x,tform)});
This would return both images, but then they'd always be returned together, which isn't really what you want for trainNetwork (I assume).
You could write the images out to disk by calling read on imds2 and passing the results to imwrite, but you'd need to have unique file names. You could use the second output of read to get the file names. This is a bit of coding.
The only other way I can think to do what you're looking for would be to create your own datastore with https://www.mathworks.com/help/matlab/import_export/develop-custom-datastore.html
Your datastore would hold onto both the original datastore, and the transform, and just return reads from one until it's done, then the next one until it's done and so on.
That looks like what I want!
Just out of curiousity, though, is it possible to do the customization by inheriting from imageDatastore, rather than matlab.io.datastore, as the example at your link shows?

Sign in to comment.