How does "semanticseg" command work on images larger than what the network was trained with?

4 vues (au cours des 30 derniers jours)
I am using semantic segmentation to identify particles in images. I have trained a VGG16 network using 200x200 pixel images. I use the "semanticseg" command to analyze 1000x1000 pixel images, and it works. Why is that? Does semanticseg split my 1000x1000 pixel images into 200x200 pixel sections and analyze them one at a time? Does it scale down my large image to 200x200 pixels, then scales it back up? Would it make a difference if I first cropped my 1000x1000 pixel images into 200x200 pixel sections before using semanticseg?
  3 commentaires
TStan
TStan le 27 Nov 2018
Hi Kushagr, I have followed this MATLAB training procedure, but used my own images and labels to train the network. I am technically using the "SegNet" network (which is VGG-16 that has been altered for semantic segmentation). The network was trained using 200x200 pixel images. I use the "semanticseg" command to segment 1000x1000 pixel images using the trained nework, and it works very well! I would like to understand how MATLAB applies my network (trained on 200x200 pix images) to a large image (1000x1000 pix).
Salma Hassan
Salma Hassan le 27 Déc 2018
how you creat your pixel label images for your own images ? i need to use the same code but i don't understand how to creat pixellabel

Connectez-vous pour commenter.

Réponses (1)

Kushagr Gupta
Kushagr Gupta le 28 Nov 2018
The segnetLayers function creates a U-Net to perform semantic segmentation and the beauty of this type of a network is that it is made up primarily of convolution, relu and batchNormalization layers. Thus, one can pass arbitrary sized inputs during the inference/prediction stage.
This is because a convolutionLayer contains filters of some size (say 3x3) which can be applied on any input irrespective of its size(as mentioned it works well on 200x200 or 1000x1000). The reason why the size needs to be same during training stage is because training happens in batches and each batch must have same size in all dimensions, i.e. [H W C ] dimensions need to be the same for all images during training. But during prediction you are free to pass in an input size smaller or greater than the size used during training.
To answer other doubts that you had, semanticseg neither scales down your image nor splits it into patches. Moreover, it would not make any difference if you manually cropped your dataset to be of the size you used during training and then patched all those up again.
Note: You would only do such type of cropping and patching if you cannot fit the entire data on your GPU, in which case there really isnt any other choice left other than feeding as much data as can be processed by the GPU in patches and then concatenating all these patches at the end.
Hope this helps.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by