MATLAB Answers


Large training set in Semantic Segmentation runs out of memory in trainNetwork

Asked by Lorant Szabo on 18 Nov 2019 at 9:14
Latest activity Commented on by Raunak Gupta on 26 Nov 2019 at 9:35
Dear Community!
Since the following question has not been answered yet, i would give an update with more details.
I would like to train a dataset contains 600 piece of 1208x1920 images and 50 classes.
I used the following code just changed the classes and the paths:
However the training runs on the following error:
matlab error.png
Where the 1208x1920 is the image size, 50 is a number of classes and 200 is the number of the validation pictures.
With 1 piece of validation picture the training is starting
Memory: 64 GB
GPU : Titan X Pascal 12GB
We would like to know what is the best way to overcome this problem.


Sign in to comment.

1 Answer

Answer by Raunak Gupta on 22 Nov 2019 at 6:12

As mentioned in the example that is referenced you may need to resize the image to a smaller size that can fit into the GPU memory or you may try reducing the MiniBatchSize to a smaller value like 4 or 2. If even 1 image doesn’t fit into the memory you need to resize the image of choose a smaller network to work with or increase the GPU memory on the system. Here since the imageDatastore is used, all the validation images won’t be read from the memory instead only MiniBatchSize’ number of images will be read.
Here since the image size is almost 3.4 times the size used in example, so I recommend first changing the MiniBatchSize to 2 as compared to 8 in the example.
For increasing the MiniBatchSize the best way is to increase the GPU Memory present on the system.


As you can see this bug is not releated to the GPU memory since without validational dataset the training is starts with MiniBatchSize=10.
This bug is in connection with the validational dataset size.
It occured because in the trainNetwork function tries to allocate array for the results of the validational set.
In the example that is referred the data is divided using a function partitionCamVidData that creates seperate datastore for training, testing and validation according to the split that is mentioned. As long as long the data is feed into the network using datastores the images are not loaded into the memory. Also the result array will not be of much size because it contains only accuracies and losses for each iteration (Even if the number of iteration is 10000 the overall size will 10000 x 5 as it store only 5 parameters). So, unless the validation data is loaded specifically into the code, the datasores will not load data into the GPU Memory. The data will loaded according to the batch size. Also you may explictly set the 'ExecutionEnvironment' to 'gpu'.

Sign in to comment.