Effacer les filtres
Effacer les filtres

Mask R-CNN maximum number of detected instances per image.

32 vues (au cours des 30 derniers jours)
ANDREA MACI
ANDREA MACI le 2 Juin 2024
Commenté : ANDREA MACI le 27 Juin 2024 à 9:26
Hi, I've been working on the mask r-cnn following the documentation instructions. I've got everything tot work but I stumbled upon a potential library mistake. Let me explain better my situation: I am working on a dataset with ~250 images (split between training and validation) and with just 1 category. Each image might have 40-50 instances up to 600-650 instances.
The problem is this, mask r-cnn can only detect up to 100 instances by class definition. I believe this is hurting the training of the network - however I cannot confirm this because I have to run the training by remote, by command prompt, since I don't have a GPU powerful enough to run the training locally. My evidence is that the network, after the training, performs somewhat well on images with 40-50 instances, while it performs horrible on images with a lot of instances. In fact, when I evaluate my network on the validation set (something that I can do on my own computer), the network outputs at most 100 masks per image.
My "local" fix: I edited the maskrcnn.m file of the library. I went to the directory "C:\Program\Files\MATLAB\R2023b\toolbox\vision\vision\@maskrcnn\maskrcnn.m" and at line 172 of the code, instead of
NumStrongestRegionsPrediction = 100
I put (expecting to not detect more than 800 instances, given my ground truth data)
NumStrongestRegionsPrediction = 800
which fixes my issue at least at validation time. However, since my training is run without this fix and, given my results, I am writing here to ask what I can do about this issue, I am basically certain my code is correct.
Again, all I can observe at training time is the training loss, which converges to a good number, however sometimes it outputs a bigger number, probably because it encounters the batch with the images with a lot of instances - in other words, the network isn't learning enough out of these images and mistakes/training loss.
I can provide more information if needed, however for now I want to keep the post simple.
  2 commentaires
John D'Errico
John D'Errico le 2 Juin 2024
It is totally amazing at how often people claim they are certain their code is correct. Surely, nothing you could have done could possibly be wrong? SIGH.
I would point out it is a terribly poor idea to edit supplied code. The rule I have always understood and used is, you change it, you own it. Once you change supplied code, then any problems are now yours.
Anyway, I would suggest you post this as a tech support issue, not Answers. They may have had valid reasons to limit that parameter.
ANDREA MACI
ANDREA MACI le 2 Juin 2024
Thank you for your answer, John.
As I've said, my code works fine. The training of the neural network works and it has great results, besides on the images with too many instances. I am aware of the dangers of editing supplied code, I didn't encounter new problems by changing it. If anything, it worked better, give my instance segmentation task. I can provide and post my code here if you want to have a look at it.
I am new to writing here, which section of the support should I write to? Moreover, I am a student, and as written in the "Product Usage" section of MATLAB Tech Support: "Technical support from MathWorks is available for activation, installation and bug-related issues", so I don't even know if that's going to work.

Connectez-vous pour commenter.

Réponses (1)

aditi bagora
aditi bagora le 26 Juin 2024 à 9:53
Hi Andrea,
I understand that you are trying to detect objects using mask-rcnn and you observed model performing well for smaller instances when compared to larger instances.
The parameter "NumStrongestRegionsPrediction" controls the number of regions with high prediction values to output. Setting it to a value 8x will lead to more instances in the output. It seems that it is solving your issue. But, increasing the value can also increase the number of false positive instances.
I would suggest you to not change the code locally instead use the parameter "threshold". Changing the threshold gives controls over selection of number of instances with a prediction value. Please note setting the threshold to lower values may also lead to false positive detections.
Also, If the model is unable to detect more instances with higher prediction values it is a clear indication that model is unable to learn properly from the training data. In that case, you need to analyse your model, training data and maybe re-train your model.
Hope this helps!
  1 commentaire
ANDREA MACI
ANDREA MACI le 27 Juin 2024 à 9:26
Thank you for your answer,
It is true that I get more false positives, however the results I get are still more accurate. I will explain what I think happens at training time which is the reason why the model doesn't perform well. It doesn't have to do with the threshold parameter at inference time, the model is very sure about the instances I obtain (instances easily have a score >0.99).
If the model can only detect 100 instances at most, then it will be penalyzed heavily when it trains on images with 600 instances; because the maskrcnn, according to the ground truth data, is doing a bad job and therefore it will keep "changing its mind" about what's an instance and what is not an instance despite all of them being instances. I'm not sure I've explained myself properly. Moreover, I don't have any evidence to support my point.
All I know is that, in the trainMaskRCNN() function, the parameters "NumStrongestRegions, NumRegionsToSample" seem to not change the performance of the model while training. I'm also pretty sure it doesn't inherently have to do with the size of the instances (this is taken into account by the anchor boxes).
The original paper I'm working on is "Materials swelling revealed through automated semantic segmentation of cavities in electron microscopy images", they obtain overall good results by implementing a MaskRCNN in python, of which I'm not familiar of, regarding the implementation of the neural network, but I don't know anything else, so you might be correct that a ResNet50 backbone isn't enough for this task.
Thank you again for taking your time. In the mean time, since the first time I posted, I've simply accepted the results I've obtained.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Image Data Workflows dans Help Center et File Exchange

Produits


Version

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by