Correct weight Initialization in CNN

20 vues (au cours des 30 derniers jours)
Andres Ramirez
Andres Ramirez le 29 Juil 2018
When a very deep DAG network is built from scratch, the initialization of the weights made by matlab is not very good since it presents a vanishing gradient problem which causes the CNN not to learn.
What is the function with which Matlab does the initiation of CNN weights?
Why do you implement initialization functions in Matlab such as XAVIER or RELU AWARE SCALALED?
Thank you for your answers.
  2 commentaires
Greg Heath
Greg Heath le 31 Juil 2018
I do not understand
"Why do you implement initialization functions in Matlab such as XAVIER or RELU AWARE SCALALED?"
Please explain.
Greg
Yuze Zou
Yuze Zou le 3 Juil 2019
I guess this issuse has been solved in the latest release (R2019a) via new default weights initialization method (i.e., Xavier/Glorot) for `fullyConnectedLayer`. You can find more details here.

Connectez-vous pour commenter.

Réponse acceptée

Maria Duarte Rosa
Maria Duarte Rosa le 5 Juil 2019
Modifié(e) : Maria Duarte Rosa le 5 Juil 2019
In R2019a, the following weight initializers are available (including a custom initializer via a function handle):
'glorot' (default) | 'he' | 'orthogonal' | 'narrow-normal' | 'zeros' | 'ones' | function handle
Glorot is also know as Xavier initializer.
Here is a page comparing 3 initializers when training LSTMs:
I hope this helps,
Maria

Plus de réponses (2)

Andres Ramirez
Andres Ramirez le 31 Juil 2018
Hello Gerg, thanks for answering ... I'll explain:
I have built very deep networks such as: Googlenet, Resnet and VGG19, and I want to train them from scratch with my databases; However, when I do the training of any of these networks, the network does not learn and only reaches a maximum performance of 12 or 15%. I think the low performance is mainly due to the random initialization of the weights made by defaul in matlab does not work for very deep networks, according to the literature, random initialization causes a vanishing gradient problem which causes the network do not learn.
For the above, my questions are:
Why do not implement in matlab more appropriate weight initialization functions for the training of deep DAG networks, for example, XAVIER or RELA AWARE SCALALED?
I hope to have been clear the explanation of the problem I have ...
Thank you.
Greetings.
  1 commentaire
Greg Heath
Greg Heath le 1 Août 2018
Modifié(e) : Greg Heath le 1 Août 2018
Do you have a reference for
RELA AWARE SCALALED
I have no idea what this is.
Thanks
Greg

Connectez-vous pour commenter.


fareed jamaluddin
fareed jamaluddin le 4 Août 2018
I think you can take a look at this example https://www.mathworks.com/help/images/single-image-super-resolution-using-deep-learning.html
I am also looking for a way on weight initialization options, you can see in the example it create the initialization with He method for every conv layer.

Catégories

En savoir plus sur Image Data Workflows dans Help Center et File Exchange

Produits


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by