Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

John Smith

13 Mar 2023

2 Réponses

Réponse acceptée

Mise à jour 24 Mar 2023

44 Vues (30 jours)

Connectez-vous pour répondre à cette question.

Follow Question

Connectez-vous pour répondre à cette question.

Follow Question

Afficher commentaires plus anciens

1 vote

Hello,

While implementing a ViT transformer in Matlab, I found at that the layerNormalizationLayer does include the T dimension in the statistics calculated for each sample in the batch. This is problematics when implementing a transformer, since tokens correspond to the T dimension and reference implementations calculate the statistics separately for each token.

Thx

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Réponse acceptée

John Smith le 24 Mar 2023

0 votes

It seems Mathworks have listened and changed the behavior of layerNormalizationLayer in R2023a.:

https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.layernormalizationlayer.html

Starting in R2023a, by default, the layer normalizes sequence data over the channel and spatial dimensions. In previous versions, the software normalizes over all dimensions except for the batch dimension (the spatial, time, and channel dimensions). Normalization over the channel and spatial dimensions is usually better suited for this type of data. To reproduce the previous behavior, set OperationDimension to "batch-excluded".

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Plus de réponses (1)

Matt J le 13 Mar 2023

0 votes

Perhaps you can fold your T dimension into the C dimension and use a groupNormalizationLayer instead, with the groups defined so that different T belong to different groups.

7 commentaires
Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens

John Smith le 15 Mar 2023

Perhaps lamenting would cause someone from Mathworks to take notice and add the capability to the code base. Sigh ...

Matt J le 15 Mar 2023

That happens sometimes, but usually you have to submit a formal enhancement request.

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Produits

Deep Learning Toolbox

Version

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Plus de réponses (1)

7 commentaires Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens

Catégories

Produits

Version

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

7 commentaires
Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens