crosschannelnorm

Cross channel square-normalize using local responses

Description

The cross-channel normalization operation uses local responses in different channels to normalize each activation. Cross-channel normalization typically follows a relu operation. Cross-channel normalization is also known as local response normalization.

Note

This function applies the cross-channel normalization operation to dlarray data. If you want to apply cross-channel normalization within a layerGraph object or Layer array, use the following layer:

example

dlY = crosschannelnorm(dlX,windowSize) normalizes each element of dlX with respect to local values in the same position in nearby channels. The normalized elements in dlY are calculated from the elements in dlX using the following formula.

$y=\frac{x}{{\left(K+\frac{\alpha *ss}{windowSize}\right)}^{\beta }}$

where y is an element of dlY, x is the corresponding element of dlX, ss is the sum of the squares of the elements in the channel region defined by windowSize, and α, β, and K are hyperparameters in the normalization.

example

dlY = crosschannelnorm(dlX,windowSize,'DataFormat',FMT) also specifies the dimension format FMT when dlX is an unformatted dlarray, in addition to the input arguments the previous syntax. The output dlY is an unformatted dlarray with the same dimension order as dlX.

example

dlY = crosschannelnorm(___,Name,Value) specifies options using one or more name-value pair arguments in addition to the input arguments in previous syntaxes. For example, 'Beta',0.8 sets the value of the β contrast constant to 0.8.

Examples

collapse all

Use crosschannelnorm to normalize each observation of a mini-batch using values from adjacent channels.

Create the input data as ten observations of random values with a height and width of eight and six channels.

height = 8;
width = 8;
channels = 6;
observations = 10;

X = rand(height,width,channels,observations);
dlX = dlarray(X,'SSCB');

Compute the cross-channel normalization using a channel window size of three.

dlY = crosschannelnorm(dlX,3);

Each value in each observation of dlX is normalized using the element in the previous channel and the element in the next channel.

Values at the edges of an array are normalized using contributions from fewer channels, depending on the size of the channel window.

Create the input data as an array of ones with a height and width of two and three channels.

height = 2;
width = 2;
channels = 3;

X = ones(height,width,channels);
dlX = dlarray(X);

Normalize the data using a channel-window size of 3, an $\alpha$ of 1, a $\beta$ of 1, and a $\mathit{K}$ of 1e-5. Specify a data format of 'SSC'.

dlY = crosschannelnorm(dlX,3,'Alpha',1,'Beta',1,'K',1e-5,'DataFormat','SSC');

Compare the values in the original and the normalized data by reshaping the three-channel arrays into 2-D matrices.

dlX = reshape(dlX,2,6)
dlX =
2x6 dlarray

1     1     1     1     1     1
1     1     1     1     1     1

dlY = reshape(dlY,2,6)
dlY =
2x6 dlarray

1.5000    1.5000    1.0000    1.0000    1.5000    1.5000
1.5000    1.5000    1.0000    1.0000    1.5000    1.5000

For the first and last channels, the sum of squares is calculated using only two values. For the middle channel, the sum of squares contains the values of all three channels.

Typically, the cross-channel normalization operation follows a ReLU operation. For example, the GoogLeNet architecture contains convolutional operations followed by ReLU and cross-channel normalization operations.

The function modelFunction defined at the end of this example shows how you can use cross-channel normalization in a model. Use modelFunction to find the grouped convolution and ReLU activation of some input data and then normalize the result using cross-channel normalization with a window size of 5.

Create the input data as a single observation of random values with a height and width of ten and four channels.

height = 10;
width = 10;
channels = 4;
observations = 1;

X = rand(height,width,channels,observations);
dlX = dlarray(X,'SSCB');

Create the parameters for the grouped convolution operation. For the weights, use a filter height and width of three, two channels per group, three filters per group, and two groups. Use a value of zero for the bias.

filterSize = [3 3];
numChannelsPerGroup = 2;
numFiltersPerGroup = 3 ;
numGroups = 2;

params = struct;
params.conv.weights = rand(filterSize(1),filterSize(2),numChannelsPerGroup,numFiltersPerGroup,numGroups);
params.conv.bias = 0;

Apply the modelFunction to the data dlX.

dlY = modelFunction(dlX,params);
function dlY = modelFunction(dlX,params)

dlY = dlconv(dlX,params.conv.weights,params.conv.bias);
dlY = relu(dlY);
dlY = crosschannelnorm(dlY,5);

end

Input Arguments

collapse all

Input data, specified as a dlarray with or without data format. When dlX is an unformatted dlarray, you must specify the data format using the 'DataFormat',FMT name-value pair.

You can specify up to two dimensions in dlX as 'S' dimensions.

Data Types: single | double

Size of the channel window, which controls the number of channels that are used for the normalization of each element, specified as a positive integer.

If windowSize is even, then the window is asymmetric. The software looks at the previous floor((windowSize-1)/2) channels and the following floor((windowSize)/2) channels. For example, if windowSize is 4, then the function normalizes each element by its neighbor in the previous channel and by its neighbors in the next two channels.

Example: 3

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32

Name-Value Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Alpha',2e-4,'Beta',0.8 sets the multiplicative normalization constant to 0.0002 and the contrast constant exponent to 0.8.

Dimension order of unformatted input data, specified as a character vector or string scalar FMT that provides a label for each dimension of the data.

When you specify the format of a dlarray object, each character provides a label for each dimension of the data and must be one of the following:

• "S" — Spatial

• "C" — Channel

• "B" — Batch (for example, samples and observations)

• "T" — Time (for example, time steps of sequences)

• "U" — Unspecified

You can specify multiple dimensions labeled "S" or "U". You can use the labels "C", "B", and "T" at most once.

You must specify DataFormat when the input data is not a formatted dlarray.

Data Types: char | string

Normalization constant (α) that multiplies the sum of the squared values, specified as the comma-separated pair consisting of 'Alpha' and a numeric scalar. The default value is 1e-4.

Example: 'Alpha',2e-4

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32

Contrast constant (β), specified as the comma-separated pair consisting of 'Beta' and a numeric scalar greater than or equal to 0.01. The default value is 0.75.

Example: 'Beta',0.8

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32

Normalization hyperparameter (K) used to avoid singularities in the normalization, specified as the comma-separated pair consisting of 'K' and a numeric scalar greater than or equal to 1e-5. The default value is 2.

Example: 'K',2.5

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32

Output Arguments

collapse all

Normalized data, returned as a dlarray. The output dlY has the same underlying data type as the input dlX.

If the input data dlX is a formatted dlarray, dlY has the same dimension labels as dlX. If the input data is an unformatted dlarray, dlY is an unformatted dlarray with the same dimension order as the input data.