Custom deep learning network - gradient function using dlfeval

Question

Iris Soa le 15 Juil 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/565595-custom-deep-learning-network-gradient-function-using-dlfeval

Réponse apportée : Iris Soa le 27 Juil 2020

I want to create a custom deep learning training function, the output of which is an array Y. I have two inputs, the arrays X1 and X2. I want to find the gradient of Y with respect to X1 and X2.

This is my network:

layers1 = [
    sequenceInputLayer(sizeInput,"Name","XTrain1")
    fullyConnectedLayer(numHiddenDimension,"Name","fc_1")
    softplusLayer('Name','s_1')];
layers2 = [
    sequenceInputLayer(sizeInput,"Name","XTrain2")
    fullyConnectedLayer(numHiddenDimension,"Name","fc_2")
    softplusLayer('Name','s_2')];
lgraph = layerGraph(layers1); 
lgraph = addLayers(lgraph,layers2); % connect layers -> 2 in, 1 out
add = additionLayer(2,'Name','add');
lgraph = addLayers(lgraph,add); 
lgraph = connectLayers(lgraph,'s_1','add/in1');
lgraph = connectLayers(lgraph,'s_2','add/in2');
fc = fullyConnectedLayer(sizeInput,"Name","fc_3");
lgraph = addLayers(lgraph,fc);
lgraph = connectLayers(lgraph,'add','fc_3');
dlnet = dlnetwork(lgraph);

My

should become my output. Then every iteration, I do:

dlX1 = dlarray(X1,'CTB'); 
dlX2 = dlarray(X2,'CTB');% to differentiate: dlarray/dlgradient
for i = 1:sizeInput
    [gradx1(i), gradx2(i), dlY] = dlfeval(@modelGradientsX,dlnet,dlX1(i),dlX2(i)); % here is where I get my error
end

and I call my function

, which is supposed to get the derivative of my output with respect to my inputs.

function [gradx1, gradx2, dlY] = modelGradientsX(dlnet,dlX1,dlX2)
    dlY = forward(dlnet,dlX1,dlX2); 
    [gradx1, gradx2] = dlgradient(dlY,dlX1,dlX2);
end

And the error I get is: "Input data must be formatted dlarray objects". I have seen similar approaches in other examples (like this one: https://www.mathworks.com/matlabcentral/fileexchange/74760-image-classification-using-cnn-with-multi-input-cnn) so I don't understand - why is

not the correct type of data?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Raunak Gupta le 18 Juil 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/565595-custom-deep-learning-network-gradient-function-using-dlfeval#answer_467433

Ouvrir dans MATLAB Online

Hi,

From the code I only see a syntax error on the following line

[gradx1(i), gradx2(i)] = dlfeval(@modelGradientsX,dlnet,dlX1(i),dlX2(i));

Here the modelGradientsX outputs three variables but you have assigned only gradx1 and gradx2 while calling it. This may be one issue. Other than that, I think loss should also be returned from the modelGradientsX function so that for next iteration the weights can be updated.

If still the error persist you may check that dlX1(i) and dlX2(i) are indeed a dlarray object because dlgradient only accept dlarray object.

2 commentaires
Afficher AucuneMasquer Aucune

Iris Soa le 19 Juil 2020

Modifié(e) : Iris Soa le 26 Juil 2020

Ouvrir dans MATLAB Online

Sir,

Thank you very much for your answer. I will reply to each of the ideas in turn:

On the line that you have emphasised

[gradx1(i), gradx2(i)] = dlfeval(@modelGradientsX,dlnet,dlX1(i),dlX2(i));

unfortunately I have just provided the incorrect code. In fact I am sending back three outputs. I have updated the issue I have opened to reflect this.

I see now that I should get a loss returned from my function, thank you very much for this. I think this is my problem. Thank you.

Iris Soa le 23 Juil 2020

Ouvrir dans MATLAB Online

Here is the code that I am using to compare with my own, and it works for some reason...

iteration = 0;
start = tic;
% Loop over epochs.
for epoch = 1:numEpochs
    % Shuffle data.
    idx = randperm(numel(YTrain));
    XTrain1 = XTrain1(:,:,:,idx);
    XTrain2 = XTrain2(:,:,:,idx);
    YTrain = YTrain(idx);
    
    % Loop over mini-batches.
     for i = 1:numIterationsPerEpoch
        iteration = iteration + 1;
        
        % Read mini-batch of data and convert the labels to dummy
        % variables.
        idx = (i-1)*miniBatchSize+1:i*miniBatchSize;
        X1 = XTrain1(:,:,:,idx);
        X2 = XTrain2(:,:,:,idx);
        % convert the label into one-hot vector to calculate the loss
        Y = zeros(numClasses, miniBatchSize, 'single');
        for c = 1:numClasses
            Y(c,YTrain(idx)==classes(c)) = 1;
        end
        
        % Convert mini-batch of data to dlarray.
        dlX1 = dlarray(single(X1),'SSCB');
        dlX2 = dlarray(single(X2),'SSCB');
        
        % If training on a GPU, then convert data to gpuArray.
        if (executionEnvironment == "auto" && canUseGPU) || executionEnvironment == "gpu"
            dlX1 = gpuArray(dlX1);
            dlX2 = gpuArray(dlX2);
        end
        %the traning loss and the gradients after the backpropagation were
        %calculated using the helper function modelGradients_demo
        
        % --------------- below: call to my dlfeval function, working -----------------
        [gradients1,gradients2,gradients3,loss] = dlfeval(@modelGradients_demo,dlnet1,dlnet2,dlnet3,dlX1,dlX2,dlarray(Y));
        % -----------------------------------------------------------------------------
        learnRate = initialLearnRate/(1 + decay*iteration);
        % Update the network parameters using the SGDM optimizer.
        % Update the parameters in dlnet1 to 3 sequentially 
        [dlnet3.Learnables, velocity3] = sgdmupdate(dlnet3.Learnables, gradients3, velocity3, learnRate, momentum);
        [dlnet2.Learnables, velocity2] = sgdmupdate(dlnet2.Learnables, gradients2, velocity2, learnRate, momentum);
        [dlnet1.Learnables, velocity1] = sgdmupdate(dlnet1.Learnables, gradients1, velocity1, learnRate, momentum);
        % Display the training progress.
        D = duration(0,0,toc(start),'Format','hh:mm:ss');
            addpoints(lineLossTrain,iteration,double(gather(extractdata(loss))))
            title("Epoch: " + epoch + ", Elapsed: " + string(D))
            drawnow
    end
end
function dlnet=createLayer(XTrain,numHiddenDimension)
layers = [
    imageInputLayer([14 28 1],"Name","imageinput","Mean",mean(XTrain,4))
    convolution2dLayer([3 3],8,"Name","conv_1","Padding","same")
    batchNormalizationLayer("Name","batchnorm_1")
    reluLayer("Name","relu_1")
    maxPooling2dLayer([2 2],"Name","maxpool_1","Stride",[2 2])
    convolution2dLayer([3 3],16,"Name","conv_2","Padding","same")
    batchNormalizationLayer("Name","batchnorm_2")
    reluLayer("Name","relu_2")
    maxPooling2dLayer([2 2],"Name","maxpool_2","Stride",[2 2])
    convolution2dLayer([3 3],32,"Name","conv_3","Padding","same")
    batchNormalizationLayer("Name","batchnorm_3")
    reluLayer("Name","relu_3")
    fullyConnectedLayer(numHiddenDimension,"Name","fc")];
    lgraph = layerGraph(layers);
    dlnet = dlnetwork(lgraph);
end
function dlnet=createLayerFullyConnect(numHiddenDimension)
    layers = [
        imageInputLayer([1 numHiddenDimension*2 1],"Name","imageinput","Normalization","none")
        fullyConnectedLayer(20,"Name","fc_1")
        fullyConnectedLayer(10,"Name","fc_2")];
    lgraph = layerGraph(layers);
    dlnet = dlnetwork(lgraph);
end
% ----------------- below - the function called by dlfeval, working --------------------
function [gradients1,gradients2,gradients3, loss] = modelGradients_demo(dlnet1,dlnet2,dlnet3,dlX1,dlX2,Y)
    dlYPred1 = forward(dlnet1,dlX1);
    dlYPred2 = forward(dlnet2,dlX2);
    dlX_concat=[dlYPred1;dlYPred2];
    dlX_concat=reshape(dlX_concat,[1 40, 1, 128]);%the value 128 corresponds the mini batch size
    dlX_concat=dlarray(single(dlX_concat),'SSCB');
    dlY_concat=forward(dlnet3,dlX_concat);
    dlYPred_concat = softmax(dlY_concat);
    loss = crossentropy(dlYPred_concat,Y);
    [gradients1,gradients2,gradients3] = dlgradient(loss,dlnet1.Learnables,dlnet2.Learnables,dlnet3.Learnables);
end

Connectez-vous pour commenter.

Answer 2

Iris Soa le 27 Juil 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/565595-custom-deep-learning-network-gradient-function-using-dlfeval#answer_471451

Ouvrir dans MATLAB Online

Update on this issue, see here: https://uk.mathworks.com/help/deeplearning/ug/include-automatic-differentiation.html

Derivative Trace

To evaluate a gradient numerically, a dlarray constructs a data structure for reverse mode differentiation, as described in Automatic Differentiation Background. This data structure is the trace of the derivative computation. Keep in mind these guidelines when using automatic differentiation and the derivative trace:

Do not introduce a new dlarray inside of an objective function calculation and attempt to differentiate with respect to that object. For example:function [dy,dy1] = fun(x1)

function [dy,dy1] = fun(x1)
x2 = dlarray(0);
y = x1 + x2;
dy = dlgradient(y,x2); % Error: x2 is untraced
dy1 = dlgradient(y,x1); % No error even though y has an untraced portion
end

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Custom deep learning network - gradient function using dlfeval

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

2 commentaires
Afficher AucuneMasquer Aucune

Plus de réponses (1)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

Custom deep learning network - gradient function using dlfeval

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

2 commentaires Afficher AucuneMasquer Aucune

Plus de réponses (1)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens