How to implement Siamese network with the two subnetworks not share weights

Question

Cloud Wind le 22 Août 2022

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/1783330-how-to-implement-siamese-network-with-the-two-subnetworks-not-share-weights

Commenté : Joss Knight le 10 Sep 2022

I was implementing a Siamese using matlab deep learning toolbox. It is easy to implement such a network when the two subnetworks of the Siamese network share weights follwoing this official demo. Now I want to implement a Siamese network with the two subnetworks not share weights. Is there any easy solutions? I know we can set two "dlnetwork", one for input image A and the other for input image B. But the problem is you need to load two subnetworks into GPU memory, which is unavailable when there is no enough memory.

Any good solutions is welcomed, thank you!

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Joss Knight le 1 Sep 2022

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1783330-how-to-implement-siamese-network-with-the-two-subnetworks-not-share-weights#answer_1039965

You can try gathering the weights back from each network after you've used it, as in net = dlupdate(@gather,net). This should save some memory.

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Cloud Wind le 2 Sep 2022

Ouvrir dans MATLAB Online

Hi Joss, I post the example code. It is following the official Siamese demo. In the demo, the two subnetworks share weights. While I want to implement a Siamese network (without sharing weights) that does need much GPU memory.

clc; clear;
downloadFolder = 'F:\exps\siamese_scd\data';
dataFolderTrain = fullfile(downloadFolder,'train');
dataFolderTest = fullfile(downloadFolder,'test');
%***************************************************
   net = alexnet;
   layers=net.Layers(1:22);
   layers=[layers(1:18,:);layers(20:21,:)];  
   lgraph = layerGraph(layers);
   dlnet = dlnetwork(lgraph);
   fcWeights = dlarray(0.01*randn(1,4352));
   fcBias = dlarray(0.01*randn(1,1));
   fcParams = struct(...
     "FcWeights",fcWeights,...
     "FcBias",fcBias); 
clear net layers
%***************************************************
imdsTrain = imageDatastore(dataFolderTrain, ...
    'IncludeSubfolders',true, ...
    'LabelSource','none');
files = imdsTrain.Files;
parts = split(files,filesep);
labels = join(parts(:,(end-2):(end-1)),'_');
imdsTrain.Labels = categorical(labels);
imdsTest = imageDatastore(dataFolderTest, ...
    'IncludeSubfolders',true, ...
    'LabelSource','none');
files = imdsTest.Files;
parts = split(files,filesep);
labels = join(parts(:,(end-2):(end-1)),'_');
imdsTest.Labels = categorical(labels);
%***************************************************
numIterations =1000;
train_miniBatchSize =8;
test_minBatchSize = 4;
learningRate = 2e-5;  
trailingAvgSubnet = [];
trailingAvgSqSubnet = [];
trailingAvgParams = [];
trailingAvgSqParams = [];
gradDecay = 0.9;
gradDecaySq = 0.99;
executionEnvironment = "gpu";
plots = "training-progress";
if plots == "training-progress"
    figure
    subplot(2,1,1)
    
    trainingPlotAxes = gca;
%    trainingPlotAxes.YLim = [0 1];
    lineLossTrain = animatedline(trainingPlotAxes);
    xlabel(trainingPlotAxes,"Iteration")
    ylabel(trainingPlotAxes,"Loss")
    title(trainingPlotAxes,"Loss During Training")
    subplot(2,1,2)
    
    testingPlotAxes = gca;
%    testingPlotAxes.YLim = [0 1];
    lineLosstest = animatedline(testingPlotAxes);
    xlabel(testingPlotAxes,"Iteration")
    ylabel(testingPlotAxes,"Loss")
    title(testingPlotAxes,"Loss During Testing")
end
%***************************************************
for iteration = 1:numIterations
       
    [X1,X2,pairLabels] = getAlexnetBatch(imdsTrain,train_miniBatchSize);
    [tX1,tX2,tpairLabels] = getAlexnetTest(imdsTest,test_minBatchSize);
 
    dlX1 = dlarray(single(X1),'SSCB');
    dlX2 = dlarray(single(X2),'SSCB');
    
    tdlX1 = dlarray(single(tX1),'SSCB');
    tdlX2 = dlarray(single(tX2),'SSCB');
	
   
    if executionEnvironment == "gpu"
        dlX1 = gpuArray(dlX1);
        dlX2 = gpuArray(dlX2);
        tdlX1 = gpuArray(tdlX1);
        tdlX2 = gpuArray(tdlX2);
    end
    
    [gradientsSubnet,gradientsParams,loss] = dlfeval(@modelGradients,dlnet,fcParams,dlX1,dlX2,pairLabels);
    lossValue = double(gather(extractdata(loss)));
    
    [~,~,tloss] = dlfeval(@modelGradients,dlnet,fcParams,tdlX1,tdlX2,tpairLabels);
    tlossValue = double(gather(extractdata(tloss)));
    clear dlX1 dlX2 tdlX1 tdlX2
%***************************************************
    [dlnet,trailingAvgSubnet,trailingAvgSqSubnet] = ...
        adamupdate(dlnet,gradientsSubnet, ...
        trailingAvgSubnet,trailingAvgSqSubnet,iteration,learningRate,gradDecay,gradDecaySq);
    
    [fcParams,trailingAvgParams,trailingAvgSqParams] = ...
        adamupdate(fcParams,gradientsParams, ...
        trailingAvgParams,trailingAvgSqParams,iteration,learningRate,gradDecay,gradDecaySq);
    
    if plots == "training-progress"
        addpoints(lineLossTrain,iteration,lossValue);
        addpoints(lineLosstest,iteration,tlossValue);
    end
    drawnow;
    
    temp1=sprintf('iteration: %d ----- %d',[iteration,numIterations]);
    temp2=sprintf('loss: Training：%0.4f ----- Testing：%0.4f',[lossValue,tlossValue]);
    disp(temp1);
    disp(temp2);
end
%******************************************************************************************
% the called functions
%******************************************************************************************
function [gradientsSubnet,gradientsParams,loss] = modelGradients(dlnet,fcParams,dlX1,dlX2,pairLabels)
    % Pass the image pair through the network 
    Y = forwardSiamese(dlnet,fcParams,dlX1,dlX2);
    
    % Calculate binary cross-entropy loss
    loss = binarycrossentropy(Y,pairLabels);
       
    % Calculate gradients of the loss with respect to the network learnable
    % parameters
    [gradientsSubnet,gradientsParams] = dlgradient(loss,dlnet.Learnables,fcParams);
end
function loss = binarycrossentropy(Y,pairLabels)
    % binarycrossentropy accepts the network's prediction Y, the true
    % label, and pairLabels, and returns the binary cross-entropy loss value.
    
    % Get precision of prediction to prevent errors due to floating
    % point precision    
    precision = underlyingType(Y);
      
    % Convert values less than floating point precision to eps.
    Y(Y < eps(precision)) = eps(precision);
    %convert values between 1-eps and 1 to 1-eps.
    Y(Y > 1 - eps(precision)) = 1 - eps(precision);
    
    % Calculate binary cross-entropy loss for each pair
    loss = -pairLabels.*log(Y) - (1 - pairLabels).*log(1 - Y);
    
    % Sum over all pairs in minibatch and normalize.
    loss = sum(loss)/numel(pairLabels);
end
%***************************************************************************************
function Y = forwardSiamese(dlnet,fcParams,dlX1,dlX2)
% forwardSiamese accepts the network and pair of training images, and returns a
% prediction of the probability of the pair being similar (closer to 1) or 
% dissimilar (closer to 0). Use forwardSiamese during training.
    % Pass the first image through the twin subnetwork
 
    F1 = forward(dlnet,dlX1);
    F1 = sigmoid(F1);
    
    % Pass the second image through the twin subnetwork
    F2 = forward(dlnet,dlX2);
    F2 = sigmoid(F2);
    % Subtract the feature vectors
    Y = abs(F1 - F2);
end
%***************************************************************************************
function [X1,X2,pairLabels] = getAlexnetTest(imds,miniBatchSize)
pairLabels = zeros(1,miniBatchSize);
X1 = zeros([227 227 3 miniBatchSize]);
X2 = zeros([227 227 3 miniBatchSize]);
imdsaug = augmentedImageDatastore([227 227],imds);
batch=readall(imdsaug);
    for i = 1:miniBatchSize
        choice = rand(1);
        if choice < 0.5
            [pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(batch.response);
        else
            [pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(batch.response);
        end
        
        X1(:,:,:,i) =batch.input{pairIdx1};
        X2(:,:,:,i) =batch.input{pairIdx2};
        
    end
end
function [pairIdx1,pairIdx2,pairLabel] = getSimilarPair(classLabel)
% getSimilarSiamesePair returns a random pair of indices for images
% that are in the same class and the similar pair label = 1.
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose a class randomly which will be used to get a similar pair.
    classChoice = randi(numel(classes));
    
    % Find the indices of all the observations from the chosen class.
    idxs = find(classLabel==classes(classChoice));
    
    % Randomly choose two different images from the chosen class.
    pairIdxChoice = randperm(numel(idxs),2);
    pairIdx1 = idxs(pairIdxChoice(1));
    pairIdx2 = idxs(pairIdxChoice(2));
    pairLabel = 1;
end
function  [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel)
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose two different classes randomly which will be used to get a dissimilar pair.
    classesChoice = randperm(numel(classes),2);
    
    % Find the indices of all the observations from the first and second classes.
    idxs1 = find(classLabel==classes(classesChoice(1)));
    idxs2 = find(classLabel==classes(classesChoice(2)));
    
    % Randomly choose one image from each class.
    pairIdx1Choice = randi(numel(idxs1));
    pairIdx2Choice = randi(numel(idxs2));
    pairIdx1 = idxs1(pairIdx1Choice);
    pairIdx2 = idxs2(pairIdx2Choice);
    label = 0;
end
%***************************************************************************************
function [X1,X2,pairLabels] = getAlexnetBatch(imds,miniBatchSize)
pairLabels = zeros(1,miniBatchSize);
X1 = zeros([227 227 3 miniBatchSize]);
X2 = zeros([227 227 3 miniBatchSize]);
imageAugmenter = imageDataAugmenter('RandRotation',[90,270],'RandXReflection',true,'RandYReflection',true);
imdsaug = augmentedImageDatastore([227 227],imds,'DataAugmentation',imageAugmenter);
batch=readall(imdsaug);
    for i = 1:miniBatchSize
        choice = rand(1);
        if choice < 0.5
            [pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(batch.response);
        else
            [pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(batch.response);
        end
        
        X1(:,:,:,i) =batch.input{pairIdx1};
        X2(:,:,:,i) =batch.input{pairIdx2};
        
    end
end
function [pairIdx1,pairIdx2,pairLabel] = getSimilarPair(classLabel)
% getSimilarSiamesePair returns a random pair of indices for images
% that are in the same class and the similar pair label = 1.
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose a class randomly which will be used to get a similar pair.
    classChoice = randi(numel(classes));
    
    % Find the indices of all the observations from the chosen class.
    idxs = find(classLabel==classes(classChoice));
    
    % Randomly choose two different images from the chosen class.
    pairIdxChoice = randperm(numel(idxs),2);
    pairIdx1 = idxs(pairIdxChoice(1));
    pairIdx2 = idxs(pairIdxChoice(2));
    pairLabel = 1;
end
function  [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel)
% getDissimilarSiamesePair returns a random pair of indices for images
% that are in different classes and the dissimilar pair label = 0.
    % Find all unique classes.
    classes = unique(classLabel);
    
    % Choose two different classes randomly which will be used to get a dissimilar pair.
    classesChoice = randperm(numel(classes),2);
    
    % Find the indices of all the observations from the first and second classes.
    idxs1 = find(classLabel==classes(classesChoice(1)));
    idxs2 = find(classLabel==classes(classesChoice(2)));
    
    % Randomly choose one image from each class.
    pairIdx1Choice = randi(numel(idxs1));
    pairIdx2Choice = randi(numel(idxs2));
    pairIdx1 = idxs1(pairIdx1Choice);
    pairIdx2 = idxs2(pairIdx2Choice);
    label = 0;
end

Joss Knight le 10 Sep 2022

Ouvrir dans MATLAB Online

I'm imagining that you would do something like this, in your forwardSiamese function:

dlnet1 = dlupdate(@gpuArray,dlnet1);
F1 = forward(dlnet1,dlX1);
F1 = sigmoid(F1);
dlnet1 = dlupdate(@gather,dlnet1);
dlnet2 = dlupdate(@gpuArray,dlnet2);
% Pass the second image through the twin subnetwork
F2 = forward(dlnet2,dlX2);
F2 = sigmoid(F2);
dlnet1 = dlupdate(@gather,dlnet1);

For this to work you will need to ensure you always pass in your two networks, at the call to dlfeval as fully host-side networks, so something like

dlnet1 = dlupdate(@gather,dlnet1);
dlnet2 = dlupdate(@gather,dlnet2);
[gradientsSubnet,gradientsParams,loss] = dlfeval(@modelGradients,dlnet1,dlnet2,fcParams,dlX1,dlX2,pairLabels);

If you don't do this then it won't make any difference what you do inside modelGradients because MATLAB will hold onto the GPU copy from the calling code.

You should also remove the fcParams part of the code, since you seem to have deleted the fullyconnect operation and therefore it's wasting space.

Connectez-vous pour commenter.

How to implement Siamese network with the two subnetworks not share weights

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

How to implement Siamese network with the two subnetworks not share weights

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens