dlarray/dlgradient Value to differentiate is non-scalar. It must be a traced real dlarray scalar.
17 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hello, I am working on auto differentiation. But it came up with a error shown as title.
data = randn (3, 5000, 100);
numChannels=size(data,1);
numObservations=size(data,3);
XTrain = data(:,:,1:floor(0.9*numObservations));
XTest = data(:,:,floor(0.9*numObservations)+1:end);
numHiddenUnits=100;
numLatentChannels=1;
layersE = [
sequenceInputLayer(numChannels,Normalization="zscore")
lstmLayer(numHiddenUnits,'OutputMode','sequence')
fullyConnectedLayer(2*numLatentChannels)
samplingLayerSeq
];
layersD = [
sequenceInputLayer(numLatentChannels,Normalization="zscore")
lstmLayer(numHiddenUnits,'OutputMode','sequence')
fullyConnectedLayer(numChannels)
];
netE = dlnetwork(layersE);
netD = dlnetwork(layersD);
numEpochs = 150;
miniBatchSize = 20;
learnRate = 1e-2;
dsTrain = arrayDatastore(XTrain,IterationDimension=3);
numOutputs = 1;
mbq = minibatchqueue(dsTrain,numOutputs, ...
MiniBatchSize = miniBatchSize, ...
MiniBatchFcn=@preprocessMiniBatch, ...
MiniBatchFormat="CBT", ...
PartialMiniBatch="discard");
trailingAvgE = [];
trailingAvgSqE = [];
trailingAvgD = [];
trailingAvgSqD = [];
numObservationsTrain = size(XTrain,3);
numIterationsPerEpoch = ceil(numObservationsTrain / miniBatchSize);
numIterations = numEpochs * numIterationsPerEpoch;
monitor = trainingProgressMonitor( ...
Metrics="Loss", ...
Info="Epoch", ...
XLabel="Iteration");
epoch = 0;
iteration = 0;
% Loop over epochs.
while epoch < numEpochs && ~monitor.Stop
epoch = epoch + 1;
% Shuffle data.
shuffle(mbq);
% Loop over mini-batches.
while hasdata(mbq) && ~monitor.Stop
iteration = iteration + 1;
% Read mini-batch of data.
X = next(mbq);
% X = dlarray(X,'CBT');
% Evaluate loss and gradients.
[loss,gradientsE,gradientsD] = dlfeval(@modelLoss,netE,netD,X);
% Update learnable parameters.
[netE,trailingAvgE,trailingAvgSqE] = adamupdate(netE, ...
gradientsE,trailingAvgE,trailingAvgSqE,iteration,learnRate);
[netD, trailingAvgD, trailingAvgSqD] = adamupdate(netD, ...
gradientsD,trailingAvgD,trailingAvgSqD,iteration,learnRate);
end
end
%% model loss
function [loss,gradientsE,gradientsD] = modelLoss(netE,netD,X)
% Forward through encoder.
[Z,mu,logSigmaSq] = forward(netE,X);
% Forward through decoder.
Y = forward(netD,Z);
% Calculate loss and gradients.
loss = elboLoss(Y,X,mu,logSigmaSq);
[gradientsE,gradientsD] = dlgradient(loss,netE.Learnables,netD.Learnables);
end
%% elboloss
function loss = elboLoss(Y,T,mu,logSigmaSq)
% Reconstruction loss.
reconstructionLoss = mse(Y,T);
% KL divergence.
KL = -0.5 * sum(1 + logSigmaSq - mu.^2 - exp(logSigmaSq),1);
KL = mean(KL);
% Combined loss.
loss = reconstructionLoss + KL;
end
%% preprocess minibatch
function X = preprocessMiniBatch(dataX)
% Concatenate.
X = cat(3,dataX{:});
end
%% class
classdef samplingLayerSeq < nnet.layer.Layer
methods
function layer = samplingLayerSeq(args)
% layer = samplingLayer creates a sampling layer for VAEs.
%
% layer = samplingLayer(Name=name) also specifies the layer
% name.
% Parse input arguments.
arguments
args.Name = "";
end
% Layer properties.
layer.Name = args.Name;
layer.Type = "Sampling";
layer.Description = "Mean and log-variance sampling";
layer.OutputNames = ["out" "mean" "log-variance"];
end
function [Z,mu,logSigmaSq] = predict(~,X)
% [Z,mu,logSigmaSq] = predict(~,Z) Forwards input data through
% the layer at prediction and training time and output the
% result.
%
% Inputs:
% X - Concatenated input data where X(1:K,:) and
% X(K+1:end,:) correspond to the mean and
% log-variances, respectively, and K is the number
% of latent channels.
% Outputs:
% Z - Sampled output
% mu - Mean vector.
% logSigmaSq - Log-variance vector
% Data dimensions.
numLatentChannels = size(X,1)/2;
miniBatchSize = size(X,2);
% Split statistics.
mu = X(1:numLatentChannels,:,:);
logSigmaSq = X(numLatentChannels+1:end,:,:);
sz = size(mu);
epsilon =randn(sz);
% Sample output.
% epsilon = randn(numLatentChannels,miniBatchSize,"like",X);
sigma = exp(.5 * logSigmaSq);
Z = epsilon .* sigma + mu;
% Z = dlarray(Z,'CBT');
end
end
end
0 commentaires
Réponses (1)
Ben
le 5 Jan 2024
Your loss in modelLoss has a non-scalar T dimension since the model outputs sequences. You need to compute a scalar loss to use dlgradient. Standard approaches might be to take a sum or mean over the T dimension, but more intricate losses are common too.
0 commentaires
Voir également
Catégories
En savoir plus sur Custom Training Loops dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!