Multilayer Perceptron with More Than One Output and Data Interpretation?

20 vues (au cours des 30 derniers jours)
Matthew
Matthew le 6 Août 2024
Modifié(e) : Joss Knight le 7 Août 2024
I haven't used Matlab's ML or deep learning toolbox since maybe 2022. In the last year, it seems like MathWorks intentionally made the ML portion of Matlab unusable. I'm hoping someone here can help me out because rolling back my matlab two years will make a lot of my recent work unstable. I've done my best to document my process here. Error messages are given in italics for ease of reading.
I have some data that is represented by four doubles that are normalzied to between 0 and 1. I would like to use these four doubles to predict two doubles that are also between 0 and 1. I would like to make a multilayer perceptron that will take in the four values and spit out the two values. However, createMLPNetwork doesn't appear to support multiple outputs - at least, the documentation for it doesn't explain how to do so. So I have tried to make a MLP from scratch using the following code:
%% Prep NN architecture
layers = [
inputLayer([4, 1], "CU", name="in")
flattenLayer
fullyConnectedLayer(20, name="fc1")
reluLayer
fullyConnectedLayerB(16, name="fc2")
reluLayer
fullyConnectedLayer(12, name="fc3")
reluLayer
fullyConnectedLayer(numel(data{1,1}), name="output")
softmaxLayer
];
net = dlnetwork(layers);
[trainedNet, info] = trainnet(train, layers, "mse", opts);
Attempting to run this gives me the following results:
Error using trainnet (line 46)
Error forming mini-batch for network input "in". Data interpreted with format "CU". To specify a different format, use the InputDataFormats option.
Error in trainRutileModel (line 107)
[trainedNet, info] = trainnet(train, layers, "mse", opts);
Caused by:
The data format must have a batch dimension.
I tried to get around using the generic inputLayer function by using the more specific functions like featureInputLayer but replacing the inputLayer with a featureInputLayer throws the following:
Error using trainnet (line 46)
Number of observations in predictors (4) and targets (2) must match. Check that the data and network are consistent.
Error in trainRutileModel (line 107)
[trainedNet, info] = trainnet(train, layers, "mse", opts);
So that won't work because I only have two output data points. The same happened when I tried imageInputLayer. Then I tried replacing "CU" with other values - the following error is for "BU" but the other errors are the same - but got the following:
Error using dlnetwork/initialize (line 558)
Invalid network.
Error in dlnetwork (line 167)
net = initialize(net, dlX{:});
Error in trainRutileModel (line 104)
net = dlnetwork(layers);
Caused by:
Layer 'fc1': Invalid input data for fully connected layer. The input data must have exactly one channel dimension.
So then I tried flattening the output and got the following error:
Error using dlnetwork/initialize (line 558)
Invalid network.
Error in dlnetwork (line 167)
net = initialize(net, dlX{:});
Error in trainRutileModel (line 105)
net = dlnetwork(layers);
Caused by:
Layer 'flatten': Invalid input data. Layer expects data with a channel dimension, but received input data with format "BU".
I'm really not sure what to do here. I have no idea why MathWorks would make this so much more difficult to use and give so much less, and more opaque, documentation. If anyone has any ideas on how to make this work, I'd be happy to hear them. In the meantime, I'm going to take another crack at the createMLPNetwork function and hope that when MathWorks says "a black-box continuous-time or discrete-time neural state-space model with identifiable (estimable) network weights and bias," by "states" they mean "outputs."

Réponse acceptée

LeoAiE
LeoAiE le 6 Août 2024
Hi there!
I have few code examples may help you get started!
% Define network architecture
layers = [
featureInputLayer(4, "Name", "input")
fullyConnectedLayer(20, "Name", "fc1")
reluLayer("Name", "relu1")
fullyConnectedLayer(16, "Name", "fc2")
reluLayer("Name", "relu2")
fullyConnectedLayer(12, "Name", "fc3")
reluLayer("Name", "relu3")
fullyConnectedLayer(2, "Name", "output") % 2 outputs for regression
];
% Convert to dlnetwork
net = dlnetwork(layers);
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'MaxEpochs', 100, ...
'InitialLearnRate', 1e-3, ...
'Shuffle', 'every-epoch', ...
'Verbose', true, ...
'Plots', 'training-progress');
% Assuming trainData is your input data matrix of size [numObservations, 4]
% and trainTargets is your target data matrix of size [numObservations, 2]
% Split data into training and validation sets if needed
% [trainInd, valInd] = dividerand(numel(trainData), 0.8, 0.2);
% Prepare datastore if data is large
% ds = arrayDatastore({trainData, trainTargets}, 'IterationDimension', 1);
% Train the network
[trainedNet, info] = trainNetwork(trainData, trainTargets, layers, options);
% Data preparation
trainData = rand(1000, 4); % Example data, replace with your actual data
trainTargets = rand(1000, 2); % Example data, replace with your actual data
% Define network architecture
layers = [
featureInputLayer(4, "Name", "input")
fullyConnectedLayer(20, "Name", "fc1")
reluLayer("Name", "relu1")
fullyConnectedLayer(16, "Name", "fc2")
reluLayer("Name", "relu2")
fullyConnectedLayer(12, "Name", "fc3")
reluLayer("Name", "relu3")
fullyConnectedLayer(2, "Name", "output") % 2 outputs for regression
];
% Convert to dlnetwork
net = dlnetwork(layers);
% Training options
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'MaxEpochs', 100, ...
'InitialLearnRate', 1e-3, ...
'Shuffle', 'every-epoch', ...
'Verbose', true, ...
'Plots', 'training-progress');
% Train the network
[trainedNet, info] = trainNetwork(trainData, trainTargets, layers, options);
analyzeNetwork(layers);
  1 commentaire
Matthew
Matthew le 7 Août 2024
Thanks for the response! I hadto work with what yous ent a little bit. Inded up with this before I finally got it running:
opts = trainingOptions('sgdm', ...
'Plots', 'training-progress', ...
'MiniBatchSize', 100, ...
'MaxEpochs', 60, ...
'Shuffle', 'every-epoch', ...
'ValidationData', {valData', valLabels'}, ...
'ExecutionEnvironment', 'gpu');
layers = [
featureInputLayer(4, "Name", "input")
fullyConnectedLayer(20, "Name", "fc1")
reluLayer("Name", "relu1")
fullyConnectedLayer(16, "Name", "fc2")
reluLayer("Name", "relu2")
fullyConnectedLayer(12, "Name", "fc3")
reluLayer("Name", "relu3")
fullyConnectedLayer(2, "Name", "fc4")
softmaxLayer
regressionLayer('name', 'output') % 2 outputs for regression
];
[trainedNet, info] = trainNetwork(xTrain', yTrain', layers, opts);
So I had to add the regression layer on to the end of the layer map and rearrange the data a little bit, but it finally started training.
Thanks so much for spending some time and effor answering this!

Connectez-vous pour commenter.

Plus de réponses (1)

Joss Knight
Joss Knight le 7 Août 2024
You can continue to use trainNetwork if you don't want to use dlnetwork. dlnetwork obviously provides much more flexibility as well as the ability to format your data however you like (which was tripping you up), but you don't have to use it.
  2 commentaires
Matthew
Matthew le 7 Août 2024
Good morning, Joss! Thanks for looking into this!
I looked into dlnetwork's documentation, but it's not terribly cear on how data formatting, etc, works. As you can see in the code included in the original post, I tired using dlnetwork and trainnet, but ran into problems formatting the data. Moreover, in the original post, I made several attempts at getting dlnerwork to function, but eventually ran into a circular set of errors regarding data type and flattening.
Unfortunately, I did not see anything in the dlnetwork documentation that explains how to set up dlnetwork with anything except an imageInput layer. And there doesn't seem to be a data type for the inputLayer function that just takes in plain old doubles. I haven't yet been able to solve those issues for dlnetwork and inputLayer, though I'm still trying at it.
Joss Knight
Joss Knight le 7 Août 2024
Modifié(e) : Joss Knight le 7 Août 2024
That's totally fair. A future version will remove the requirement for you to describe your data format, and make it more like other frameworks where certain input layouts are expected for certain types of network.
dlnetwork has the same requirements for other frameworks for you to define the correct layout for your data, but instead of permuting your data into a required format, you label which dimension is which. To use an inputLayer and for fullyConnectedLayer you need at least to explain which dimension has the channels of your data and which has the batch dimension, in the same way you might be required in pyTorch or Tensorflow to provide your data layed out with each row a different batch of data and each column a different channel. The documentation tries to explain as completely as it can what each label means.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Image Data Workflows dans Help Center et File Exchange

Produits


Version

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by