It seems like your observations (batch) are the days, time is the rows and you perhaps have two channels in the first two columns. I'm not sure your data isn't already in the right format expected by trainnet. But it doesn't sound like you want a fullyconnected and softmax layer. You are not trying to classify each day, you are performing regression, trying to match your predictions to targets. You just want one or more LSTM layers separated by activation layers and you want the last one to output a single channel.
How to prepare irregularly spaced time-series data for classification using LSTM
    5 vues (au cours des 30 derniers jours)
  
       Afficher commentaires plus anciens
    
    Eugen Fekete
 le 25 Fév 2025
  
    
    
    
    
    Commenté : Eugen Fekete
 le 2 Mar 2025
             (First 50 days worth of data included) 
I have the variable holding 215 days worth of data structured like this: processed_data is a cell array of size 215×1, holding cells, where each  cell contains data for a given day. Each cell (day) has a varying number of observations (with a mean of approximately 12,000 rows). Each row represents an observation, where: the first column contains the seconds elapsed since the previous row (not normalized), the second column contains the price of a specified security (normalized using z-score),  and the third column is the target variable, signaling whether the price at that moment will be 0.01% higher (represented as 1) 60 seconds later or not (represented as 0). I'm using the first two columns as the predictors. I imagined this network to be able to make a prediction for every observation in the data. I keep the days separate, because hours pass between the last row of day i and the first row of day i+1. Below is a sample of data from an arbitrary day: 
2.57500000000437	0.502515050312692	0
1.03600000000006	0.469361050915526	1
1.05899999999383	0.386501335237771	1
0.838000000003376	0.436219680495852	0
1.12999999999738	0.469361050915526	0
0.824000000000524	0.369924327252462	1
I'm just a beginner in ML, and I'm having a really hard time imagining  how the data for the LSTM layer should be formatted. If I'm correct, it  needs 3-dimensional data, where one dimension represents the channel,  another the time step, and another the batch. I'm now sure that I have  completely misunderstood these concepts and have written the code below: 
%% Partitioning data.
train_data_length = round(length(processed_data) * 0.9);
train_data = processed_data(1: train_data_length);
test_data = processed_data(train_data_length+1:end);
%% Training setup
% Convert data to cell arrays of dlarray.
train_X = cell(size(train_data));
train_Y = cell(size(train_data));
for day = 1:length(train_data)
    % Add batch dimension (C×B×T where B=1).
    data = permute(train_data{day}(:, 1:2)', [1 3 2]); % [2×1×T]
    train_X{day} = dlarray(data, "CBT");
    % Convert labels to one-hot encoded CBT format [2×1×T].
    labels = train_data{day}(:, 3)'; %  [1×T]
    one_hot_labels = onehotencode(labels, 1, 'ClassNames', [0 1]); % [2×T]
    one_hot_labels = reshape(one_hot_labels, 2, 1, []); % [2×1×T]
    train_Y{day} = dlarray(single(one_hot_labels), "CBT");
end
ds = combine(...
    arrayDatastore(train_X, 'OutputType', 'same'), ...
    arrayDatastore(train_Y, 'OutputType', 'same')...
);
%clearvars -except ds test_data ml_method
num_features = 2;
num_hidden_units = 128;
num_classes = 2;
mini_batch_size = 32;
layers = [
    sequenceInputLayer(num_features, 'Name', 'input')
    lstmLayer(num_hidden_units, 'OutputMode', 'sequence')
    fullyConnectedLayer(num_classes)
    softmaxLayer
];
net = dlnetwork(layers);
options = trainingOptions('adam', ...
    'MaxEpochs', 30, ...
    'MiniBatchSize', mini_batch_size, ...
    'SequenceLength', 'longest', ...
    'Shuffle', 'every-epoch', ...
    'Plots', 'training-progress', ...
    'InputDataFormats', 'CBT', ...
    'Verbose', false, ...
    'ExecutionEnvironment', 'gpu');
net = trainnet(ds, net, 'crossentropy', options);
In the code above, I tried to define the channel as the number of  predictors (2 in my case—most likely the only dimension I defined  correctly). I set the batch to 1 because I thought it meant the network  would use one observation to make predictions. I set the time step as  the first column of a day's worth of data (the seconds passed since the  last observation) because I thought it literally meant steps in time.  Now I know that I was completely wrong. I also had to change the  mini_batch_size to 32 from 128, which I found too low, but otherwise, I  would run out of memory. I guess this is because of my incorrectly  formatted data (I'm not sure if this is an important detail, but I'll  include my GPU which is an RTX2070 Super with 8GB of memory). My question is: How should I format my data for the LSTM layer based on  my goals? Or my goals are unrealistic?
0 commentaires
Réponse acceptée
  Joss Knight
    
 le 1 Mar 2025
        5 commentaires
  Joss Knight
    
 le 2 Mar 2025
				Is there really no example in the MATLAB documentation that fits your use case? This isn't exactly my area of expertise and I'm just trying to avoid literally doing a search for you.
Classification and regression are not fundamentally different. The softmax operation is what let's us convert a regression loss (match the values) into a classification loss (match the highest number), but there's nothing fundamentally different about the underlying algorithms here.
Plus de réponses (0)
Voir également
Catégories
				En savoir plus sur Parallel and Cloud dans Help Center et File Exchange
			
	Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

