Datastore with overlapping read function

I have some data stored in 20 spreadsheets. Each spreadsheet has a size of about 4000x200. I want to store the data in a datastore, and feed it into a temporal-CNN in chunks of 100 rows. However, I want the rows to overlap. For example, the first value the datastore will return is a 100x200 array, which corresponds to rows 1:101 in the spreadsheet. The second value it should return is rows 2:102, then 3:103, etc.
The only way that I can seem to do this right now is to read all the spreadsheets into a 80,000x200 array in matlab, and then create a 3D array with size 79,900x100x200, and then use a for loop to iterate through the array, copying over chunks of 100x200 from the 2d array into the 3d array. Finally, I put the 3d array into an arrayDatastore. However, this seems really inefficient, and I have to keep the batch size for the CNN pretty small to avoid errors.
I also tried saving each of the 100x200 arrays into a grayscale png, and then creating an imageDatastore with the 79,900 images. This lets me have a larger batch size, but A) it takes about 8 hours to convert all the data into images, and B) training the CNN takes about 8-10 times longer (4-5 hours instead of 30 mins).
Is there a better way to do this?

Réponses (1)

Matt J
Matt J le 23 Avr 2022
Modifié(e) : Matt J le 23 Avr 2022

0 votes

For example, the first value the datastore will return is a 100x200 array, which corresponds to rows 1:101 in the spreadsheet. The second value it should return is rows 2:102, then 3:103, etc.
An equivalent to that would be to make the CNN fully convolutional (if it isn't already) with input size 4000x200. Then, you could feed an entire spreadsheet as input at once.

11 commentaires

I'm not sure I quite follow this.
To give a little more background, the goal of this TCN it to classify an activity. The columns represent the spatial dimension of the signal, and the rows represent the time. Each row is labeled with what activity the person was doing at that time. I'm hoping to create an algorithm that can classify the users current activity using the current timestep + the previous 99 timesteps.
Each spreadsheet represents 1 data collection session, where the user did 8 different activities.
Do you know of any similar examples?
The difference in this example is that each gesture is a discrete, 9000 time-step movement. Whereas I am trying to make a continuous classifier.
Matt J
Matt J le 23 Avr 2022
Modifié(e) : Matt J le 23 Avr 2022
It is a convolutional network, so separating the inputs into overlapping blocks is redundant. Consider the simplified example below. You can see that convolving the hypothetical weights with the whole input produces the same data as when you convolve the weights with separate overlapping row-blocks.Therefore your training would just be doing unnecessary repeat computations if you break your spreadsheets up into overlapping blocks, not to mention the extra memory requirements..
input=reshape(1:12,4,3);
block1=input(1:3,:);
block2=input(2:4,:);
w=rand(2); %random convolution weights
conv2(block1,w,'valid')
ans = 2×2
6.3718 14.2150 8.3326 16.1758
conv2(block2,w,'valid')
ans = 2×2
8.3326 16.1758 10.2934 18.1366
conv2(input,w,'valid') %total input
ans = 3×2
6.3718 14.2150 8.3326 16.1758 10.2934 18.1366
Yeah, I understand that my way involves doing redundant convolutions, but I don't understand how I can input the activity labels in the format that you suggest.
When I separate the data into chunks of 100, the label for each chunk corresponds to the label for the last row in that chunk. The input size of the first layer is a 100x200 array and the output of the last layer is a single categorical. The datastore containing the training data is a combinedDatastore, where the first underlying datastore is the 100x200 arrays, and the second underlying datastore is the labels.
If I give an entire spreadsheet to the algorithm at once, can I only give one label to that entire spreadsheet? Even though it consisted of them doing many different activities? Because I want to be able to make a classification for each timestep, I just think the algorithm needs more than just the current timestep to make a good classification.
In your example, what is the input size of the first layer of the CNN? And what format is the datastore containing the features and labels?
Sorry if I'm missing something very basic. Your way sounds a lot better than mine, I just don't quite get how to implement it. Thanks for your help.
Matt J
Matt J le 23 Avr 2022
In your example, what is the input size of the first layer of the CNN?
Well that's just it. Convolutional layers don't have a defined input size, because they are convolutional (footnote: they do have edge-padding rules). However, in my example, if you wish, you can think of the input size as 3x3 in the case when I pass in block1 and block2 separately and 4x3 when I pass in the whole input.
The input size of the first layer is a 100x200 array and the output of the last layer is a single categorical
If so, and if your network is fully convolutional, then you should be able to pass in a 4000x200 (containing 3901 sequences) to this same network. The output of the last layer should be a 3901x1 vector of categoricals. The cost function calculation needs to be adjusted to sum the costs of all 3901 classification results.
Thanks for all your help. I think I am on the same page with you conceptually now, but I have been working on it and still can't seem to implement it.
1) Is sequenceInputLayer with input size [200 1] and a MinLength of 100 the correct type of input layer? Initially, I was using an imageInputLayer, but it doesn't seem like that will allow me to make a vector of predictions.
sequenceInputLayer([numFeat 1],"Normalization", "none","MinLength", 100, 'Name',['input'])
2) If the sequenceInputLayer is correct, I get this error, because what was previously considered as a spatial dimension is now a temporal dimension:
3) Next I tried to fix that error by eliminating the temporal dimension through convolution, but I get the same error even when the size of the temporal dimension is only 1.
4) So I tried to eliminate the spatial dimension by using a flatten layer. This eliminates the errors in the network itself, but when I train I get the error:
Error using trainNetwork
The training sequences are of feature dimension 4283 200 but the input layer expects sequences of feature dimension 200 1.
I don't really understand why it expects sequences of 200x1. I would have thought you could input sequences of 200x100 or even longer. Do you have any suggestions about the implementation?
Matt J
Matt J le 25 Avr 2022
Modifié(e) : Matt J le 25 Avr 2022
The way I'm imagining it, you would go back to the imageInputLayer.The flattening layer and fully connected layer, though, would be removed and replaced with a single convolutional2dLayer with padding=0, stride=1. The number of output channels Nc should be the number of classes and the spatial dimensions of the weights should be 100x200. If you give a 4000x200 input image, and all goes well, the ouput of this should be a 3901(S)x1(S) image with Nc channels.
For the output layers, instead of having a softmax and classification layers, I think yout want a pixelClassificationLayer,
We're viewing the 3901x1 conv layer output as an array of pixels and we want to classify each one.
Okay that makes sense. The pixel classification layer is really helpful.
Can the image input layer accept images with varying heights? Because although the trials are about 4000 timesteps long, sometimes they are as short as 3800 or as long as 4300. The spatial dimension stays the same (200).
I can "crop" the trials to all be the same length and still have enough data, but it would be nice to avoid that. Also, I eventually want to be able to use this algorithm to make real-time predictions using only 100 timesteps.
Matt J
Matt J le 26 Avr 2022
Modifié(e) : Matt J le 26 Avr 2022
Imageinputlayers have to be a preset size, though I am somewhat curious to see what would happen if you omitted it.I speculate that, because you have that batch normalization layer, you might not need the normalization that the imageinputlayer usually applies.
One thing that concerns me a bit is that you only have 1 hidden convolutional layer and no pooling is done. Normally, in CNN classification, you have a series of convolution and pooling layers so your feature map spatial dimensions get smaller with successive layers while the number of channels increase. That way, you don't have so many weights in the output layers to train. Currently, you have 2e4*Nc output weights. Is this specific network architecture something you got from literature?
Oh, I just took out most of the layers so that it would be easy to see the input layer and the fully-connected layer in one screenshot. I am using an architecture very similar to this example: https://www.mathworks.com/help/deeplearning/ug/hand-gesture-classification-using-radar-signals-and-deep-learning.html
It seems that it is not possible to create a network without an input layer. I can make the network with an imageinputlayer of size 4000x200, train it, and then remove the input layer and replace it with an imageInputLayer with size 100x200.
I finally got to successful training with the simplified layer structure from my previous screenshots, but when I added back in more layers of convolution and pooling, I realized I need to to really be careful setting the filter pooling sizes and and padding and stride length to ensure that the time dimension only shrinks from 4000 to 3901. And I think that means that when I change the input size, it might not necessarily shrink the time dimension from 100 to 1, so I have to be really careful on the settings to try to achieve both goals.
To be honest, I think I will just go back to my old method of copying the data into a 3d array, making an array datastore, and just deal with a small batch size to avoid memory errors. I think having more flexibility with the filter sizes and other parameters is worth it. Thank you for all your help, though.
Matt J
Matt J le 27 Avr 2022
It seems that it is not possible to create a network without an input layer. I can make the network with an imageinputlayer of size 4000x200
You would have to turn off the normalizations that it is doing, in that case. The imageInputLayer is not a convolutional layer you can't get shift-invariant output if normalization is happening.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Produits

Version

R2022a

Commenté :

le 27 Avr 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by