How process many structured arrays at once

I want to plot/process some test data, but it was stored in structured arrays with unhelpful / sequential names (Signal_000, Signal_001 ...Signal_157). I would like to sequentially extract the character array, make it a string, turn it into a variable name, and then assign the data in that structured array.
All the posts I've seen say don't do this (was considering eval). How should I go about doing this?

3 commentaires

Stephen23
Stephen23 le 30 Août 2018
Modifié(e) : Stephen23 le 30 Août 2018
"All the posts I've seen say don't do this (was considering eval)."
That is good advice. Magically accessing variable names is how beginners force themselves into writing slow, complex, buggy code that is hard to debug. Read this to know more:
"How should I go about doing this?"
Simple: don't get into the situation where you need to access variable names dynamically. Presumably you did not sit and write out all the names by hand, from 000 to 157, so they must have been load-ed or generated somehow. And that is where to fix your code! Simply load into an output variable (which itself is a structure):
S = load(...)
or if creating the variables in a loop then use indexing to allocate them directly into one cell array or structure or ND array or table or ...
RichardB
RichardB le 30 Août 2018
Really, I'm not looking to write very efficient code. I just need a way to take a 100 odd struct files and turn them into 100 simple arrays (row) with a useful name I can recognize.
jonas
jonas le 30 Août 2018
Modifié(e) : jonas le 30 Août 2018
And you received several methods to solve your problem. If you are unable to solve it by now, then you should consider uploading a sample data set and be thankful that people are willing to help.
I'm guessing most people here (at least myself) get some satisfaction from helping people improve their coding. Solving a trivial problem by writing buggy code is not something I will spend my evening doing.

Connectez-vous pour commenter.

 Réponse acceptée

Stephen23
Stephen23 le 31 Août 2018
Modifié(e) : Stephen23 le 31 Août 2018
"I want to plot/process some test data, but it was stored in structured arrays with unhelpful / sequential names (Signal_000, Signal_001 ...Signal_157). I would like to sequentially extract the character array, make it a string, turn it into a variable name, and then assign the data in that structured array."
So far the actual data files have not been described very well: I am going to assume that names of the .mat files corresponds exactly to the names of the structures, that each file/structure has a different name, and that each file contains one structure only. This can be processed very easily, something like this:
D = 'path of the folder where the data files are';
S = dir(fullfile('*.mat'));
C = cell(1,numel(S));
for k = 1:numel(S)
T = load(fullfile(D,S(k).name));
C(k) = struct2cell(T);
C(k).filename = S(k).name;
end
You now have all of the data in one cell array C, which is trivial to process in a loop. For example, your example data for file 0045 would then be
Z{46}.y_values.quantity.g
Z{46}.function_record.name
and the filename is always available:
Z{46}.filename
If all of the structures contain the same fields, then you can easily convert this cell array into one non-scalar structure like this:
Z = [C{:}]
Your example data for file 0045 would then be
Z(46).y_values.quantity.g
Z(46).function_record.name
etc
and you can do neat things, like get all of filenames in one cell array:
{Z.filename}

7 commentaires

Your code gives me a lot to look through. Thanks. I tried the first set, but it did not seem to work. Below is the result.
I tried to show what the structure of my files looks like in the text file I attached in this thread. I have 5 folders. Within each folder are 10-14 *.mat files. Each *.mat file has 100+ channels of data. Each channel is a struct, with three sub-struct, and further sub-structs/arrays. Unzipped about 22 Gb.
D = 'D:\MNR_Vibration\5_ODS_NewPipe\';
S = dir(fullfile('*.mat'));
C = cell(1,numel(S));
for k = 1:numel(S)
T = load(fullfile(D,S(k).name));
C(k) = struct2cell(T);
C(k).filename = S(k).name;
end
Unable to perform assignment because the left and right sides have a different number of elements.
If I can figure out how to trim one of the files I will attach it.
Stephen23
Stephen23 le 31 Août 2018
Modifié(e) : Stephen23 le 31 Août 2018
" I have 5 folders. Within each folder are 10-14 *.mat files."
Fine, no problem. Easy to deal with using dir or sprintf:
"Each *.mat file has 100+ channels of data. Each channel is a struct..."
Important question: Does each .mat file contain exactly the same channels or data? I.e. are the variable names within each .mat file the same? Or are they all sequential, or have some other non-repeating pattern?
"...with three sub-struct, and further sub-structs/arrays"
Lets walk before we start running: first thing is to load the data into MATLAB. It would help most if you could upload some sample files: You can simply replace the field data with empty arrays, and save them in some new .mat files. That will save some space!
"I tried the first set, but it did not seem to work"
I would not expect it to work: I wrote clear assumptions in my answer, that your files do not fulfill. I had to make those assumptions, because earlier you had not given adequate information about the files.
RichardB
RichardB le 31 Août 2018
I pulled some of the signal strucs and put them in a sample *.mat file. Hopefully this helps. There are a lot of files and data. I do not want to process all of them. Just a few so I don't go crazy.
Stephen23
Stephen23 le 31 Août 2018
Modifié(e) : Stephen23 le 31 Août 2018
@RichardB: in my last comment I asked a question. In case you missed it, here it is again: Does each .mat file contain exactly the same channels or data?
Is the variable name sequence complete, or are their gaps? If so, how large?
Do the structure names all have leading zeros?
What I would like to know is how the .mat file relate to each other: do they contain the same variable names, or are they sequential, or what is their relationship?
RichardB
RichardB le 31 Août 2018
The *.mat files vary in the number of channels, but the structure of each struct (Signal_000 to Signal_114 for example without gaps and no leading zeros past 100) is the same as far as I can tell. I am primarily interesting in one file, so could modify the code to adjust to the others as needed, but their format is the same. I imagine they were all exported from LMS the same way.
Some of the links you provided suggest converting with struct2cell. That seems to work well as I can now sequence through the Signal structures where I could not figure it out before.
Stephen23
Stephen23 le 31 Août 2018
Modifié(e) : Stephen23 le 31 Août 2018
@RichardB: using struct2cell is a very good idea, together with fieldnames. Something like this works on your sample file (and a simple copy I made):
D = '.'; % directory where the files are:
S = dir(fullfile(D,'*.mat'));
C = cell(1,numel(S));
for k = 1:numel(S)
T = load(fullfile(D,S(k).name));
F = fieldnames(T);
T = struct2cell(T);
T = vertcat(T{:});
[T.fieldname] = F{:};
[T.filedata] = deal(S(k));
C{k} = T;
end
Z = vertcat(C{:});
It returns one structure Z which contains all of the file data, the filenames, and the fieldnames. So you can process these however you want. The output is easy to access, e.g. the second element of the structure contains the data from :
>> Z(2).filedata.name
ans =
SampleRun1.mat
>> Z(2).fieldname
ans =
Signal_005
>> Z(2).x_values
ans =
start_value: 7.66712283871882e-05
increment: 9.765625e-05
number_of_values: 4122210
quantity: [1x1 struct]
>> Z(2).x_values.increment
ans =
9.765625e-05
>> Z(2).number_of_values
ans =
4122210
You can easily extend this to work over multiple directories. You can add bells and whistles yourself, such as sorting based on the original fieldnames, etc. The test files are attached.
RichardB
RichardB le 31 Août 2018
Thanks much. I think that will get me where I want to go.

Connectez-vous pour commenter.

Plus de réponses (3)

jonas
jonas le 30 Août 2018
Modifié(e) : jonas le 30 Août 2018
As you've already read, do not. You already have your data in a struct, which is much more convenient than indexed variables. You can use dynamic field names to call your data from the struct. It's one of the common options for avoiding the infamous eval function.
Basic syntax is:
F='Signal_000'
out=MyStruct.(F)
It may however be even more convenient to put your data in a cell array, in which case you can use struct2cell
RichardB
RichardB le 30 Août 2018

0 votes

I got this data from a 3rd party, so I do not have any control over that. I need to figure out which Signal_??? has acceleration data for turbo 1 in the Z axis (found somewhere in the struct Signal_???). Maybe if I could simply find a way to sequentially rename the struct array so I can determine which one I want to investigate) so I don't have to interrogate each one individually.

1 commentaire

jonas
jonas le 30 Août 2018
Modifié(e) : jonas le 30 Août 2018
I mean, you already received several solutions for how to process your data. However, the actual structure of the data remains a bit ambiguous so it's difficult to give you more specific, detailed, solutions. If you tell us exactly what you want to do and upload the data, then we can give you some actual useable code.
It's difficult for us, who have no knowledge of your background, to understand what "acceleration data for turbo 1 in the Z axis" actually means and how to help you find it.

Connectez-vous pour commenter.

RichardB
RichardB le 30 Août 2018

0 votes

Thanks for the information from everyone...internet is amazing. I read some of the stuff about not using eval, but it remains the only way I see to do what I want. No doubt I am not understanding something here...and I need to read a lot of stuff.
I attached what the data structure looks like. For example, I want to plot time vs a particular acceleration, but I have no idea which struc it is in. Maybe there was a different way to bring it in from the *.mat file.
I am not processing the data, per se. I am trying to interrogate it.
I want to determine which name goes with which signal belongs to the data I want. The name of the data (which is everything to me) is found in: Signal_???.function_record.name The data is in Signal_???.y_values.values

3 commentaires

jonas
jonas le 30 Août 2018
Modifié(e) : jonas le 30 Août 2018
I mean, no one is going to stop you from using eval. You can read about other options but also how to name variables dynamically here .. eval or not, I think no one can give you more details unless you describe precisely and explicitly what it is that you are trying to achieve. To me, it still remains unclear.
Looking at your data, the structure is a little bit more complicated than what it seemed before. It seems you two layers of structs and you want to extract the name of the series (stored in the field function_record.name) and the corresponding data (stored in y_values.values). So far it is clear.
What do you want to do with this data? Are you only searching for a specific name and the respective data? Do you want to plot each series of data? Do you want to structure the data in a more convenient way, so that you can access it more easily by name of the series the corresponding data?
Finally, as SC pointed out earlier, the way to read the .mat file is:
data=load('yourmatfile.mat');
This way you can start from a single struct and search it for whatever data you want without typing out every variable name by hand. Will happily help you with that if you explain in detail what to do. In my experience from this forum, well-described problems with data are solved within hours while other questions remains unanswered for days because the problem is unclear and it takes 10 iterations of answers before nailing the desirable output.
RichardB
RichardB le 30 Août 2018
Sorry this is dragging on and I am not clear.
Do you want to structure the data in a more convenient way, so that you can access it more easily by name of the series the corresponding data? Yes
I may plot all the data, but generally that is unhelpful. It depends on what I find. Maybe figuring out how to load the *.mat file and search for the data I want would help, but I don't understand how that would work given how it is structured.
How will I use the date? I will plot acceleration vs time (find the worst ones), acceleration vs rpm, create a Campbell plot, FFT, compare 4 vertical accelerations to see if it is pitching, rolling, translating, or twisting, check transmissibility to see which structure is exciting the one that has a problem. I can double integrate and AC couple the acceleration data to get "stationary" displacement. But I don't want to do this to everything.
Stephen23
Stephen23 le 31 Août 2018
Modifié(e) : Stephen23 le 31 Août 2018
"Really, I'm not looking to write very efficient code. I just need a way to take a 100 odd struct files and turn them into 100 simple arrays (row) with a useful name I can recognize."
Whether you are just investigating your data or trying to write an efficient tool for processing lots of files, it does not change the fact that magically accessing variable names is a bad idea. Data imported using magically named variables cannot be accessed easily in a loop, which is why it forces you to continue to use slow and complex method to access your data. In contrast, when you import your data using better methods, then accessing it is simpler, and there are lots of neat tools that work on whole arrays that help you to process your data, which only work on an array and not on lots of separate variables.
The upshot is, this is not just about making more efficient code, as you simply dismissed earlier, but also about making accessing your data easier, and means you can use lots of tools that will help you to analyze your data.
" Maybe figuring out how to load the *.mat file and search for the data I want would help, but I don't understand how that would work given how it is structured."
It would help. To start with, you need to load the files into output variables:
S = load(...)
What to do next depends on exactly what the .mat files contain. It would help us a lot, if you accurately described how you get the data, in particular:
  • is the data stored in one .mat file, or lots of them?
  • how many variables are stored in each .mat file?

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by