Uploading .mat files that contain nested structs into a datastore

I would like to upload .mat file into a datastore that contains nested structs. Like so...
TestData.Unit1.Parameter 1
TestData.Unit1.Parameter 2
TestData.Unit2.Parameter 1
TestData.Unit2.Parameter 2
When I try to upload this into a datastore I cannot read the content of the fields inside the nested struct. The size of the .mat files are enormous and take a really long time to load one parameter. How do I accomplish this while still using datastore or something of the like? I know we are supposed to use fileDataStore to create the custom read function but the read function I created only displays the first level of the structs.

2 commentaires

Matt J
Matt J le 20 Fév 2024
Modifié(e) : Matt J le 20 Fév 2024
When you say "upload" do you truly mean that the datastore reads from files stored on a remote computer? If so, maybe the reading time is predominantly due to the remote connection, rather than the file size.
It is files stored on a network of computers. So there is no way to "point" to the data and call specific parameters as needed?

Connectez-vous pour commenter.

 Réponse acceptée

Catalytic
Catalytic le 20 Fév 2024
Modifié(e) : Catalytic le 20 Fév 2024
I think the wisest course would be to reorganize your data and assign each "Unit" to its own separate .mat file.

3 commentaires

Normally I would agree but I was given specific instructions to not create anymore .mat files because I am loading from a specific store of data
The data is write protected
It wouldn't matter if the data is write-protected. You could read the data, reorganize it, and copy the reorganized data somewhere controlled by you and where you would have full read/write acess.
If for some reason you can't do that, so be it. However, one of the upsides of @Catalytic's suggestion is that if you split the data this way, you could shuffle the Parameter instances. In the form you have things now, the shuffle() command can only vary the the read-out order of the files, not the order of the Parameters within the files.

Connectez-vous pour commenter.

Plus de réponses (1)

Matt J
Matt J le 20 Fév 2024
This example shows how to use a fileDataStore to read partial data from a .mat file
I might modify their example and use a matfile object to read the data without actually loading the entire file.

7 commentaires

Yes I tried that specific example. I tried using the matfile but it doesn't see any variables.
fds = fileDatastore("C:\Users\******\Desktop\datafile.mat","ReadMode","partialfile","ReadFcn",@load_variable)
function [data,variables,done] = load_variable(filename,variables)
% If variable list is empty,
% create list of variables from the file
if isempty(variables)
variables = who('-file', filename);
end
% Load a variable from the list of variables
data = matfile(filename, variables{1});
% Remove the newly-read variable from the list
variables(1) = [];
% Move on to the next file if this file is done reading.
done = isempty(variables);
end
read(fds)
This is the error back:
Error using matlab.io.datastore.FileDatastore/read (line 29)
Error using ReadFcn @load_variable for file:
C:\Users\********\Desktop\datafile.mat
Error using matlab.io.MatFile (line 422)
No value was given for 'Data'. Name-value pair arguments require a name followed by a value.
Error in matfile (line 75)
mf = matlab.io.MatFile(varargin{:});
Error in load_variable (line 10)
data = matfile(filename, variables{1});
data = matfile(filename, variables{1});
That is the wrong way to use matfile. The proper way to use matfile is
matObj = matfile(filename);
data = matObj.(variables{1});
Matt J
Matt J le 20 Fév 2024
Modifié(e) : Matt J le 20 Fév 2024
If the instances to be read are the nested Parameterx fields, then I think you would need something more elaborate than the literal example from the doc:
function [data,tasks,done] = load_variable(filename,tasks)
% Initialize tasks
if isempty(tasks)
tasks.unitList = who('-file', filename);
end
%Initialize current Unit
if ~isfield(task,'currentUnit')
tasks.currentUnit=load(filename, tasks.unitList{1}).(tasks.unitList{1});
tasks.paramList=fieldnames(tasks.currentUnit);
end
% Load a piece of data
data = tasks.currentUnit.(tasks.paramList{1});
% Remove the read-out parameter from paramList
tasks.paramList(1) = [];
if isempty(tasks.paramList) %Check if we are finished processing a Unit
tasks=rmfield(tasks,{'currentUnit','paramList'});
tasks.unitList(1)=[];
end
done =isempty(tasks.unitList); %Check if all Units in the file have been processed
if done
tasks=[];
end
end
>> read(fds)
ans =
struct with fields:
Para1: [1×1 struct]
Para2: [1×1 struct]
That worked. How do you get to the actual data within the struct?
See above.
"How do you get to the actual data within the struct?"
Based on what you show in your comment, perhaps something like this:
S = read(fds);
S.Para1
S.Para2
Based on what you show in your comment, perhaps something like this:
Yes, but the OP would like the read() operation to return this directly.

Connectez-vous pour commenter.

Catégories

En savoir plus sur System Commands dans Centre d'aide et File Exchange

Produits

Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by