Is there a way to identify group names in an H5 file programmatically without using h5info?

25 vues (au cours des 30 derniers jours)
I am working with large (0.5-2 GB) and complex h5 data files and trying to identify the high level group names in the files. The names of the groups change for each file, so I need to be able to programmatically identify them. Below the high level of groups, the file structure is consistent, so I can efficiently use h5read once I have these 5-10 group names. Using hdf5info works, but is very slow because it is scanning the entire file and giving me much more metadata than I really care about (each high level group has thousands of nested lower level groups/datasets/attributes). The MATLAB recommended h5info is much slower for some reason. In fact, I have never actually let it run to completion, usually giving up after half an hour.
I have also tried setting the "ReadAttributes" bool to FALSE which for some reason made hdf5info take even longer to run. Is there a more efficient way to identify only the top level of group names in the h5 file?
Thanks,

Réponse acceptée

Jacob
Jacob le 23 Nov 2022
I finally figured it out by using the low level H5 functions built into MATLAB (H5G.get_objname_by_idx). This is exponentially faster than running the full hdf5info.
fid = H5F.open('test.H5');
idx = 0;
while true
this_name = H5G.get_objname_by_idx(fid,idx);
if isempty(this_name)
break
end
group_names{idx+1,1} = this_name;
idx = idx+1;
end
num_grps = length(group_names);
H5F.close(fid);
  1 commentaire
John Wolter
John Wolter le 3 Juil 2024
Modifié(e) : John Wolter le 3 Juil 2024
Note that Jacob's routine does not test the object names found to see if they are actually groups. I wrote a version of this to traverse the full groups tree and discovered that H5G.get_objname_by_idx found groups, datasets, and datatypes in my file. There were no links in my file, so I don't know if it would find those as well.
I used the routine below to distingush between datasets and groups, but I wouldn't be surprised if there is a more elegant solution.
try
did = H5D.open(fid, full_name);
% dataset open succeeded; do action appropriate for a dataset
...
catch
try
gid = H5G.open(fid, full_name);
% group open succeeded; do action appropriate for a group
...
catch
...
end
end

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Workspace Variables and MAT Files dans Help Center et File Exchange

Produits


Version

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by