Import part of dataset in a HDF5 file, by 'member' and/or 'logical array'
31 views (last 30 days)
Show older comments
한범 on 21 Jun 2022
Answered: Walter Roberson on 22 Jun 2022
I am trying to open quite big (~3G) hdf5 files in matlab and compute it parallely. But files are too big, and it takes so long time to load it and also the RAM is broken because the workspace became full, so I hope there is a way I can open just small part of the matrix.
For example, if I do h5disp('Data.h5'), I get:
Member 'A': H5T_STD_U32LE (uint32)
Member B': H5T_STD_U32LE (uint32)
Member 'C': H5T_STD_U64LE (uint64)
Member 'D': H5T_ARRAY
Base Type: H5T_STD_U16LE (uint16)
Member 'E': H5T_ARRAY
Base Type: H5T_STD_U32LE (uint32)
It seems with high-level function 'h5read()' I can import the data in the unit of chunks. However, each chunk contains all members - ABCDE. In this case E takes the most of the size and is the reason for the long importing time. Is there any method to only import A, B, or C without loading D E?
Moreover i have one more problem. I know that with 'h5read()' I can import just 'some' chunks in the file in the form of h5read(filename,ds,start,count,stride). However, it seems 'stride' can be only one interger. Can I import the portion of data defind by indexing array, such as [1,100,121,400,3254,...] or [1 0 0 1 0 1 0 ...]?
I tried to deal with it by myself and even looked into the low-level functions, but it is beyond my limit. It seems many friends here have already given such question in this community, but I found no satisfying answer for this problem. If anyone can help please answer me.
MJFcoNaN on 21 Jun 2022
The "start, count, stride" is suitable for slicing a huge matix. For example this will only read a "vector" from a 2D matrix thus much less RAM needed.
% fix 2nd dimension
data=h5read('yourfile','needed dataset',[1 1],[inf 1]);
% or fix 1st dim
data=h5read('yourfile','needed dataset',[1 1],[1 inf]);
then you can deal with it in matlab.
Walter Roberson on 22 Jun 2022
The approach seems to be to use the H5T utilities to create a prototype containing only the members that you want to read, and then pass the prototype to the HDF read routine.
This is not convenient, but it does appear to be possible.
Find more on HDF5 Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!