Speed up loading a single variable from a large mat file

48 vues (au cours des 30 derniers jours)
Wonsang You
Wonsang You le 18 Juil 2017
Commenté : Walter Roberson le 25 Sep 2018
I was trying to load a single variable 'data' from a large .mat file (>2GB) using the following command in MATLAB R2016b on Mac OS.
data = load('db.mat','-mat','data');
However, it was too slow to load the single variable. Note that the variable 'data' is not a matrix but a structure. I already refer to the previous discussion but it didn't give solution. Could anyone please suggest an alternative way to speed up loading a single variable from a large mat file? Thank you for your help in advance.
  1 commentaire
per isakson
per isakson le 18 Juil 2017
Modifié(e) : per isakson le 18 Juil 2017
  • ">2GB" means version '-v7.3', I assume.
  • '-nocompression' improves speed, but not that much
  • AFAIK: '-v7.3' is slow. (I didn't test that since R2013a.)
  • Try HDF5 files with 'ChunkSize' Default: Not chunked

Connectez-vous pour commenter.

Réponse acceptée

Jan
Jan le 18 Juil 2017
Using a large MAT file and extracting a single variable takes time. There is no magic acceleration. The compression of the data of v7.3 files requires to decompress the complete or at least a large part of the data to extract you variable.
The clean solution would be not to use MAT files for this job.
  2 commentaires
Rahimeh Rouhi
Rahimeh Rouhi le 25 Sep 2018
I have the same problem. You are right, using Matfile is time-consuming. How can I fix the problem?
Thank you for your guidance.
Walter Roberson
Walter Roberson le 25 Sep 2018
With -v7.3 mat files there is now a -nocompression option
This applies only for -v7.3 .
It is possible to have a -v7 (not -v7.3) .mat file that is over 2 gigabytes as long as any one variable is not over 2 gigabytes. However, compression would happen automatically for -v7 mat files as well.
Sometimes what helps is to use matfile with -v7.3 files . This can help in accessing members of a struct array. Reading between the lines, it might be able to load a single field of a struct array, but anything below that would be loaded as a whole.
For pure numeric arrays sometimes it makes sense to write them as pure binary files and memmapfile() . This can also be done for struct arrays, but in each case the data written per entry must be fixed size.

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Workspace Variables and MAT-Files dans Help Center et File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by