How to efficiently integrate big data without using memory / (How to create big data)
10 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Mehmet OZC
le 18 Août 2015
Commenté : Mehmet OZC
le 19 Août 2015
- in a study i will produce large arrays.
- Each array will have at least 500 MB size.
- Each array will have the same number of rows.
- the total size of dataset will be approximately 20 GB or over.
- Somehow I have to create a single variable/array which includes all data and size of 20 GB.
matfile seems a good solution. However when the size of file increases, it gets slower. How can i handle this problem?
9 commentaires
Walter Roberson
le 18 Août 2015
I wonder if compression is leading to slowdowns? I do not know whether -v7.3 with matfile uses compression; see discussion http://www.mathworks.com/matlabcentral/answers/15521-matlab-function-save-and-v7-3 and http://www.mathworks.com/matlabcentral/answers/137592-compress-only-selected-variables-when-saving-to-mat
Réponse acceptée
JMP Phillips
le 19 Août 2015
Modifié(e) : Walter Roberson
le 19 Août 2015
Here are some things you could try:
Use the matfile function, which allows you to access and change variables directly in MAT-files, without loading into memory: http://au.mathworks.com/help/matlab/large-mat-files.html http://au.mathworks.com/help/matlab/ref/matfile.html
Structure your data differently: - if you are representing the data as doubles, maybe you can afford less accuracy e.g. use int32. For example, you can use scaling of 1e4 to represent a double value such as 100.3425 as an integer 1003425.
With MATLAB:
- use 64 bit matlab version
- try disabling compression when saving the files, with the -v6 option
Optimize your PC for your task:
- in task manager, close any unnecessary processes running at the same time, including taskbar junk (adobe update, java update etc)
- disable your anti-virus which might be trying to scan the file and slowing it down
- under task manager, give higher priority to the MATLAB process (see http://www.sevenforums.com/tutorials/83361-priority-level-set-applications-processes.html)
- increase your virtual memory or page file size http://windows.microsoft.com/en-au/windows/change-virtual-memory-size#1TC=windows-7
- defragment your hard drive
- run MATLAB from your local hard drive and not a network drive or external harddrive
- save the .mat file to your local hard drive where it has plenty of space, not a network drive or external harddrive.
- For faster hard drive access, use a Solid State Drive (SSD)
2 commentaires
Walter Roberson
le 19 Août 2015
The -v6 option is incompatible with matfile and with objects over 2 Gb.
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Standard File Formats dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!