I always have problems with matlab (R2019b) using too much memory (way more than the variables I have saved). Currently I'm running a function to extract data from a number of structures. I paused the function because the level of RAM being used just doesn't make any sense. Task manager says that Matlab is using 4.7gb of memory, even though I'm not running anything right now. The total size of all the variables in my workspace is ~0.055gb and I have no figure windows open. The only two programs I have running on my computer are Matlab and Task Manager. Is there any reason that Matlab would be using so much memory and is there a way for me to reduce it?

1 commentaire

Xin Niu
Xin Niu le 18 Avr 2024
It is not only the variables saved on your disk that take memory. I had a code that takes more than 300 GB of memory and finally I found that is due to a bug. The story is we used to save a 0-based stimstamps variable. In an update, we switched to unix time. A variable is created based on the max value of timestamps like: var = 1:step:max(timestamps). So the variable became huge when we switch to unix time.

Connectez-vous pour commenter.

Réponses (3)

Jan
Jan le 22 Nov 2019
Modifié(e) : Jan le 29 Déc 2021

2 votes

How do you observe the memory consumption? The Taskmanager displays the memory reserved for Matlab. If Matlab allocates memory and releases it afterwards, this is not necessarily free'd directly. As long as no other application asks for the memory, it is efficient to keep its status.
Does it cause any troubles, that the OS reserves 4.7GB RAM for Matlab? Why to you say, that this is "too much" memory?
Although the current consumption of memory is small, growing arrays can need much more memory. Example:
x = [];
for k = 1:1e6
x(k) = k;
end
Although the final array x occupies 8MB of RAM only (plus about 100 Bytes for the header), the intermediate need fro RAM is much higher: sum(1:1e6)*8 bytes = 4 TerraBytes. Explanation: If x=[1], the next step x(2)=2 duplicates the former array and appends a new element. Although the intermediately used memory is released, there is no guranateed time limit for the freeing.
Can you post some code, which reproduces the problem.

12 commentaires

Julia Rhyins
Julia Rhyins le 22 Nov 2019
Modifié(e) : Julia Rhyins le 22 Nov 2019
This is an example of my code:
idx = dir(sprintf('%s\\*.mat',path));
widths =[];
snrs = [];
APD_mpz;
for i = 1:length(idx)
%% load datafile
clear data SNR SNRdB info outStruct
filename = sprintf('%s\\%s',idx(i).folder,idx(i).name)
data = load(filename);
Fstim = 1/(data.stimPeriod/1000);
%% call functions
[SNR, SNRdB, info] = SNRV4_1(data.V_out,Fstim,data.Fs,12);
%% output structure
outStruct = data;
outStruct.SNR_V4_1 = SNR;
outStruct.SNRdB_V4_1 = SNRdB;
outStruct.SNRinfo_V4_1 = info;
%% save new data
save(filename,'-struct','outStruct');
%% arrays
widths =[widths;data.width_V5.width];
snrs = [snrs;SNRdB];
APD_mpz = [APD_mpz;nanmean(data.APD_o)];
end
I'm basically looping through a directory and loading 220 matlab structures (each ~4000KB). I call a function, add the new values to the structure and save it, then pull a couple values from the structure to save in arrays. If I run the for loop from 1:length(idx), it will almost definitely crash my computer. So what I end up doing is running the loop it in groups of 20 or 30 files. Even then, by the time I get to the 'end,' the taskmanager says matlab is 'using' 7 or 8gb of RAM. It never seems like the variables in my workspace use very much memory.
Is there a way I can rewrite this so that it doesn't crash my computer?
**Another note, it's possible that the problem I'm having is caused by my computer, rather than by Matlab. I've had some performance problems with it in the past. So if it seems like my code shouldn't be problematic, then the source of my problems may be at a higher level. Even if I close and reopen Matlab, it will immediately use about 800MB memory.
I tried deleting everything from my loop other than the following:
idx = dir(sprintf('%s\\*.mat',path));
for i =1:217
clear data
filename = sprintf('%s\\%s',idx(i).folder,idx(i).name)
data = load(filename);
end
The memory usage still got up to 5900MB in 217 iterations, then after the loop ended, it continued to increase to ~7600MB and stayed there indefinitely.
Daniel M
Daniel M le 22 Nov 2019
Hmm, I can't seem to replicate this issue. This may sound silly, but try changing the code to this and see if it clears it up:
idx = dir(sprintf('%s\\*.mat',path));
for i =1:217
clear data
pause(0.02)
filename = sprintf('%s\\%s',idx(i).folder,idx(i).name)
data = load(filename);
end
If that doesn't work, replace pause() with drawnow
Jan
Jan le 23 Nov 2019
@Julia: The task manager tells you, how much memory is reserved for Matlab. Loading a file might reserve space for disk caching - this detail is not documented, so I'm only guessing. This would mean, that the consumption in the task manager has no "real" meaning. As soon as another application needs the memory, it can be distributed again.
What does "it will almost definitely crash my computer" exactly mean? What happens? Do you get error messages? If so, which one?
In Matlab the clear commands are usually a waste of time only.
By the way, path is an important Matlab command. Do not use this name for a variable to avoid serious troubles during debugging.
Use fullfile instead of creating the file name manually. Replace
sprintf('%s\\%s',idx(i).folder,idx(i).name)
by
fullfile(idx(i).folder, idx(i).name)
Loading MAT files can create figures and other hidden data, e.g. stored persistently in function or user-defined classes. Checking the sizes of the variables in the workspace is not enough to exclude such side-effects.
Your oiriginal code shows exactly, what I have mention with the need for pre-allocation. replace
snrs = [];
for i = 1:length(idx)
snrs = [snrs;SNRdB];
end
by
snrs = zeros(1, length(idx));
for k = 1:numel(idx)
snrs(k) = SNRdB;
end
numel is more stable than length, because the latter chooses the longest dimension, while numel does exactly, what is wanted. For vectors the result is the same, but in real code this is applyied to matrices too often. Using numel is clear and direct.
Using "i" as loop counter is deprecated by MathWorks to avoid a confusion with 1i. Well, this might be a question of taste.
Julia Rhyins
Julia Rhyins le 25 Nov 2019
Someone pointed out to me that .mat files are compressed so when the file is loaded it may be larger than what I see in the file explorer. I think this may be my problem because each file that I load contains a couple of figure handles. In terms of the error, I think there is some initial memory warnign, but it is quickly buried in the command window by continuous printing of 'Warning: Error updating line. Update failed for unknown reason'
Walter Roberson
Walter Roberson le 25 Nov 2019
The total size of all the variables in my workspace is ~0.055gb and I have no figure windows open.
each file that I load contains a couple of figure handles.
There is a contradiction there. The only way to load figure handles is to create figure windows from them. Those figures might not be visible but they are open. And if you do not close those figures after you are finished with them then the memory for the (possibly not visible) figures will add up.
tom3w
tom3w le 24 Juin 2020
Modifié(e) : tom3w le 24 Juin 2020
Hi,
Jan mentions that Task Manager reports the memory reserved for Matlab.
Is there a way to know the effective amount of memory used by Matlab considering Task Manager or Resource Monitor, or any other way (preferably not in Matlab)?
For now, I can't find any other memory metric mentioning an effective memory use being much lower than 3.7Gb (with an empty Matlab session)...
Thanks
Daigo
Daigo le 28 Déc 2021
@Jan, "In Matlab the clear commands are usually a waste of time only. " > I know this is not a point in this discussion but I'm wondering what you mean by that. Do you mean the clear commands make the code slower? If so, do you know a better alternative to save a memory usage?
Bruno Luong
Bruno Luong le 29 Déc 2021
Usualy if you factorize your code correctly in functions, you don't need to call clear since the intermediate variables are used by the function and automatically cleared when the function exits.
If you feel you need clear, then your code is not clean and the local worksapce is populated with bunch of variables that seat there for nothing and might break your code and occupy the PC available memory.
IMO one can somewhate tolerate CLEAR at the begining of the main script.
Jan
Jan le 29 Déc 2021
@Daigo : Bruno hits the point: If the relevant parts of the code are split into functions, the intermediately created large variables are cleared automatically. If a large variable is created repeatedly in a loop, using the same variable clears the formerly used data automatically also.
function myTest
data = rand(1000, 1000);
for k = 1:1000
y = data + 1; % Any large array
... % Saome arbitrary code
clear('y'); % No benefit
end
end
An example, in which several large arrays are created in sequentially and not used later:
function myOtherTest
data1 = rand(10000, 1000);
... % Some code 1
clear('data1'); % Might be useful
data2 = rand(10000, 1000);
... % Some code 2
clear('data2'); % Might be useful if more code follows
end
Such a code structure is a bad programming style, because different jobs are done inside the same function. Here the clear might be useful, because it allows Matlab to release the memory of data1 early. But whenever your code produces the need for clearing variables, this should be taken as signal to check, if a refactoring of the code is a better strategy.
The toolboxes of Matlab I have installed, contain about a dozen of functions, which call clear. In splitapply it is used to check, if the function to be applied creates the output "ans". This is an example of meta-programming, which is required, if the function cannot know details about other code. This should not happen in real applications.
Walter Roberson
Walter Roberson le 30 Déc 2021
"Do you mean the clear commands make the code slower?"
If the variable is reused, then yes , "clear" makes the code slower. MATLAB does flow control analysis to track potential type changes in variables. When "clear" is used, then in all points after that in the code, MATLAB has to mark the variable as being of unknown type and re-look-up methods for it each time (because the flow control analysis is static, and at run-time MATLAB has to face the possibility that the variable might not be re-defined in some paths.)
Bruno Luong
Bruno Luong le 30 Déc 2021
"When "clear" is used, then in all points after that in the code, MATLAB has to mark the variable as being of unknown type and re-look-up methods for it each time "
It does not make sense to me. The analyser should be able to know the type of variable when it will be create again after clear.
And might be the analyzer won't be able to akways track the type without clear, for instant the branch decision depending on execution value, or burried deep inside class method or function.

Connectez-vous pour commenter.

Jose Sanchez
Jose Sanchez le 28 Jan 2020

1 vote

I am having a similiar issue while running on an HPC cluster.
My University cluster allow me using up to 520 workers where each HPC node (4 workers) has 8 GB RAM. I controlled that the RAM consumed inside my parfor loop were no higher than 500 MB. However, when I run in the cluster using 100 parallel processes, the cluster crash with "Out of Memory" error.
Then, I did a test running locally on my PC (32 GB RAM) and I can see clearly that every worker is consuming over 2 GB of RAM, which is more than 5 times the amount of RAM consumed within each PARFOR.
In my opinion, clearly, MATLAB is doing something that is not working as expected! I didn't notice this issue in the HPC MATLAB version 2017a despite using our cluster very often.

4 commentaires

Mohammad Sami
Mohammad Sami le 28 Jan 2020
Modifié(e) : Mohammad Sami le 28 Jan 2020
In my experience with the current version of Matlab, every thread started by parallel pool takes up ~500 mb / thread (on Windows) with no data loaded. So 100 processes will take up 50 GB of RAM.
mustafa mete
mustafa mete le 19 Mai 2020
Hey, could you solve this problem. now, i have same problem. I dont know how to deal with it.
Mohammad Sami
Mohammad Sami le 20 Mai 2020
R2020a introduced new thread based parallel cluster. It has some limitations compared with the process based cluster. If your code is compatible with the thread based cluster, you can use it instead to reduce the memory overhead for process based clusters.
You can read more details here
tianyuan wang
tianyuan wang le 30 Juil 2020
I have the same problem.If I call MATLAB on the control node of HPC (one control node, 17 computing nodes, distributed memory) to process the super large matrix, can the new version of MATLAB solve the problem of insufficient memory for single node?

Connectez-vous pour commenter.

Christian Schwermer
Christian Schwermer le 16 Août 2020

0 votes

Hello,
MATLAB doesn't release memory, if you didn't declare the variable as output for the function.
best regards

4 commentaires

Walter Roberson
Walter Roberson le 16 Août 2020
Perhaps you can provide example code for this? It does not sound correct to me.
Christian Schwermer
Christian Schwermer le 19 Août 2020
Modifié(e) : Walter Roberson le 30 Déc 2021
I think every variable has to be preallocated exactly. Changing size while a function is executing, accumulates memory usage. I think when the size changes the complete array is copied and saved, without changing its name. Original array stays in memory, and therere is no possibility to delte, because its name doesn't exist anymore.
Perhaps it is the best way to preallocate all arrays whith enough free space, even if it won't be used. Greater arrays need less memory than arrays changing its size.
I come to that bcause of my GUI, where i used a cell array as FIFO buffer to acquire images. Memory usage increases for every session. Only closing matlab and restart releases the memory usage:
bufferSize = 450;
frame_buffer = cell(1, bufferSize);
....
flushdata(VideoInputObj)
delete(VideoInputObj)
frame_buffer(:) = {[]}=;
clear('frame_buffer')
imaqreset
When i preallocate the buffer for each cell . Memory usage stays on a constant acceptable level. Nevertheless it wasn't possible to release memory without restarting:
ROI = VideoInputObj.ROIPosition;
bufferSize = 450;
frame_buffer = cell(1, bufferSize);
frame_buffer(:) = {zeros(ROI(4), ROI(3) ,'uint8')} ;
Bruno Luong
Bruno Luong le 19 Août 2020
"I think every variable has to be preallocated exactly. Changing size while a function is executing, accumulates memory usage. I think when the size changes the complete array is copied and saved, without changing its name. Original array stays in memory, and therere is no possibility to delte, because its name doesn't exist anymore."
Hmmm no, sorry but this in totally incorrect, MATLAB is not that stupid. It has cross link within internal structure to keep track data sharing, release memory update the cross link if that variable is cleared, or erased when content is written, there is also garbage collector that destroy local variables when function ends, etc... The user variable name has nothing to do with the management.
The latest MATLAB version might not work exactly like that but there is no leak memory as you state.
Walter Roberson
Walter Roberson le 19 Août 2020
Or if there is a leak it is in the image acquisition software.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Performance and Memory dans Centre d'aide et File Exchange

Commenté :

le 18 Avr 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by