Effacer les filtres
Effacer les filtres

Economical use of memory

5 vues (au cours des 30 derniers jours)
Chienmin
Chienmin le 19 Avr 2012
Dear guys/professionals,
First of all, I am not a professional programming person in terms of computer science. I used matlab only for mathematical computation, so please bear with me if my following question look stupid to you.
I am currently computing some VERY-LARGE-SCALE problems. I create a class with its own properties and methods. Bascially, some properties of this class are in the form of three-dimensional array. Let us say we have 100 points in each axis, then I have 100^3=1000000 entries in such array. An I would like to find the values of that array. In my algorithms, all the entry values will be columnized; and in a mathematical system, there will be a coefficient matrix ahead of this column vector. And yes, that coefficient matrix is 1000000-by-1000000. And I will have to sove such similar systems at each time step, say 100 time steps. That is why the problems becomes big.
Now since the system is big. I want to find a way to use my memory capacity more efficiently. I am currently using an "exogenous" algorithm to solve such a system. That means the algorithm is not one of the properties of my class. And so, I have to INPUT data (the coefficient matrix, for example) into that algorithm. My feeling is when I do so, actually I am transmit my data via "PASS BY VALUES". So I GUESS if I include the algorithm as part of the properties of my class, then I MAY be able to do "PASS BY REFERENCE" and MAY save half of the memory capacity. Is my guess correct? Or is there anyone having a good suggestion for this?
Many thanks for your guys' help in advance.
Best regards,
Leon

Réponses (3)

Richard Brown
Richard Brown le 19 Avr 2012
Matlab's semantics are pass-by-value but it only actually copies the matrix if you modify it within your function, "lazy copy" or "copy on write". So passing it as an input should not be a problem
Question: How sparse is your coefficient matrix? Presumably it's not dense, as that would require around a TB of memory even if you were using int8s ...
  1 commentaire
James Tursa
James Tursa le 20 Avr 2012
To be precise, MATLAB passes shared data copies of the arguments to m-file functions, and the actual reference to the original variable to mex functions. If you modify an argument in an m-file, it simply unshares it first before modifying it following the regular MATLAB variable rules. There are some exceptions to this, e.g. see Loren's Blog on in-place operations:
http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/

Connectez-vous pour commenter.


Geoff
Geoff le 19 Avr 2012
As long as you don't CHANGE the variables that you pass into your function, they are passed by reference.
Consider this test where I allocate all my remaining physical memory to a matrix, and then pass that into a function. Notice that it is NOT copied:
function [y] = byref( x )
y = x(42,42);
disp(['Hidden meaning of Life, the Universe and Everything: ' num2str(y)]);
memory
end
>> clear
>> memory
Maximum possible array: 12794 MB (1.342e+10 bytes) *
Memory available for all arrays: 12794 MB (1.342e+10 bytes) *
Memory used by MATLAB: 679 MB (7.119e+08 bytes)
Physical Memory (RAM): 8167 MB (8.564e+09 bytes)
* Limited by System Memory (physical + swap file) available.
>> x = rand((8167-679)/8,1048576);
>> memory
Maximum possible array: 5282 MB (5.539e+09 bytes) *
Memory available for all arrays: 5282 MB (5.539e+09 bytes) *
Memory used by MATLAB: 8167 MB (8.564e+09 bytes)
Physical Memory (RAM): 8167 MB (8.564e+09 bytes)
* Limited by System Memory (physical + swap file) available.
>> byref(x)
Hidden meaning of Life, the Universe and Everything: 0.47449
Maximum possible array: 5289 MB (5.545e+09 bytes) *
Memory available for all arrays: 5289 MB (5.545e+09 bytes) *
Memory used by MATLAB: 8168 MB (8.564e+09 bytes)
Physical Memory (RAM): 8167 MB (8.564e+09 bytes)
* Limited by System Memory (physical + swap file) available.
ans =
0.4745
You still need to be careful with intermediate large data sets... Delete those values as soon as they're not required. Especially if you're going to overwrite it with a new value (otherwise you WILL for a time have the old AND the new in memory). For example, if you're in a loop:
huge = important;
while crazy
result = something(huge);
%do stuff
clear result; % I hate MatLab's relaxed scoping rules.
end
As for your co-efficient matrix, are most values in there going to be zero? In that case you can use a sparse matrix.
doc sparse
Cos a million-by-million element matrix of doubles will otherwise require 7630 gigabytes of RAM, which I'm reasonably confident you don't have.
You may need to consider ways to split up your processing, if that is possible.
  3 commentaires
Richard Brown
Richard Brown le 19 Avr 2012
@Walter: nice - that's a pretty serious machine. Our one-rack BlueGene/P installation has 4TB
Geoff
Geoff le 19 Avr 2012
That's a lot of gigawiggles.
void *fun = malloc( (size_t)1 << 40 );

Connectez-vous pour commenter.


Chienmin
Chienmin le 20 Avr 2012
Dear guys,
Many thanks for your kindly reply to clarify my confusion. But unfortunately the coefficients matrices will be modified again and aagain in my algorithms. Actually I have already used the sparse format, deleted unnecessary variables and even kept saving and loading those very big data set. Probably my problems can be run better on PC/Server in 5-10 years. lol
Just one more question, for those people working in dynamic three-dimesional modelling or simultaneous game programming, how do they manage to use their memory dynamically efficient? I suppose they have faced the smaeproblem as me before.
  1 commentaire
Geoff
Geoff le 22 Avr 2012
When you are modifying the matrix again and again, are you working on a small part of it? You could return only the bits that have changed (as a sparse matrix) and combine them back into the main one using subscripted assignments. The other option is to make your coefficient matrix global.
People generally handle enormous datasets by exploiting the principle of locality. That is, we don't tend to require random access to the entire data set at once. And if we do, then it often points to a poor choice of data representation and/or processing algorithm.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Performance and Memory dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by