Compute mean and diff faster

20 vues (au cours des 30 derniers jours)
Xin
Xin le 28 Déc 2017
Hello everyone. I am working on a FD code and need to do a lot of averaging in very large 3D/2D matrices. I want to do the following task, take 1D as an example.
A vector is: A=[a,b,c,d,e,f], and I want to get the average value in between each two neighboring values, so I do A=(A(2:end)+A(1:end-1))/2. And I also do diff a lot, e.g. diff(A,1). And it is the same in 2D or 3D cases. But it becomes very slow when I am dealing with very large matrix, say 1000*1000*1000. Is there any faster way to do this?
Thank you very much.

Réponse acceptée

John D'Errico
John D'Errico le 28 Déc 2017
Modifié(e) : John D'Errico le 28 Déc 2017
Get more memory. Huge problems require more memory, or a cup of coffee. Sit down, relax, take out that old copy of War and Peace, and read away.
If your matrix is 1000x1000x1000, then it has 1e9 elements, each of which will require 8 bytes of RAM to store. So that matrix uses roughly 8 gigabytes of RAM to store.
Now, when you compute some operation on your array, like diff or the local average between consecutive elements, this creates a NEW array that is almost the same size. So a new array is formed that also requires 8 gigabytes of RAM.
Every copy of that array forces MATLAB to allocate 8 more gigabytes of RAM. How many gigs of RAM does your computer have? For example, mine is now just a bit old, so it has only 8 gigs in total.
What does MATLAB do when it runs out of RAM? It starts swapping things around, using virtual memory. That gets SLOW, real fast, even if it can find the disk space to do so.
So if you want your computations to be faster, you need more memory.
A poor alternative might be to use singles, instead of doubles. Create your matrix as a single array, and it will now require 4 gigabytes of RAM. It is still gonna be a memory hog, but a slightly leaner one. The cost of course is a loss of precision in your computations.
  2 commentaires
Xin
Xin le 28 Déc 2017
Hi John. Thanks for your reply, it may be a good idea to do some reading while the computer is suffering :)
Yes, my computer is 16GB and it seems that it can handle such calculation but barely. This is not a major issue before later I can run it on some supercomputer with 256GB so I can even push the resolution much further. What worries me now is the time as matlab already handle matrix calculation very fast and my code basically is composed of matrix addition/subtraction mostly. I am just wondering if there is a better way in matlab to do it faster. Maybe there isn't but it would be nice to explore a bit how matlab deals with this type of simple (but time consuming) task.
John D'Errico
John D'Errico le 28 Déc 2017
Two copies of an 8 GB array require 16 GB. Don't forget that MATLAB itself consumes some RAM. So it does not matter how the computation is done, you will start having problems as soon as you start to push that limit. If your computations can tolerate the use of single, go in that direction.
If you have an actual disk drive that uses s spinning platter, the best thing you can do is replace the disk drive with a SSD drive. That can hugely increase your VM access speed, limited by the bus speeds between memory and your drive. SSD drives are not that expensive. Mine was well worth the money spent to keep my computer hopping along happily for a few more years.

Connectez-vous pour commenter.

Plus de réponses (2)

Benjamin Kraus
Benjamin Kraus le 28 Déc 2017
It may be time to start looking into the Parallel Computing Toolbox or some of the new Big Data capabilities in MATLAB (such as tall arrays). Some links to check out:

David Santos
David Santos le 20 Août 2019
I will recomend you to put all your data in a big .mat matrix using matfile(doesn't load all the data in memory just the necessary) and process in chunks, preferably by columns.
Doing this you can control the ammount of data you put into memory and been able to process very long matrix (> 1TB).
Tall arrays are ok if you don't need to acess to all the data because once you increase the number of acces it becames slower than matfiles
All the best

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by