Processing timer for Sparse matrix is too long?
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Nghia Nguyen Thanh
le 22 Août 2017
Modifié(e) : Cedric
le 19 Sep 2017
Dear All,
I am a new user with Matlab, and I meet a problem with Sparse matrix.
Time for processing with Sparse matrix is very long.
Here is my code:
lab=randi([1 7],1,100000);
M = sparse(100000,100000);
for i = 1:100000
for j = (i+1):100000
if lab(i,1) == lab(j,1)
M(j,i) = 1;
else
M(j,i) = 0;
end
end
end
Do I have any method for reduce timing process? Thanks and Best regard!
1 commentaire
Tim Berk
le 19 Sep 2017
Dear Nghia,
Looping through large arrays is very slow, especially in what is called 'nested loops' as you are using. Your nested loop requires 100000*100000/2 calculations (the 1/2 because of the (i+1) as starting value for j).
Lets assume that evaluating
if lab(1,i) == lab(1,j)
M(i,j) = 1;
else M(j,i) = 0;
end
Takes 10 micro seconds (which I think is quite a low estimate), than your script would take 1E5*1E5*0.5*1E-5 = 50000 seconds (14 hours).
Hope this clarifies why it is slow.
Often this can be solved by vectorization as explained here: https://uk.mathworks.com/help/matlab/matlab_prog/vectorization.html
Hope this helps.
Cheers,
Tim
Réponse acceptée
Cedric
le 19 Sep 2017
Modifié(e) : Cedric
le 19 Sep 2017
You won't be able to do it, vectorizing or not.
If random integers are really in 1:7, the probability, when you pick two elements, that one is equal to the other is 1/7. On average, with your lab vector of n=1e6 elements and assuming that you are interested in the lower triangular part of M, this means n^2/2/7 non-zero elements, and hence
>> (16 * (1e6)^2/2/7 + 8 * 1e6 + 8) / 1e12
ans =
1.1429
more than 1.14 TB of RAM to store (just a single version of it).
Are random integers really in 1:7 or is the pool of integers larger, e.g. 1:n ? In the second case, you will gain a lot by vectorizing the approach. Often in these cases that involve sparse matrices, the best approach consists in building indices of non-zero elements ( I and J ), and performing a unique call to SPARSE at the end:
M = sparse( I, J, 1, n, n ) ; % Values are all 1 in your case.
0 commentaires
Plus de réponses (1)
Walter Roberson
le 19 Sep 2017
Modifié(e) : Walter Roberson
le 19 Sep 2017
Doing M(j,i) = 0 is a waste of time on a sparse array unless M(j,i) might already have a non-zero value. sparse() initializes to 0.
Your line,
M = sparse(100000,100000);
is equivalent to
M = sparse([], [], [], 100000, 100000, 0)
which generates a sparse array that is expected to have 0 occupied cells in it. Every time a cell is written, the sparse array needs to be recreated to allocate room for the new entry. It is a lot more efficient to use
M = spalloc(100000, 100000, N)
where N is an estimate of the number of locations that will eventually be occupied with non-zero values. In your situation you can estimate N as 100000^2/7 which is roughly 1428571429. One element in seven occupancy is not very sparse.
Your inner loop could be vectorized, which could improve performance a lot.
0 commentaires
Voir également
Catégories
En savoir plus sur Matrix Indexing dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!