Effacer les filtres
Effacer les filtres

Fast multiplication of rows of one matrix and columns of the second matrix

1 vue (au cours des 30 derniers jours)
I would like to compute v(k) = A(k, :)* B(:, k) as fast as possible (no-loops). Currently, I am doing diag(A * B) but it has unnecessary overhead of computation, and storage.
  2 commentaires
Daniel Shub
Daniel Shub le 5 Oct 2011
Do you want to do it as fast as possible or without loops? The JIT accelerator means that those two things are not necessarily the same.
Daniel Shub
Daniel Shub le 5 Oct 2011
How big is A? Is it sparse or distributed or anything funky like that?

Connectez-vous pour commenter.

Réponse acceptée

Teja Muppirala
Teja Muppirala le 5 Oct 2011
The fastest way to do something generally depends on the size and structure of your data.
Don't assume loops are slower. For simple linear algebra, loops are generally very fast. In fact for large matrices (1000x1000 etc.), I think loops are probably the fastest way actually.
v = zeros(1,size(A,1));
for k = 1:size(A,1)
v(k) = A(k,:)*B(:,k);
end
For smaller matrices, you are probably better off doing this:
v = sum(A'.*B);
The best thing to do it just to try things out and see what works best for your data.
  3 commentaires
Teja Muppirala
Teja Muppirala le 5 Oct 2011
Ah. Yeah I forgot the dot. Thanks James
Dr. Seis
Dr. Seis le 5 Oct 2011
I created two random 10000x10000 matrices and the "for loop" took 2 seconds to compute what "diag" took over 20 seconds to compute. However, and I will have to make a correction to the above, this took only 1 second to execute:
v = sum(A.*B',2);
Note: I added the "dot" to denote that each element in A is multiplied to each respective element of B-transpose before the rows are summed. This should be the same result as v = diag(A*B);

Connectez-vous pour commenter.

Plus de réponses (1)

Daniel Shub
Daniel Shub le 5 Oct 2011
Yair's blog has a nice post on memory issues and array operations:
It is not always obvious what is the best solution.
If your matrices are really big you might be better off distributing them to a graphic card. If your machine has a lot of cores, the for loop could be replaced by a parfor loop, or even distributed to a cluster. It is silly to worry about slight inefficiencies if you can access 1000+ cores.

Catégories

En savoir plus sur Matrix Indexing dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by