How to make this code faster?

Question

0 votes

I have coded a formulation using for loops but it takes forever when my matrices are very large. Can anyone help me make this code more efficient without for loops?

I'm coding the formulations in page two (in the lower half page) of this document: http://www-stat.wharton.upenn.edu/~tcai/paper/Testing-2-Covariance-Matrices.pdf

And here are my codes:

if true
% Input: X, Y
p=size(X,2);
n1=size(X,1); n2=size(Y,1);
sigma1=cov(X); sigma2=cov(Y);
for i=1:p
  for j=1:p
      for k=1:n1
          A(k,:)=((X(k,i)-mean(X(:,i))).*(X(k,j)-mean(X(:,j)))-sigma1(i,j)).^2;
      end
      Teta_bar1(i,j)=1./n1 .* sum(A);
        for k=1:n2
            B(k,:)=((Y(k,i)-mean(Y(:,i))).*(Y(k,j)-mean(Y(:,j)))-sigma2(i,j)).^2;
        end
        Teta_bar2(i,j)=1./n2 .* sum(B);
        M(i,j)=((sigma1(i,j)-sigma2(i,j)).^2)./(Teta_bar1(i,j)/n1+Teta_bar2(i,j)/n2);
    end
end  
end

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Geoff Hayes le 15 Oct 2014

Ouvrir dans MATLAB Online

1 vote

MJ - what does it takes forever when my matrices are very large mean? Seconds, minutes, hours? Please provide an example of a matrix (perhaps just the dimensions) along with the time it takes to evaluate the above.

Before looking at vectorizing your code, you may want to profile your code to see where the bottlenecks are.

A couple things that you may want to do (for your optimization) is to allocate or pre-size matrices before you start to use them. For example, M appears to be a pxp matrix, so why not size it as such before you start to use it

 p=size(X,2);
 n1=size(X,1); 
 n2=size(Y,1);
 sigma1=cov(X); 
 sigma2=cov(Y);
 % pre-size matrices
 M = zeros(p,p);
 Teta_bar1 = zeros(p,p);
 Teta_bar2 = zeros(p,p);
 A = ??
 B = ??

Look also at your inner loops that iterate over k

 for k=1:n1
     A(k,:)=((X(k,i)-mean(X(:,i))).*(X(k,j)-mean(X(:,j)))-sigma1(i,j)).^2;
 end

Note that you calculate

mean(X(:,i))

on each iteration of k even though this average does not depend on k at all. It doesn't even depend upon j, so the code is calculating the same average p*n1 times when it only needs to do this once. Similarly, the

mean(X(:,j)

is calculated n1 times when it is only needed once. Try changing this to

 for i=1:p
    meanXi = mean(X(:,i));     % <--- calculate the mean
    for j=1:p
        meanXj = mean(X(:,j)); % <--- calculate the mean
        for k=1:n1
            A(k,:)=((X(k,i)-meanXi).*(X(k,j)-meanXj)-sigma1(i,j)).^2;
        end

You can apply the same logic to the second inner loop that iterates over k to avoid the duplicate mean calculations.

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

MJ le 15 Oct 2014

Wonderful!! Just applied your instructions and got the results in ONLY 3 seconds! You made my day, sir! :) Thanks a lot!

Geoff Hayes le 15 Oct 2014

Glad it worked out!

Connectez-vous pour commenter.

How to make this code faster?

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Community Treasure Hunt

How to make this code faster?

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

4 commentaires Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens