How can I improve the parfor performance in my code?

Question

Philip Muscarella le 3 Mai 2021

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/820790-how-can-i-improve-the-parfor-performance-in-my-code

Commenté : Jan le 5 Mai 2021

I have this nested parfor and for loop that takes ~8mins to run on 16 workers. I also have a version of this code that runs in about the same amount of time on a single proc. If anyone has some suggestions about how to improve performance that would be great.

parfor jj = 1:ny
    y = Y(jj);
    for ii = 1:nx
            x = X(ii);    
            
            CosTerm   = cos(wnkcosDir*x+wnksinDir*y+gamma); 
            SinTerm   = sin(wnkcosDir*x+wnksinDir*y+gamma);  
            
            eta(ii,jj) = sum(sum((spec_dfdt.*CosTerm),1),2) ;  
            u(ii,jj)   = grav*sum(sum((wnkcosDir.*spec_dfdt.*CosTerm.*romega),1),2) ;
            v(ii,jj)   = grav*sum(sum((wnksinDir.*spec_dfdt.*CosTerm.*romega),1),2) ;
            w(ii,jj)     = grav*sum(sum((wnk.*spec_dfdt.*SinTerm.*romega),1),2) ;
            deta_dx(ii,jj) =  - sum(sum((spec_dfdt.*wnkcosDir.*SinTerm),1),2)  ;
            deta_dy(ii,jj) =  - sum(sum((spec_dfdt.*wnksinDir.*SinTerm),1),2)   ;
    end
end

Here are all the sizes/types of variables:

  Name              Size                 Bytes  Class     Attributes
  X              3001x3001            72048008  double              
  Y              3001x3001            72048008  double              
  deta_dx        3001x3001            72048008  double              
  deta_dy        3001x3001            72048008  double              
  eta            3001x3001            72048008  double              
  gamma           123x93                 91512  double              
  grav              1x1                      8  double              
  romega          123x93                 91512  double              
  spec_dfdt       123x93                 91512  double              
  u              3001x3001            72048008  double              
  v              3001x3001            72048008  double              
  w              3001x3001            72048008  double              
  wnk             123x93                 91512  double              
  wnkcosDir       123x93                 91512  double              
  wnksinDir       123x93                 91512  double    

I also have been using ticBytes/tocBytes to track the communications to the workers.

               BytesSentToWorkers    BytesReceivedFromWorkers
             __________________    ________________________
          9.9847e+07               2.7339e+07       
          9.9703e+07               2.7195e+07       
          1.0057e+08               2.8062e+07       
           9.913e+07               2.6621e+07       
          9.9706e+07               2.7197e+07       
           9.913e+07               2.6621e+07       
          9.9565e+07               2.7057e+07       
          9.9565e+07               2.7057e+07       
          9.8986e+07               2.6475e+07       
         9.9274e+07               2.6764e+07       
         9.8983e+07               2.6472e+07       
         9.9127e+07               2.6617e+07       
         9.9559e+07                2.705e+07       
         9.9988e+07               2.7481e+07       
           1.01e+08               2.8492e+07       
         1.0013e+08               2.7624e+07       
    Total        1.5943e+09               4.3412e+08   

Thanks in advance.

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Edric Ellis le 4 Mai 2021

Ouvrir dans MATLAB Online

I would check using top or taskmgr or similar your CPU usage when running the for-loop version of your code. You might well find that MATLAB's intrinsic multi-threading is already doing a good job of parallelising your code. If this is the case, then parfor can never win because you don't have any more CPUs for it to take advantage of. parfor wins when your for-loop code cannot be multi-threaded by MATLAB; or, when you can offload the computations onto more CPUs by using a cluster.

One final note - in recent releases of MATLAB, you can use the 'all' flag to sum to perform the summation along all dimensions at once:

sum(magic(4), 'all')
ans = 136

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Jan le 4 Mai 2021

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/820790-how-can-i-improve-the-parfor-performance-in-my-code#answer_691475

Modifié(e) : Jan le 4 Mai 2021

Ouvrir dans MATLAB Online

Move all repeated calculations out of the loop:

C1 = wnkcosDir.*spec_dfdt.*romega;
C2 = wnksinDir.*spec_dfdt.*romega;
C3 = wnk.*spec_dfdt.*romega;
C4 = -spec_dfdt.*wnkcosDir;
C5 = -spec_dfdt.*wnksinDir;
parfor jj = 1:ny
    y = Y(jj);
    C6 = wnksinDir*y+gamma;
    C7 = wnksinDir*y+gamma;
    for ii = 1:nx
            x = X(ii);    
            
            CosTerm   = cos(wnkcosDir*x + C6); 
            SinTerm   = sin(wnkcosDir*x + C7);  
            
            eta(ii,jj) = sum(spec_dfdt .* CosTerm, 'all') ;  
            u(ii,jj)   = grav*sum(C1 .* CosTerm, 'all') ;
            v(ii,jj)   = grav*sum(C2 .* CosTerm, 'all') ;
            w(ii,jj)   = grav*sum(C3 .* SinTerm, 'all') ;
            deta_dx(ii,jj) =  sum(C4 .* SinTerm, 'all');
            deta_dy(ii,jj) =  sum(C5 .* SinTerm, 'all');
    end
end

Avoiding repeated calculations is cheaper than distributing it to multiple threads.

Calling SUM once with 'all' dimensions is more efficient than calling it twice.

2 commentaires
Afficher AucuneMasquer Aucune

Philip Muscarella le 4 Mai 2021

I am in R2018a so the 'all' option is not available. Will look into updating.

Why do you need C6 and C7 if they are the same?

Jan le 5 Mai 2021

Oh, C6 and C7 are identical. Good point. I've overseen this. You know, all the small letters look like tiny flies when looking from a certain distance. So please tale my code only as a demonstration about how to moving repeated computations out of the loop.

Connectez-vous pour commenter.

How can I improve the parfor performance in my code?

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Réponse acceptée

2 commentaires
Afficher AucuneMasquer Aucune

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

How can I improve the parfor performance in my code?

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Réponse acceptée

2 commentaires Afficher AucuneMasquer Aucune

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune