Optimization of loops with deep learning functions
    4 vues (au cours des 30 derniers jours)
  
       Afficher commentaires plus anciens
    
Hello,
I am trying to optimize the following code, which performs the deep learning convolutions on input arrays:
       parfor k=1:500 %number of images
        for j =1:8 %number of channels of the relevant filters
            dldf_O_dlconv3_temp_1(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,1,k),[8 8 1]),'SSC'),0,'Padding',padding);%1st filter
            dldf_O_dlconv3_temp_2(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,2,k),[8 8 1]),'SSC'),0,'Padding',padding);%2nd filter
            dldf_O_dlconv3_temp_3(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,3,k),[8 8 1]),'SSC'),0,'Padding',padding);%3rd filter
            dldf_O_dlconv3_temp_4(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,4,k),[8 8 1]),'SSC'),0,'Padding',padding);%4th filter
            dldf_O_dlconv3_temp_5(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,5,k),[8 8 1]),'SSC'),0,'Padding',padding);%5th filter
            dldf_O_dlconv3_temp_6(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,6,k),[8 8 1]),'SSC'),0,'Padding',padding);%6th filter
            dldf_O_dlconv3_temp_7(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,7,k),[8 8 1]),'SSC'),0,'Padding',padding);%7th filter
            dldf_O_dlconv3_temp_8(:,:,j,k)=dlconv(dlarray(reshape(O_maxpool2(:,:,j,k),[8 8 1]), 'SSC'),dlarray(reshape(DLDO_O_dlconv3(:,:,8,k),[8 8 1]),'SSC'),0,'Padding',padding);%8th filter
        end
       end
As you can see, I am already using the parfor loop. I tried using GPU arrays with O_maxpool2 and DLDO_O_dlconv3, but instead of speeding anything up I think it became a bit slower if anything. My GPU device details are as follows:
         CUDADevice with properties:
                      Name: 'GeForce RTX 2080 Ti'
                     Index: 1
         ComputeCapability: '7.5'
            SupportsDouble: 1
             DriverVersion: 11.2000
            ToolkitVersion: 11
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 1.1811e+10
           AvailableMemory: 9.4411e+09
       MultiprocessorCount: 68
              ClockRateKHz: 1545000
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
           DeviceAvailable: 1
            DeviceSelected: 1
Please let me know if there is anything else I could do to speed this code, and also why using gpuArrays have not sped it up.
Much thanks.
0 commentaires
Réponse acceptée
  Joss Knight
    
 le 24 Juin 2021
        
      Modifié(e) : Joss Knight
    
 le 24 Juin 2021
  
      dlconv is designed to work in batch with multiple input channels, multiple filters, and multiple input observations in a single call. Read the documentation for dlconv.
Plus de réponses (0)
Voir également
Catégories
				En savoir plus sur Parallel and Cloud dans Help Center et File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

