PTX kernel time to run
Afficher commentaires plus anciens
Hello, i am using R2010b, CUDA toolkit 3.1 with a geforce gt425m. While is was optimalizing my cuda code i observed that calling the kernel with feval in matlab has a ~2ms constant time measured with
tic feval(k,...) toc
the kernel code:
#define C_WIDTH 1024
#define C_HEIGHT 768
__global__ void timetest1(float* holo) {
int mindex=blockIdx.x*blockDim.x+threadIdx.x;
int size=C_WIDTH*C_HEIGHT;
if (mindex>=size)
return;
holo[mindex]=mindex*mindex;
}
Even if i take out the write to global memory //holo[mindex]=mindex*mindex; there is a ~2ms time
Does anybody know the origin of this lag? It would be great to somehow eliminate it.
Thanks,
Gaszton
PS: my matlab code for the kernel:
clear
import parallel.gpu.GPUArray
xsize=1024; ysize=768;
vectorsize=xsize*ysize; threadpblock=1024; k=parallel.gpu.CUDAKernel('TimeTest.ptx', 'TimeTest.cu'); k.ThreadBlockSize=[threadpblock,1,1]; k.GridSize=[ceil(vectorsize/threadpblock),1];
dholo=parallel.gpu.GPUArray.zeros(vectorsize,1,'single');
tic [dholo]=feval(k,dholo); time=toc;
['ms time= ' num2str(time*1000)]
clear
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur GPU Computing dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!