Using a templated CUDA kernel via MATLAB

Question

0 votes

Hello,

Is it possible to use a C++-style templated CUDA kernel via MATLAB's GPU Computing interface?

For example, consider the following (useless) toy code:

template<typename T>
__global__ void get_nans(T*, const int*);
template<>
__global__ void get_nans<double>(double* out, const int* dims)
{
  const int tx = blockIdx.x*blockDim.x + threadIdx.x;
  const int ty = blockIdx.y*blockDim.y + threadIdx.y;
  if ((tx < dims[1]) && (ty < dims[0]))
    out[tx*dims[0] + ty] = nan(0);
}
template<>
__global__ void get_nans<float>(float* out, const int* dims)
{
  const int tx = blockIdx.x*blockDim.x + threadIdx.x;
  const int ty = blockIdx.y*blockDim.y + threadIdx.y;
  if ((tx < dims[1]) && (ty < dims[0]))
    out[tx*dims[0] + ty] = nanf(0);
}

I then compile this into PTX code, but when I try to instantiate the kernel object in MATLAB I get the following error:

>> k = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'get_nans.cu' );
Error using handleKernelArgs (line 61)
Found multiple matching entries in the PTX code. Matches found:
_Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5_
_Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5_

Thank you,

Alex

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Narfi le 29 Jan 2013

0 votes

Alex,

Allow me to quote from the help for parallel.gpu.CUDAKernel and try to parse it for you. If you have suggestions for how to improve the wording, please let me know! "If specified, FUNC must be a string that unambiguously defines the appropriate kernel entry name in the PTX file. If FUNC is omitted, the PTX file must contain only a single entry point"

In your case, the get_nans.cu defines two global functions:

get_nans<double>
get_nans<float>

and the get_nans.ptx defines the corresponding two entry points:

Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5 (For the double function)
Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5 (For the float function)

When you create the parallel.gpu.CUDAKernel, it is ambiguous whether you want to invoke the double or the float function. Therefore, you must provide the name of the entry point you want to use and construct either the double or the float version:

kDouble = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'get_nans.cu', '_Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5_'); kFloat = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'get_nans.cu', '_Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5_');

Now, this almost works, but not quite because the parser in parallel.gpu.CUDAKernel cannot parse the template function definition. Therefore, we stop using this way of constructing the CUDAKernel:

KERN = parallel.gpu.CUDAKernel(PTXFILE, CUFILE, FUNC)

and use this one instead:

KERN = parallel.gpu.CUDAKernel(PTXFILE, CPROTO, FUNC)

We then end up with:

kDouble = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'double* out, const int* dims', '_Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5_');

kFloat = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'float* out, const int * dims', '_Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5_');

Does this make sense?

Best,

Narfi

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Using a templated CUDA kernel via MATLAB

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Plus de réponses (0)

Catégories

Produits

Tags

Community Treasure Hunt

Using a templated CUDA kernel via MATLAB

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Plus de réponses (0)

Catégories

Produits

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens