Main Content

Maximum blocks per kernel

Description

Specify the maximum number of CUDA® blocks created during a kernel launch.

Because GPU devices have limited streaming multiprocessor (SM) resources, limiting the number of blocks for each kernel can avoid performance losses from scheduling, loading and unloading of blocks.

If the number of iterations in a loop is greater than the maximum number of blocks per kernel, the code generator creates CUDA kernels with striding.

When you specify the maximum number of blocks for each kernel, the code generator creates 1-D kernels. To force the code generator to create 2-D or 3-D kernels, use the coder.gpu.kernel (GPU Coder) pragma. The coder.gpu.kernel pragma takes precedence over the maximum number of kernels for each CUDA block.

Category: Code Generation > GPU Code

Settings

Default: 0

Specify the maximum number of CUDA blocks created during a kernel launch.

Dependencies

  • This parameter requires a GPU Coder™ license.

  • To enable this parameter, select Generate GPU code on the Code Generation pane.

Command-Line Information

Parameter: GPUMaximumBlocksPerKernel
Type: integer
Value: any valid value
Default: 0

Related Topics