Data type (TensorRT)
Inference computation precision
Description
App Configuration Pane: Deep Learning
Configuration Objects: coder.TensorRTConfig
Specify the precision of the inference computations in supported layers. When
performing inference in 32-bit floats, use 'fp32'. For
half-precision, use 'fp16'. For 8-bit integer, use
'int8'. Default value is 'fp32'.
INT8 precision requires a CUDA® GPU with minimum compute capability of 6.1, 7.0 or higher. Compute
capability of 6.2 does not support INT8 precision.
FP16 precision requires a CUDA GPU with minimum compute capability of 5.3, 6.0, 6.2 or higher. Use the
ComputeCapability property of the GpuConfig
object to set the appropriate compute capability value.
See the Deep Learning Prediction with NVIDIA TensorRT Library example for 8-bit integer prediction for a logo classification network by using TensorRT.
Dependencies
To enable this parameter, you must set Deep learning
library to TensorRT.
Settings
fp32This setting is the default setting.
Inference computation is performed in 32-bit floats.
fp16This setting is the default setting.
Inference computation is performed in 16-bit floats.
int8Inference computation is performed in 8-bit integers.
Programmatic Use
Property: DataType |
Values: 'fp32' |
'fp16' | 'int8' |
Default: 'fp32' |
Version History
Introduced in R2018b