trainFasterRCNNObjectDetector
Train Faster R-CNN deep learning object detector
Syntax
Description
Train a Detector
trains a Faster R-CNN (regions with convolution neural
networks) object detector using deep learning. You can train
a Faster R-CNN detector to detect multiple object
classes.trainedDetector
= trainFasterRCNNObjectDetector(trainingData
,network
,options
)
This function requires that you have Deep Learning Toolbox™. It is recommended that you also have Parallel Computing Toolbox™ to use with a CUDA®-enabled NVIDIA® GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).
[
also returns information on the training progress, such as
training loss and accuracy, for each iteration.trainedDetector
,info
] = trainFasterRCNNObjectDetector(___)
Resume Training a Detector
resumes training from a detector checkpoint.trainedDetector
= trainFasterRCNNObjectDetector(trainingData
,checkpoint
,options
)
Fine-Tune a Detector
continues training a Faster R-CNN object detector with
additional fine-tuning options. Use this syntax with
additional training data or to perform more training
iterations to improve detector accuracy.trainedDetector
= trainFasterRCNNObjectDetector(trainingData
,detector
,options
)
Additional Properties
uses additional options specified by one or more
trainedDetector
= trainFasterRCNNObjectDetector(___,Name,Value
)Name,Value
pair arguments and
any of the previous inputs.
Examples
Input Arguments
Output Arguments
Tips
To accelerate data preprocessing for training,
trainFastRCNNObjectDetector
automatically creates and uses a parallel pool based on your parallel preference settings. For more details about setting these preferences, see parallel preference settings. Using parallel computing preferences requires Parallel Computing Toolbox.VGG-16, VGG-19, ResNet-101, and Inception-ResNet-v2 are large models. Training with large images can produce "out-of-memory" errors. To mitigate these errors, try one or more of these options:
Reduce the size of your images by using the '
SmallestImageDimension
' argument.Decrease the value of the '
NumRegionsToSample
' name-value argument.
This function supports transfer learning. When you input a
network
by name, such as'resnet50'
, then the function automatically transforms the network into a valid Faster R-CNN network model based on the pretrainedresnet50
(Deep Learning Toolbox) model. Alternatively, manually specify a custom Faster R-CNN network by using theLayerGraph
(Deep Learning Toolbox) extracted from a pretrained DAG network. For more details, see Create Faster R-CNN Object Detection Network.This table describes how to transform each named network into a Faster R-CNN network. The feature extraction layer name specifies the layer for processing by the ROI pooling layer. The ROI output size specifies the size of the feature maps output by the ROI pooling layer.
Network Name Feature Extraction Layer Name ROI Pooling Layer OutputSize Description alexnet
(Deep Learning Toolbox)'relu5'
[6 6] Last max pooling layer is replaced by ROI max pooling layer vgg16
(Deep Learning Toolbox)'relu5_3'
[7 7] vgg19
(Deep Learning Toolbox)'relu5_4'
squeezenet
(Deep Learning Toolbox)'fire5-concat'
[14 14] resnet18
(Deep Learning Toolbox)'res4b_relu'
ROI pooling layer is inserted after the feature extraction layer. resnet50
(Deep Learning Toolbox)'activation_40_relu'
resnet101
(Deep Learning Toolbox)'res4b22_relu'
googlenet
(Deep Learning Toolbox)'inception_4d-output'
mobilenetv2
(Deep Learning Toolbox)'block_13_expand_relu'
inceptionv3
(Deep Learning Toolbox)'mixed7'
[17 17] inceptionresnetv2
(Deep Learning Toolbox)'block17_20_ac'
For information on modifying how a network is transformed into a Faster R-CNN network, see Design an R-CNN, Fast R-CNN, and a Faster R-CNN Model.
During training, multiple image regions are processed from the training images The number of image regions per image is controlled by the
NumRegionsToSample
property. ThePositiveOverlapRange
andNegativeOverlapRange
properties control which image regions are used for training. Positive training samples are those that overlap with the ground truth boxes by 0.6 to 1.0, as measured by the bounding box intersection-over-union metric (IoU). Negative training samples are those that overlap by 0 to 0.3. Choose values for these properties by testing the trained detector on a validation set.Overlap Values Description PositiveOverlapRange
set to[0.6 1]
Positive training samples are set equal to the samples that overlap with the ground truth boxes by 0.6 to 1.0, measured by the bounding box IoU metric. NegativeOverlapRange
set to[0 0.3]
Negative training samples are set equal to the samples that overlap with the ground truth boxes by 0 to 0.3. If you set
PositiveOverlapRange
to[0.6 1]
, then the function sets the positive training samples equal to the samples that overlap with the ground truth boxes by 0.6 to 1.0, measured by the bounding box IoU metric. If you setNegativeOverlapRange
to[0 0.3]
, then the function sets the negative training samples equal to the samples that overlap with the ground truth boxes by 0 to 0.3.Use the
trainingOptions
(Deep Learning Toolbox) function to enable or disable verbose printing.
References
[1] Ren, S., K. He, R. Girschick, and J. Sun. "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." Advances in Neural Information Processing Systems. Vol. 28, 2015.
[2] Girshick, R. "Fast R-CNN." Proceedings of the IEEE International Conference on Computer Vision, 1440-1448. Santiago, Chile: IEEE, 2015.
[3] Girshick, R., J. Donahue, T. Darrell, and J. Malik. "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation." Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 580-587. Columbus, OH: IEEE, 2014.
[4] Zitnick, C. L., and P. Dollar. "Edge Boxes: Locating Object Proposals from Edges." Computer Vision-ECCV 2014, 391-405. Zurich, Switzerland: ECCV, 2014.
Extended Capabilities
Version History
Introduced in R2017aSee Also
Apps
Functions
trainRCNNObjectDetector
|trainFastRCNNObjectDetector
|trainingOptions
(Deep Learning Toolbox) |objectDetectorTrainingData
|estimateAnchorBoxes
|fasterRCNNLayers
Objects
maxPooling2dLayer
(Deep Learning Toolbox) |Layer
(Deep Learning Toolbox) |layerGraph
(Deep Learning Toolbox) |averagePooling2dLayer
(Deep Learning Toolbox) |SeriesNetwork
(Deep Learning Toolbox) |fasterRCNNObjectDetector
|boxLabelDatastore