Main Content

Generate C++ Code for Object Detection Using YOLO v2 and Intel MKL-DNN

This example shows how to generate C++ code for the YOLO v2 Object detection network on an Intel® processor. The generated code uses the Intel Math Kernel Library for Deep Neural Networks (MKL-DNN).

For more information, see Object Detection Using YOLO v2 Deep Learning (Computer Vision Toolbox).

Prerequisites

  • Intel Math Kernel Library for Deep Neural Networks (MKL-DNN)

  • Refer MKLDNN CPU Support to know the list of processors that supports MKL-DNN library

  • MATLAB® Coder™ for C++ code generation

  • MATLAB Coder Interface for Deep Learning support package

  • Deep Learning Toolbox™ for using the DAGNetwork object

  • Computer Vision Toolbox™ for video I/O operations

For more information on the supported versions of the compilers and libraries, see Generate Code That Uses Third-Party Libraries.

This example is supported on Linux®, Windows®, and macOS platforms and not supported for MATLAB Online.

Get the Pretrained DAGNetwork Object

The DAG network contains 150 layers including convolution, ReLU, and batch normalization layers and the YOLO v2 transform and YOLO v2 output layers.

net = getYOLOv2();
Downloading pretrained detector (98 MB)...

Use the command net.Layers to see all the layers of the network.

net.Layers

Code Generation for yolov2_detection Function

The yolov2_detection function attached with the example takes an image input and runs the detector on the image using the network saved in yolov2ResNet50VehicleExample.mat. The function loads the network object from yolov2ResNet50VehicleExample.mat into a persistent variable yolov2Obj. Subsequent calls to the function reuse the persistent object for detection.

type('yolov2_detection.m')
function outImg = yolov2_detection(in)

%   Copyright 2018-2019 The MathWorks, Inc.

% A persistent object yolov2Obj is used to load the YOLOv2ObjectDetector object.
% At the first call to this function, the persistent object is constructed and
% set up. Subsequent calls to the function reuse the same object to call detection 
% on inputs, thus avoiding having to reconstruct and reload the
% network object.
persistent yolov2Obj;

if isempty(yolov2Obj)
    yolov2Obj = coder.loadDeepLearningNetwork('yolov2ResNet50VehicleExample.mat');
end

% pass in input
[bboxes,~,labels] = yolov2Obj.detect(in,'Threshold',0.5);
outImg = in;

% convert categorical labels to cell array of character vectors 
labels = cellstr(labels);


if ~(isempty(bboxes) && isempty(labels))
% Annotate detections in the image.
    outImg = insertObjectAnnotation(in,'rectangle',bboxes,labels);
end

To generate code, create a code configuration object for a MEX target and set the target language to C++. Use the coder.DeepLearningConfig function to create a MKL-DNN deep learning configuration object. Assign this object to the DeepLearningConfig property of the code configuration object. Specify the input size as an argument to the codegen command. In this example, the input layer size of the YOLO v2 network is [224,224,3].

cfg = coder.config('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('mkldnn');
codegen -config cfg yolov2_detection -args {ones(224,224,3,'uint8')} -report
Code generation successful: To view the report, open('codegen\mex\yolov2_detection\html\report.mldatx')

Run the Generated MEX Function on Example Input

Set up a video file reader and read the example input video highway_lanechange.mp4. Create a video player to display the video and the output detections.

videoFile = 'highway_lanechange.mp4';
videoFreader = VideoReader(videoFile);
depVideoPlayer = vision.DeployableVideoPlayer('Size','Custom','CustomSize',[640 480]);

Read the video input frame by frame and detect the vehicles in the video by using the detector.

while hasFrame(videoFreader)
    I = readFrame(videoFreader);
    in = imresize(I,[224,224]);
    out = yolov2_detection_mex(in);
    depVideoPlayer(out);
    cont = hasFrame(videoFreader) && isOpen(depVideoPlayer); % Exit the loop if the video player figure window is closed
end
clear videoFreader

References

[1] Redmon, Joseph, and Ali Farhadi. "YOLO9000: Better, Faster, Stronger." In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6517–25. Honolulu, HI: IEEE, 2017.

See Also

|

Related Examples

More About