Semantic Segmentation on NVIDIA DRIVE

This example shows how to generate and deploy a CUDA® executable for an image segmentation application that uses deep learning. It uses the GPU Coder™ Support Package for NVIDIA® GPUs to deploy the executable on the NVIDIA DRIVE™ platform. This example performs code generation on the host computer and builds the generated code on the target platform by using remote build capability of the support package. For more information, see Code Generation for Semantic Segmentation Network.

Prerequisites

Target Board Requirements

  • NVIDIA DRIVE PX2 embedded platform.

  • Ethernet crossover cable to connect the target board and host PC (if you cannot connect the target board to a local network).

  • NVIDIA CUDA toolkit installed on the board.

  • NVIDIA cuDNN library (v5 and above) on the target.

  • OpenCV 3.0 or higher library on the target for reading and displaying images.

  • Environment variables on the target for the compilers and libraries. For more information, see Install and Setup Prerequisites for NVIDIA Boards.

Development Host Requirements

  • GPU Coder for CUDA code generation. For a tutorial, see Get Started with GPU Coder.

  • Deep Learning Toolbox™ to use a DAG network object.

  • NVIDIA CUDA toolkit on the host.

  • GPU Coder Interface for Deep Learning Libraries support package. To install this support package, use the MATLAB® Add-On Explorer.

  • Environment variables for the compilers and libraries. For more information, see Third-Party Hardware and Setting Up the Prerequisite Products.

Create a Folder and Copy Relevant Files

The following line of code creates a folder in your current working folder on the host and copies all the relevant files into this folder. If you cannot generate files in this folder, before running this command, change your current working folder.

gpucoderdemo_setup('segnet_deploy');

Connect to the NVIDIA Hardware

The GPU Coder Support Package for NVIDIA GPUs uses an SSH connection over TCP/IP to execute commands while building and running the generated CUDA code on the DRIVE platforms. Connect the target platform to the same network as the host computer or use an Ethernet crossover cable to connect the board directly to the host computer. For information on how to set up and configure your board, see NVIDIA documentation.

To communicate with the NVIDIA hardware, create a live hardware connection object by using the drive function. You must know the host name or IP address, user name, and password of the target board to create a live hardware connection object. For example, when connecting to the target board for the first time, create a live object for Drive hardware by using the command:

hwobj = drive('drive-px2-name','ubuntu','ubuntu');

During the hardware live object creation, the support package performs hardware and software checks, IO server installation, and gathers peripheral information on target. This information is displayed in the Command Window.

In case of a connection failure, a diagnostics error message is reported at the MATLAB command line. If the connection has failed, the most likely cause is incorrect IP address or host name.

Verify GPU Environment on Target Board

To verify that the compilers and libraries necessary for running this example are set up correctly, use the coder.checkGpuInstall function.

envCfg = coder.gpuEnvConfig('drive');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
envCfg.HardwareObject = hwobj;
coder.checkGpuInstall(envCfg);

Get Pretrained SegNet DAG Network Object

net = getSegNet();

The DAG network contains 91 layers including convolution, batch normalization, pooling, unpooling, and the pixel classification output layers. To see all the layers of the network, use the analyzeNetwork function.

Generate CUDA Code for the Target Board Using GPU Coder

This example uses segnet_predict.m file as the entry-point function for code generation. To generate a CUDA executable that you can deploy on to an NVIDIA target, create a GPU code configuration object for generating an executable.

cfg = coder.gpuConfig('exe');

% When there are multiple live connection objects for different targets,
% the code generator performs a remote build on the target board for which
% a recent live object was created. To choose a hardware board for
% performing a remote build, use the |setupCodegenContext()| method of the
% respective live hardware object. If only one live connection object was
% created, you do not need to call this method.
%
%   hwobj.setupCodegenContext;

To create a configuration object for the DRIVE platform and assign it to the Hardware property of the code configuration object cfg, use the coder.hardware function.

cfg.Hardware = coder.hardware('NVIDIA Drive');

To specify the folder for performing remote build process on the target board, use the BuildDir property. If the specified build folder does not exist on the target board, then the software creates a folder with the given name. If no value is assigned to cfg.Hardware.BuildDir, the remote build process occurs in the last specified build folder. If there is no stored build folder value, the build process takes place in the home folder.

cfg.Hardware.BuildDir = '~/remoteBuildDir';

On NVIDIA platforms such as DRIVE PX2 that contain multiple GPUs, use the SelectCudaDevice property in the GPU configuration object to select a specific GPU.

cfg.GpuConfig.SelectCudaDevice = 0;

The custom main.cu file is a wrapper that calls the predict function in the generated code. Postprocessing steps are added in the main file by using OpenCV interfaces. The output of SegNet prediction is an 11-channel image. The eleven channels here represent the prediction scores of eleven different classes. In postprocessing, each pixel is assigned a class label that has the maximum score among the 11 channels. Each class is associated with a unique color for visualization. The final output is shown by using the OpenCV imshow function.

cfg.CustomSource  = fullfile('main.cu');

In this example, code generation uses an image as the input to the network. However, the custom main file is coded to take video as input and perform a SegNet prediction for each frame in the video sequence. The compiler and linker flags required to build the executable with OpenCV library are updated in the buildinfo section in the |segnet_predict.m|file.

Generate sample image input for code generation.

img = imread('peppers.png');
img = imresize(img,[360 480]);

To generate CUDA code, use the codegen function and pass the GPU code configuration and the size of the inputs for and segnet_predict.m entry-point function. After the code generation takes place on the host, the generated files are copied over and built on the target board.

codegen('-config ', cfg, 'segnet_predict', '-args', {img},'-report');

Run Executable on Target Board

Copy the input test video to the target workspace folder, using the workspaceDir property of the hardware object. This property contains the path to the codegen folder on the target board.

hwobj.putFile('CamVid.avi', hwobj.workspaceDir);

To launch the executable on the target hardware, use the runApplication() method of the hardware object.

hwobj.runApplication('segnet_predict','CamVid.avi');

The segmented image output is displayed in a window on the monitor that is connected to the target board.

You can stop the running executable on the target board from the MATLAB environment on the host by using the killApplication() method of the hardware object. This method uses the name of the application and not the name of the executable.

hwobj.killApplication('segnet_predict');

Cleanup

To remove the example files and return to the original folder, call the cleanup function. cleanup