Code Generation for Deep Learning Networks with ARM Compute Library
With MATLAB®
Coder™, you can generate code for prediction from an already trained neural network,
targeting an embedded platform that uses an ARM® processor that supports the NEON extension. The code generator takes advantage
of the ARM
Compute Library for computer vision and machine learning. The generated code
implements a neural network that has the architecture, layers, and parameters specified in the
input SeriesNetwork
(Deep Learning Toolbox) or
DAGNetwork
(Deep Learning Toolbox) network
object.
Generate code by using one of these methods:
Requirements
MATLAB Coder Interface for Deep Learning. To install the support package, select it from the MATLAB Add-Ons menu.
ARM Compute Library for computer vision and machine learning must be installed on the target hardware.
Deep Learning Toolbox™.
Environment variables for the compilers and libraries.
Note
The ARM Compute library version that the examples in this help topic uses might not be the latest version that code generation supports. For supported versions of libraries and for information about setting up environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
Code Generation by Using codegen
To generate code for deep learning on an ARM target by using codegen
:
Write an entry-point function that loads the pretrained convolutional neural network (CNN) and calls
predict
. For example:function out = squeezenet_predict(in) %#codegen persistent net; opencv_linkflags = '`pkg-config --cflags --libs opencv`'; coder.updateBuildInfo('addLinkFlags',opencv_linkflags); if isempty(net) net = coder.loadDeepLearningNetwork('squeezenet', 'squeezenet'); end out = net.predict(in); end
If your target hardware is Raspberry Pi®, you can take advantage of the MATLAB Support Package for Raspberry Pi Hardware. With the support package,
codegen
moves the generated code to the Raspberry Pi and builds the executable program on the Raspberry Pi. When you generate code for a target that does not have a hardware support package, you must run commands to move the generated files and build the executable program.MEX generation is not supported for code generation for deep learning on ARM targets.
For ARM, for inputs to
predict
(Deep Learning Toolbox) with multiple images or observations (N > 1
), aMiniBatchSize
of greater than 1 is not supported. Specify aMiniBatchSize
of 1.
Code Generation for Deep Learning on a Raspberry Pi
When you have the MATLAB Support Package for Raspberry Pi Hardware, to generate code for deep learning on a Raspberry Pi:
To connect to the Raspberry Pi, use
raspi
. For example:r = raspi('raspiname','username','password');
Create a code generation configuration object for a library or executable by using
coder.config
. Set theTargetLang
property to'C++'
.cfg = coder.config('exe'); cfg.TargetLang = 'C++';
Create a deep learning configuration object by using
coder.DeepLearningConfig
. Set theArmComputeVersion
andArmArchitecture
properties. Set theDeepLearningConfig
property of the code generation configuration object to thecoder.ARMNEONConfig
object. For example:dlcfg = coder.DeepLearningConfig('arm-compute'); dlcfg.ArmArchitecture = 'armv7'; dlcfg.ArmComputeVersion = '20.02.1'; cfg.DeepLearningConfig = dlcfg;
To configure code generation hardware settings for the Raspberry Pi, create a
coder.Hardware
object, by usingcoder.hardware
. Set theHardware
property of the code generation configuration object to thecoder.Hardware
object.hw = coder.hardware('Raspberry Pi'); cfg.Hardware = hw;
If you are generating an executable program, provide a C++ main program. For example:
cfg.CustomSource = 'main.cpp';
To generate code, use
codegen
. Specify the code generation configuration object by using the-config
option. For example:codegen -config cfg squeezenet_predict -args {ones(227, 227, 3,'single')} -report
Note
You can specify half-precision inputs for code generation. However, the code generator type casts the inputs to single-precision. The Deep Learning Toolbox uses single-precision, floating-point arithmetic for all computations in MATLAB.
Code Generation When You Do Not Have a Hardware Support Package
To generate code for deep learning when you do not have a hardware support package for the target:
Generate code on a Linux® host only.
Create a configuration object for a library. For example:
cfg = coder.config('lib');
Do not use a configuration object for an executable program.
Configure code generation to generate C++ code and to generate source code only.
cfg.GenCodeOnly = true; cfg.TargetLang = 'C++';
To specify code generation with the ARM Compute Library, create a
coder.ARMNEONConfig
object by usingcoder.DeepLearningConfig
. Set theArmComputeVersion
andArmArchitecture
properties. Set theDeepLearningConfig
property of the code generation configuration object to thecoder.ARMNEONConfig
object.dlcfg = coder.DeepLearningConfig('arm-compute'); dlcfg.ArmArchitecture = 'armv7'; dlcfg.ArmComputeVersion = '20.02.1'; cfg.DeepLearningConfig = dlcfg;
To configure code generation parameters that are specific to the target hardware, set the
ProdHWDeviceType
property of theHardwareImplementation
object.For the ARMv7 architecture, use
'ARM Compatible->ARM Cortex'
.for the ARMv8 architecture, use
'ARM Compatible->ARM 64-bit (LP64)'
.
For example:
cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->ARM 64-bit (LP64)';
To generate code, use
codegen
. Specify the code generation configuration object by using the-config
option. For example:codegen -config cfg squeezenet_predict -args {ones(227, 227, 3, 'single')} -d arm_compute
For an example, see Generate Code and Deploy SqueezeNet Network to Raspberry Pi.
Generated Code
The series network is generated as a C++ class containing an array of layer classes.
class b_squeezenet_0 { public: int32_T batchSize; int32_T numLayers; real32_T *inputData; real32_T *outputData; MWCNNLayer *layers[68]; private: MWTargetNetworkImpl *targetImpl; public: b_squeezenet_0(); void presetup(); void postsetup(); void setup(); void predict(); void cleanup(); real32_T *getLayerOutput(int32_T layerIndex, int32_T portIndex); ~b_squeezenet_0(); };
The setup()
method of the class sets up handles and allocates
memory for each layer of the network object. The predict()
method
invokes prediction for each of the layers in the network. Suppose that you generate code
for an entry-point function, squeezenet_predict
. In the generated"for
you" file, squeezenet_predict.cpp
, the entry-point function
squeeznet_predict()
constructs a static object of b_squeezenet_0 class type and invokes setup
and predict
on the network object.
static b_squeezenet_0 net; static boolean_T net_not_empty; // Function Definitions // // A persistent object net is used to load the DAGNetwork object. // At the first call to this function, the persistent object is constructed and // set up. When the function is called subsequent times, the same object is reused // to call predict on inputs, avoiding reconstructing and reloading the // network object. // Arguments : const real32_T in[154587] // real32_T out[1000] // Return Type : void // void squeezenet_predict(const real32_T in[154587], real32_T out[1000]) { // Copyright 2018 The MathWorks, Inc. if (!net_not_empty) { DeepLearningNetwork_setup(&net); net_not_empty = true; } DeepLearningNetwork_predict(&net, in, out); }
Binary files are exported for layers that have parameters, such as fully connected and
convolution layers in the network. For example, the files with names having the pattern
cnn_squeezenet_*_w
and cnn_squeezenet_*_b
correspond to weights and bias parameters for the convolution layers in the
network.
cnn_squeezenet_conv10_b cnn_squeezenet_conv10_w cnn_squeezenet_conv1_b cnn_squeezenet_conv1_w cnn_squeezenet_fire2-expand1x1_b cnn_squeezenet_fire2-expand1x1_w cnn_squeezenet_fire2-expand3x3_b cnn_squeezenet_fire2-expand3x3_w cnn_squeezenet_fire2-squeeze1x1_b cnn_squeezenet_fire2-squeeze1x1_w ...
int8
Code Generation
Code Generation by Using the MATLAB Coder App
Complete the Select Source Files and Define Input Types steps.
Go to the Generate Code step. (Skip the Check for Run-Time Issues step because MEX generation is not supported for code generation with the ARM Compute Library.)
Set Language to C++.
Specify the target ARM hardware.
If your target hardware is Raspberry Pi and you installed the MATLAB Support Package for Raspberry Pi Hardware:
For Hardware Board, select
Raspberry Pi
.To access the Raspberry Pi settings, click More Settings. Then, click Hardware. Specify the Device Address, Username, Password, and Build directory.
When you do not have a support package for your ARM target:
Make sure that Build type is
Static Library
orDynamic Library
and select the Generate code only check box.For Hardware Board, select
None - Select device below
.For Device vendor, select
ARM Compatible
.For the Device type:
For the ARMv7 architecture, select
ARM Cortex
.For the ARMv8 architecture, select
ARM 64-bit (LP64)
.
Note
If you generate code for deep learning on an ARM target, and do not use a hardware support package, generate code on a Linux host only.
In the Deep Learning pane, set Target library to
ARM Compute
. Specify ARM Compute Library version and ARM Compute Architecture.Generate the code.
See Also
coder.loadDeepLearningNetwork
| coder.DeepLearningConfig
| coder.ARMNEONConfig
Related Topics
- Deep Learning Prediction with ARM Compute Using codegen
- Generate Code and Deploy SqueezeNet Network to Raspberry Pi
- Generate int8 Code for Deep Learning Networks
- Workflow for Deep Learning Code Generation with MATLAB Coder
- Code Generation for Deep Learning Networks with MKL-DNN
- Code Generation for Deep Learning Networks by Using cuDNN (GPU Coder)
- Code Generation for Deep Learning Networks by Using TensorRT (GPU Coder)