ONNX Model Import and Inference Performance in MATLAB Is Significantly Slower Than Python (CPU & GPU)

73 vues (au cours des 30 derniers jours)
Ilan
Ilan le 18 Déc 2025 à 13:12
Commenté : Ilan le 23 Déc 2025 à 12:09
I am experiencing very slow ONNX model import and inference performance in MATLAB compared to Python when using the same ONNX file and hardware.
The ONNX model was exported from Python using the RF-DETR repository (nano model).
When running inference in MATLAB, both model loading time and prediction time (especially on GPU) are orders of magnitude slower than in Python.
MATLAB Reproduction Code
hTime = tic();
oNet = importNetworkFromONNX("rfdetr.onnx");
loadTime = toc(hTime);
fprintf('ONNX Model Load Time: %.4f seconds\n', loadTime);
sizeImg = 384;
% Load an image
tImg = imread("img.png");
tImgSmall = imresize(tImg,[sizeImg sizeImg]);
mImg = single(tImgSmall);
% Convert to dlarray
dlImg = dlarray(mImg,"SSC");
% Measure time for prediction on CPU
hTime = tic();
cpuPrediction = oNet.predict(dlImg);
cpuTime = toc(hTime);
fprintf('CPU Prediction Time: %.4f seconds\n', cpuTime);
%% predict Img with GPU
hTime = tic();
mGPUImg = gpuArray(dlImg);
[dlBoxes, dlLabels] = oNet.predict(mGPUImg);
gpuTime = toc(hTime);
fprintf('GPU Prediction Time: %.4f seconds\n', gpuTime);
MATLAB Performance Results
ONNX Model Load Time: 30.5475 seconds
CPU Prediction Time: 10.3682 seconds
GPU Prediction Time: 512.8456 seconds
Python Performance Results (Same ONNX, Same Hardware)
ONNX Load 0.56 seconds
CPU Inference 57.57 ms
GPU Inference 5.35 ms
Questions
  • Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second?
  • Why is inference significantly slower in MATLAB compared to Python: ~10 seconds vs ~57 ms on CPU and ~500 seconds vs ~5 ms on GPU
  • What is the recommended way to run ONNX object detection models efficiently in MATLAB?
Thanks!
  3 commentaires
Jiangyin
Jiangyin le 19 Déc 2025 à 15:18
Hi IIan,
Re: "Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second":
In "importNetworkFromONNX", we translate operations from ONNX models into deep learning toolbox layers or auto-gen custom layers in MATLAB. It takes time for us to do the translation. Importation of the model is a one time thing, once the model is translated into MATLAB model, you could save the model and load it back when you need it.
% if the model is with variable name 'net'
filename = 'trained_network.mat';
save(filename, 'net');
load('trained_network.mat')
The saving and loading should give you a high performance.
Re: performance comparison between MATLAB and ONNX:
Which MATLAB version are you using now. We are actively enhancing deep learning toolbox as well as ONNX model importer. I could try the version you see and investigate the reason for the performance differences.
Ilan
Ilan le 23 Déc 2025 à 12:09
Hi Jiangyin,
Thank you for the explanation.
I followed your suggestion and saved the imported network to a MAT file and reloaded it. This did improve the import time, but it is still relatively slow — around 15 seconds on my setup.
I have uploaded the RF-DETR ONNX model that I am using. The link will be available for the next 3 days:
If possible, I would really appreciate it if you could take a look at this specific model and help identify the reason for the large performance gap I’m seeing between MATLAB and Python, especially for GPU inference.
My MATLAB version is 2024b update 6. 24.2.0.2923080
Thanks a lot for your help,
Ilan

Connectez-vous pour commenter.

Réponses (0)

Catégories

En savoir plus sur Parallel and Cloud dans Help Center et File Exchange

Produits


Version

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by