MATLAB Answers

How can I improve code efficiency?

5 views (last 30 days)
Hello Experts,
I am a novice in this and would like to get help / advice.
I have written the code to successfully load and unpack the CAN message data through through for-loop. Once the messages are unpacked I am then writing it into a table for further processing / plotting work. Currently the code does the job, but it takes around 15mins (depending on CAN data) to unpack message from ~200k lines (yes it is ~200k). This can take even longer depending on the size of CAN data file I am receiving, so I would like to improve the code to reduce the time it takes to unpack the CAN message. I am seeking experts advice on improving the code (if it is possible) to reduce the processing time.
To make it clear, I will briefly explain how I am unpacking the message with the code below which is done in two steps.
CAN file has data recorded for every milli second with repetative messages over time. Every message in the CAN file have multiple signals / channels logged in at the respective time step. First, I am reading the message name through outer for-loop for first time step (say 0.1 second). Then inner for-loop receives the message name and extracts the signals stored under the message name for that time step and writes the signal information in to a table.
for i = 1:length(unique(canData.Name))
Signals = Msgs{idx(i)};
MsgName = Msgs{idx(i),2}{1};
for j = 1:size(Signals,1)
db = Signals(j,:);
value = double(unpack(Messages(i), db.StartBit, db.SignalSize, char(db.ByteOrder), char(db.Class)))* db.Factor + db.Offset;
Name(j+offset) = db.Name(1);
Time(j+offset) = Messages(i).Timestamp;
Value(j+offset) = value;
end
offset = offset + size(Signals,1);
end
This process repeats for ~200k lines which currently takes long time to unpack the message. I have attached a screenshot of canMsgs variable with red box to show an example message name and signal info and their repeat over time.
Please help me in improving the code efficiency, let me know if you need any additional information.
Thanks in Advance. :)

  7 Comments

Show 4 older comments
Logesh Velusamy
Logesh Velusamy on 21 Jan 2020
Hi Guillaume,
Apologies for not being very clear.
'canData' is the data only part extracted from 'canMsgs' which does not include any Name or Signals.
'idx' is the matlab index number of table data which has the unique CAN Message names.
'Name' - Name of signal extracted from CAN message/Data file
'Time' - Time at which the above signal (Name) was logged in a CAN file
'Value' - The data recorded for the above signal (Name) at the respective Time
'Messages' - HEX formatted CAN message read and converted into normal data format using 'canMessage' (a Matlab function)
I hope it is understandable now.
Thanks.
Guillaume
Guillaume on 21 Jan 2020
Still very confused
"'canData' is the data only part extracted from 'canMsgs' which does not include any Name or Signals"
On the first line of the code you've posted, you extract Name from canData!
What are the classes of Name, Time and Value? You said you're decoding the messages into a table, but these are clearly not tables.
Anyway, from the profile results, in particular file8, which is the script profile, you can see that it spends 50% of the time on the unpack line and 25% of the time on the dbInfo = SignalInfoTable(j,:); On the unpack line, I'm not sure if the time spent is by unpack itself or the table indexing. The two indexings could be combined into one which would probably perform better.
Can you attach the whole Can_Test.m? it would be useful if we saw the actual code you're using, not a slightly edited version.
Logesh Velusamy
Logesh Velusamy on 21 Jan 2020
Sorry, I might be confusing you a lot. I will attach my code below for you to look.

Sign in to comment.

Accepted Answer

Logesh Velusamy
Logesh Velusamy on 6 Feb 2020
Edited: Logesh Velusamy on 6 Feb 2020
Thanks to everyone who helped me with this issue.
I have re-written the program with the help of the MathWorks support team who suggested the use of in-build functions rather looping and this has brought down the runtime by ~98%.
I have shown my modified code below if in-case anyone needs it.
%% Import CAN Message and Database File for Message table creation
msgTable = canMessageImport('Msg.asc','vector',canDatabase(canDBC),'OutputFormat','timetable');
%% Time in the Message table is rounded to "1" decimal place to avoid odd timestamp in final table
msgTable.Time = seconds(round(seconds(msgTable.Time),1));
%% Signal Extraction from Message Table
Sig = canSignalTimetable(msgTable);
%% Assigning field names to a variable for looping through to synchronize or concatenate the signals
FieldNames = fields(Sig);
Data = Sig.(FieldNames{1});
%% Individual Signal tables are "synchronized" based on timestamp in loop
for i=2:length(FieldNames)
Data = synchronize(Data, Sig.(FieldNames{i}));
end
%% Replacing missing or NAN values to "-999"
Data = fillmissing(Data, 'Constant', -999);
Thanks.

  0 Comments

Sign in to comment.

More Answers (1)

Bjorn Gustavsson
Bjorn Gustavsson on 21 Jan 2020
First step of improving code is to run the scripts and functions with the profiler on (seel documentation and help for profile) so that you can see what parts of your code uses most of the time. Once that is determined you can start to try to pick off the parts where most of the time is spent.
Even before that you should have a look at the code-advices you get in the matlab-editor. It often points out where things looks a bit dodgy (though the editor code analysis doesn't know exactly the sizes of your variables or number of iterations and such), so you can start to clean out those warnings too.
In your case it seems that you will dynamically grow the sizes of your variables Name, Time and Value. That will start to become expensive when you increase the variable-sizes in increments of one. It is much prefered to preallocate the variables. Something like this:
maxSizeSignal = 12; % you might know this number, you might have to determine it before the loop
Value = zeros(1,length(unique(canData.Name))*maxSizeSignal); % this might be a little bit large
% then proceed as before.
Then after the loop you might prefer to cut the excess:
Value = Value(1:offset);
HTH

  3 Comments

Logesh Velusamy
Logesh Velusamy on 21 Jan 2020
Hi Bjorn,
Thanks for your effort. Apologies, I forgot to add to my question that I am pre-defining the variables with an empty table before starting the loop.
Bjorn Gustavsson
Bjorn Gustavsson on 21 Jan 2020
Hello Logesh,
OK, I'll wait for the profile-report. Then we can continue. Your description of "pre-defining variables with an empty table" sounds confusing to me, it is the actual variables Value, Name and Time that should individually be initialized to their "final full size" to avoid time-wasting on memmory reallocation. But that will now wait for more information...
Logesh Velusamy
Logesh Velusamy on 21 Jan 2020
Hello Bjorn,
I have the profiler report updated in the link below.
"Pre-defining" variables is with the actual variable names.

Sign in to comment.

Sign in to answer this question.


Translated by