Comparing the performance of colon, loop, and cell indexing

14 vues (au cours des 30 derniers jours)
Benjamin D'Anjou
Benjamin D'Anjou le 21 Mai 2020
I will first state the problem and then ask my question.
PROBLEM
I have this code where repeatedly multiply a vector by a square matrix and store the resulting vector. The vector has a dimension dim and it is multiplied by the matrix numSteps times.
I wrote three variations of this code. First, I store the vector using colon indexing. Second, I store the vector using loop indexing. Third, I store the vector in a cell array using cell indexing.
The three functions are the following. They are identical except for the way the data is stored.
Colon indexing:
function output = funcColon(dim,numSteps)
%FUNCCOLON Function that stores vectors using colon indexing
% Initialize the output and the vector
output = zeros(numSteps,dim);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
vec = mat*vec0;
vec0 = vec;
output(k,:) = vec; % Use colon indexing to store the results
end
end
Loop indexing:
function output = funcLoop(dim,numSteps)
%FUNCLOOP Function that stores vectors using a loop
% Initialize the output and the vector
output = zeros(numSteps,dim);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
vec = mat*vec0;
vec0 = vec;
for j=1:dim
output(k,j) = vec(j); % Use a loop to store the results
end
end
end
Cell indexing:
function output = funcCell(dim,numSteps)
%FUNCCELL Function that stores vectors using a cell array
% Initialize the output and the vector
output = cell(numSteps,1);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
vec = mat*vec0;
vec0 = vec;
output{k} = vec; % Store the vector in a cell array
end
end
I first set numSteps=100. I then ran each function for various values of dim and I timed them. I did this multiple times to accumulate statistics on the average runtime.
The results are shown in the attached figure. I take the average runtime of the colon indexing as a point of reference, i.e., the two curves show the ratio of the average runtimes of funcLoop and funcCell to the average runtime of funcColon.
There are basically two regimes.
First, there is the regime dim > 1. There, we have the hierarchy:
runtime loop < runtime cell < runtime colon
Second, there is the regime dim < 1. There, we have the hierarchy:
runtime loop < runtime colon < runtime cell
Note that in the limit of large dim, all functions tend to the same runtime because the runtime becomes limited by the matrix multiplication. But for small dim, the runtime is limited by the indexing. I checked that it is indeed the case with the Run & Time feature.
QUESTION
I'm quite confused by this result. I would have assumed that colon indexing would always be faster because it is a syntax designed specifically for matlab arrays. What is colon indexing doing that is taking so much additional runtime? Even cell array indexing is more efficient than colon indexing for dim > 1.
This seems to imply that I could rewrite an overloaded custom colon indexing function using a for loop and have it be more performant than MATLAB's native colon indexing.
Am I missing something? What are your suggestions for best practices with a code like that?

Réponse acceptée

Stephen23
Stephen23 le 21 Mai 2020
Modifié(e) : Stephen23 le 21 Mai 2020
"Am I missing something?"
1- accessing data in non-contiguous locations is slow. Consider your indexing output(k,:): because MATLAB stores arrays in column-major order, your indexing refers to lots of separate locations spread out through the entire array. Carefully going through the array, finding those locations, splitting up the input array and copying the parts to each separate location... is not efficient, and gets slower for larger arrays. You data design forces MATLAB to be slow.
If you access only contiguous blocks of memory you would find the code would be much faster: output(:,k)
2- MATLAB (probably) does not actually copy those arrays, so although you write that it "stores" the input array in a cell:
output{k} = vec; % Store the vector in a cell array
actually MATLAB (probably) just stores a pointer to the already existing input array, and creating one small pointer does not take much time at all. The original array (probably) does not move or get copied anywhere, until you write to any location in that array, at which point it will finally get copied... and in your code this never happens, so you are comparing apples with oranges here.
Read about copy-on-write:
Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!
"What are your suggestions for best practices with a code like that?"
  • Consider column-major in your design: if efficiency is important store data in contiguous locations.
  • Consider copy-on-write in your design.
  2 commentaires
Benjamin D'Anjou
Benjamin D'Anjou le 21 Mai 2020
I think I get it. I read the copy-on-write doc before but I don't think I had grasped everything you say.
Just one clarification:
"Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!"
Are you saying that cell arrays store data contiguously by design, i.e., each element of a cell array occupies a contiguous memory location? That would make sense.
Stephen23
Stephen23 le 21 Mai 2020
Modifié(e) : Stephen23 le 21 Mai 2020
"each element of a cell array occupies a contiguous memory location?"
Every cell contains a separate array, which are each stored in separate memory locations:

Connectez-vous pour commenter.

Plus de réponses (1)

Matt J
Matt J le 21 Mai 2020
Modifié(e) : Matt J le 21 Mai 2020
I checked that it is indeed the case with the Run & Time feature.
I don't know what that is, but the better test in any case is to rewrite the functions so that they contain only the indexing operations.
function output = funcColon(dim,numSteps,output,vec)
%FUNCCOLON Function that stores vectors using colon indexing
for k=1:numSteps
output(k,:) = vec(k,:); % Use colon indexing to store the results
end
end
function output = funcLoop(dim,numSteps,output,vec)
%FUNCLOOP Function that stores vectors using a loop
for k=1:numSteps
for j=1:dim
output(k,j) = vec(k,j); % Use a loop to store the results
end
end
end
function output = funcCell(dim,numSteps,output,vec)
%FUNCCELL Function that stores vectors using a cell array
for k=1:numSteps
output{k} = vec(k,:); % Store the vector in a cell array
end
end
The test code then gives me the following, less suprising results:
dim=20;
numSteps=1e5;
vec = rand(numSteps,dim);
output = zeros(numSteps,dim);
tColon = timeit(@()funcColon(dim,numSteps,output,vec)) %0.0173 seconds
output = zeros(numSteps,dim);
tLoop = timeit(@()funcLoop(dim,numSteps,output,vec)) %0.0207 seconds
output = cell(numSteps,1);
tCell = timeit(@()funcCell(dim,numSteps,output,vec)) %0.1120 seconds
  3 commentaires
Matt J
Matt J le 21 Mai 2020
Modifié(e) : Matt J le 21 Mai 2020
To really know what's going on you should probably post your latest test code so that we can run it as well. Below are the results from my own test code which I have improved still further to exclude the memory allocation inherent in vec0(k,:). In almost all cases tLoop is slightly poorer than tColon. It is to be expected that tCell gets better and better because, as Stephen pointed out, it requires essentially no data copying. Be mindful however that if you make the variable output is held as a cell array, it may be relatively quick to build, but the memory access time to pull the data out later can be a lot slower, depending on numSteps.
Test code:
clear all
Dims=1:5:400;
for i=1:numel(Dims)
dim=Dims(i);
numSteps=3e3;
vec0 = num2cell(rand(numSteps,dim),2);
output = zeros(numSteps,dim);
tColon(i) = timeit(@()funcColon(dim,numSteps,output,vec0));
output = zeros(numSteps,dim);
tLoop(i) = timeit(@()funcLoop(dim,numSteps,output,vec0));
output = cell(numSteps,1);
tCell(i) = timeit(@()funcCell(dim,numSteps,output,vec0));
end
semilogy(Dims,tLoop./tColon,'r', Dims,tCell./tColon,'b');
legend('tLoop / tColon','tCell / tColon')
function output = funcColon(dim,numSteps,output,vec0)
%FUNCCOLON Function that stores vectors using colon indexing
for k=1:numSteps
output(k,:) = vec0{k}; % Use colon indexing to store the results
end
end
function output = funcLoop(dim,numSteps,output,vec0)
%FUNCLOOP Function that stores vectors using a loop
for k=1:numSteps
vec=vec0{k};
for j=1:dim
output(k,j) = vec(j); % Use a loop to store the results
end
end
end
function output = funcCell(dim,numSteps,output,vec0)
%FUNCCELL Function that stores vectors using a cell array
for k=1:numSteps
output{k} = vec0{k}; % Store the vector in a cell array
end
end
Benjamin D'Anjou
Benjamin D'Anjou le 21 Mai 2020
Thanks. The behavior you observe is similar to the one I observe. I think now I understand better what to prioritize.
I will structure all my data in a 2D matrix and always loop through it column-wise. Then I'll reshape the array in whatever form I need at the end. I think that's probably the safest thing to do since the core of the program is basically a series of loops over the elements of a big array.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Loops and Conditional Statements dans Help Center et File Exchange

Produits


Version

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by