Questions regarding preallocation and much more

Consider the following function:
function A = plodding(N,d)
for ii = 1:N
jj = 1;
A(ii,jj) = randn;
while abs(A(ii,jj)) < d
jj = jj + 1;
A(ii,jj) = randn;
end
end
Rewrite this function to eliminate the allocation problem that is slowing it down. Call the new function, cruising. On a Dell Latitude E6410, with 7.8 gigabytes of usable memory, eliminating the allocation problem produces a speed-up factor of 7.”
Here's my work:
The original version with rng(0)
function A = plodding(N,d)
rng(0); % To compare between the original and the modified version
for ii = 1:N
jj = 1;
A(ii,jj) = randn;
while abs(A(ii,jj)) < d
jj = jj + 1;
A(ii,jj) = randn;
end
end
end
The modified version
function A = cruising(N,d)
rng(0);
for jj = 1:N % Reorganize, so elems are added column-wise
ii = 1;
A(ii,jj) = randn;
while abs(A(ii,jj)) < d
ii = ii + 1;
A(ii,jj) = randn;
end
end
A = A'; % To get the matrix desired
end
But when I test:
tic;A = plodding(5,4.5);toc
Elapsed time is 0.176091 seconds.
>> tic;A1 = cruising(5,4.5);toc;
Elapsed time is 39.285447 seconds.
>>>B = A - A1; sum(B(:))
ans =
0
So certainly A = A1.
Based on what I learned from the lesson, my logic should be right, because MATLAB stores elements column-wise. Could someone please help me????

5 commentaires

As far as pre-allocation is concerned, for these kind of problems it is often best to create an initial array that you are very confident will be large enough to hold all the values required and trim it at the end if required.
e.g.
A = zeros( N, 10 );
or in this case possibly
A = randn( N, 10 );
would be most efficient, but I don't have time to check whether it is or not - in that case you would have to replace values with 0s later where your condition has already been satisfied so that may well result in more replacements than starting off with zeros.
Certainly you can do the first column upfront after pre-allocation though as e.g.
A(:,1) - randn(N,1);
Obviously 10 may not be an appropriate number to preallocate, it depends on what the input d is and the probability of successive random numbers all being below d to give a good estimate for preallocation.
Stephen23
Stephen23 le 31 Juil 2015
Modifié(e) : Stephen23 le 31 Juil 2015
@Huy Truong: I just edited your question and removed all of the empty lines from your code. Every second line was empty which makes it a complete pain to read, and it probably makes it a complete pain to write as well. In future please do not put an empty line between every line of your code.
@Adam: I thought about it again and this problem is not asking for preallocation, since it would be very hard to determine the final size of the output, even though your way of handling preallocation is very good.
This problem requires reorganizing, i.e. reorganize the loops to make the code becomes more efficient. Since MATLAB stores the values in column-major order, it'll be best to reorganize if possible.
Here's an example of reorganizing, in case I don't make my explanation clear enough:
function A = so_fast(N) % explicit function instead of using rand(N)
for jj = 1:N
for ii = 1:N
A(ii,jj) = rand;
end
end
end
My question still remains. I still need some help...
Adam
Adam le 31 Juil 2015
Well, you didn't give the arguments with which the function is to be called to produce a factor of 7 speedup. With a function like that the speedup will certainly not be linear in the inputs.
When I run your code with 10000, 2 as the two input arguments I get 7.36s for plodding and ~0.114s for cruising which is a speedup of the order of 60 times. That is certainly surprising to me since I wasn't aware just allocating in columns vs rows could make such a difference, but then I rarely use arrays that are resizing dynamically in that manner.
Huy Truong
Huy Truong le 31 Juil 2015
Yeah, rearrange loop index can help greatly, especially when the size of the final output is hard to determine. For problems that you can use both preallocating and rearranging index, the run-time is reduced even more than you can imagine. This is what I learned from my course. We can discuss more about this if you are more interested in. But you know, I'm even more surprised than you can imagine... So For the arguments (10000,2), plodding gives 7.36s, while cruising gives about 0.114s. BUT for (5,5), plodding gives 1.16s, while cruising is like forever (try it yourself if you want to check). Can you explain why? This suggests something about MATLAB that I still don't understand.

Connectez-vous pour commenter.

Réponses (1)

Philip Borghesani
Philip Borghesani le 31 Juil 2015
Modifié(e) : Philip Borghesani le 31 Juil 2015
The problem here is that the shape of the output and the algorithm determine the number of times the array needs to be reallocated. Plodding is faster when N is small and d is large, cruiseing is faster when N is large and d is small. Take a look at this code (which I do not claim is optimal but runs ok in both cases)
function A = warping(N,d)
rng(0);
A=zeros(1,N);
for jj = 1:N % Reorganize, so elems are added column-wise
ii = 1;
%A(ii,jj) = randn;
t(ii) = randn;
while abs(t(ii)) < d
ii = ii + 1;
t(ii) = randn;
end
A(1:ii,jj)=t(1:ii);
end
A = A'; % To get the matrix desired
end
Try all three with the inputs of (1000,3.5) (both values fairly large) your two do poorly and warping does much better.
I think the fastest solution would be to collect the rows of the final output into different cell arrays and then build the final matrix by concatenation or indexing after all the data has been calculated.

2 commentaires

You may want to have a look at this result, which I and the other one have discussed above.
>> tic;A = plodding(10000,2);toc
Elapsed time is 9.289355 seconds.
>> tic;A1 = cruising(10000,2);toc;
Elapsed time is 0.078783 seconds.
>> tic;A = plodding(5,5);toc
Elapsed time is 1.168961 seconds.
>> tic;A1 = cruising(5,5);toc;
% When I posted this thread, it's already more than 10 mins and MATLAB's still "busy"!
Can you explain this? It looks like you are on the right tract of understanding this behavior of MATLAB, but I still don't get your explanation. Based on what I learned, cruising should perform faster than plodding in ANY case . This behavior of MATLAB suggests that there's something about it that I still don't quite grasp.
The number of reallocations of A determines how long the function will take. This overwhelms any advantage cruising has due to access order. Given a return value of A(m,n) plodding will reallocate A a minimum of m times and cruising will reallocate A a minimum of n times.

Connectez-vous pour commenter.

Catégories

Question posée :

le 31 Juil 2015

Commenté :

le 10 Août 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by