Can i vectorize my loop
1 vue (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I have an array of numbers that i wish to reduce based on whether the previous kept value is within a defined tolerance. I don't believe that the diff() function does what i need and therefore I think i will need something more bespoke. Currently I perform this task using the following for loop but this is very slow when processing large amounts of data.
MinInterval=0.1
KeptData(1,1)=OriginalData(1,1);
n=1;
for p=1:length(OriginalData)-1
difference=OriginalData(p+1,1)-KeptData(n,1);
if difference<=MinInterval
%do nothing
else
n=n+1;
KeptData(n,1)=OriginalData(p+1,1);
end
end
For example,
OriginalData=[1; 2; 3; 3.1; 3.2; 3.3; 3.4; 3.5; 3.6; 4; 5];
assuming a tolerance of 0.1 would become
KeptData=[1; 2; 3; 3.2; 3.4; 3.6; 4; 5]
7 commentaires
Réponses (1)
Jeremy Wurbs
le 27 Nov 2013
Very interesting question.
First off, running your solution as is I get:
[ 1.0000 2.0000 3.0000 3.1000 3.2000 3.4000 3.5000 3.6000 4.0000 5.0000]
It seems there are precision issues with MinInterval being 0.1 and your data being spaced "0.1" apart (which is impossible to represent in binary; you can see this by trying the statement (3*0.1==0.3) in the command window).
Anyway, adding a small wiggle factor to your MinInterval, I get the code:
% Method 1
tic
MinInterval=0.1+eps; % We will have issues with precision without eps
KeptData(1,1)=OriginalData(1,1);
n=1;
for p=1:length(OriginalData)-1
difference=OriginalData(p+1,1)-KeptData(n,1);
if difference<=MinInterval
%do nothing
else
n=n+1;
KeptData(n,1)=OriginalData(p+1,1);
end
end
t1 = toc;
A simple attempt to vectorize the above only using diff could look like:
% Method 2
tic
KeptData2 = OriginalData([true; diff(OriginalData)>MinInterval]);
t2 = toc;
But the above gives a different behavior around "stair-case" sections in your data (the [... 3.0; 3.1; 3.2; 3.3; ...] part). We can get more similar behavior to the original code with the following:
% Method 3
MinInterval = 0.1; % Shouldn't have any precision issues
tic
NewMinInterval = 2*MinInterval;
edges = min(OriginalData):NewMinInterval:max(OriginalData);
hc = histc(OriginalData,edges);
KeptData3 = edges(logical(hc));
t3 = toc;
Using the following:
KeptData'
KeptData2'
KeptData3
disp(['Using for loop: ' num2str(t1)])
disp(['Using diff alone: ' num2str(t2)])
disp(['Using histc + diff: ' num2str(t3)])
I get the results:
ans =
1.0000 2.0000 3.0000 3.2000 3.4000 3.6000 4.0000 5.0000
ans =
1 2 3 4 5
KeptData3 =
1.0000 2.0000 3.0000 3.2000 3.4000 3.6000 4.0000 5.0000
Using for loop: 0.017651
Using diff alone: 0.0047814
Using histc + diff: 0.0071429
Hopefully that helps. Let me know if that doesn't give the behavior you're looking for.
Cheers
Voir également
Catégories
En savoir plus sur Loops and Conditional Statements dans Help Center et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!