Improving Efficiency of Find Algorithm

7 vues (au cours des 30 derniers jours)
Matt C
Matt C le 19 Août 2021
Hello,
I am aware that logical indexing is much faster than the usage of the find function, in specific instances. I'm wondering if there is a way to improve the following algorithm - I'm not quite sure how to use indexing (if possible) in this situation.
What I have is a matrix of ascending values, though some of those values may be repeated (specifically, I have millions of ascending timestamps with many repeated). I am then seeking the start and end indices of a window that is between time X and Y.
Here is an example of the algorithm that I currently have implemented:
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 80;
start_index = find(myDataTimestamps >= window_start_time,1,'first');
end_index = find(myDataTimestamps <= window_end_time,1,'last');
Is there a way to improve the speed of this code and still return the same start_index of 3 and end_index of 13?
Much appreciated!

Réponses (2)

Cris LaPierre
Cris LaPierre le 19 Août 2021
This approach may only work for this simple case, but here's a way to do it using max/min.
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 60;
% find start/end index
ind = 1:length(myDataTimestamps);
wind = myDataTimestamps==window_start_time | myDataTimestamps==window_end_time;
start_index = min(ind(wind))
start_index = 3
end_index = max(ind(wind))
end_index = 10
  1 commentaire
Matt C
Matt C le 19 Août 2021
Modifié(e) : Matt C le 19 Août 2021
I haven't been able to do a comparison, but that wasn't the massive improvement that I was hoping for. I had cancelled the script early without grabbing a total runtimes for comparison, but it looked like the proposed algorithm was going to take just as long (if not longer) than the find function. I implemented the recommendation as:
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 60;
% find start/end index
ind = 1:length(myDataTimestamps);
start_index = min(ind(myDataTimestamps>=window_start_time));
end_index = max(ind(myDataTimestamps<=window_end_time));
Have I blown anything in my above implementation? Note that my processor loading was ~50%, and only ~1 GB of my 24 GB of RAM was being used.
Edit: I can confirm that it took much longer using the min/max method. My code took ~45 minutes to fully execute using 'find', whereas it had only completed ~25% after about 2 hours using the min/max method.

Connectez-vous pour commenter.


Cris LaPierre
Cris LaPierre le 19 Août 2021
I wonder if this is a scenario where using tall data may help. See this page.

Catégories

En savoir plus sur Loops and Conditional Statements dans Help Center et File Exchange

Produits


Version

R2015b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by