Efficient moving average of scattered data

I have some scattered data and I'd like to take something similar to a moving average, where I average all values with in some radius of each point. I can do this with a loop, but I'd like a more efficient approach. Any ideas?
Here's a working example I'd like to make more efficient:
x = randi(100,45,1) + 20+3*randn(45,1) ;
y = 15*sind(x) + randn(size(x)) + 3;
figure
plot(x,y,'bo')
radius = 10;
ymean = NaN(size(x));
for k = 1:length(x)
% Indicies of all points within specified radius:
ind = abs(x-x(k))<radius;
% Mean of y values within radius:
ymean(k) = mean(y(ind));
end
hold on
plot(x,ymean,'ks')
legend('scattered data','radial average','location','southeast')

1 commentaire

Walter Roberson
Walter Roberson le 28 Juin 2016
When I read the title I thought you might mean "sparse", and was thinking about how I might do an efficient moving average on sparse data.

Connectez-vous pour commenter.

 Réponse acceptée

Chad Greene
Chad Greene le 30 Juin 2016

1 vote

I turned this into a generalized function called scatstat1, which is on the file exchange here.

Plus de réponses (2)

Chris Turnes
Chris Turnes le 9 Mar 2017

1 vote

If you can upgrade to R2017a, this functionality can now be achieved through the 'SamplePoints' name-value pair in the moving statistics. For your example, you would do something like movmean(y, 2*radius, 'SamplePoints', x); (though you'd need to sort your x values first).
Walter Roberson
Walter Roberson le 28 Juin 2016

0 votes

pdist() to get all of the distances simultaneously. Compare to the radius. Store the resulting mask. Multiply the mask by repmat() of the y value, and sum along a dimension. sum the mask along the same dimension and divide the value sum by that count. Result should be the moving average.

3 commentaires

Chad Greene
Chad Greene le 30 Juin 2016
Interesting idea! I got your solution working, but for N of 20,000 points the pdist function takes a bit of time. As it turns out, looping is a faster.
Walter Roberson
Walter Roberson le 30 Juin 2016
Modifié(e) : Walter Roberson le 30 Juin 2016
I wonder if looping pdist2() would be efficient? Eh, it probably just adds unnecessary overhead to a simple Euclidean calculation.
Chad Greene
Chad Greene le 1 Juil 2016
Also adds a Stats Toolbox dependency. I'll have to keep pdist in mind for future applications though. Thanks for the suggestion!

Connectez-vous pour commenter.

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by