interpolating missing data
4 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi all,
I'm trying to estimate model parameters in MATLAB using data I collected in the lab, but I didn't measure all of the variables every day (so for some days I only have data for one variable). The data look like this (time; variable 1; variable 2; variable 3):
1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 9.79568000000000e-09 0.140000000000000
I've found a way to deal with this by replacing the NaN's with 0s, but I really don't want to do that in this case since it would screw up the estimation. I read something about interpolating the missing data using interp1 but I haven't been able to get that to work. Any help would be much appreciated. Thank you!
0 commentaires
Réponse acceptée
Sven
le 1 Déc 2011
Let's start with your data.
data = [1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 NaN 0.140000000000000]
Now here's how you can use interp1, looped over each column. I've updated it to handle NaN values on the end that can't be addressed with pure interpolation:
fullData = data;
for c = 2:size(data,2)
nanRows =
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1));
nanRows = isnan(data(:,c));
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1), 'nearest','extrap');
end
2 commentaires
Sven
le 2 Déc 2011
Yes, is is a small annoyance I have with interp1. Note the difference between _interpolation_ and _extrapolation_. For the former, you need a value above *and* below your query point. I assume that what you really want to do is:
1. Interpolate *linearly* for any _internal_ NaNs.
2. Set those NaN values on the outside to their nearest non-NaN neighbour's value.
My two most-used modes for *interp1* are 'linear' or 'nearest'. There's also an 'extrap' option to extrapolate. But since the above points one and two use different _forms_ of interpolation/extrapolation, you can't do this in one line.
What I do is run two interp commands... one to linearly interpolate, and one to 'nearestly' exrapolate. I've updated the answer accordingly.
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Smoothing dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!