Effacer les filtres
Effacer les filtres

interpolating missing data

8 vues (au cours des 30 derniers jours)
LS
LS le 1 Déc 2011
Hi all,
I'm trying to estimate model parameters in MATLAB using data I collected in the lab, but I didn't measure all of the variables every day (so for some days I only have data for one variable). The data look like this (time; variable 1; variable 2; variable 3):
1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 9.79568000000000e-09 0.140000000000000
I've found a way to deal with this by replacing the NaN's with 0s, but I really don't want to do that in this case since it would screw up the estimation. I read something about interpolating the missing data using interp1 but I haven't been able to get that to work. Any help would be much appreciated. Thank you!

Réponse acceptée

Sven
Sven le 1 Déc 2011
Let's start with your data.
data = [1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 NaN 0.140000000000000]
Now here's how you can use interp1, looped over each column. I've updated it to handle NaN values on the end that can't be addressed with pure interpolation:
fullData = data;
for c = 2:size(data,2)
nanRows =
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1));
nanRows = isnan(data(:,c));
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1), 'nearest','extrap');
end
  2 commentaires
LS
LS le 1 Déc 2011
This is great - thank you! I have one more question though - I deleted the last value in the first column (so now there's a NaN there) and tried using this code to fill in that value as well but the NaN wasn't replaced (but the code does replace all the other NaNs). Is there a problem with interpolating for the final value in a series?
Sven
Sven le 2 Déc 2011
Yes, is is a small annoyance I have with interp1. Note the difference between _interpolation_ and _extrapolation_. For the former, you need a value above *and* below your query point. I assume that what you really want to do is:
1. Interpolate *linearly* for any _internal_ NaNs.
2. Set those NaN values on the outside to their nearest non-NaN neighbour's value.
My two most-used modes for *interp1* are 'linear' or 'nearest'. There's also an 'extrap' option to extrapolate. But since the above points one and two use different _forms_ of interpolation/extrapolation, you can't do this in one line.
What I do is run two interp commands... one to linearly interpolate, and one to 'nearestly' exrapolate. I've updated the answer accordingly.

Connectez-vous pour commenter.

Plus de réponses (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by