Least squares fit to different data sets at the same time
    3 vues (au cours des 30 derniers jours)
  
       Afficher commentaires plus anciens
    
I am trying to fit an ODE system to two different data sets of varying time scales. As a consequence, the data sets are different sizes. I found the question/answer here was not sufficient for my purposes.
As a minimal example, consider the following made-up dataset, where each portion is distinguished by the resolution of the data with respect to time.
% data to be fitted to model
monthly_dat = rand(1,24)*100; % data to fit variable 1
yearly_dat = rand(1,2)*100; % data to fit variable 2
This represents hypothetical data for two output variables over the course of two years. One variable output of the system I want to fit to the monthly data, but I want the output of a second variable of the system to fit to that of the yearly data. To help illustrate, say my model outputs for a month i are denoted by  and
 and  . I have monthly resolution data for the first output variable, but I only have yearly resolution data available for the second output variable, so that I want to fit
. I have monthly resolution data for the first output variable, but I only have yearly resolution data available for the second output variable, so that I want to fit  to the first data point in yearly_dat,
  to the first data point in yearly_dat,  to the second data point, etc.
 to the second data point, etc. 
 and
 and  . I have monthly resolution data for the first output variable, but I only have yearly resolution data available for the second output variable, so that I want to fit
. I have monthly resolution data for the first output variable, but I only have yearly resolution data available for the second output variable, so that I want to fit  to the first data point in yearly_dat,
  to the first data point in yearly_dat,  to the second data point, etc.
 to the second data point, etc. I tried building one large matrix with the monthly resolution data in the first column and the yearly resolution data in the second column, with all other empty cells filled with 0. However, this is producing some strange results so I suspect I have done something wrong. Does anyone know how I can resolve this problem and produce an adequate fit to either set of data? 
A bit of elaboration as to why the previously linked answer does not suffice
The system I am trying to fit is an ODE system that takes for 'xdata' the time the ODE system is to be analyzed over.. This places some restrictions on what kind of 'xdata' I can feed into the lsqsolver. For example, I tried the following (assuming I have 96 monthly data points for the first variable and 7 yearly data points for the second variable):
% fit to monthly data, then yearly data
timeVec = {[1:96]' ([1:7]*12)'};
dataVec = {data_to_fit_1' data_to_fit_2'};
dataVec = vertcat(dataVec{:});
% other stuff goes here for the solver...
[param,resnorm] = lsqcurvefit(@my_func,z,timeVec,dataVec,...);
This returns the following error:
Undefined operator '-' for input arguments of type 'cell'.
as I'm passing a cell type variable into the time span argument of an ODE solver. Converting the data into a double type does not resolve the problem as the time values I want to evaluate the system at are not monotonically increasing (required by the solver), so I suspect there is an alternative way of getting this to work. 
4 commentaires
Réponses (1)
  Shubham
      
 le 19 Avr 2024
        Hi Cole,
Fitting an ODE system to two datasets of varying time scales, especially when the datasets are of different sizes and resolutions, requires a tailored approach. The key challenge here is to correctly align the model outputs with the corresponding data points, given their different time resolutions. Here's a strategy to address this issue, leveraging MATLAB's capabilities for handling such problems.
Step 1: Define Your ODE System
First, ensure you have a clear definition of your ODE system. For this example, let's assume a simple system of equations without specifying the exact form, as the focus is on the fitting procedure.
function dydt = odesystem(t, y, params)
    % Example ODE system
    % y(1) and y(2) are the state variables
    % params contains parameters of the ODE system
    % Placeholder for the actual system of equations
    dydt = [f1(t, y, params); f2(t, y, params)];
end
Step 2: Interpolation of Yearly Data
Given the mismatch in data resolutions, one approach is to interpolate the yearly data to match the monthly resolution. This interpolation doesn't mean we are assuming data between the years but provides a mechanism to align the yearly data with the monthly updates in a way that the optimizer can handle.
Step 3: Custom Objective Function
Create a custom objective function that evaluates the ODE system over the monthly time points and extracts the yearly points for comparison where needed.
function error = objectiveFunction(params, monthly_time, yearly_time, monthly_dat, yearly_dat, monthly_indices, yearly_indices)
    % Solve the ODE system with current parameters
    [T, Y] = ode45(@(t, y) odesystem(t, y, params), monthly_time, [initial_conditions]);
    % Extract model predictions for monthly and yearly data points
    monthly_predictions = Y(monthly_indices, 1); % Assuming variable 1 corresponds to monthly data
    yearly_predictions = Y(yearly_indices, 2); % Assuming variable 2 corresponds to yearly data
    % Calculate the error for monthly and yearly data
    monthly_error = monthly_predictions - monthly_dat';
    yearly_error = yearly_predictions - yearly_dat';
    % Combine errors
    error = [monthly_error; yearly_error];
end
Step 4: Optimization Procedure
Use MATLAB's lsqnonlin (or similar) to find the best-fitting parameters by minimizing the error defined in your custom objective function.
% Initial guess for parameters
params_initial_guess = [guess1, guess2, ...];
% Time vectors
monthly_time = (1:24)'; % 24 months
yearly_time = [1, 13]; % Start of each year in monthly indices
% Indices in the monthly data that correspond to the yearly data
monthly_indices = 1:24; % All monthly points are used
yearly_indices = [1, 13]; % Indices of the yearly data points in the monthly scale
% Optimization
options = optimoptions('lsqnonlin', 'Display', 'iter');
[param_est, resnorm] = lsqnonlin(@(params) objectiveFunction(params, monthly_time, yearly_time, monthly_dat, yearly_dat, monthly_indices, yearly_indices), params_initial_guess, [], [], options);
Notes:
- This approach assumes that you can map the yearly data points to specific indices in the monthly data resolution. Adjust the yearly_indices accordingly.
- The objective function calculates the error between the model predictions and the actual data for both the monthly and yearly datasets. It combines these errors into a single vector that lsqnonlin attempts to minimize.
- Ensure that your initial conditions and parameter guesses are reasonable to aid convergence.
- Depending on your specific ODE system, you might need to adjust the solver settings or the objective function to better capture the dynamics of your system.
This method allows you to fit an ODE system to datasets with different resolutions by carefully aligning the model outputs with the respective data points and using a numerical optimization technique to find the best-fitting parameters.
0 commentaires
Voir également
Catégories
				En savoir plus sur Ordinary Differential Equations dans Help Center et File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



