Why does it take forever to solve my optimization problem (using ode45 and fmincon)?

To estimate kinetic parameters in a kinetic model, I use experimental data to fit the rate constants in the differential equations. For this reason,I use fmincon to minimize an objective function (sum of squared errors between calculated and measured values). The calculation worked pretty good until I changed the definition of the objective function. Now calculations run for hours without results. I tried to reduce MaxFunctionEvaluations (before it was 1e10, now 100) but the error "Solver stopped prematurely.fmincon stopped because it exceeded the function evaluation limit, options.MaxFunctionEvaluations = 200 (the selected value)." appears.
Are there some tools or tricks to find out, where the calculation got stuck?

 Réponse acceptée

Torsten
Torsten le 30 Nov 2016
Modifié(e) : Torsten le 30 Nov 2016
The main problem with your fitting is that the system is already in steady-state at t=8,8... . A second problem is that you only use one data point for fitting.
This means that for the data point
(c1mean(tendmean),c2mean(tendmean),c3mean(tendmean),c4mean(tendmean)),
you try to solve the system
-k1*c1mean(tendmean)-k4*c1mean(tendmean)=0
k1*c1mean(tendmean)-k2*c2mean(tendmean)=0
k2*c2mean(tendmean)-k3*c3mean(tendmean)=0
k3*c3mean(tendmean)+k4*c1mean(tendmean)=0
which gives k2=k3=0 and k1,k4 arbitrary as solution.
You will have to supply data points for c1,c2,c3 and c4 at much more than only one time instant and at times when dci/dt is not 0. Otherwise, you won't get reasonable values for your reaction constants.
Best wishes
Torsten.

6 commentaires

Thanks for your response. You are totally right, that further data points are required to obtain reasonable values. But at the moment, the experimental data acquisition is delayed, so I thought it would be useful to develop the calculations in the meantime to be prepared when the data is available. Now I see that dc1/dt=0 at t=8.8 because c1=0, but why have the other functions/c2-c4 reached steady state?
But can these logic (chemical) problems cause these MATLAB problems?
Kind regards Teresa
Ok, it's not clear whether c2-c4 have already reached steady state.
But at least you can say that one of the ODEs (namely the first) can not be used in the identification of k1,k2,k3 and k4 because c1=0. Thus you have 3 equations in 4 unknowns. This can not result in a reasonable simulation.
Best wishes
Torsten.
Thank you, seems like in my original calculations which worked on a false premise of the objective function, it was a happy incident that it worked. Does it mean it is not possible to execute the calculation without more data points?! For this purpose, I have set up an artificial data set at three times (0, 4, 8) and adjusted the other functions where required (replacing [0 td] with td).
global Td Cd ksolve1 kst SSE1;
clear all
close all
D=[0 1 0 0 0;4 0.2 0.55 0.235 0.015;8 0.01 0.72 0.25 0.02];
Td=D(:,1); %residence time
Cd=D(:,2:end); %concentration
c0=Cd(1,:); %starting concentrations
S1=cell(4,2); %create cell to save calculated concentration distribution
for n=1:4 %variation of k0-values
m=n-5;
k0=[10^m 10^m 10^m 10^m];
kst(:,n)=k0;%save initial values k0 in kst
klabel=k0';
[ksolve1(n,:), SSE1(n)]=fminconFolgeParallel1(Td,Cd,c0,k0); %parameter fitting
a1=@(t,c)odeFolgeParallel1(ksolve1(n,:),t,c); %calculated values
[S1{n,1},S1{n,2}]=ode45(a1, Td,c0);
end
ksolve1=ksolve1';
a1=length(ksolve1(:,1));
ksolve1(a1+1,:)=SSE1; %adding SSE to ksolve
for n=1:4
figure
subplot(2,1,1); plot(S1{1,1},S1{1,2},Td,Cd,'o');title('FolgeParallel1');
end
And, alas, I can calculate values for k1 to k4. But the resulting concentration profiles do not look natural. Any idea where this linear behavior comes from?
Kind regards and thank you!
Ok, sorry I already found my mistake and replaced
[S1{n,1},S1{n,2}]=ode45(a1, Td,c0);
with
[S1{n,1},S1{n,2}]=ode45(a1, [0 Td(length(Td))],c0);
Torsten
Torsten le 1 Déc 2016
Modifié(e) : Torsten le 1 Déc 2016
The best preparation within MATLAB before you get experimental data is to produce artificial data points and make an attempt to recover the parameters for k1,...,k4 used in the generation process:
1. Choose (realistic) parameters for k1,...,k4
2. Use ode15s to generate corresponding concentrations at a couple of output times with these parameter values
3. Change the concentrations obtained by small quantities to incorporate measurement and model errors
4. Start your code from above to recover the parameters used in the creation of the artificial data.
One further hint:
If your system of ODEs will not become more complicated, you should use "dsolve" in advance to calculate the analytical solutions for c1,...,c4. The fitting process will be much more stable if you can use analytical expressions instead of numerical integration with the help of an ODE solver.
Best wishes
Torsten.
Dear Torsten, thank you very much for your patience and your ideas.
I will definitely try the approach with artificial data. To test it this way is a very good idea.
Unfortunately, the reaction network behind is only the start, it will become more complicated and analytical solutions impossible.
Kind regards
Teresa

Connectez-vous pour commenter.

Plus de réponses (1)

John D'Errico
John D'Errico le 29 Nov 2016
Modifié(e) : John D'Errico le 29 Nov 2016
How can we know what you did that now causes the problem? Since we do not see your code, we cannot read your mind. You changed something, and it no longer works. Surprise! What, EXACTLY did you change?
Changing the optimizer parameters as you have is generally not (almost never) the solution.
You might use the debugger to put a break point inside your objective function. That will stop it so you can check the current parameters at each call. Or just dump out the values of the parameters, as well as the current objective function produced.
Note that doing so will generate lots of calls at essentially the same location when it computes a gradient. So it will help to understand how an optimizer works, so you understand what it is doing at any point in time.
Finally, there are always problems when you try to use an optimizer on top of another tool like an ODE solver or an integration, or another optimization. The internal computation is only accurate to within some tolerance. So when you change the parameters slightly to compute a gradient, it uses a finite difference approximation to compute the gradient. But if there is error in the result, that gradient computation will HIGHLY magnify any error. Remember that a derivative is essentially delta_y/delta_x, as delta_x approaches zero. But if y becomes essentially a random variable, with a significant noise variance, then the gradient as computed becomes worthless.
The point is that these computations will often be nasty. Essentially your objective function is not differentiable at a fine scale. But fmincon (and other optimizers) assume that it is differentiable.

3 commentaires

Thank you for your fast answer, John!
I try to explain my example: First I defined my differential equations by
function dc=odeFolgeParallel1(k, t, c)
dc=zeros(4,1);
dc(1)=-k(1)*c(1)-k(4)*c(1);
dc(2)=k(1)*c(1)-k(2)*c(2);
dc(3)=k(2)*c(2)-k(3)*c(3);
dc(4)=k(3)*c(3)+k(4)*c(1);
end
and my objective function:
function SSE=objectiveFolgeParallel2(td, cd, k, c0)
f=@(t,c)odeFolgeParallel2(k, t, c);
[ts, cs]=ode45(f,[0 td],c0); %Solving ODE system
a=cs(length(cs),:);
err=cd-a; %error
SSE=sum(err.^2);%sum of square errors
end
and the optimization function with initial values as input as well:
function [ksolve,SSE]=fminconFolgeParallel1(TD,CD,c0,k0)
%Concentration CD, residence time TD, initial values k0
options=optimset('MaxIter',1e10, 'MaxFunEvals', 200); %Setting options for fminsearch
A=[]; b=[]; Aeq=[]; beq=[]; c=[];
lb=[0 0 0 0];
ub=[Inf Inf Inf Inf];
[ksolve,SSE]=fmincon(@(k)objectiveFolgeParallel1(TD,CD,k,c0),k0,A,b,Aeq,beq,lb,ub,c,options);
%searching minimum of objective function to obtain k
format short e
end
A final script uses all these functions to optimize the parameters according to experimental data D (normally, it loads an excel file, but here I adjusted it so the data is directly in the definition of D).
Before, I used to have nearly the same objective function:
function SSE=objectiveFolgeParallel1old(td, cd, k, c0)
global ts cs
f=@(t,c)odeFolgeParallel1(k, t, c);
[ts, cs]=ode45(f,[0 td],c0); %Solving ODE system
err=cd-cs; %error
SSE=sum(sum(err.^2));%sum of square errors
end
But there was a mistake due to the fact that my calculated and experimental data are only vectors and the double sum of errors is false as sum of squared errors.
I tried to include breakpoints (thanks for the hin, I am a total beginner) at the end of the objective function and found out, that the SSE value is decreasing in the beginning and then starts to rise and fall again after some twenty cycles.
I am aware that I do not fully understand the way fmincon finds local minima. But how can I know if my objective function is differentiable?
I am sorry for maybe some obvious beginner questions and thank you in advance for your response.
The vector "a" you create with the command
a=cs(length(cs),:);
only contains the simulated values of cs1, cs2, cs3 and cs4 for the final time of the integration.
So. if your code were correct, your vector of experimental data "cd" should only contains 4 elements, namely cd1(tend),cd2(tend),cd3(tend) and cd4(tend).
Is this really the case ?
Another point:
Usually problems from chemical engineering are stiff. You should change the ODE solver from ODE45 to ODE15S.
Best wishes
Torsten.
Thank you for your answer and help.
Oh, sorry, the final script mentioned above got lost somewhere.
global TD Td CD Cd ksolve1 kst SSE1;
clear all
close all
D=[0 1 0 0 0;8.80052311991583 0 0.700874384730850 0.276768973823475 0.0223566414456759;8.83157648307715 0 0.703049857064805 0.278036568828600 0.0189135741065945;8.83157648307716 0 0.702021770277732 0.277770287539202 0.0202079421830663;8.72383679088206 0 0.704610421364065 0.278164671184585 0.0172249074513494];
TD=D(:,1); %residence time
CD=D(:,2:end); %concentration
Cd=mean(CD(2:end,:));
Td=mean(TD(2:end));
c0=CD(1,:); %starting concentrations
S1=cell(6,2); %create cell to save calculated concentration distribution
for n=1:6 %variation of k0-values
m=n-3;
k0=[10^m 10^m 10^m 10^m];
kst(:,n)=k0;%save initial values k0 in kst
klabel=k0';
[ksolve1(n,:), SSE1(n)]=fminconFolgeParallel1(Td,Cd,c0,k0); %parameter fitting
a1=@(t,c)odeFolgeParallel1(ksolve1(n,:),t,c); %calculated values
[S1{n,1},S1{n,2}]=ode45(a1,[0 Td],c0);
end
ksolve1=ksolve1';
a1=length(ksolve1(:,1));
ksolve1(a1+1,:)=SSE1; %adding SSE to ksolve
for n=1:6
figure
subplot(2,1,1); plot(S1{1,1},S1{1,2},Td,Cd,'o');title('FolgeParallel1');
end
Setting a breakpoint after S1=... shows that Cd contain only 4 values, which are used later as input for fminconFolgeParallel1.
Thank you for your advice regarding ODE solver. I will try it out. What consequences can it have to use the wrong ODE function? Can it cause that the calculations get stuck?

Connectez-vous pour commenter.

Catégories

En savoir plus sur Get Started with Curve Fitting Toolbox dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by