Why does it take forever to solve my optimization problem (using ode45 and fmincon)?

Question

0 votes

To estimate kinetic parameters in a kinetic model, I use experimental data to fit the rate constants in the differential equations. For this reason,I use fmincon to minimize an objective function (sum of squared errors between calculated and measured values). The calculation worked pretty good until I changed the definition of the objective function. Now calculations run for hours without results. I tried to reduce MaxFunctionEvaluations (before it was 1e10, now 100) but the error "Solver stopped prematurely.fmincon stopped because it exceeded the function evaluation limit, options.MaxFunctionEvaluations = 200 (the selected value)." appears.

Are there some tools or tricks to find out, where the calculation got stuck?

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Torsten le 30 Nov 2016

Modifié(e) : Torsten le 30 Nov 2016

Ouvrir dans MATLAB Online

0 votes

The main problem with your fitting is that the system is already in steady-state at t=8,8... . A second problem is that you only use one data point for fitting.

This means that for the data point

(c1mean(tendmean),c2mean(tendmean),c3mean(tendmean),c4mean(tendmean)),

you try to solve the system

-k1*c1mean(tendmean)-k4*c1mean(tendmean)=0
k1*c1mean(tendmean)-k2*c2mean(tendmean)=0
k2*c2mean(tendmean)-k3*c3mean(tendmean)=0
k3*c3mean(tendmean)+k4*c1mean(tendmean)=0

which gives k2=k3=0 and k1,k4 arbitrary as solution.

You will have to supply data points for c1,c2,c3 and c4 at much more than only one time instant and at times when dci/dt is not 0. Otherwise, you won't get reasonable values for your reaction constants.

Best wishes

Torsten.

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Teresa Schubert le 30 Nov 2016

Modifié(e) : Teresa Schubert le 30 Nov 2016

Ouvrir dans MATLAB Online

Thank you, seems like in my original calculations which worked on a false premise of the objective function, it was a happy incident that it worked. Does it mean it is not possible to execute the calculation without more data points?! For this purpose, I have set up an artificial data set at three times (0, 4, 8) and adjusted the other functions where required (replacing [0 td] with td).

global Td Cd ksolve1 kst SSE1;
clear all
close all
D=[0 1 0 0 0;4 0.2 0.55 0.235 0.015;8 0.01 0.72 0.25 0.02];
Td=D(:,1); %residence time
Cd=D(:,2:end); %concentration
c0=Cd(1,:); %starting concentrations
S1=cell(4,2); %create cell to save calculated concentration distribution
for n=1:4 %variation of k0-values
    m=n-5;
    k0=[10^m 10^m 10^m 10^m];
    kst(:,n)=k0;%save initial values k0 in kst
    klabel=k0';
      [ksolve1(n,:), SSE1(n)]=fminconFolgeParallel1(Td,Cd,c0,k0); %parameter fitting
      a1=@(t,c)odeFolgeParallel1(ksolve1(n,:),t,c); %calculated values
      [S1{n,1},S1{n,2}]=ode45(a1, Td,c0);
end
ksolve1=ksolve1';
a1=length(ksolve1(:,1));
ksolve1(a1+1,:)=SSE1; %adding SSE to ksolve
for n=1:4
    figure
    subplot(2,1,1); plot(S1{1,1},S1{1,2},Td,Cd,'o');title('FolgeParallel1');
end

And, alas, I can calculate values for k1 to k4. But the resulting concentration profiles do not look natural. Any idea where this linear behavior comes from?

Kind regards and thank you!

Torsten le 1 Déc 2016

Modifié(e) : Torsten le 1 Déc 2016

The best preparation within MATLAB before you get experimental data is to produce artificial data points and make an attempt to recover the parameters for k1,...,k4 used in the generation process:

1. Choose (realistic) parameters for k1,...,k4

2. Use ode15s to generate corresponding concentrations at a couple of output times with these parameter values

3. Change the concentrations obtained by small quantities to incorporate measurement and model errors

4. Start your code from above to recover the parameters used in the creation of the artificial data.

One further hint:

If your system of ODEs will not become more complicated, you should use "dsolve" in advance to calculate the analytical solutions for c1,...,c4. The fitting process will be much more stable if you can use analytical expressions instead of numerical integration with the help of an ODE solver.

Best wishes

Torsten.

Teresa Schubert le 6 Déc 2016

Dear Torsten, thank you very much for your patience and your ideas.

I will definitely try the approach with artificial data. To test it this way is a very good idea.

Unfortunately, the reaction network behind is only the start, it will become more complicated and analytical solutions impossible.

Kind regards

Teresa

Connectez-vous pour commenter.

Answer 2

John D'Errico le 29 Nov 2016

Modifié(e) : John D'Errico le 29 Nov 2016

0 votes

How can we know what you did that now causes the problem? Since we do not see your code, we cannot read your mind. You changed something, and it no longer works. Surprise! What, EXACTLY did you change?

Changing the optimizer parameters as you have is generally not (almost never) the solution.

You might use the debugger to put a break point inside your objective function. That will stop it so you can check the current parameters at each call. Or just dump out the values of the parameters, as well as the current objective function produced.

Note that doing so will generate lots of calls at essentially the same location when it computes a gradient. So it will help to understand how an optimizer works, so you understand what it is doing at any point in time.

Finally, there are always problems when you try to use an optimizer on top of another tool like an ODE solver or an integration, or another optimization. The internal computation is only accurate to within some tolerance. So when you change the parameters slightly to compute a gradient, it uses a finite difference approximation to compute the gradient. But if there is error in the result, that gradient computation will HIGHLY magnify any error. Remember that a derivative is essentially delta_y/delta_x, as delta_x approaches zero. But if y becomes essentially a random variable, with a significant noise variance, then the gradient as computed becomes worthless.

The point is that these computations will often be nasty. Essentially your objective function is not differentiable at a fine scale. But fmincon (and other optimizers) assume that it is differentiable.

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Teresa Schubert le 30 Nov 2016

Ouvrir dans MATLAB Online

Thank you for your fast answer, John!

I try to explain my example: First I defined my differential equations by

   function dc=odeFolgeParallel1(k, t, c)
  dc=zeros(4,1);
  dc(1)=-k(1)*c(1)-k(4)*c(1);
  dc(2)=k(1)*c(1)-k(2)*c(2);
  dc(3)=k(2)*c(2)-k(3)*c(3);
  dc(4)=k(3)*c(3)+k(4)*c(1);
  end

and my objective function:

function SSE=objectiveFolgeParallel2(td, cd, k, c0)
f=@(t,c)odeFolgeParallel2(k, t, c);
[ts, cs]=ode45(f,[0 td],c0); %Solving ODE system
a=cs(length(cs),:);
err=cd-a; %error
SSE=sum(err.^2);%sum of square errors
end

and the optimization function with initial values as input as well:

    function [ksolve,SSE]=fminconFolgeParallel1(TD,CD,c0,k0)
  %Concentration CD, residence time TD, initial values k0
    options=optimset('MaxIter',1e10, 'MaxFunEvals', 200); %Setting options for fminsearch
    A=[]; b=[]; Aeq=[]; beq=[]; c=[];
    lb=[0 0 0 0];
    ub=[Inf Inf Inf Inf];
    [ksolve,SSE]=fmincon(@(k)objectiveFolgeParallel1(TD,CD,k,c0),k0,A,b,Aeq,beq,lb,ub,c,options); 
    %searching minimum of objective function to obtain k
     format short e
    end

A final script uses all these functions to optimize the parameters according to experimental data D (normally, it loads an excel file, but here I adjusted it so the data is directly in the definition of D).

Before, I used to have nearly the same objective function:

function SSE=objectiveFolgeParallel1old(td, cd, k, c0)
global ts cs
f=@(t,c)odeFolgeParallel1(k, t, c);
[ts, cs]=ode45(f,[0 td],c0);  %Solving ODE system
err=cd-cs; %error
SSE=sum(sum(err.^2));%sum of square errors
end

But there was a mistake due to the fact that my calculated and experimental data are only vectors and the double sum of errors is false as sum of squared errors.

I tried to include breakpoints (thanks for the hin, I am a total beginner) at the end of the objective function and found out, that the SSE value is decreasing in the beginning and then starts to rise and fall again after some twenty cycles.

I am aware that I do not fully understand the way fmincon finds local minima. But how can I know if my objective function is differentiable?

I am sorry for maybe some obvious beginner questions and thank you in advance for your response.

Torsten le 30 Nov 2016

Modifié(e) : Torsten le 30 Nov 2016

Ouvrir dans MATLAB Online

The vector "a" you create with the command

a=cs(length(cs),:);

only contains the simulated values of cs1, cs2, cs3 and cs4 for the final time of the integration.

So. if your code were correct, your vector of experimental data "cd" should only contains 4 elements, namely cd1(tend),cd2(tend),cd3(tend) and cd4(tend).

Is this really the case ?

Another point:

Usually problems from chemical engineering are stiff. You should change the ODE solver from ODE45 to ODE15S.

Best wishes

Torsten.

Teresa Schubert le 30 Nov 2016

Ouvrir dans MATLAB Online

Thank you for your answer and help.

Oh, sorry, the final script mentioned above got lost somewhere.

global TD Td CD Cd ksolve1 kst SSE1;
clear all
close all
D=[0 1 0 0 0;8.80052311991583 0 0.700874384730850 0.276768973823475 0.0223566414456759;8.83157648307715 0 0.703049857064805 0.278036568828600 0.0189135741065945;8.83157648307716 0 0.702021770277732 0.277770287539202 0.0202079421830663;8.72383679088206 0 0.704610421364065 0.278164671184585 0.0172249074513494];
TD=D(:,1); %residence time
CD=D(:,2:end); %concentration
Cd=mean(CD(2:end,:));
Td=mean(TD(2:end));
c0=CD(1,:); %starting concentrations
S1=cell(6,2); %create cell to save calculated concentration distribution
for n=1:6 %variation of k0-values
    m=n-3;
    k0=[10^m 10^m 10^m 10^m];
    kst(:,n)=k0;%save initial values k0 in kst
    klabel=k0';
      [ksolve1(n,:), SSE1(n)]=fminconFolgeParallel1(Td,Cd,c0,k0); %parameter fitting
      a1=@(t,c)odeFolgeParallel1(ksolve1(n,:),t,c); %calculated values
      [S1{n,1},S1{n,2}]=ode45(a1,[0 Td],c0);
end
ksolve1=ksolve1';
a1=length(ksolve1(:,1));
ksolve1(a1+1,:)=SSE1; %adding SSE to ksolve
for n=1:6
    figure
    subplot(2,1,1); plot(S1{1,1},S1{1,2},Td,Cd,'o');title('FolgeParallel1');
end

Setting a breakpoint after S1=... shows that Cd contain only 4 values, which are used later as input for fminconFolgeParallel1.

Thank you for your advice regarding ODE solver. I will try it out. What consequences can it have to use the wrong ODE function? Can it cause that the calculations get stuck?

Connectez-vous pour commenter.

Why does it take forever to solve my optimization problem (using ode45 and fmincon)?

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Plus de réponses (1)

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Catégories

Tags

Community Treasure Hunt

Why does it take forever to solve my optimization problem (using ode45 and fmincon)?

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

6 commentaires Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Plus de réponses (1)

3 commentaires Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Catégories

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien