How do I choose the initial values for non-linear curve/function fitting?

Question

0 votes

I have some data taken from an optics experiment which consists of applied voltage (X-axis) and Intensity (Y-axis). There is a mathematical relation between them which is quite complicated but here it is in Matlab format:

F= cos((a(1).^2.*(a(2) - (a(2).*a(3))./(a(2).^2.*cos(a(4) - 2.*atan(exp(-(V- a(5))./a(6)))).^2 + a(3).^2.*sin(a(4) - 2.*atan(exp(-(V - a(5))./a(6)))).^2).^(1/2)).^2 + a(7).^2).^(1/2)).^2

Here F is the theoretical expression for the intensity and V is the voltage and 'a' are fit parameters that I need to find. Since this is highly nonlinear, it seems hard to find the fit parameters such that the curve fits to the data well.

My Previous Attemps and Observations:

I did notice that the fit parameters are extremely sensitive both initial start points and its lower and upper limits.
I got a 'somewhat' decent fit after so many attempts as shown below but cannot seem to get better beyond this:

I also tried to constrain some of the above parameters to some possible way using Physics but still can't seem to do any better: for example, a(2) and a(3) should lie around 1.5-1.7 in most cases. a(1) should be around 38750, a(5) around 80. Im not very sure about a(7) and a(4) but they should likely be between -5 to 5.

Is there someway in which I could choose the initial values so that I could get a good fit? I feel like it gets stuck at some local optima but doesn't reach teh global optimal fit.

My raw data is attached here. Let me know in case I need to attach more information

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Mathieu NOE le 17 Juil 2023

It would definitively help if you share your data and code along

as usual, the more we know from the model and underlying physics, the better we can make the fit process converge

I wonder how the theoretical equation was constructed and what ensures that the experiment is done according to this model

in other words , we need to use as much a priori information to bound the search interval (or even better , to fix some parameters)

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

John D'Errico le 17 Juil 2023

Modifié(e) : John D'Errico le 17 Juil 2023

Ouvrir dans MATLAB Online

1 vote

Yes, there is an easy way to do it automatically, but its a SECRET! They would have built it into the code, but it is such a big secret, that it could not be given to just anybody, and you might figure it out from the code.

Yeah. Right. I'm sorry, but there is no magic way to automatically determine good starting values.

Optimizations with many parameters often have locally optimal sub-optimal "solutions", where the optimizer gets stuck, IF you start in a bad place. (Your model qualifies as having many parameters.) Call those points stationary points, where the optimzer cannot find a better place to look from there.

a = sym('a',[1,7]);

syms V

F= cos((a(1).^2.*(a(2) - (a(2).*a(3))./(a(2).^2.*cos(a(4) - 2.*atan(exp(-(V- a(5))./a(6)))).^2 + a(3).^2.*sin(a(4) - 2.*atan(exp(-(V - a(5))./a(6)))).^2).^(1/2)).^2 + a(7).^2).^(1/2)).^2

F =

Models with trig components ALWAYS seem to have problems with multiple solutions. This is just a natural consequence of periodic functions. Powers and roots of parameters also cause problems, because again, it introduces multiple solutions.

Again, I'm sorry, but this is just a bullet you need to bite. You need to understand the model you have proposed. After all, if you chose to build that model, you should have done so for a reason. So spend the time to learn about how a parameter impacts the model. In this model, at least a5 and a6 are trivial to understand, as shift and scale parameters. The others can probably have some interpretations, if you spend some time, but I won't go into that rabbit hole.

Your best solution is probably to use a multi-start method, or perhaps a tool like GA. In either case, they are designed to be LESS sensitive to problems of this sort, but expect it to be a difficult problem.

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Akaash Srikanth le 24 Juil 2023

Modifié(e) : Akaash Srikanth le 25 Juil 2023

Ouvrir dans MATLAB Online

@Alex Sha,

Thank you for the help. So I realized something that there was a slight error in my mathematical model. Thats probably why the fit is not within the values that I expected. I corrected it and now we are trying to again fit another function to a slightly different set of data. (The new files have been posted here: x_axis:volt and y_axis is phase) Would it be possible predict some good values for the fit in that case? My fit results are slightly better but still would prefer a much better fit. Here is the best I could go to.

a = sym('a',[1,8]);
syms V
F= @(a,V) a(1).*(a(2) + a(3) - a(4) - (a(2).*a(5))./(a(2).^2.*cos(a(6) - 2.*atan(exp(-(V - a(7))./a(8)))).^2 + a(5).^2.*sin(a(6) - 2.*atan(exp(-(V - a(7))./a(8)))).^2).^(1/2))
F = function_handle with value:
    @(a,V)a(1).*(a(2)+a(3)-a(4)-(a(2).*a(5))./(n.^2.*cos(a(6)-2.*atan(exp(-(V-a(7))./a(8)))).^2+a(5).^2.*sin(a(6)-2.*atan(exp(-(V-a(7))./a(8)))).^2).^(1/2))

Again some contraints on the parameter:

a(1): (about 20000, 80000) - This is the ratio of the thickness of a crystal and the wavelength of the laser we are using.
a(2) and a(5): (about 1.4-1.8) - Both these are refractive indices.
No strict range on a(3) and a(4)
a(7) : Around (0.3-0.9) - These are some voltages in V. (Can go slightly above 0.9)
a(8): Around (0.3-2)- Again some voltage in V.
a(6)- No strict range on this too

If needed I could post this as a new question too as I am not very familiar with Matlab community guidelines.

Alex Sha le 26 Juil 2023

Ouvrir dans MATLAB Online

If taking parameter ranges as: a1=[20000, 80000],a2=[1.4,1.8],a3,a4,a5=[1.4,1.8],a6,a7=[0.3,0.9],a8=[0.3,2];

Sum Squared Error (SSE): 0.913188311071197
Root of Mean Square Error (RMSE): 0.0825520329387615
Correlation Coef. (R): 0.999569920445517
R-Square: 0.999140025859457
Parameter	Best Estimate    
---------	-------------    
a1       	20000.0000113322 
a2       	1.40000693995549 
a3       	125.071659909581 
a4       	125.071307157303 
a5       	1.40050427357894 
a6       	1.24015985413439 
a7       	0.300000000000583
a8       	0.796932506770127

It is easy to see that parameter a1, a2, a5 and a7 are all in the lower bound of their ranges, so if change rages as: a1=[10000, 80000],a2=[0,1.8],a3,a4,a5=[0,1.8],a6,a7=[0.0,0.9],a8=[0.0,2];

the result will be a little better:

Sum Squared Error (SSE): 0.392117528292817
Root of Mean Square Error (RMSE): 0.054094826103246
Correlation Coef. (R): 0.999815349109032
R-Square: 0.999630732314016
Parameter	Best Estimate       
---------	-------------       
a1       	10000.0637696401    
a2       	0.00172350448796937 
a3       	-279.422101867729   
a4       	-279.421813550937   
a5       	0.000794456392713509
a6       	-0.260580870966713  
a7       	0.256682315830075   
a8       	1.33579711975512

Connectez-vous pour commenter.

How do I choose the initial values for non-linear curve/function fitting?

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Réponses (1)

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Catégories

Produits

Version

Tags

Community Treasure Hunt

How do I choose the initial values for non-linear curve/function fitting?

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Réponses (1)

6 commentaires Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Catégories

Produits

Version

Tags

Voir également

Community Treasure Hunt

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens