How do I choose the initial values for non-linear curve/function fitting?
Afficher commentaires plus anciens
I have some data taken from an optics experiment which consists of applied voltage (X-axis) and Intensity (Y-axis). There is a mathematical relation between them which is quite complicated but here it is in Matlab format:
F= cos((a(1).^2.*(a(2) - (a(2).*a(3))./(a(2).^2.*cos(a(4) - 2.*atan(exp(-(V- a(5))./a(6)))).^2 + a(3).^2.*sin(a(4) - 2.*atan(exp(-(V - a(5))./a(6)))).^2).^(1/2)).^2 + a(7).^2).^(1/2)).^2
Here F is the theoretical expression for the intensity and V is the voltage and 'a' are fit parameters that I need to find. Since this is highly nonlinear, it seems hard to find the fit parameters such that the curve fits to the data well.
My Previous Attemps and Observations:
- I did notice that the fit parameters are extremely sensitive both initial start points and its lower and upper limits.
- I got a 'somewhat' decent fit after so many attempts as shown below but cannot seem to get better beyond this:

- I also tried to constrain some of the above parameters to some possible way using Physics but still can't seem to do any better: for example, a(2) and a(3) should lie around 1.5-1.7 in most cases. a(1) should be around 38750, a(5) around 80. Im not very sure about a(7) and a(4) but they should likely be between -5 to 5.
Is there someway in which I could choose the initial values so that I could get a good fit? I feel like it gets stuck at some local optima but doesn't reach teh global optimal fit.
My raw data is attached here. Let me know in case I need to attach more information
1 commentaire
Mathieu NOE
le 17 Juil 2023
It would definitively help if you share your data and code along
as usual, the more we know from the model and underlying physics, the better we can make the fit process converge
I wonder how the theoretical equation was constructed and what ensures that the experiment is done according to this model
in other words , we need to use as much a priori information to bound the search interval (or even better , to fix some parameters)
Réponses (1)
John D'Errico
le 17 Juil 2023
Modifié(e) : John D'Errico
le 17 Juil 2023
Yes, there is an easy way to do it automatically, but its a SECRET! They would have built it into the code, but it is such a big secret, that it could not be given to just anybody, and you might figure it out from the code.
Yeah. Right. I'm sorry, but there is no magic way to automatically determine good starting values.
Optimizations with many parameters often have locally optimal sub-optimal "solutions", where the optimizer gets stuck, IF you start in a bad place. (Your model qualifies as having many parameters.) Call those points stationary points, where the optimzer cannot find a better place to look from there.
a = sym('a',[1,7]);
syms V
F= cos((a(1).^2.*(a(2) - (a(2).*a(3))./(a(2).^2.*cos(a(4) - 2.*atan(exp(-(V- a(5))./a(6)))).^2 + a(3).^2.*sin(a(4) - 2.*atan(exp(-(V - a(5))./a(6)))).^2).^(1/2)).^2 + a(7).^2).^(1/2)).^2
Models with trig components ALWAYS seem to have problems with multiple solutions. This is just a natural consequence of periodic functions. Powers and roots of parameters also cause problems, because again, it introduces multiple solutions.
Again, I'm sorry, but this is just a bullet you need to bite. You need to understand the model you have proposed. After all, if you chose to build that model, you should have done so for a reason. So spend the time to learn about how a parameter impacts the model. In this model, at least a5 and a6 are trivial to understand, as shift and scale parameters. The others can probably have some interpretations, if you spend some time, but I won't go into that rabbit hole.
Your best solution is probably to use a multi-start method, or perhaps a tool like GA. In either case, they are designed to be LESS sensitive to problems of this sort, but expect it to be a difficult problem.
6 commentaires
Alex Sha
le 18 Juil 2023
How about the fitting result below:
Sum Squared Error (SSE): 0.0123074231165673
Root of Mean Square Error (RMSE): 0.00879802118852625
Correlation Coef. (R): 0.999788515515635
R-Square: 0.999577075756958
Parameter Best Estimate
--------- -------------
a1 -4.82972314218071
a2 1.4527434701678
a3 0.00133382444541842
a4 1.57200300023204
a5 960.215290629223
a6 -122.525680565375
a7 3.28633718110081
or:
Sum Squared Error (SSE): 0.0123080008287356
Root of Mean Square Error (RMSE): 0.00879822767627776
Correlation Coef. (R): 0.999788446689249
R-Square: 0.999576938133302
Parameter Best Estimate
--------- -------------
a1 -16.2089136197006
a2 -0.432859328507279
a3 0.000378668065253761
a4 -1.56964679801966
a5 966.029672568356
a6 -122.507995077863
a7 -3.28637077838881

Akaash Srikanth
le 18 Juil 2023
Modifié(e) : Akaash Srikanth
le 18 Juil 2023
Alex Sha
le 18 Juil 2023
@Akaash Srikanth, only you know the background of your fitting problem, so, please give the range limits of each parameter in advance, like "0<a2,a3<1", how abou others?
Akaash Srikanth
le 18 Juil 2023
Modifié(e) : Akaash Srikanth
le 18 Juil 2023
Akaash Srikanth
le 24 Juil 2023
Modifié(e) : Akaash Srikanth
le 25 Juil 2023
Alex Sha
le 26 Juil 2023
If taking parameter ranges as: a1=[20000, 80000],a2=[1.4,1.8],a3,a4,a5=[1.4,1.8],a6,a7=[0.3,0.9],a8=[0.3,2];
Sum Squared Error (SSE): 0.913188311071197
Root of Mean Square Error (RMSE): 0.0825520329387615
Correlation Coef. (R): 0.999569920445517
R-Square: 0.999140025859457
Parameter Best Estimate
--------- -------------
a1 20000.0000113322
a2 1.40000693995549
a3 125.071659909581
a4 125.071307157303
a5 1.40050427357894
a6 1.24015985413439
a7 0.300000000000583
a8 0.796932506770127

It is easy to see that parameter a1, a2, a5 and a7 are all in the lower bound of their ranges, so if change rages as: a1=[10000, 80000],a2=[0,1.8],a3,a4,a5=[0,1.8],a6,a7=[0.0,0.9],a8=[0.0,2];
the result will be a little better:
Sum Squared Error (SSE): 0.392117528292817
Root of Mean Square Error (RMSE): 0.054094826103246
Correlation Coef. (R): 0.999815349109032
R-Square: 0.999630732314016
Parameter Best Estimate
--------- -------------
a1 10000.0637696401
a2 0.00172350448796937
a3 -279.422101867729
a4 -279.421813550937
a5 0.000794456392713509
a6 -0.260580870966713
a7 0.256682315830075
a8 1.33579711975512

Catégories
En savoir plus sur Quadratic Programming and Cone Programming dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

