Problem with polyfit command (R2015a)

1 vue (au cours des 30 derniers jours)
T Hafid
T Hafid le 30 Mai 2024
Commenté : T Hafid le 30 Mai 2024
Below is a small script for using the polyfit command but surprisingly the last command gives me a completely wrong polynomial p. I don't understand why. Thanks in advance for your help.
Below the script is the response from my version of MATLAB (R2015a).
%--------------------------------------------------------------------------------------------------------------
%butta_sto_test
%
clear all
clc
x=[1:9]
x = 1x9
1 2 3 4 5 6 7 8 9
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
y=[5,6,10,20,28,33,34,36,42]
y = 1x9
5 6 10 20 28 33 34 36 42
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
p=polyfit(x,y,1)
p = 1x2
4.9833 -1.1389
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[p,S]=polyfit(x,y,1)
p = 1x2
4.9833 -1.1389
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
S = struct with fields:
R: [2x2 double] df: 7 normr: 8.4581 rsquared: 0.9542
[p,S,mu]=polyfit(x,y,1)
p = 1x2
13.6474 23.7778
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
S = struct with fields:
R: [2x2 double] df: 7 normr: 8.4581 rsquared: 0.9542
mu = 2x1
5.0000 2.7386
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
%----------------------------------------------------------------------------------------------------------------------
x =
1 2 3 4 5 6 7 8 9
y =
5 6 10 20 28 33 34 36 42
p =
4.9833 -1.1389
p =
4.9833 -1.1389
S =
R: [2x2 double]
df: 7
normr: 8.4581
p =
13.6474 23.7778
S =
R: [2x2 double]
df: 7
normr: 8.4581
mu =
5.0000
2.7386
  1 commentaire
Dyuman Joshi
Dyuman Joshi le 30 Mai 2024
You are, most likely, not using the appropriate syntax of polyval() to evaluate the polyfit output obtained, as Matt has shown below.

Connectez-vous pour commenter.

Réponse acceptée

Matt J
Matt J le 30 Mai 2024
Modifié(e) : Matt J le 30 Mai 2024
The result from polyfit is correct. I suspect you are simply not using polyval properly to evaluate the fit:
x=[1:9] ;
y=[5,6,10,20,28,33,34,36,42];
[p,S,mu]=polyfit(x,y,1);
xx=linspace(1,9);
plot(x,y,'o',xx,polyval(p,xx,S,mu)); legend Data Line-Fit
  8 commentaires
Torsten
Torsten le 30 Mai 2024
Modifié(e) : Torsten le 30 Mai 2024
The "two" polynomials p1 and p3 returned from polyfit define a unique polynomial p - only their polynomial coefficients a_i and b_i differ because of the different basis functions used in their development.
p1 is written in the standard basis {1,X,X^2,...,X^n} as
p1(X) = sum_{i=0}^{i=n} a_i * X^i
while p3 is written in the basis {1,((X-mean(xdata))/std(xdata)),((X-mean(xdata))/std(xdata))^2,...,((X-mean(xdata))/std(xdata))^n} as
p3(X) = sum_{i=0}^{i=n} b_i * ((X-mean(xdata))/std(xdata))^i
The b_i's are not backtransformed to the a_i's because of @Matt J ' s argument given below.
Matt J
Matt J le 30 Mai 2024
Modifié(e) : Matt J le 30 Mai 2024
It would be less confusing if the third form of the polyfit command also provided the same polynomial p as the first two forms.
The math operations required to undo the normalization would result in floating point errors in the unnormalized coefficients. The floating point errors could then negatively impact numerical accuracy when the polynomial is evaluated later.

Connectez-vous pour commenter.

Plus de réponses (1)

Steven Lord
Steven Lord le 30 Mai 2024
Modifié(e) : Steven Lord le 30 Mai 2024
From the polyfit documentation page: "[p,S,mu] = polyfit(x,y,n) performs centering and scaling to improve the numerical properties of both the polynomial and the fitting algorithm. This syntax additionally returns mu, which is a two-element vector with centering and scaling values. mu(1) is mean(x), and mu(2) is std(x). Using these values, polyfit centers x at zero and scales it to have unit standard deviation,"
If you call polyfit with three outputs, p is not a polynomial in x. It is a polynomial in the centered and scaled .
xdata=[1:9];
y=[5,6,10,20,28,33,34,36,42];
p = polyfit(xdata, y, 1);
Let's look at p symbolically.
psym = poly2sym(p);
polynomialInX = vpa(psym, 5)
polynomialInX = 
Now let's look at the polynomial in the centered and scaled .
[p, ~, mu] = polyfit(xdata, y, 1);
syms xhat
polynomialInXhat = vpa(poly2sym(p, xhat), 5)
polynomialInXhat = 
These look different. But what happens if we substitute the expression for into polynomialInXhat?
syms x
vpa(subs(polynomialInXhat, xhat, (x-mu(1))/mu(2)), 5)
ans = 
That looks the same as polynomialInX. What if we evaluate both polynomials, polynomialInX at the unscaled X data and polynomialInXhat at the scaled X data?
valueUnscaled = vpa(subs(polynomialInX, x, xdata), 5)
valueUnscaled = 
valueScaled = vpa(subs(polynomialInXhat, xhat, (xdata-mu(1))./mu(2)), 5)
valueScaled = 
The difference doesn't really matter that much for the 1st degree polynomial and the small magnitude x data you're using. But suppose you were doing something that required you to take the fourth power of a year, like if you were trying to fit the census data to the population?
load census
format longg
pUnscaled = polyfit(cdate, pop, 4)
Warning: Polynomial is badly conditioned. Add points with distinct X values, reduce the degree of the polynomial, or try centering and scaling as described in HELP POLYFIT.
pUnscaled = 1x5
1.0e+00 * 4.75430030603743e-08 -0.000355569614612858 1.00320581128586 -1264.43935834017 600203.36463964
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
That leading coefficient is tiny becuase you're working with large numbers when you raise 2020 to the fourth power. That's why you receive a warning.
2020^4
ans =
16649664160000
But if you'd centered and scaled the years from 1900 to 2020:
[pScaled, ~, mu] = polyfit(cdate, pop, 4)
pScaled = 1x5
0.704706162785502 0.92102307075127 23.4706157176829 73.8597813280959 62.2285498913524
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
mu = 2x1
1.0e+00 * 1890 62.0483682299543
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Now you're taking powers of numbers on the order of:
normalizedYears = normalize(cdate, 'center', mu(1), 'scale', mu(2))
normalizedYears = 21x1
-1.61164592805076 -1.45048133524568 -1.28931674244061 -1.12815214963553 -0.966987556830456 -0.80582296402538 -0.644658371220304 -0.483493778415228 -0.322329185610152 -0.161164592805076
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
and those numbers aren't nearly as large.
normalizedYears(end)^4
ans =
6.74650025299376
I'd much rather work with 6.7 than 16649664160000 and a leading coefficient near 0.7 rather than 4e-8.
  1 commentaire
T Hafid
T Hafid le 30 Mai 2024
Thanks Sir for the answer

Connectez-vous pour commenter.

Catégories

En savoir plus sur Mathematics and Optimization dans Help Center et File Exchange

Produits


Version

R2015a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by