Why I cannot fit a function with more than five variables?

3 vues (au cours des 30 derniers jours)
huazai2020
huazai2020 le 5 Juin 2020
Commenté : Walter Roberson le 14 Juin 2020
Why I cannot fit a function with more than five variables?It gives the errors below, who can help me?
  20 commentaires
huazai2020
huazai2020 le 14 Juin 2020
what does this (B = [ones(nx,1), log(x)] \ log(y(:));)mean?
Walter Roberson
Walter Roberson le 14 Juin 2020
Consider the model
log(y) = log(b1) + b2*log(x1) + b3*log(x2) + b4*log(x3) + b5*log(x4) + b6*log(x5)
but suppose you have several x* and y values. Then you can create a series of equations
log(y(1)) = log(b1)*1 + b2*log(x1(1)) + b3*log(x2(1)) + b4*log(x3(1)) + b5*log(x4(1)) + b6*log(x5(1))
log(y(2)) = log(b1)*1 + b2*log(x1(2)) + b3*log(x2(2)) + b4*log(x3(2)) + b5*log(x4(2)) + b6*log(x5(2))
log(y(3)) = log(b1)*1 + b2*log(x1(3)) + b3*log(x2(3)) + b4*log(x3(3)) + b5*log(x4(3)) + b6*log(x5(3))
...
Now arrange those in matrix form:
log(y(:)) = [1, log(x1(1)), log(x2(1)), log(x3(1)), log(x4(1)), log(x5(1)), log(x6(1));
1, log(x1(2)), log(x2(2)), log(x3(2)), log(x4(2)), log(x5(2)), log(x6(2));
1, log(x1(3)), log(x2(3)), log(x3(3)), log(x4(3)), log(x5(3)), log(x6(3));
...
] * [log(b1); b2; b3; b4; b6; b6]
where * is algebraic matrix multiplication.
This can then be written more compactly as
log(y(:)) = [Column of 1s, log(x1(:)), log(x2(:)), log(x3(:)), log(x4(:)), log(x5(:)), log(x6(:))] ...
* [log(b1); b2; b3; b4; b6; b6]
and since your x1 = x(:,1) and x2 = x(:,2) and so on, the log(x1(:)), log(x2(:)) and so on can be written more compactly as log(x), so
log(y(:)) = [Column of 1's, log(x)] * [log(b1); b2; b3; b4; b6; b6]
This is a system of linear equations. If we say
b = log(y(:))
A = [Column of 1's, log(xdata)]
X = [log(b1); b2; b3; b4; b6; b6]
then we get the familiar A*X = b .
If A were square (that is, you had 7 samples) then mathematically you would multiply the left sides by inv(A), getting
inv(A) * A * X = inv(A) * b
and inv(A) * A would be the identity matrix, and inv(A) * b could be calculated as all of those values are known, so the vector of unknowns X = inv(A) * b
You have more than 7 samples, so you do not have a square system, so you cannot use inv(), but you can do the equivalent of
pinv(A) * A * X = pinv(A) * b
to get X = pinv(A) * b as a solution that attempts to minimize error.
The MATLAB operator A\b roughly calculates pinv(A) * b, but does so using a different method of minimizing error; A\b is a least-squared error method of making that calculation.
ones(nx,1) is the "column of 1s" mentioned earlier.
Now, this all would be a least-squared calculation in log space, and would be the best fit you could get in log space for that model. But you probably want least-squared calculation in linear space, so take the result as initial values to feed into a linear least squared fit routine.

Connectez-vous pour commenter.

Réponse acceptée

the cyclist
the cyclist le 7 Juin 2020
I think the primary problem is that some of your variables are highly correlated with each other, and therefore (a) add very little information to the model, and (b) will contribute to over-fitting because of the extra parameters. The following works:
data = xlsread('data_youhua.xlsx','Sheet1');
X = data(:,[1 3]);
y =data(:,6);
modelfun = @(b,x) b(1).*x(:,1).^b(2) .*x(:,2).^b(3);
beta0 = [-1 0.1 0.2];
mdl = fitnlm(X,y,modelfun,beta0);
predicted_y = predict(mdl,X);
figure
hold on
hd = plot(y,'.');
hp = plot(predicted_y,'.');
set([hd hp],'MarkerSize',24)
for ny = 1:numel(y)
hc = line([ny ny],[y(ny) predicted_y(ny)]);
set(hc,'Color','black')
end
legend([hd hp],{'data','prediction'})
print('-dpng','-r600','test.png')
and results in the following fit
Note that I did not draw the fit as a continuous line. The reason is that the x-axis here is not a continuous variable. It is just the ordinal count of your data points. There are almost certainly better ways to plot the comparison of the data and the fit, but this is at least not incorrect.
  15 commentaires
huazai2020
huazai2020 le 11 Juin 2020
What does you mean "Five independent variables for seventeen data points is just noise.", I do not understand
Rik
Rik le 11 Juin 2020
You have way too little data to fit that many variables. Only if your data fits the function perfectly does that make any sense.

Connectez-vous pour commenter.

Plus de réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by