What is the difference between different ways to do least square

35 vues (au cours des 30 derniers jours)
Zeyuan
Zeyuan le 13 Oct 2025 à 2:34
Modifié(e) : Matt J le 13 Oct 2025 à 15:47
Here I encounter this problem of using different ways to do least square. And I got different results (some are quite different). I want to know why. Basically, I tried to use different ways to compute ||Aθ-y||min. So I used these three methods.
theta_train_5k = ((A_train_5k'*A_train_5k)^-1)*A_train_5k'*y_train_5k;
% This is the result of least square
theta_train_5k_3 = A_train_5k\y_train_5k;
% This is also the result of least square
theta_train_5k_2 = lsqr(A_train_5k,y_train_5k);
% This is result of least square using lsqr
And I found different results.
theta_train_100 = ((A_train_100'*A_train_100)^-1)*A_train_100'*y_train_100;
theta_train_100_3 = A_train_100\y_train_100;
% This is also the result of least square for 100 data points
theta_train_100_2 = lsqr(A_train_100,y_train_100);
% This is result of least square using lsqr
For the above one, the result is even more strange. with theta_train_100 1000 to 100000 times larger than theta_train_3 and theta_train_2. So I was wondering when should I use which? Does it have something to do with the condition number or the singular value of the matrix?
Please help. Thank you in advance.
Variables are in the attachment

Réponse acceptée

Matt J
Matt J le 13 Oct 2025 à 2:54
Modifié(e) : Matt J le 13 Oct 2025 à 14:56
The train_100 system is underdetermined, so of course you aren't going to get a unique solution.
For the 5k data, the only reason you see a significant disagreement with lsqr is because you ran lsqr with too few iterations and too loose a tolerance. You can see below that adjusting this reduces the disagreement. In any case, mldivide() is considered the efficient and stable method for small, nonsparse systems (which yours is), so there is no reason to be using lsqr.
load myvariable
theta_train_5k = ((A_train_5k'*A_train_5k)^-1)*A_train_5k'*y_train_5k;
% This is the result of least square
theta_train_5k_3 = A_train_5k\y_train_5k;
% This is also the result of least square
theta_train_5k_2 = lsqr(A_train_5k,y_train_5k,1e-8,300);
lsqr converged at iteration 183 to a solution with relative residual 0.36.
pdiff=@(a,b) norm(a-b)/norm(a)*100; % percent disagreement function
pdiff(theta_train_5k_3, theta_train_5k )
ans = 1.1997e-11
pdiff(theta_train_5k_3, theta_train_5k_2 )
ans = 7.7810e-04
  3 commentaires
Zeyuan
Zeyuan le 13 Oct 2025 à 15:23
Also, I found out that if we add 1e-8,300 to the code, it will kinda overfit, so that testing accuarcy will go down by 0.2%
Matt J
Matt J le 13 Oct 2025 à 15:31
Modifié(e) : Matt J le 13 Oct 2025 à 15:47
I am a bit confused. If we do not get a unique solution for train_100 data, how can we still get results in theta_train_100, theta_train_100_2,theta_train_100_3?
Least squares solutions still exist even when non-unique (there will be infinitely many). But you cannot expect different methods to give you the same one..
Also, I found out that if we add 1e-8,300 to the code, it will kinda overfit, so that testing accuarcy will go down by 0.2%
That doesn't mean the least squares solver made a mistake. The equations you provided were still correctly solved, as we can see from the 3-way agreement between all the solver results.

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Descriptive Statistics dans Help Center et File Exchange

Produits


Version

R2025b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by