Create least squares fit for linear combinations
Afficher commentaires plus anciens
Hi, I'm trying to create an unmixing alogirthm and want help in doing so.
To start off, I have 2 values (let's say x1 and x2) that make up an unknown (let's say y1). I'm trying to figure out what combination of these values make up the unknown, so the equation would be:
y1 = (a1 * x1) + (a2 * x2)
where a1 + a2 = 1, as they're percentages of each x value.
I'm trying to understand how to create a function that would find a value of a1 and a2 to minimize the difference between the unknown and the combination. In essence, I'm trying to minimize using the least squares fit:
y1 - (a1*x1) + (a2*x2)
Any help would be appreciated, thanks!
1 commentaire
Matt J
le 15 Avr 2022
Isn't this question the same as your previous one?
Réponses (3)
Bruno Luong
le 13 Avr 2022
first write
a2 = 1 - a1
replace in the first equation with y, solve for a1, then a2
Matt J
le 15 Avr 2022
I'm assuming x1,x2, and y1 are all equal-sized column vectors.
LHS=x1(:)-x2(:);
RHS=y1(:)-x2(:);
a1=min(max(LHS\RHS,0),1);
a2=1-a1;
a=[a1,a2];
9 commentaires
Bruno Luong
le 16 Avr 2022
Modifié(e) : Bruno Luong
le 16 Avr 2022
If x1 and x2 has noise; it's more correct to use total-least square regression
dx = x1-x2;
dxy = y-x2;
%% least-squares
% a1 = dot(dxy,dx) / dot(dx,dx)
%% total least-squares
[~,~,V] = svd([dx(:) dxy(:)], 0);
a1 = -V(1,2) / V(2,2)
Image Analyst
le 16 Avr 2022
Does that guarantee that the proportions (coefficients) sum to exactly 1 as required? Or do we need an additional step to make sure, like
a = a / sum(a)
Bruno Luong
le 16 Avr 2022
Modifié(e) : Bruno Luong
le 16 Avr 2022
If you want to be pedantic Image Analyst, there is no such thing of "sum to exactly 1" in finite precision arithmetic:
a=[1 eps(1)/2 eps(1)/2]
sum(a)==1
sum(flip(a))==1
Image Analyst
le 16 Avr 2022
@Bruno Luong well of course. But, to the precision of the computer, there may be a physical reason why the weights may need to sum to 1, like if they had 100 ml of fluid and applied certain, varying proportions to two beakers and did lots of experiments to get measured x1, x2, and y values. Now you wouldn't want a1 to be 40% and a2 to be 50% when you know for a fact, from the physics of the situation, that they need to add to 100%.
Perhaps I'm reading too much into the original poster's statement "a1 + a2 = 1, as they're percentages of each x value." Perhaps @Michelle Kwan could clarify how important that summing to 1 requirement is.
Bruno Luong
le 16 Avr 2022
Modifié(e) : Bruno Luong
le 16 Avr 2022
Stop being pedantic even if you normalize (for arbitrary array a)
a = a/sum(a)
you relies on the fact that MATLAB sum function that perform the sum in certain order, which is NOT universal result (sum function changes behavior just recently if you are not awared about it). You think you correctly normalize using MATLAB sum followed by a division? Think again.
a=randn(1,1e7);
a=a/sum(a);
sum(a)==1
For only two numbers I bet you to be able to find an example of 0 <= a1 <=1 such that
a2 = 1-a1
a1+a2 == 1 % return false
Matt J
le 16 Avr 2022
It's not clear to me that total least squares extends so easily when you have constraints on the variables. I think you would have to fall back to a 3-variable optimization with lsqlin,
C=[x1,x2,-y1];
C=C-mean(C);
d=[0;0;0];
a=lsqlin(C,d,[],[], [1,1,-1],0,d);
a=a(1:2)/a(3);
Bruno Luong
le 16 Avr 2022
Modifié(e) : Bruno Luong
le 16 Avr 2022
Your code lsqlin doesn't seem to work.
x1=zeros(1,1000);
x2=ones(1,1000);;
y=0.1*x1+0.9*x2;
x1=x1+0.1*randn(size(x1));
x2=x2+0.1*randn(size(x2));
y=y+0.01*randn(size(y));
dx = x1-x2;
dxy = y-x2;
% leastsquare
a1=dot(dxy,dx)./dot(dx,dx)
% total least-square
[~,~,V] = svd([dx(:) dxy(:)], 0); % find the SVD of Z.
a1 = -V(1,2) / V(2,2)
% Matt code crash fix
C=[x1(:),x2(:),-y(:)];
C=C-mean(C);
d=zeros(size(C,1),1);
a=lsqlin(C,d,[],[], [1,1,-1],0,[0;0;0]);
a=a(1:2)/a(3);
a(1)
x1=zeros(1,1000);
x2=ones(1,1000);
y=0.1*x1+0.9*x2;
x1=x1+0.2*randn(size(x1));
x2=x2+0.2*randn(size(x2));
y=y+0.02*randn(size(y));
dx = x1-x2;
dxy = y-x2;
% leastsquare
a1=dot(dxy,dx)./dot(dx,dx)
% total least-square
[~,~,V] = svd([dx(:) dxy(:)], 0); % find the SVD of Z.
a1 = -V(1,2) / V(2,2)
%total least square constrained
C=[x1(:),x2(:),-y(:)];
d=zeros(size(C,1),1);
a=lsqlin(C,d,[],[], [1,1,-1],0,[0;0;0]);
a=a(1:2)/a(3);
a(1)
Better but I don't think what you call lsqlin is Total Least squares according to Wikipedia, where the Frobenius norm of the augmented error matrix
% norm([E F],'fro')
is minimize, not
% norm(C*aa,2)
Your solution and mine clearly differ if you run smaller numerical example.
x1=zeros(1,10);
x2=ones(1,10);
y=0.1*x1+0.9*x2;
x1=x1+0.1*randn(size(x1));
x2=x2+0.1*randn(size(x2));
y=y+0.01*randn(size(y));
dx = x1-x2;
dxy = y-x2;
% leastsquare
a1=dot(dxy,dx)./dot(dx,dx)
% total least-square
[~,~,V] = svd([dx(:) dxy(:)], 0); % find the SVD of Z.
a1 = -V(1,2) / V(2,2)
%total least square constrained
C=[x1(:),x2(:),-y(:)];
d=zeros(size(C,1),1);
a=lsqlin(C,d,[],[], [1,1,-1],0,[0;0;0]);
a=a(1:2)/a(3)
Bruno Luong
le 17 Avr 2022
Modifié(e) : Bruno Luong
le 17 Avr 2022
I also make a flawed code of total leat squares earlier (under comment of Matt's answer), since dx and dxy are correlated so my TLSQ implement on those vectors is not the appropriate. The better implementation is
n=1000;
x1=zeros(1,n);
x2=ones(1,n);
y=0.1*x1+0.9*x2;
x1=x1+0.1*randn(size(x1));
x2=x2+0.1*randn(size(x2));
y=y+0.1*randn(size(y));
% total least-square
% assuming ccov([x1(:) x2(:) y(:)]) is sigma*eye(3)
v = (y+x1-2*x2)/sqrt(6);
w = (y-x1)/sqrt(2);
[~,~,V] = svd([v(:) w(:)], 0);
V22 = V(2,2)*sqrt(6);
V12 = V(1,2)*sqrt(2);
a1 = (V22-V12)/(V22+V12);
a1 = min(max(a1,0),1);
a2 = 1-a1;
a = [a1; a2]
1 commentaire
Bruno Luong
le 17 Avr 2022
Doing some montecarlo to evaluate the precision of each method (script attached). LSQ and LSQLIN seems to have some systematic errors (bias).

Catégories
En savoir plus sur Linear Least Squares dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!