How to remove dependent rows in a matrix?

Let A be an m by n matrix whose rows are linearly dependent. I want to remove rows from A such that the rank does not decrease. How can I find such rows of A?

 Réponse acceptée

Matt J
Matt J le 5 Oct 2012
function [Xsub,idx]=licols(X,tol)
%Extract a linearly independent set of columns of a given matrix X
%
% [Xsub,idx]=licols(X)
%
%in:
%
% X: The given input matrix
% tol: A rank estimation tolerance. Default=1e-10
%
%out:
%
% Xsub: The extracted columns of X
% idx: The indices (into X) of the extracted columns
if ~nnz(X) %X has no non-zeros and hence no independent columns
Xsub=[]; idx=[];
return
end
if nargin<2, tol=1e-10; end
[Q, R, E] = qr(X,0);
if ~isvector(R)
diagr = abs(diag(R));
else
diagr = R(1);
end
%Rank estimation
r = find(diagr >= tol*diagr(1), 1, 'last'); %rank estimation
idx=sort(E(1:r));
Xsub=X(:,idx);

11 commentaires

Kees Roos
Kees Roos le 6 Oct 2012
Thanks a lot. It works fine.
Harry Coules
Harry Coules le 23 Jan 2014
Matt, I'm a little unclear as to how the rank estimation tolerance is working there. Could you explain it please? It looks like it works differently to the estimation tolerance in the rank function.
Matt J
Matt J le 23 Jan 2014
Modifié(e) : Matt J le 24 Jan 2014
Hi Harry,
When you call QR with the syntax shown, diagr = abs( diag( R ) ) will be sorted with descending elements and the number of trailing zeros it shows gives you the nullity of X. To decide whether an element is small enough to be considered a trailing zero, we do
r = find(diagr >= tol*diagr(1), 1, 'last');
which compares all of diagr's elements to its maximum element diagr(1). Note that this threshold selection method is adaptive to the magnitude of elements in X.
The rank() function is a bit naive, in that it doesn't scale its threshold tolerance relative to the magnitude of the X(i,j). That's the reason for not using it here.
Now that I look at it, though, there is a small bug
diagr = R(1);
should really be
diagr = abs(R(1));
Harry Coules
Harry Coules le 24 Jan 2014
Hi Matt, Thanks a lot for that. Sure, the rank function doesn't scale. But also it applies the tolerance to a vector of singular values calculated using svd rather than to the leading diagonal of the R-matrix. Can you explain the relationship between the two?
Matt J
Matt J le 24 Jan 2014
Modifié(e) : Matt J le 24 Jan 2014
I don't claim there is a relationship, other than that both the SVD and QR decompositions of a matrix give information about its rank, and that both work by rewriting the columns of the matrix as a linear combination of orthogonal basis vectors.
You could do something similar to what I've done via QR with SVD instead, but with a more robust thresholding rule than what rank() currently does, e.g.,
s=svd(X);
rank=nnz(s>=max(s)*tol);
SVD seems to be more computationally demanding than QR, however,
>> X=rand(1000); tic;[q,r,e]=qr(X,0);toc; tic;s=svd(X);toc
Elapsed time is 0.275871 seconds.
Elapsed time is 0.380738 seconds.
mohsen
mohsen le 27 Juin 2014
Hi Matt J
I have a 398*225 matrix and it has rank 225. I used upper function to remove some raw without decreasing rank . but lincols function returns a 398*160 matrix that has rank 160.
How can I fix it ?
thanks
Matt J
Matt J le 27 Juin 2014
Modifié(e) : Matt J le 27 Juin 2014
Hi mohsen,
I recommend that you attach your matrix in a .mat file, so that it can be studied.
mohsen
mohsen le 28 Juin 2014
I am really grateful for your help
mohsen,
When I do plot(svd(docoeff)), I get the plot below, which to my eye says that the rank of your matrix is well below 200.
Nevertheless, if you do consider such tiny singular values significant, you can set the rank threshold lower
>> X=licols(docoeff,1e-15); whos X
Name Size Bytes Class Attributes
X 398x225 716400 double
mohsen
mohsen le 28 Juin 2014
I calculate rank with Matlab rank() function. it says the rank is 225
I must decrease raws from 398 to 261 without decreasing rnak,and you said the licols function removes raws ,but it removes columns.
Matt J
Matt J le 28 Juin 2014
Modifié(e) : Matt J le 28 Juin 2014
Matlab's rank() function is not to be trusted blindly (as you can see from my previous plot). If nothing else, rank is subjectively dependent on the tolerance parameter that you use, just like I showed you that licols is. You chose to use the default tolerance, but a different choice would give you a different result, e.g.,
>> rank(docoeff,2)
ans =
203
I did not say that licols() removes rows. The help text clearly says that it removes columns. However, you can certainly use it to remove rows by transposing:
>> X=licols(docoeff.').'; whos X
Name Size Bytes Class Attributes
X 160x225 288000 double

Connectez-vous pour commenter.

Plus de réponses (2)

Jos (10584)
Jos (10584) le 24 Jan 2014
Another, very straightforward, approach is to include them one by one and observe the changes in rank … (I agree that this is not so elegant!).
N = size(A,1) ; % number of rows
IncludeTF = false(N,1) ; % by default, exclude all rows, except ...
IncludeTF(1) = true ; % first row which can always be included
R0 = rank(A) ; % the original rank
for k = 2:N, % loop over all rows
B = A(IncludeTF,:) ; % select the currently included rows of A
IncludeTF(k) = rank(B) < R0 ; % include in B when the rank is less
end
isequal(rank(B), R0) % check!

1 commentaire

Jeel Bhavsar
Jeel Bhavsar le 24 Nov 2018
I have the same question with gf matrix.Does this code work for gf(galois field) matrix?

Connectez-vous pour commenter.

Arash Rabbani
Arash Rabbani le 24 Août 2019
This is a shorter version of Jos solution if you needed:
R1=1;
for I=1:size(A,1)
R2=rank(A(1:I,:));
if R2~=R1; disp(I); end
R1=R2+1;
end

1 commentaire

Arash Rabbani
Arash Rabbani le 24 Août 2019
It displays the rows with linear dependany to other rows.

Connectez-vous pour commenter.

Catégories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by