Algorithm to extract linearly dependent columns in a matrix

Question

HN le 3 Août 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/574543-algorithm-to-extract-linearly-dependent-columns-in-a-matrix

Modifié(e) : Matt J le 8 Juin 2023

Réponse acceptée : John D'Errico

Is there any general or standard approach to extract columns that are linearly dependent from the given matrix ?

Thanks and any help is apperciated !

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

John D'Errico le 3 Août 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/574543-algorithm-to-extract-linearly-dependent-columns-in-a-matrix#answer_474610

Ouvrir dans MATLAB Online

It is difficult to answer this. Consider a simple matrix:

format short g
A = rand(4,5)
A =
      0.73796      0.36725     0.042904    0.0056783      0.45642
      0.13478      0.54974      0.60439      0.39645      0.36555
       0.4903      0.25921      0.63407      0.77345      0.95769
      0.43373      0.27658       0.6462      0.25778      0.23421

The matrix (since it is random) will be of full rank, thus 4 in this case.

EVERY column is linearly dependent. That is, We can write every column as a linear combination of the other 4 columns. I can argue problems exist with other matrices too.

A = magic(6)
A =
    35     1     6    26    19    24
     3    32     7    21    23    25
    31     9     2    22    27    20
     8    28    33    17    10    15
    30     5    34    12    14    16
     4    36    29    13    18    11
     
rank(A)
ans =
     5

So A here has rank 5. But again, we can write any column of A as a linear combination of the other 5. So which column is linearly dependent? They all are!

Perhaps you might get something out of the null space vector(s).

Anull = null(A);
Anull = Anull/Anull(1)
Anull =
            1
            1
         -0.5
           -1
           -1
          0.5

This gives us the linear combination of importance as:

A(:,1) + A(:,2) - 0.5*A(:,3) - A(:,4) - A(:,5) + 0.5*A(:,6) = 0

We can now solve for ANY of those columns, in terms of the others.

How it helps you, I don't really know, because I have no idea what you really want to do.

If I had to guess, what you really need is to learn enough about linear algebra, and perhaps what a pivoted QR decomposition might provide. Because once you have that pivoted QR, you also have enough to do almost anything you want to do.

[Q,R,E] = qr(A,0)
Q =
    -0.017647      0.64381     -0.20699      0.22818      0.49018          0.5
     -0.56472     -0.11589     -0.45002      0.59421     -0.33476  -3.7638e-16
     -0.15883      0.52674       -0.446     -0.47355     -0.15542         -0.5
     -0.49413   -0.0017098      0.37629      0.14508      0.58583         -0.5
    -0.088237      0.52963      0.62872      0.19923     -0.52605  -4.2484e-16
     -0.63531     -0.11878      0.13728     -0.55665    -0.059779          0.5
R =
      -56.666      -16.377      -42.107       -33.53       -33.53      -35.224
            0       53.915       18.611       30.231       30.676       29.048
            0            0       32.491      -7.9245      -8.9182      -11.289
            0            0            0       10.101       5.6138     -0.56312
            0            0            0            0       5.1649      -5.1649
            0            0            0            0            0  -2.0136e-15
E =
     2     1     3     6     4     5

We can use the QR to tell us much about the problem, as well as efficiently and stably provide all you need.

First, notice that Q(6,6) is essentially zero, compared to the other diagonal elements.

As well, the QR tells us that it decided column 5 (that is the last element of E) is the one it wants to drop out.

17 commentaires
Afficher 15 commentaires plus anciensMasquer 15 commentaires plus anciens

John D'Errico le 3 Août 2020

Ouvrir dans MATLAB Online

A random 6x3 matrix will almost never be rank deficient, at least, not if it is truly random. A slightly better measure of rank deficiency is rank. See how it will work in an example. I'll create a 6x4 array to make things interesting.

A = [rand(6,1),rand(6,1)*rand(1,3)];

The matrix A is known to be a rank 2 matrix based on how I created it. As such, again, I could pick column 1 and ANY of the other three columns, thus discarding any 2 of the columns 2:4. I'll show you how this will work.

rank(A)
ans =
     2

So we learn from rank the matrix is rank 2. Therefore we know we need to find two independent columns.

[Q,R,E] = qr(A,0);
diag(R)
ans =
      -1.5743
      0.77783
  -1.3837e-16
  -3.8753e-17
E
E =
     1     3     2     4

Did it work? QR does agree with rank. The pivoted QR is pretty much as good as rank, but rank gives you a hard number, whereas here we are counting the number of diagonal elements that are close to zero. So rank is a much better tool to just predict the numerical rank of a matrix.

Again, look at the diagonal elements of R. Here we see two elements that are essentially zero compared to the others.

Therefore we discard the last TWO columns indicated by the last two elements of E, as being the "most" dependent columns. What remains are columns 1 and 3.

A(:,E(1:2))
ans =
      0.55352      0.95123
       0.9532      0.68952
      0.76997      0.17197
      0.53268     0.037538
      0.60707      0.25574
       0.1352      0.26301

So the pivoted QR gives you what you need, and depending on what you needed to do, the pivoted QR might also provide other useful information. For example, you can use it for a projection into a rank 2 subspace. I would use it (and have used it) for example if I were working with triangles in 3 or 4 dimensions, where I then need to convert the problem into a 2-dimensional one.

HN le 4 Août 2020

Modifié(e) : HN le 5 Août 2020

Ouvrir dans MATLAB Online

Thank you so much John D'Errico . what is case if our A is 3 x 6 ? R has only three diagonal elements to check if its value is comparatively close to zero to be dicarded ? Example:

    A =
         0    1.0000         0   -0.0073         0   -0.2499
   -0.8660   -0.5000         0   -0.0320    0.0554   -0.2416
    0.8660   -0.5000         0    0.0283    0.0491   -0.2434
    
[Q, R, E] = qr(A,0)
Q =
   -0.8165    0.0000    0.5774
    0.4082   -0.7071    0.5774
    0.4082    0.7071    0.5774
R =
   -1.2247    0.0000    0.0060    0.0045    0.0427         0
         0    1.2247   -0.0013    0.0427   -0.0045         0
         0         0   -0.4243   -0.0063    0.0603         0
E =
     2     1     6     4     5     3  

The result is correct since I know the rank and independent columns from the physical meaning. However, discarding dependents based on the diagonal element of R seems to be confusing. How this can be explained ? How the procedure can be explained in general, like pseudocode or something ?

Bruno Luong le 4 Août 2020

Modifié(e) : Bruno Luong le 5 Août 2020

Ouvrir dans MATLAB Online

@HN, QR is equivalent (in practice MATLAB uses given rotation) to the Gram-Schmidt orthorgonalization process on column vectors of the matrix.

It build a chain of growing subspaces

V1 = span({c1})

V2 = span({c1 c2})

...

V6 = span({c1 c2 ... c6})

where c1, ... c6 are columns of the matrix A.

When operates with permutation (3rd output argument p requested)

[Q,R,p] = qr(A,'vector')

at every iteration of the process, QR selects the "best" columns in the sense that the new vector ck has the largest orthogonal components (per unit) to the previous space, and this orthogonal component is R(k,k), element of the diagonal of R. When the matrix rank r is reached, R(k+1,k+1) = 0, because of numerical error we never test == 0 but with some tolerance. The (r) previous vectors form an (r) independent columns vectors.

That's the meaning of diag(R).

The index of the columns selected as above avec p, meaning

c1 = A(:,p(1))

c2 = A(:,p(2))

c6 = A(:,p(6))

p(1:r) are therefore the indices of independent colums of A .

Nguyen Le le 8 Juin 2023

How fast is this method compare to build a chain of growing subspaces, check for the rank and only keep the columns that increases the rank of the subspaces

Matt J le 8 Juin 2023

Modifié(e) : Matt J le 8 Juin 2023

Ouvrir dans MATLAB Online

@Nguyen Le I suspect that method would be a lot slower because it could require O(N) svd operations to compute the rank. Additionally, it would be difficult to decide what numerical tolerance parameter to choose for the submatrix rank computations. Consider for example,

A=[1e-20 1;
   1e-20 1];
B=[1e-20 2e-20;
   1e-20 2e-20];

Numerically, the first column of A should be considered to have rank 0, whereas the same column in matrix B should be considered to have rank 1. But there's no way to know that without somehow choosing rank()'s tolerance parameter adaptively based on the entire matrix. With the selections below, we get the opposite results from what we would want:

rank(A(:,1))
ans = 1
rank(B(:,1),1e-10)
ans = 0

Connectez-vous pour commenter.

Answer 2

Matt J le 4 Août 2020

3
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/574543-algorithm-to-extract-linearly-dependent-columns-in-a-matrix#answer_474988

See this FEX contribution

https://www.mathworks.com/matlabcentral/fileexchange/77437-extract-linearly-independent-subset-of-matrix-columns

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

HN le 4 Août 2020

Modifié(e) : HN le 4 Août 2020

Thank you Matt J

I just found it a bit earlier before you post it here and it clears everything for me !

Thank you

Connectez-vous pour commenter.

Answer 3

Bruno Luong le 3 Août 2020

2
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/574543-algorithm-to-extract-linearly-dependent-columns-in-a-matrix#answer_474601

Modifié(e) : Bruno Luong le 3 Août 2020

Ouvrir dans MATLAB Online

Test matrix (10 x 6) with rank 4

M = rand(10,4)*rand(4,6)

Automatic selection of independent columns of M

[Q,R,p] = qr(M,'vector');
dr = abs(diag(R));
if dr(1)
    tol = 1e-10;
    r = find(dr>=tol*dr(1),1,'last');
    ci = p(1:r) % here is the index of independent columns
else
    r = 0;
    ci = [];
end
% Submatrix with r columns (and full column rank).
Mind=M(:,ci)
% Those three rank estimation should be equals 
% if it's not then the cause if MATLAB selection of tolerance for rank differs with the above
% and usage of more robust SVD algorithm for rank estimation
rank(Mind)
rank(M)
r

14 commentaires
Afficher 12 commentaires plus anciensMasquer 12 commentaires plus anciens

Bruno Luong le 5 Août 2020

Modifié(e) : Bruno Luong le 5 Août 2020

Ouvrir dans MATLAB Online

Matt: "but it seems to rely on the assumption that the rows of the matrix abs(R) from the QR decomposition are maximized at the diagonal."

Yes this statement is true

abs(R(i,i)) >= abs(R(i,j)) for all j>=i

But this is not only an assumption, it is a consequence of the pivoting selection of QR algo with permutaion/.

Proof

At iteration #i, the algorithm selects the remaining (n-i+1) columns of A that maximizes the projection on orthogonal of the subspace V_{i-1} (notation used in my explanation below John's answer), which is abs(R(i,i)).

For simplification let denote W_{k} := orthogonal of the subspace V_{k}.

So later on at the iteration step j > i, abs(R(i,j)) is the norm of the projection of A(:,p(j)) on span(Q(:,i)).

As span(Q(:,i)) is included in W_{i-1} by construction, thus

norm(projection x on span(Q(:,i))) <= norm(projection x on W_{i-1}) for any vector x in R^n.

Apply this for x = A(:,p(j)), we get

    abs(R(i,j)) <= norm(projection A(:,p(j)), on W_{i-1}) <= norm(projection A(:,p(i)), on W_{i-1}) = abs(R(i,i))

This is the proof of why diag(R) dominates off-diagonal terms row-wise, therefore you can remove the statement as "assumption".

Henry Wolkowicz le 30 Oct 2021

I now see the problem, i.e. qr behaves differently with sparse input matrix X. It is fine with X=full(X)

Bruno Luong le 31 Oct 2021

Ouvrir dans MATLAB Online

When you apply qr with permutation on sparse matrix S

[Q,R,p] = qr(S,'vector')

MATLAB returns the permutation to have R having "good" sparse pattern, and does not make the diagonal

abs(diag(R))

decreasing (this requirement usually makes R fully filled with non-zeros elements).

Connectez-vous pour commenter.

Algorithm to extract linearly dependent columns in a matrix

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

17 commentaires
Afficher 15 commentaires plus anciensMasquer 15 commentaires plus anciens

Plus de réponses (2)

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

14 commentaires
Afficher 12 commentaires plus anciensMasquer 12 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

Algorithm to extract linearly dependent columns in a matrix

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

17 commentaires Afficher 15 commentaires plus anciensMasquer 15 commentaires plus anciens

Plus de réponses (2)

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

14 commentaires Afficher 12 commentaires plus anciensMasquer 12 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

17 commentaires
Afficher 15 commentaires plus anciensMasquer 15 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

14 commentaires
Afficher 12 commentaires plus anciensMasquer 12 commentaires plus anciens