Finding common values in a matrix and create a chain

I have a A matrix 8124x4 and I want to save in a different matrix B all the rows that have the same value in a single element (could be different columns), the values dont have any type of relation in the A matrix.
Example:
,taking this as an example for a smaller A matrix, I wanna save the 1st row in B, then the 4th row also because it has a "12" in it, then the second row should also be saved in B because it has a "10" (like the second row).

4 commentaires

dpb
dpb le 24 Avr 2020
Modifié(e) : dpb le 24 Avr 2020
Is the first column data or a counter?
the first column is not data (its like an ID for that row), i just wanna compare the values of the 2nd 3rd and 4th.
dpb
dpb le 24 Avr 2020
Figured, just checkin'...
BTW, for future, don't post images of data; paste the text so folks can just cut 'n paste for testing. Formatting as code is best...
ok, thanks

Connectez-vous pour commenter.

 Réponse acceptée

Ameer Hamza
Ameer Hamza le 24 Avr 2020
Modifié(e) : Ameer Hamza le 25 Avr 2020
Try something like this
A = [1 9 12 17; 2 3 90 10; 3 32 55 22; 4 12 2 10; 13 30 40 70; 89 101 90 98; 7 55 200 300; 10 39 29 122; 13 219 100 122; 1000 3233 4003 1220; 8328 12 32 124];
B_counter = 1;
B{B_counter}(1, :) = A(1, :);
last_rows = A(1, :);
A(1, :) = [];
count = 2;
for i=1:size(A,1)
[r, ~] = find(ismember(A(:, 2:end), last_rows(:, 2:end)));
if isempty(A)
break;
elseif isempty(r)
B_counter = B_counter + 1;
B{B_counter}(1, :) = A(1, :);
last_rows = A(1, :);
A(1, :) = [];
count = 2;
else
B{B_counter}(count:count+numel(r)-1, :) = A(r, :);
last_rows = A(r, :);
A(r, :) = [];
count = count+numel(r);
end
end
B = cellfun(@(b) {unique(b, 'rows', 'stable')}, B);

24 commentaires

yes for that example it works but for the one i need it appears an error i don't know why.
"Subscripted assignment dimension mismatch."
"error in line 11"
"B(count, :) = A(r, :);"
Can you attach real matrix. It will make it easier to debug the issue.
Armindo Barbosa
Armindo Barbosa le 24 Avr 2020
Modifié(e) : dpb le 25 Avr 2020
The original one has 8124 rows,
I had some random rows to the one above and now has the same bug.
A = [1 9 12 17; 2 3 90 10; 3 32 55 22; 4 12 2 10; 13 30 40 70; 89 101 90 98; 7 55 200 300; 10 39 29 122; 13 219 100 122; 1000 3233 4003 1220; 8328 12 32 124];
Armindo, this problem happen when two rows meet the criteria. For example, the first row is
1 9 12 17
and if you look at 4th and 11^th row
4 12 2 10 % 4th row have 12
8328 12 32 124 % 11th row have 12
both have 12 in them.
In this situation, I have modified the code in my answer to only add the first row in matrix B and leave the 2nd in matrix A. Also, the process will stop, when there is no row in matrix A matching an element in last added row to matrix B.
I think its not working cause i tested for a value in my A matrix, and that value appears 8 times in the beggining of A matrix and only 2 times in the B matrix
yes, I mentioned that It will only add the first row in matrix B. Do you want to add all matching rows in matrix B?
ah yes, do you know how to do that?
Armindo, See the updated code. Also note that the matrix B for the example matrix A is
1 9 12 17
4 12 2 10
8328 12 32 124
3 32 55 22
7 55 200 300
It stops after 5 rows, because no other row of matrix A contain 55, 200, or 300.
the results for the matrix B with that code show even less rows than before.
It is because there is still ambiguity in the rule you defined. Can you clarify the rules in more detail? It follows the following rules
1. In the first iteration, it adds the first row of matrix A to matrix B.
2. In the next iterations, It takes the last row of matrix B and finds out which rows of matrix A have those elements. It adds those rows to matrix B.
3. It stops when there is no element in matrix A, matching the last row of matrix B.
all the 3 rules are correct.
In the rule 2 all the rows in A added to B have to be checked in finding equal values in A, if you add all at the same time it is not only the last one that need to be checked. (i dont know if it is that the problem).
Example: after the rule 1, you verify in the first iteration of rule two that there are 10 rows with the same elements. All those 10 rows have to be checked to have common values with A after, not only the 10th.
This is more clear now. Please check the updated code. Now check the updated code. Now In each iteration, all the rows added in the last iteration are used to find the common element in matrix A.
now there is another issue, by doing that there are repeated rows in the B matrix.
I am not sure why that would happen. My code removes the rows from matrix A as they are added to matrix B. This problem does not appear in the small example matrix A I have in the answer.
how can i delete repeated rows just to see if thats the only issue?
I think it could be the fact that if one of the rows has 2 elements in common and not only one that rows it added 2 times. Look at the first rows of the B matrix the element 2733 and 2850 are there two times because they have 2 elements in comon with the first row.
You can try this to delete duplicate rows while keeping the order same
unique(B, 'rows', 'stable')
Yes i think that solved it! Just another question, if I had to do a loop until i empty the A matrix (I know it should be 50 iterations to empty it), how can i create several matrix in a loop (B1, B2 ; B3 ....B50).
Please try the updated code. It creates a cell array of B matrices.
Appears this error
Cell contents assignment to a non-cell array
object.
Error in Untitled (line 47)
B{B_counter}(1, :) = A(1, :);
dpb
dpb le 25 Avr 2020
Have you tried my alternate solution yet?
Armindo, can you try your code after clearing all the existing variable in your code. I guess you can try after adding the line 'clear B' like this
clear B;
B_counter = 1;
B{B_counter}(1, :) = A(1, :);
last_rows = A(1, :);
A(1, :) = [];
thanks!!
Glad to be of help.

Connectez-vous pour commenter.

Plus de réponses (1)

dpb
dpb le 24 Avr 2020
Modifié(e) : dpb le 25 Avr 2020
Same idea, different cat skinned to get there...
u=unique(A(:,2:end)); % look for these values' places in A
ix=arrayfun(@(u) find(A(:,2:end)==u),u,'UniformOutput',false); % linear index in array
ix=ix(cellfun(@(r) length(r)>1,ix)); % keep only >1 match
[r,~]=cellfun(@(i) ind2sub(size(A(:,2:end)),i),ix,'uni',0); % back to r,c subscripts
r=unique(vertcat(r{:})); % only use row once
B=A(r,:) % those rows whole array
% with your A(:,2:end) results in
>> B
B =
9 12 17
3 90 10
12 2 10
>>
I just kept the data portion of the array, you can change references to A to A(:,2:end).
Also NB: the second output argument from ind2sub MUST be present even though are throwing it away...otherwise just returns the linear index again instead of the row.
ADDENDUM:
For your addended array above above yields
>> B
B =
1 9 12 17
2 3 90 10
3 32 55 22
4 12 2 10
89 101 90 98
7 55 200 300
10 39 29 122
13 219 100 122
8328 12 32 124
>> sortrows(B,1)
ans =
1 9 12 17
2 3 90 10
3 32 55 22
4 12 2 10
7 55 200 300
10 39 29 122
13 219 100 122
89 101 90 98
8328 12 32 124
>>
What's not to like given initial description of wanted result?

4 commentaires

it shows this error:
Error using horzcat
Dimensions of matrices being concatenated are
not consistent.
Error in teste (line 6)
B=A(unique([r{:}]),:);
dpb
dpb le 25 Avr 2020
Modifié(e) : dpb le 25 Avr 2020
That line isn't in the revised code above...it's
r=unique(vertcat(r{:}));
Be sure to use the whole revised code segment, not just pieces as I recast it to be more user-readable as to what it's doing each step from initial...as well as it has built into it the ignoring of the first column.
This does have the assumption that each data row is unique set; a duplicate there would also show up in the final result. It's possible to code for that, but the data don't show it and the specification/problem description didn't require it. :)
the B matrix appears equal to the A in the end
Only if all lines have duplicates or there are values duplicated across each line.
The result for your sample array is
>> whos A B
Name Size Bytes Class Attributes
A 11x4 352 double
B 9x4 288 double
>>
so two lines didn't have duplicates in other line. Visual inspection confirms there are no row in B that are unique in all three data elements.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Matrices and Arrays dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by