# Intersect() with with repetition

57 views (last 30 days)

Show older comments

The syntax [C,ia,ib] = intersect(A,B,'rows') returns elements without repetitions. However, I need every potential combination of ia and ib that gives C. For example:

A=[1,1;1,1;1,2];

B=[0,1;1,1;1,1];

I need the output to be:

C=[1,1;1,1;1,1;1,1];

ia=[1;1;2;2];

ib=[2;3;2;3];

The answer here generates indcies into A and B that are intersecting elements: https://www.mathworks.com/matlabcentral/answers/338649-arrays-intersection-with-repetition However there are no correspondace between ia and ib generated by this method. e.g., A(ia,:)~=B(ib,:). it also doesn't give every potential combination of indices.

Any ideas?

Thanks

### Accepted Answer

Jan
on 9 Sep 2022

If you really need the redundant information in iAB_C, iAB_Cprime, idx1, idx2, this is faster than the original version tested with the data:

A = randi([0, 200], 1e6, 2);

B = randi([0, 200], 1e6, 2);

I get 1.12 s instead of 1.86 s.

function [C, iAB_C, iAB_Cprime, idx1, idx2] = repintersect_1b(A2, B2)

% Join data for faster sorting:

A = A2 * [4294967296; 1];

B = B2 * [4294967296; 1];

[uA, ~, iAx] = unique(A, "stable");

[a, idx] = sort(iAx);

aV = 1:numel(A);

aV = aV(idx).';

aI = RunLen(a);

[uB,~,iBx] = unique(B, "stable");

[b, idx] = sort(iBx);

bV = 1:numel(B);

bV = bV(idx).';

bL = cumsum([1, RunLen(b)]);

[C2, iuA, iuB] = intersect(uA, uB, "stable");

% Split data for the output:

C = [floor(C2 / 4294967296), rem(C2, 4294967296)];

iAB_C = cell(numel(iuA), 1);

a0 = 1;

for ii = 1:numel(iuA)

a1 = a0 + aI(ii); % Easier indexing for A

aa = aV(a0:a1 - 1);

a0 = a1;

na = numel(aa);

b0 = bL(iuB(ii)); % Need accumulated RunLength for B

b1 = bL(iuB(ii) + 1) - 1;

bb = bV(b0:b1);

nb = numel(bb);

qa = repmat(aa.', nb, 1); % Replace MESHGRID

qb = repmat(bb, 1, na);

iAB_C{ii} = [qa(:), qb(:)];

end

iAB_Cprime = cell2mat(iAB_C);

idx2 = cumsum(cellfun('size', iAB_C, 1)); % Faster than cellfun(@(x) size(x,1), C)

idx1 = [1; idx2(1:end-1) + 1];

end

% Helper function:

function n = RunLen(x)

d = [true; diff(x(:)) ~= 0]; % TRUE if values change

n = diff(find([d.', true])); % Indices of changes

end

I've omitted the temporarily used cell arrays.

Further time could be saved, if you do not create iAB_Cprime and the set of iAB_C and the indices, because both contain the same information.

##### 3 Comments

### More Answers (3)

the cyclist
on 18 Aug 2022

I frankly have not quite figured out the logic of what you want as output, but I'm pretty sure that you can construct what you want by using the ismember command instead of intersect. You'll probably need both "directions", and possibly all outputs:

A=[1,1;1,1;1,2];

B=[0,1;1,1;1,1];

[tf_ab, loc_ab] = ismember(A,B,"rows")

[tf_ba, loc_ba] = ismember(B,A,"rows")

##### 2 Comments

the cyclist
on 18 Aug 2022

Jan
on 18 Aug 2022

Edited: Jan
on 18 Aug 2022

A simple loop approach:

A = [1,1; ...

1,1; ...

1,2];

B = [0,1; ...

1,1; ...

1,1];

[C, iA, iB] = RepIntersectRows(A, B)

function [C, iA, iB] = RepIntersectRows(A, B)

% Sizes of inputs:

nA = size(A, 1);

nB = size(B, 1);

w = size(A, 2);

% Pre-allocate output:

C = zeros(nA * nB, w);

iA = zeros(nA * nB, 1);

iB = zeros(nA * nB, 1);

% Find matchs:

iC = 0;

for kA = 1:nA

a = A(kA, :);

for kB = 1:nB

if isequal(a, B(kB, :))

iC = iC + 1;

C(iC, :) = a;

iA(iC) = kA;

iB(iC) = kB;

end

end

end

% Crop unused elements:

C = C(1:iC, :);

iA = iA(1:iC);

iB = iB(1:iC);

end

If A and B have 1e4 rows, the runtime is 4 seconds already. But it is thought to clarify,what you exactly need. The output of your approach does not match the original question exactly. Before I optimize my code, I want to know, if it matchs your needs at all.

##### 4 Comments

Jan
on 18 Aug 2022

@Yi-xiao Liu: Matlab's unique and intersect are based on binary searchs, while the loops (e.g. in my prove of concept code) perform a linear search.

Binary search means, that the data are sorted at first, such that you do not have to compare all elements, but log2 of the elements only by dividing the inteval of interest by 2 in each step.

I'm try to find a shorter and faster solution, when I'm at home.

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!