Comparing each vector in a cell array to another vector
3 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Christopher Tran
le 5 Fév 2017
Modifié(e) : Stephen23
le 5 Fév 2017
Hello. I am wondering if there is an easier way to check vectors in my cell array are subsets of another given vector. I want to keep track of how many times the vectors in C are found in T, for many different T's. For example:
T = [1 2 3 4 5 6]
C = { [1 2], [3 4], [5,6], [7 8], [9, 10] }
% loop through each candidate
for j = 1:numCands
c = Ck{j};
% Candidate Counts. Check if candidate is a subset of the
% current transaction
if ismember(c,t)
candCounts(j) = candCounts(j) + 1;
end
% Candidate Tail Counts. Check if the tail of the candidate is
% a subset of the current transaction
if ismember(c(2:end),t)
candTailCounts(j) = candTailCounts(j) + 1;
end
end
What I have done is loop through each member of C and check using ismember() for each element. But this gets really long when C becomes very large. I was wondering if there was any clever way of vectorization which would result in a faster runtime. I would also like to add that T can be varying sized vectors. Not sure if that information is relevant or not.
Thanks in advance!
Edit: For future references here is what I ended up doing, which is similar to what the answers suggested:
% Make all candidates into matrix for faster calculation
allCands = vertcat(Ck{:});
% Get number of columns to get where there are subsets.
[~,col] = size(allCands);
% Find locations where the candidates are subsets of the
% transactions for both the candidates and tail counts.
memb1 = ismember(allCands,t);
memb2 = ismember(allCands(:,2:end),t);
% Subset check
subCheck1 = find(sum(memb1,2) == col);
subCheck2 = find(sum(memb2,2) == col - 1);
% Increment counts
candCounts(subCheck1) = candCounts(subCheck1) + 1;
candTailCounts(subCheck2) = candTailCounts(subCheck2) + 1;
1 commentaire
Walter Roberson
le 5 Fév 2017
Modifié(e) : Walter Roberson
le 5 Fév 2017
Does it happen to be the case that C and T only have integer elements in the range 0 to 65535 ?
Are all of the entries in C the same length? Are they always row vectors?
Réponse acceptée
Stephen23
le 5 Fév 2017
Modifié(e) : Stephen23
le 5 Fév 2017
Method one: one ismember call: the function ismember is going to to be slowest operation, so one simple speedup is to reduce how many times you call it: Note that a vector c will only be a member if the tail c(2:end) is a member, so you can test for the tail first, and only if the tail is a member do you need to check if the first element is also a member:
T = [1,2,3,4,5,6];
C = {[1,2],[3,4],[5,6],[7,8],[9,10]};
cntC = 0;
cntT = 0;
for k = 1:numel(C)
if ismember(C{k}(2:end),T)
cntT = cntT+1;
cntC = cntC+any(C{k}(1)==T);
end
end
Method two: Convert to a matrix: if all members of C are the same size vectors, then convert C to a matrix and perform one simple comparison using bsxfun:
M = vertcat(C{:});
U = reshape(T,1,1,[]);
X = any(bsxfun(@eq,M,U),3);
xT = all(X(:,2:end),2);
xC = xT&X(:,1);
cntT = sum(xT)
cntC = sum(xC)
5 commentaires
Stephen23
le 5 Fév 2017
"The problem with that is if the elements occur out of order, you would still get a match."
That is what a subset is, and the original question specifically asks for "check vectors in my cell array are subsets of another given vector".
For a subset the order is irrelevant.
Stephen23
le 5 Fév 2017
Modifié(e) : Stephen23
le 5 Fév 2017
@Christopher Tran: I am happy to take a look. Please clarify one point: do you want to match the values in the vectors in the same order, or is the order irrelevant (as per set theory and the definition of subset). For example, is [2,1] a subset of [1,2,3] or not?
Plus de réponses (1)
Walter Roberson
le 5 Fév 2017
If all members of C are the same length then you can use buffer() with overlap to create an array containing all the subsequences of that length, and transpose that and then convert C to a row matrix and then ismember() with rows option
1 commentaire
Christopher Tran
le 5 Fév 2017
Modifié(e) : Christopher Tran
le 5 Fév 2017
Voir également
Catégories
En savoir plus sur Operators and Elementary Operations dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!