Efficient way to use regexp and contains and matching

39 vues (au cours des 30 derniers jours)
Tiasa Ghosh
Tiasa Ghosh le 7 Sep 2018
Commenté : Tiasa Ghosh le 12 Sep 2018
Hello!
I am using the following code to match two cell array contents. But it takes way too long to process. Can anybody suggest a better way to code the same thing?
validVar={};
str = {'abc==';'bac[2]';'fuh[2]';'fgh'};
list={'abc(1)';'cde';'fgh'};
for x=1:numel(list)
expression = sprintf('%s..',list{x});
for y=1:numel(str)
if ~isempty(regexp(str{y},expression,'match')) || contains(str{y},list{x})
validVar=[validVar;list{x}];
end
end
end
Also, the result gives me validVar with 'fgh' only but I want 'abc(1)' in the list as well since it is a part of str{1}. Is there a way to match the entries of list with str in such a way that even if a part of list entry matches part of str entry then it should be listed under validVar.
  3 commentaires
Tiasa Ghosh
Tiasa Ghosh le 7 Sep 2018
My mistake. I have edited the question with examples and more bugs. I am using regexp as one search pattern so that the expression match to be used can have extended parts and the condition turns true even if a part of the string matches with part of the input.
Greg
Greg le 8 Sep 2018
First, your performance is suffering because you're looping over both lists. Both regexp and contains will work on a vector with a scalar, removing one of the loops.
Second, if you know how to use regexp expertly (this is not a dig - regexp is extremely powerful but even more difficult to master), you could do all of your checking with one expression.
Finally, your requirements are very ill-formulated. What in the word does "... even if a part of list entry matches part of str entry" mean? A part of bac[2] matches a part of cde - the c character. I'm sure this isn't what you had in mind, so you need more explicit rules for validVar.

Connectez-vous pour commenter.

Réponse acceptée

Guillaume
Guillaume le 10 Sep 2018
Right now, your double loops can be simplified to:
validVar = {};
for x = 1:numel(list)
if ~isempty(cell2mat(regexp(str, [list{x}, '(..)?'], 'once'))) %match either list{x} or list{x} followed by any two characters.
validVar = [validVar; list{x}];
end
end
which should be a lot faster.
However, I don't think that's exactly what you want. I agree with greg that it's not really clear what it is exactly you want. We need a very clear rule of what patterns you want to match and not match with the regex.
  1 commentaire
Tiasa Ghosh
Tiasa Ghosh le 12 Sep 2018
I realised where the question went vague and somehow the question isn't relevant to me now. Anyhow, thank you for your time and answer. Hope it helps somebody else in future . :)

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by