Validate and get exact number of characters using the regular expression

Hi all,
I am using MATLAB R2016a version and working on a chacters set like
str = 'ENGINE-45'; % this is okay(digits are 2 after hyphen)
where i need to validate and get as numbers of digits after hyphen (-), it should be 2 digtits but in some case it is 3, 4, 5 number digits like
str = 'ACCCAT-455'; % this is not okay(3 digits)
str = 'VCCASNSRR-12344'; % this is not okay(5 digits)
I have used reguler expresion to get charcters and digits as
out_str = regexp(str, '[A-Z]+-[0-9]{2}', 'match');
ans =
'VCCASNSRR-12' % used {2} but where inputs are 5, tryiing to restrict this to only 2 char
but this is not what i required i need to get if and only 2 digits after hyphen?

 Réponse acceptée

Try
out_str = regexp(str, '[A-Z]+-[0-9]{2}$', 'match');

9 commentaires

For completeness, you can still use regex for the "numbers only" problem using look-behind and making only minimal changes to original expression. Surely it's not the only way, I'm also pretty lame at regex. I'm sure only aliens can actually make regex expressions on demand without consulting docs :) Also, now I know why I had trouble with word boundaries, I didn't know about \> and \<
candidates = [...
"SOMETHING-2";...
"ENGINE-45";...
"ACCCAT-455";...
"ANOTHER-23";...
"PREFIX BLAH-64";...
"BLAH-75 SUFFIX";...
"VCCASNSRR-12344"]
FullMatch = regexp(candidates, '\<[A-Z]+-[0-9]{2}\>', 'match')
NumberMatch = regexp(candidates, '\<(?<=[A-Z]+-)[0-9]{2}\>', 'match')
Stephen23
Stephen23 le 8 Mar 2020
Modifié(e) : Stephen23 le 8 Mar 2020
+1 neat answer.
Tip: rather than [0-9] use \d
Thanks! I guess this decision depends on a few things like expression portability, and other (if any) characters to be included as "digits"
Thank you Alex, it worked perfectly !
but i have another situation adding another string i.e. engine number like
str = 'ENGINE-12345-23';
I have used
>> regexp(str, '[A-Z]{6}+-[0-9]{5}+-[0-9]{2}$', 'match'); % it is working for me
>>ans =
1×1 cell array
{'ENGINE-12345-23'}
but in the input case VCCASNSRR it should not work
>> str = 'VCCASNSRR-12345-23'
>> regexp(str, '[A-Z]{6}+-[0-9]{5}+-[0-9]{2}$', 'match')
ans =
1×1 cell array
{'ASNSRR-12345-23'} % this is not required it should restrict only 6 chars
How to restrict first and second string charcter to between hyphen as 6 and 5 ?
Stephen23
Stephen23 le 9 Mar 2020
Modifié(e) : Stephen23 le 12 Mar 2020
Prepend ^ to the start of the regular expression. Read the documentation to know what it does:
Stephen, do you mean prepend ^?
If you included \< and \> as my example shows, it should have worked too.
Your final question is unclear.
Sorry for if i am unclear, need to restrict the input string with specific length as
6 characters-5 digits-2 digits
Example:
str = 'ENGINE-12345-23';
In reguler expression
'...-[0-9]{2}$'; % which is restricting the last input after hyphen ( here 23) is ok
% in the same way first two string inputs(ENGINE, 12345) need restrict to 6 and 5 respectively
In the suggested link Anchors working with specfic characters but not with number of characters.
@J. Alex Lee: yes thank you, fixed now.
@Bhaskar R: use an anchor. It worked for me:
regexp(str, '^[A-Z]{6}-\d{5}-\d{2}$', 'match', 'once')
Yes sir, It worked perfectly
Thank you :-)

Connectez-vous pour commenter.

Plus de réponses (1)

I can get you closer, but not exactly right...
regexp(s,'-\d{2}\>','match')
will return only matches of exactly two digits after the "-", but for those cases that do match, also returns the leading "-"
I'm not sure how to prevent that...I'm pretty lame at regular expressions.

7 commentaires

maybe I misunderstood the requirements. Do you need just the digits, or also the string and hyphen preceding?
candidates = [...
"SOMETHING-2",...
"ENGINE-45",...
"ACCCAT-455",...
"VCCASNSRR-12344"]
origmatches = regexp(candidates, '[A-Z]+-[0-9]{2}', 'match')
fullmatches = regexp(candidates, '[A-Z]+-[0-9]{2}$', 'match')
numericmatches = regexp(candidates,'-\d{2}\>','match')
If you want something like the last, and don't want the hyphen, i think you could use some lookbehind
Hmmm...rereading I see where you're coming from...looks like maybe he is trying to extract the strings+digits but only those with two digits after the hyphen. For that I'd probably just use a count if it is exactly two and forget the overhead of regexp entirely.
>> candidates(strlength(extractAfter(candidates,'-'))==2)
ans =
"ENGINE-45"
>>
nice solution, but still assumes what I assumed with regex, that the numbers are terminal
Oh, no claims it's anything other than the same thing; just don't need to figure out regex that way... :)
I've seen comments that regex is also fairly expensive; how it would compare on large array I've no idea, didn't test.
yep, didn't mean to suggest otherwise either.
I don't know relative computational cost either, and hadn't seen comments about it; will definitely keep in mind. But I suppose in any case it makes sense to reach for regex only when you're at the limit of simple text processing.
dpb
dpb le 8 Mar 2020
Modifié(e) : dpb le 8 Mar 2020
My thinking for past 40 years! I like the quip about only aliens...am in full agreement! :)
Some people when confronted with a problem think "I know, I'll use regular expressions." Now they have two problems.
Thank for the for the response dpb, I have data varying with another string so I had to stick to use only reguler expressions only as of now.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Entering Commands dans Centre d'aide et File Exchange

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by