Regular Expression to extract bigram

string = 'ab bc cd ef gh ij kl'
what will be the regular expression to extract bigram from the given string
I am writing the code
regexp(string,'\w* \w*','match');
the o/p is coming as: 'ab bc' 'cd' 'ef' 'gh' 'ij' 'kl'
while the output i am expecting as:
  • 'ab bc'
  • 'bc cd'
  • 'cd ef'
  • 'ef gh'
  • 'gh ij'
  • 'ij kl'

2 commentaires

I believe the term is "bi-gram".
If the string was
'abc defg'
would you want the result to be
ab bc c<space> <space>d de ef fg
or
ab de
or
ab bc de ef fg
?
Or does it only need to work on letter pairs ?
yes you are saying right but i want to do it on word level and when i am writing the regexp as:
regexp(string,'\w+ \w+','match')
the o/p is: ans =
'ab bc' 'cd ef' 'gh ij'

Connectez-vous pour commenter.

 Réponse acceptée

Azzi Abdelmalek
Azzi Abdelmalek le 26 Sep 2013
Modifié(e) : Azzi Abdelmalek le 26 Sep 2013
EDIT
Do you want?
string = 'ab bc cd ef gh ij kl'
regexp(string,'\s+','split');

3 commentaires

arun
arun le 26 Sep 2013
Modifié(e) : arun le 26 Sep 2013
@Azzi Abdelmalek
No i want the
  • 'ab bc'
  • 'bc cd'
  • 'cd ef'
  • 'ef gh'
  • 'gh ij'
  • 'ij kl'as output
string = 'ab bc cd ef gh ij kl'
out=regexp(string,'\s+','split');
cellfun(@(x,y) [x ' ' y],out(1:end-1)', out(2:end)','un',0)
arun
arun le 26 Sep 2013
Thanks @Azzi Abdelmalek
for your answer.

Connectez-vous pour commenter.

Plus de réponses (1)

Andrei Bobrov
Andrei Bobrov le 26 Sep 2013
z=regexp(string,'\w*','match')
strcat(z(1:end-1),{' '},z(2:end))

Catégories

En savoir plus sur Data Type Identification dans Centre d'aide et File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by