Parsing or regexp HTML output from urlread
Afficher commentaires plus anciens
I need to extract the PubMed IDs from the below HTML, but I am not too fluent in the use of regexp.
Can anyone help with how I would extract the IDs from the below HTML, and store them in a vector?
I'm guessing there is some way to say: what is between '<Id>' and '</Id>' store in...
version="1.0" ? eSearchResult PUBLIC "-//NLM//DTD eSearchResult, 11 May 2002//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eSearch_020511.dtd" eSearchResult<Count>8</Count><RetMax>8</RetMax><RetStart>0</RetStart><IdList> href = "Id>16123227</Id">Id>16123227</Id</a href = "Id>9561342</Id">Id>9561342</Id</a href = "Id>8429296</Id">Id>8429296</Id</a href = "Id>1408722</Id">Id>1408722</Id</a href = "Id>2152845</Id">Id>2152845</Id</a href = "Id>2894889</Id">Id>2894889</Id</a href = "Id>2860133</Id">Id>2860133</Id</a href = "Id>6145799</Id">Id>6145799</Id</a /IdList<TranslationSet/><TranslationStack> TermSet Term"ulcerative colitis"[All Fields]</Term> href = "Field>All">Fields</Field</a href = "Count>33249</Count">Count>33249</Count</a href = "Explode>N</Explode">Explode>N</Explode</a /TermSet TermSet Term"Clonidine"[All Fields]</Term> href = "Field>All">Fields</Field</a href = "Count>16458</Count">Count>16458</Count</a href = "Explode>N</Explode">Explode>N</Explode</a /TermSet href = "OP>AND</OP">OP>AND</OP</a /TranslationStack<QueryTranslation>"ulcerative colitis"[All Fields] AND "Clonidine"[All Fields]</QueryTranslation></eSearchResult>
Réponse acceptée
Plus de réponses (1)
Sean de Wolski
le 24 Juin 2013
0 votes
1 commentaire
Philip Spratt
le 24 Juin 2013
Catégories
En savoir plus sur Characters and Strings dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!