Info
Cette question est clôturée. Rouvrir pour modifier ou répondre.
urlread - missing some href's
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi,
when i read the source code of a webpage using urlread i am missing some (not all) urls in the string i get. I think it has something to do with the html-tag "div". I can't see any href in a div environment.
EXAMPLE:
I'll post a code snippet from the source code i get using "str = urlread('http://www.mathworks.de/matlabcentral/')" (line 493-497):
<div class="spotlight custom" style="padding-left:12px;">
<<-images-blogs-blogs_spotlight_trials_gray.jpg>>
</div> </div>
In the following the corresponding code snippet from my web browser:
<div class="spotlight custom" style="padding-left:12px;">
<http://www.mathworks.com/programs/trials/trial_request.html?eventid=56763&prodcode=ML&s_iid=mlcmain_trial_mlc_cta1
<<-images-blogs-blogs_spotlight_trials_gray.jpg>>
>
</div> </div>
The complete < a href="..." > is missing. On this example page 64 links are missing.
Thank you very much in advance Hans
0 commentaires
Réponses (2)
Jan
le 7 Juil 2013
Modifié(e) : Jan
le 8 Juil 2013
I cannot reproduce this, because the linked document has changed.
Do you check this in the command window, where "<a href..." is displayed as a hyper reference automatically? Then the HREF is still there, but shown as the underlined link, not as string.
[EDITED] Workaround:
str = urlread('http://www.mathworks.de/matlabcentral/');
str = strrep(str, '<a href=', '<A HREF=');
disp(str)
Unfortunately I cannot remember, if upper case disables the auto-formatting. But I've used such a similar replacement to chos such strings in plaintext. Another idea would be to open the file in the editor:
str = urlread('http://www.mathworks.de/matlabcentral/');
matlab.desktop.editor.newDocument(str);
4 commentaires
Jan
le 8 Juil 2013
Modifié(e) : Jan
le 8 Juil 2013
Wow, I cannot edit my answer anymore. When I open it for editing, everything behind the "[EDITED]" is invisible.
I actually want to add there the method for older Matlab versions (e.g. 2009a):
com.mathworks.mlservices.MLEditorServices.newDocument(str)
Jan
le 8 Juil 2013
Modifié(e) : Jan
le 8 Juil 2013
Strange: When I copy the full message to a new answer, I can re-open it completely. But when I open the above message, I see only the version before the [EDITED] part has been appended.
I ask Randy.
[EDITED] Dear Randy and other readers: Sorry, the effect disappeared, after closing the browser and restarting(!) the machine. Obviously my computer reloaded an outdated version from the Prism cache before.
Ken Atwell
le 8 Juil 2013
urlread and your browser are hitting the web site independently of each other, and there is no guarantee that the exactly same content will be returned in both situations. In this case, the example you give involves a rather dynamically-created page and a spotlight (okay, an "ad") for a MATLAB trial -- this might be offered or not depending on a host of factors (including the browser being used and even randomness).
In short, you should not count on urlread returning the same content as your browser for anything other than a completely static web page.
0 commentaires
Cette question est clôturée.
Voir également
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!