webread not yielding actual website

Jakob Sievers

13 Juil 2022

0 Réponses

Mise à jour 14 Juil 2022

8 Vues (30 jours)

Connectez-vous pour répondre à cette question.

Follow Question

Connectez-vous pour répondre à cette question.

Follow Question

Afficher commentaires plus anciens

0 votes

Hi there. I'm trying to learn how to extract information from websites. As an example, i'm trying to extract text from Facebook posts but webread gives me something which appears to be quite different from what I'm actually seeing on the website. I'm a complete Noob at this particular type of task and so I was hoping I could get some pointers concerning how to get the text, as I see it, rather than some obscured version. Thanks in advance!

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

DGM le 14 Juil 2022

Modifié(e) : DGM le 14 Juil 2022

Considering the source, I'm going to guess it's dynamic content.

https://www.mathworks.com/matlabcentral/answers/1750720-webread-not-returning-full-html-contents

Without knowing what page and what content exactly is being targeted, it's hard to be sure.

Jakob Sievers le 14 Juil 2022

@DGM: reading through the references in the thread you're referring to, I think it may be the exact problem that the stuff you're seeing on sites like Facebook is created not by basic HTML but by tons of scripts and such, which webread then is not able to extract.

Is there no way to dig deeper than webread, using matlab? I'd really like to stay on the Matlab platform, which I'm most familiar with, before considering other alternatives

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question