Effacer les filtres
Effacer les filtres

how to use matlab to search specific items within a website and download these png files?

5 vues (au cours des 30 derniers jours)
lt c
lt c le 26 Fév 2022
Modifié(e) : lt c le 26 Fév 2022
Sorry, I'm new to matlab. It's a common task:
1.log on to a website;
2.search for specific items;
3.download those files to local pwd.
Manually, once log on to the website, there is a search box.
Input the id and click the 'search' button, then it jumps to another webpage (https://xxx/#/report/list) which contains the specific file that we want.
Then click the png and get a popup to save the files (figs in png form) to a local pwd.
% for privacy, use 'xxx' to replace the domain name, username and password)
list = xlsread('C:/Users/desktop/list.xlsx',1,'A1:A10'); % list contains id of specific items
url = 'https://xxx/#/login';
options = weboptions('Username', 'xxx', 'Password', 'xxx');
S_log = webread(url, options);
for k = 1:numel(list)
S_file = regexp(S_log, 'https:// (list(k).png), 'match');
data = webread(S_file);
end
I tried to use the loop to feed the search page with ids from the list. But I got stuck with the regexp line. I don't have the url for the downloading webpage (manually, popup with a click. The url has not changed). Any help with that? Thank you.
  2 commentaires
DGM
DGM le 26 Fév 2022
Modifié(e) : DGM le 26 Fév 2022
That depends entirely on how the page is implemented. If the search interface or results page is generated by javascript or something, webread() might not be of any use. Just as so many site-level search tools have gross compatibility issues with all but less than a handful of browsers, problems with tools like webread() should be expected.
If there's some API for the site search, that might be useful.
A general solution that would satisfy the requirements for any website would be nontrivial. I don't even know if it would be possible without relying heavily on external tools.
Then again, not everything needs to be done in MATLAB anyway. You could always just use your browser, go to the site, enter the search, and use FlashGot or something. There are also browser automation tools, external download managers, cURL, etc.
TL;DR:
There is no practical way to provide a working solution (or determine whether one exists) without knowing what the specific targets are. One needs to observe the actual site, the page content, the URL structure, etc.
lt c
lt c le 26 Fév 2022
Modifié(e) : lt c le 26 Fév 2022
The list contains thousands of ids which makes downloading job time-consuming if approached manually. In addition, the downloaded png files would be processed with matlab. So, to integrate both steps was my original plan.
You are absolutely right. The actual webpage would clarify the problem. Unfortunately, given the privacy issue, I could not elaborate on this url of our project. After going through other posts, I kind of get the idea of using regexp to create an active url to download multiple files from the webpage. However, I had a hard time doing that in a popup downloading situation. Anyway, I will try to learn how to use flashgot and curl. Great point, thank you.

Connectez-vous pour commenter.

Réponses (0)

Catégories

En savoir plus sur Downloads dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by