check if url exists for large files without downloading the file

40 vues (au cours des 30 derniers jours)
Diana
Diana le 11 Oct 2016
I've searched this forum for a while and need some help with checking URLs. I have a set of files to download from a site and loop through the formatting options for the files basically the files are zip files and are extremely large (nearly 100MB). If I use the following statement:
raw_file = websave([p_raw,'\',fname_raw],[URL,MO_name,num2str(YR),'/raw/',fname_raw]);
Then for files that do not actually exist a dummy file with 0KB of data in it is made in the saved location. I want to avoid this. I've tried using:
[str status] = urlread([URL,MO_name,num2str(YR),'/raw/',fname_raw]);
but MATLAB still insists on trying to read the actual file instead of just telling me if it's there or not - wasting time in the process.
The function isurl does not apply to named URLs. How do I check merely the existence of a url without actually downloading it?
Thanks!

Réponses (4)

Steven Lord
Steven Lord le 11 Oct 2016
Try using Java.
url = 'http://www.mathworks.com';
J = java.net.URL(url);
conn = openConnection(J);
status = getResponseCode(conn)
When I ran this code I received status 200 which is OK. I tried it for a URL to a file that didn't exist and received code 404. I don't think creating an object using openConnection actually retrieves the data; I think you'd have to call that object's getContent method or something similar to obtain the data.

Image Analyst
Image Analyst le 11 Oct 2016
Did you try exist()?
itExists = exist(filename, 'file');
filename will the the url of the file you want to check on.
  1 commentaire
Diana
Diana le 11 Oct 2016
Modifié(e) : Walter Roberson le 11 Oct 2016
If you mean in this manner:
itexists = exist('http://amisr.com/database/tmp/loucks/Apr2016/raw/20160406.005.tar.gz','file')
nope - gives me a 0 return and I'm staring at the file on the URL.

Connectez-vous pour commenter.


Matthew Eicholtz
Matthew Eicholtz le 11 Oct 2016
It seems I stumbled upon a similar approach to Steven, but a few minutes too late!
I think this function would do the trick:
function tf = urlexist(url)
URL = java.net.URL(url); %create the URL object
% Get the proxy information using the MATLAB proxy API.
proxy = com.mathworks.webproxy.WebproxyFactory.findProxyForURL(URL);
% Open a connection to the URL.
if isempty(proxy)
urlConnection = URL.openConnection;
else
urlConnection = URL.openConnection(proxy);
end
% Try to start the input stream
try
inputStream = urlConnection.getInputStream;
tf = true;
catch
tf = false;
end
end

Fernando Bello
Fernando Bello le 28 Nov 2019
Maybe it is a little bit late, but I'm just reading this post because I had the same trouble.
I solved it this way:
url = ('https://matlab/file.nc'); % the URL where data is, it can be what ever you want, png, doc, mat, etc.
filename = 'filename_test'; % the name you want to save the file
options = weboptions('Username','username','Password','passwd','CertificateFilename','');
try
websave(filename,url,options);
disp('OK') %if the file exist it is downloades
catch
disp('URL read of link was unsuccessful') % if file do not exist
end

Catégories

En savoir plus sur File Operations dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by