problem reading HDF5 on s3

7 vues (au cours des 30 derniers jours)
Ben Dichter
Ben Dichter le 22 Fév 2022
Modifié(e) : Ben Dichter le 6 Nov 2022
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9')
Error using hdf5lib2
Unable to access
'https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9'.
The specified URL scheme is invalid.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
This works when using the same URL with h5py in Python:
from h5py import File
file = File("https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9", "r", driver="ros3")
file.keys()
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>

Réponse acceptée

Ben Dichter
Ben Dichter le 6 Nov 2022
Modifié(e) : Ben Dichter le 6 Nov 2022
Two things needed to solve this:
  1. You need to input an s3 path, not an http or https path
  2. Delete or rename ~/.aws/credentials (on Windows something like C:/Users/username/.aws/credentials)

Plus de réponses (1)

Yongjian Feng
Yongjian Feng le 23 Fév 2022
In python, you seem to use read_only flag ("r"). Maybe you want to try:
H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY')
  1 commentaire
Ben Dichter
Ben Dichter le 23 Fév 2022
That does not appear to be the issue. The documentation for H5F indicates that it should be possible to include only the s3 path:
file_id = H5F.open(URL) opens the hdf5 file at a remote location
for read-only access and returns the file identifier, file_id.
Also, this use-case is demonstrated in an example:
Example: Open a file in Amazon S3 in read-only mode with
default file access properties.
H5F.close(fid);
I tried some optional arguments, which did not help:
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY')
Not enough input arguments.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY', 'H5P_DEFAULT')
Error using hdf5lib2
Unable to access
'https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9'. The
specified URL scheme is invalid.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
It looks like MATLAB is doing validation on the input, ensuring that the path starts with s3://, which mine does not.

Connectez-vous pour commenter.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by