Why does Matlab transpose hdf5 data?
Afficher commentaires plus anciens
There is an apparent bug in Matlab HDF5 read/write utility that breaks interoperability with other code. Simple array datasets are read/written as the transpose of their actual shape. I imagine this is because Matlab uses column-major (Fortran-style) order, whereas the HDF5 standard uses row-major (C-style) order.
Minimal example that illustrates the problem:
h5create('test.h5', '/dataset', [2,3]);
h5write('test.h5', '/dataset', reshape(1:6,[2,3]))
Running the HDF5 utility h5ls on the output reveals the problem:
$ h5ls test.h5
dataset Dataset {3, 2}
This is not evident if only using the HDF5 tools from within Matlab, since reading the dataset in also transposes it back.
>> h5read('test.h5', '/dataset')
ans =
1 3 5
2 4 6
Matlab should either fix this in future versions or mention the convention in the documentation, since people mostly choose HDF5 for interoperability with other systems, and this can be a tricky bug to find.
In versions:
- h5ls: Version 1.8.14
- Matlab 8.6.0.267246 (R2015b) GLNXA64
1 commentaire
Daniel Döhring
le 24 Mai 2019
Modifié(e) : Daniel Döhring
le 24 Mai 2019
Actually this bug seems to be still around. In my case, a (pseudo) multiarray of dimensions
is in Matlab internally permuted to
. As a consequence, it is impossible to write back a multiarray in dimensions
, since Matlab does not represent matrices in
manner.
Réponse acceptée
Plus de réponses (3)
Kameron Harris
le 20 Oct 2016
Modifié(e) : Kameron Harris
le 20 Oct 2016
1 vote
Kameron Harris
le 20 Oct 2016
Modifié(e) : Kameron Harris
le 20 Oct 2016
0 votes
1 commentaire
James Tursa
le 20 Oct 2016
The HDF Group intent seems to be that applications should be able to write to the file in a native storage order. This seems reasonable to me, especially from a speed standpoint. Why cripple column-ordered languages (Fortran, MATLAB) with a hard requirement to permute the data each time you read/write?
Kameron Harris
le 20 Oct 2016
Modifié(e) : Kameron Harris
le 20 Oct 2016
0 votes
2 commentaires
James Tursa
le 20 Oct 2016
Well, so this pretty much answers the question. The HDF Group intended the various applications (Fortran, MATLAB, C, C++, Python, etc) to be able to write to the file in a native storage order and simply list the dimensions of the data in the file in a specified order (slowest changing first ... fastest changing last). It is then incumbent on the user to know what storage order his/her applications use if they are to share data through this file format ... and permute the data accordingly if necessary.
So given this language in the HDF doc, I would say MATLAB is doing everything correctly (but maybe could help the user out with some documentation about interoperability with other languages/applications).
Kameron Harris
le 20 Oct 2016
Catégories
En savoir plus sur HDF5 dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!