6

Sometimes I get the following arrays from my HDF5 file:

val1 = {ndarray} [<HDF5 object reference> <HDF5 object reference> <HDF5 object reference>]

If I try to dereference it with HDF5 file object

f[val[0]]

I get an error

Argument 'ref' has incorrect type (expected h5py.h5r.Reference, got numpy.object_)
Suzan Cioc
  • 26,725
  • 49
  • 190
  • 355

1 Answers1

4

I've come across this question while trying to answer what turned out to be basically the same question in another form. A dataset containing references to other objects is a bit of an awkward situation in HDF5, but you can actually read them in a pretty straightforward way. The idea is to get the name of the referenced object, and then just read that object directly from the file.

Given a single HDF5 reference, ref, and a file, file, you can return the name of the referenced dataset by doing:

>>> name = h5py.h5r.get_name(ref, file.id)

Then just read the actual dataset itself, as usual:

>>> data = file[name].value # ndarray with the data in it.

So to read all the referenced datasets, just map this process across the whole dataset of references.

bnaecker
  • 5,114
  • 1
  • 11
  • 29