To get an idea of the data layout you could execute
h5dump ./sv/train/digitStruct.mat
but there are also other methods like visit
or visititems
.
A good reference that can help you and that seems to have already addressed a very similar problem (if not the same) recently is the following SO post:
h5py, access data in Datasets in SVHN
For example the snippet:
import h5py
import numpy
def get_name(index, hdf5_data):
name = hdf5_data['/digitStruct/name']
print ''.join([chr(v[0]) for v in hdf5_data[name[index][0]].value])
labels_file = 'train/digitStruct.mat'
f = h5py.File(labels_file)
for j in range(33402):
get_name(j, f)
will print the name of the files. I get for example:
7459.png
7460.png
7461.png
7462.png
7463.png
7464.png
7465.png
You can generalize from here.