Questions tagged [h5py]

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

Main features

  • Free (BSD licensed)
  • limited dependencies (Python, NumPy, HDF5 libs.)
  • includes both a low level c-like HDF5 interface and a high level Python/NumPy style interface
  • directly interact with datasets using NumPy metaphors, such as slicing
  • datatypes specified using standard NumPy dtype objects

Some links to get started

1070 questions
104
votes
2 answers

Input and output numpy arrays to h5py

I have a Python code whose output is a sized matrix, whose entries are all of the type float. If I save it with the extension .dat the file size is of the order of 500 MB. I read that using h5py reduces the file size considerably. So, let's say I…
lovespeed
  • 4,165
  • 15
  • 38
  • 50
100
votes
1 answer

Is there an analysis speed or memory usage advantage to using HDF5 for large array storage (instead of flat binary files)?

I am processing large 3D arrays, which I often need to slice in various ways to do a variety of data analysis. A typical "cube" can be ~100GB (and will likely get larger in the future) It seems that the typical recommended file format for large…
Caleb
  • 3,135
  • 5
  • 20
  • 32
55
votes
2 answers

How to append data to one specific dataset in a hdf5 file with h5py

I am looking for a possibility to append data to an existing dataset inside a .h5 file using Python (h5py). A short intro to my project: I try to train a CNN using medical image data. Because of the huge amount of data and heavy memory usage during…
Midas.Inc
  • 1,440
  • 2
  • 9
  • 25
49
votes
5 answers

Installing h5py on an Ubuntu server

I was installing h5py on an Ubuntu server. However it seems to return an error that h5py.h is not found. It gives the same error message when I install it using pip or the setup.py file. What am I missing here? I have Numpy version 1.8.1, which…
Devil
  • 775
  • 1
  • 8
  • 17
46
votes
2 answers

Experience with using h5py to do analytical work on big data in Python?

I do a lot of statistical work and use Python as my main language. Some of the data sets I work with though can take 20GB of memory, which makes operating on them using in-memory functions in numpy, scipy, and PyIMSL nearly impossible. The…
Josh Hemann
  • 910
  • 9
  • 12
41
votes
2 answers

How to overwrite array inside h5 file using h5py

I'm trying to overwrite a numpy array that's a small part of a pretty complicated h5 file. I'm extracting an array, changing some values, then want to re-insert the array into the h5 file. I have no problem extracting the array that's nested. f1…
user3508433
  • 411
  • 1
  • 4
  • 3
32
votes
4 answers

Error opening file in H5PY (File signature not found)

I've been using the following bit of code to open some HDF5 files, produced in MATLAB, in python using H5PY: import h5py as h5 data='dataset.mat' f=h5.File(data, 'r') However I'm getting the following error: OSError: Unable to open file (File…
Anisha Singh
  • 353
  • 1
  • 3
  • 6
30
votes
2 answers

Incremental writes to hdf5 with h5py

I have got a question about how best to write to hdf5 files with python / h5py. I have data like: ----------------------------------------- | timepoint | voltage1 | voltage2 | ... ----------------------------------------- | 178 | 10 | 12…
user116293
  • 4,864
  • 4
  • 21
  • 17
30
votes
6 answers

Combining hdf5 files

I have a number of hdf5 files, each of which have a single dataset. The datasets are too large to hold in RAM. I would like to combine these files into a single file containing all datasets separately (i.e. not to concatenate the datasets into a…
Bitwise
  • 7,043
  • 4
  • 30
  • 48
29
votes
5 answers

How to store dictionary in HDF5 dataset

I have a dictionary, where key is datetime object and value is tuple of integers: >>> d.items()[0] (datetime.datetime(2012, 4, 5, 23, 30), (14, 1014, 6, 3, 0)) I want to store it in HDF5 dataset, but if I try to just dump the dictionary h5py raises…
theta
  • 21,223
  • 35
  • 106
  • 149
27
votes
3 answers

Storing a list of strings to a HDF5 Dataset from Python

I am trying to store a variable length list of string to a HDF5 Dataset. The code for this is import h5py h5File=h5py.File('xxx.h5','w') strList=['asas','asas','asas'] h5File.create_dataset('xxx',(len(strList),1),'S10',strList) h5File.flush()…
gman
  • 1,132
  • 2
  • 13
  • 28
26
votes
2 answers

how to export HDF5 file to NumPy using H5PY?

I have an existing hdf5 file with three arrays, i want to extract one of the arrays using h5py.
l.z.lz
  • 383
  • 1
  • 4
  • 13
24
votes
6 answers

How to list all datasets in h5py file?

I have a h5py file storing numpy arrays, but I got Object doesn't exist error when trying to open it with the dataset name I remember, so is there a way I can list what datasets the file has? with h5py.File('result.h5','r') as hf: #How…
matchifang
  • 3,889
  • 8
  • 36
  • 60
24
votes
4 answers

Deleting hdf5 dataset using h5py

Is there any way to remove a dataset from an hdf5 file, preferably using h5py? Or alternatively, is it possible to overwrite a dataset while keeping the other datasets intact? To my understanding, h5py can read/write hdf5 files in 5 modes f =…
hsnee
  • 446
  • 1
  • 4
  • 16
24
votes
3 answers

Check if node exists in h5py

I am wondering if there is a simple way to check if a node exists within an HDF5 file using h5py. I couldn't find anything in the docs, so right now I'm using exceptions, which is ugly. # check if node exists # first assume it exists e = True try: …
troy.unrau
  • 1,002
  • 2
  • 10
  • 25
1
2 3
71 72