Questions tagged [h5py]

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

Main features

  • Free (BSD licensed)
  • limited dependencies (Python, NumPy, HDF5 libs.)
  • includes both a low level c-like HDF5 interface and a high level Python/NumPy style interface
  • directly interact with datasets using NumPy metaphors, such as slicing
  • datatypes specified using standard NumPy dtype objects

Some links to get started

1070 questions
23
votes
3 answers

Fastest way to write HDF5 files with Python?

Given a large (10s of GB) CSV file of mixed text/numbers, what is the fastest way to create an HDF5 file with the same content, while keeping the memory usage reasonable? I'd like to use the h5py module if possible. In the toy example below, I've…
Nicholas Palko
  • 785
  • 2
  • 10
  • 21
22
votes
1 answer

Pandas can't read hdf5 file created with h5py

I get pandas error when I try to read HDF5 format files that I have created with h5py. I wonder if I am just doing something wrong? import h5py import numpy as np import pandas as pd h5_file = h5py.File('test.h5',…
Masha L.
  • 241
  • 1
  • 2
  • 5
21
votes
3 answers

Close an open h5py data file

In our lab we store our data in hdf5 files trough the python package h5py. At the beginning of an experiment we create an hdf5 file and store array after array of array of data in the file (among other things). When an experiment fails or is…
Adriaan Rol
  • 401
  • 1
  • 4
  • 10
19
votes
8 answers

Save Keras ModelCheckpoints in Google Cloud Bucket

I'm working on training a LSTM network on Google Cloud Machine Learning Engine using Keras with TensorFlow backend. I managed it to deploy my model and perform a successful training task after some adjustments to the gcloud and my python script. I…
Kevin Katzke
  • 2,836
  • 3
  • 31
  • 42
18
votes
6 answers

Read HDF5 file into numpy array

I have the following code to read a hdf5 file as a numpy array: hf = h5py.File('path/to/file', 'r') n1 = hf.get('dataset_name') n2 = np.array(n1) and when I print n2 I get this: Out[15]: array([[, , …
e9e9s
  • 645
  • 2
  • 8
  • 21
18
votes
2 answers

HDF5 file created with h5py can't be opened by h5py

I created an HDF5 file apparently without any problems, under Ubuntu 12.04 (32bit version), using Anaconda as Python distribution and writing in ipython notebooks. The underlying data are all numpy arrays. For example, import numpy as np import…
Lilith-Elina
  • 1,304
  • 3
  • 17
  • 29
18
votes
2 answers

hdf5 / h5py ImportError: libhdf5.so.7

I'm working on a project involving network messaging queues (msgpack, zmq, ...) on a RHEL 6.3 (x86_64) system. I was installing the most recent packages of glib, gevent, pygobject, pygtk, and such in order to get pylab / matplotlib to work (which…
cronburg
  • 827
  • 1
  • 7
  • 22
17
votes
5 answers

How to differentiate between HDF5 datasets and groups with h5py?

I use the Python package h5py (version 2.5.0) to access my hdf5 files. I want to traverse the content of a file and do something with every dataset. Using the visit method: import h5py def print_it(name): dset = f[name] print(dset) …
Trilarion
  • 9,318
  • 9
  • 55
  • 91
17
votes
3 answers

'/' in names in HDF5 files confusion

I am experiencing some really weird interactions between h5py, PyTables (via Pandas), and C++ generated HDF5 files. It seems that, h5check and h5py seem to cope with type names containing '/' but pandas/PyTables cannot. Clearly, there is a gap in my…
17
votes
5 answers

Python particles simulator: out-of-core processing

Problem description In writing a Monte Carlo particle simulator (brownian motion and photon emission) in python/numpy. I need to save the simulation output (>>10GB) to a file and process the data in a second step. Compatibility with both Windows and…
user2304916
  • 7,133
  • 3
  • 29
  • 53
17
votes
5 answers

How to read a v7.3 mat file via h5py?

I have a struct array created by matlab and stored in v7.3 format mat file: struArray = struct('name', {'one', 'two', 'three'}, 'id', {1,2,3}, 'data', {[1:10], [3:9], [0]}) save('test.mat', 'struArray',…
Eastsun
  • 17,358
  • 4
  • 50
  • 76
16
votes
3 answers

Save pandas DataFrame using h5py for interoperabilty with other hdf5 readers

Here is a sample data frame: import pandas as pd NaN = float('nan') ID = [1, 2, 3, 4, 5, 6, 7] A = [NaN, NaN, NaN, 0.1, 0.1, 0.1, 0.1] B = [0.2, NaN, 0.2, 0.2, 0.2, NaN, NaN] C = [NaN, 0.5, 0.5, NaN, 0.5, 0.5, NaN] columns = {'A':A, 'B':B,…
Phil
  • 4,762
  • 1
  • 27
  • 53
16
votes
1 answer

How to partially copy using python an Hdf5 file into a new one keeping the same structure?

I have a large hdf5 file that looks something like this: A/B/dataset1, dataset2 A/C/dataset1, dataset2 A/D/dataset1, dataset2 A/E/dataset1, dataset2 ... I want to create a new file with only that: A/B/dataset1, dataset2 A/C/dataset1, dataset2 What…
graham
  • 285
  • 1
  • 3
  • 10
16
votes
3 answers

h5py: Correct way to slice array datasets

I'm a bit confused here: As far as I have understood, h5py's .value method reads an entire dataset and dumps it into an array, which is slow and discouraged (and should be generally replaced by [()]. The correct way is to use numpy-esque…
JiaYow
  • 5,097
  • 3
  • 26
  • 36
15
votes
3 answers

How do I traverse a hdf5 file using h5py

How do I traverse all the groups and datasets of an hdf5 file using h5py? I want to retrieve all the contents of the file from a common root using a for loop or something similar.
Marcio
  • 515
  • 2
  • 7
  • 16
1
2
3
71 72