Quick HDF5 benchmarks

We compare the performance of reading a subset of a large array:

  • in memory with NumPy
  • with h5py
  • with memmap using an HDF5 file
  • with memmap using an NPY file

This illustrates our performance issues with HDF5 in our very particular use case (accessing a small number of lines in a large "vertical" rectangular array).


In [4]:
import h5py
import numpy as np

In [5]:
np.random.seed(2016)

We'll use this function to bypass the slow h5py data access with a faster memory mapping (only works on uncompressed contiguous datasets):


In [6]:
def _mmap_h5(path, h5path):
    with h5py.File(path) as f:
        ds = f[h5path]
        # We get the dataset address in the HDF5 fiel.
        offset = ds.id.get_offset()
        # We ensure we have a non-compressed contiguous array.
        assert ds.chunks is None
        assert ds.compression is None
        assert offset > 0
        dtype = ds.dtype
        shape = ds.shape
    arr = np.memmap(path, mode='r', shape=shape, offset=offset, dtype=dtype)
    return arr

Number of lines in our test array:


In [7]:
shape = (100000, 1000)
n, ncols = shape

We generate a random array:


In [8]:
arr = np.random.rand(n, ncols).astype(np.float32)

We write it to a file:


In [12]:
%timeit with h5py.File('test.h5', 'w') as f: f['/test'] = arr


1 loops, best of 3: 413 ms per loop

We load the file once in read mode.


In [7]:
f = h5py.File('test.h5', 'r')

In [13]:
%timeit np.save('test.npy', arr)


1 loops, best of 3: 628 ms per loop

In [ ]:
%timeit arr = np.memmap('test.map', mode='w+', shape=shape, dtype=np.float32)

Slices


In [10]:
ind = slice(None, None, 100)

In [11]:
print('in memory')
%timeit arr[ind, :] * 1
print()
print('h5py')
%timeit f['/test'][ind, :] * 1
print()
print('memmap of HDF5 file')
%timeit _mmap_h5('test.h5', '/test')[ind, :] * 1
print()
print('memmap of NPY file')
%timeit np.load('test.npy', mmap_mode='r')[ind, :] * 1


in memory
1000 loops, best of 3: 741 µs per loop

h5py
100 loops, best of 3: 9.65 ms per loop

memmap of HDF5 file
100 loops, best of 3: 3.95 ms per loop

memmap of NPY file
100 loops, best of 3: 3.75 ms per loop

Fancy indexing

Fancy indexing is what we have to use in our particular use-case.


In [13]:
ind = np.unique(np.random.randint(0, n, n // 100))

In [15]:
len(ind)


Out[15]:
999

In [16]:
print('in memory')
%timeit arr[ind, :] * 1
print()
print('h5py')
%timeit f['/test'][ind, :] * 1
print()
print('memmap of HDF5 file')
%timeit _mmap_h5('test.h5', '/test')[ind, :] * 1
print()
print('memmap of NPY file')
%timeit np.load('test.npy', mmap_mode='r')[ind, :] * 1


in memory
100 loops, best of 3: 2.05 ms per loop

h5py
10 loops, best of 3: 53.3 ms per loop

memmap of HDF5 file
100 loops, best of 3: 5.62 ms per loop

memmap of NPY file
100 loops, best of 3: 5.12 ms per loop

Note that h5py uses a slow algorithm for fancy indexing, so HDF5 is not the only cause of the slowdown.