See e.g. http://wesmckinney.com/blog/numpy-indexing-peculiarities/ that shows that np.take(a, indices)
can be much faster than a[indices]
In [1]:
import numpy as np
In [2]:
a = np.arange(10)
indices = np.array([0, 0, 0, 1, 2, 2, 3, 4, 5, 9])
print a[indices]
print a.take(indices)
%timeit a[indices]
%timeit a.take(indices)
In [3]:
a = np.arange(10)
a = a[:, np.newaxis]
indices = np.array([0, 0, 0, 1, 2, 2, 3, 4, 5, 9])
print a[indices]
print a.take(indices, axis=0)
%timeit a[indices]
%timeit a.take(indices, axis=0)
np.take
is really faster, even though in this context by not as much as otherwise.
In [4]:
a = np.arange(10, dtype = float)
a = a[:, np.newaxis]
indices = np.array([0, 0, 0, 1, 2, 2, 3, 4, 5, 9])
out = np.empty((10, 1), dtype = float)
%timeit a.take(indices, axis=0)
%timeit a.take(indices, out=out, axis=0, mode='clip')
By preallocating the output array we can save a slight overhead.