In [1]:
from math import sqrt
from joblib import Parallel, delayed

In [3]:
%time Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))


CPU times: user 3.17 ms, sys: 2.99 ms, total: 6.16 ms
Wall time: 5.83 ms
Out[3]:
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

In [4]:
# using the threading backend

In [5]:
%time Parallel(n_jobs=2, backend="threading")(delayed(sqrt)(i ** 2) for i in range(10))


CPU times: user 9.32 ms, sys: 0 ns, total: 9.32 ms
Wall time: 110 ms
Out[5]:
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

Reusing a pool of workers

Some algorithms require to make several consecutive calls to a parallel function interleaved with processing of the intermediate results. Calling Parallel several times in a loop is sub-optimal because it will create and destroy a pool of workers (threads or processes) several times which can cause a significant overhead.

For this case it is more efficient to use the context manager API of the Parallel class to re-use the same pool of workers for several calls to the Parallel object:


In [6]:
with Parallel(n_jobs=2) as parallel:
    accumulator = 0.
    n_iter = 0
    while accumulator < 1000:
        results = parallel(delayed(sqrt)(accumulator + i ** 2)for i in range(5))
        accumulator += sum(results)  # synchronization barrier
        n_iter += 1

In [8]:
(accumulator, n_iter)


Out[8]:
(1136.5969161564717, 14)

Working with numerical data in shared memory (memmaping)

Automated array to memmap conversion


In [9]:
import numpy as np
from joblib import Parallel, delayed
from joblib.pool import has_shareable_memory

Parallel(n_jobs=2, max_nbytes=1e6)(
    delayed(has_shareable_memory)(np.ones(int(i)))
    for i in [1e2, 1e4, 1e6])


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-9-ea6067fff2b7> in <module>()
      1 import numpy as np
      2 from joblib import Parallel, delayed
----> 3 from joblib.pool import has_shareable_memory
      4 
      5 Parallel(n_jobs=2, max_nbytes=1e6)(

ImportError: cannot import name 'has_shareable_memory'

In [12]:
import joblib

In [14]:
joblib.__version__


Out[14]:
'0.12.1'

In [ ]: