Chapter 5, example 1

Here we illustrate the basic parallel computing capabilities of IPython.

First, IPython engines must be started, for example with the following command to launch 2 engines (one per core):

ipcluster start -n 2



In [1]:

    
from IPython.parallel import Client

The Client allows to start jobs on the engines.



In [2]:

    
rc = Client()

We can obtain the engines identifiers through the client.



In [3]:

    
rc.ids









    Out[3]:





[0, 1]

ERRATUM: the original code did not contain %px before the import os statement. This magic command is necessary so that the import occurs on all engines.



In [4]:

    
%px import os

The %px magic commands allows to execute commands in parallel on every engine.



In [5]:

    
%px print(os.getpid())









    



[stdout:0] 3256
[stdout:1] 1056

We can specify with %pxconfig the engine identifiers which the commands should be executed on (here, the second engine).



In [6]:

    
%pxconfig --targets 1



In [7]:

    
%px print(os.getpid())

Another possibility is to use the %%px cell magic to run an entire cell on all engines. The --targets option can accept a slice object (here, all engines except the last one).



In [8]:

    
%%px --targets :-1
print(os.getpid())









    



[stdout:0] 3256

By default, the parallel calls are synchronous (blocking) but we can ask IPython to make asynchronous calls.



In [9]:

    
%%px --noblock
import time
time.sleep(1)
os.getpid()









    Out[9]:





<AsyncResult: execute>

With asynchronous (non-blocking) calls, the results can be obtained synchronously from the engines with %pxresult. This call is blocking.



In [10]:

    
%pxresult









    





Out[1:4]: 1056

Another option to run tasks on the engines is to use map. First, we need to retrieve a view on the engines, which represents a particular set of engines among the ones that are running.



In [11]:

    
v = rc[:]

We import a module on each engine.



In [12]:

    
with v.sync_imports():
    import time









    



importing time on engine(s)

We define a simple function.



In [13]:

    
def f(x):
    time.sleep(1)
    return x * x

Now, we call map_sync, which is a synchronous and parallel version of Python's built-in map function. We execute f on all integers between 0 and 9 in parallel across all engines.



In [14]:

    
v.map_sync(f, range(10))









    Out[14]:





[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

We check how much time the native function takes.



In [15]:

    
timeit -n 1 -r 1 map(f, range(10))









    



1 loops, best of 1: 10 s per loop

And we compare with the time taken by the parallel version.



In [16]:

    
r = v.map(f, range(10))



In [17]:

    
r.ready(), r.elapsed









    Out[17]:





(False, 0.065)

We wait and get the results.



In [18]:

    
r.get()









    Out[18]:





[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]



In [19]:

    
r.elapsed, r.serial_time









    Out[19]:





(5.009, 10.0)