Tutorial Brief

The IPython architecture consists of four components:

  • The IPython engine
  • The IPython hub
  • The IPython schedulers
  • The controller client

Dependencies:

To install on Ubuntu:

sudo apt-get install libzmq-dev
sudo easy_install pyzmq
  • pyzmq

To install on Ubuntu:

sudo pip install pyzmq

Parallel and Distributed Computing

Parallel computing: Multiple processors or cores to distribute the work on a sinlge machine.

Distributed computing: Multiple processors or cores located on multiple machines to distribute the work on a several machines.

Load Balancing

This will be our first approach to parellel/distributed computing.

Start the engines

I'll start the engines on a profile named "nbserver".

Import Client


In [1]:
from IPython.parallel import Client

Create a Client and a load_balanced_view


In [2]:
cluster = Client(profile="nbserver")
lb_view = cluster.load_balanced_view()

print "Profile: %s" % cluster.profile
print "Engines: %s" % len(lb_view)


Profile: nbserver
Engines: 12

With no Load Balancing


In [3]:
def f(x):
    result = 1.0
    for counter in range(100000):
        result = (result * x * 0.5)
        if result % 5 == 0:
            result -=4
    return result

In [4]:
%%timeit -r 1 -n 1
result = []
for i in range(1000):
    result.append(f(i))


1 loops, best of 1: 23.8 s per loop

With Load Balancing

Using load balanced view can help simplify the process of distributing code. There are two methods to implement this.

view.map(f, *sequences, block=self.block, chunksize=1, ordered=True)


In [7]:
def f(x):
    result = 1.0
    for counter in range(100000):
        result = (result * x * 0.5)
        if result % 5 == 0:
            result -=4
    return result

In [9]:
%%timeit -r 1 -n 1
result = lb_view.map(f, range(1000), block=True)


1 loops, best of 1: 5.13 s per loop

@view.parallel() function decorator

Using parallel function decorator is by far the simplist way to implement parellel computing in IPython. It doesn't allow for much control but it is fast and works for most of the cases.

load_balanced_view.parallel(self, dist='b', block=None, **flags)

Decorator for making a ParallelFunction

In [10]:
@lb_view.parallel(block=True)
def f(x):
    result = 1.0
    for counter in range(100000):
        result = (result * x * 0.5)
        if result % 5 == 0:
            result -=4
    return result

In [15]:
result = f.map(range(1000))
print "Results Count: %s" % len(result)


Results Count: 1000

Asynchronous Processing


In [16]:
@lb_view.parallel(block=False)
def f(x):
    result = 1.0
    for counter in range(100000):
        result = (result * x * 0.5)
        if result % 5 == 0:
            result -=4
    return result

In [31]:
result = f.map(range(1000))

Retreving the results from AsyncMapResult


In [34]:
result


Out[34]:
<AsyncMapResult: finished>

In [33]:
print "Results Count: %s" % len(result.result)


Results Count: 1000