Lecture 22: Parallel Computing


Sections

Start the engines and load other auxillary librarys for function access.


In [1]:
import os,sys,time
import numpy

from ipyparallel import Client 
rc = Client()

In [17]:
dview = rc[:]
e0 = rc[0]
e1 = rc[1]
e2 = rc[2]

dview.block = True

We are using Python to demonstrate concepts. This is by no means the only way to program parallel. My simulations are written using an entirely different library. I'm using this environment because it is interactive and can clarify the types of things we need to think about.

There are 4 engines available for computation.


In [3]:
dview.apply_sync(os.getpid)


Out[3]:
[4528, 4525, 4529, 4530]

Scatter and Gather

Lots of parallel computations involve partitioning data onto processes.

DirectViews have scatter() and gather() methods, to help with this.

Pass any container or numpy array, and IPython will partition the object onto the engines wih scatter, or reconstruct the full object in the Client with gather().

The range() function creates an array of integers.


In [4]:
range(16)


Out[4]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

We will now scatter those numbers across our engines (processors).


In [5]:
dview.scatter('a',range(16))
dview['a']


Out[5]:
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]

We can examine engine 0 and see what data is present there.


In [6]:
e0['a']


Out[6]:
[0, 1, 2, 3]

Let us now re-assemble the data on our client machine.


In [7]:
dview.gather('a')


Out[7]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

What happened?

We distributed our data across processors. When we do this we are programming (interactively) in the MIMD paradigm.

We have to manage the data to/from the processors.

Executing Instructions

Now let us do something interesting with the data.


In [8]:
dview.scatter('a',range(16))
dview['a']


Out[8]:
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]

In [9]:
dview.scatter('a',range(16))
dview.execute("asum = sum(a)")
dview.gather('asum')


Out[9]:
[6, 22, 38, 54]

With Great Power...

We are responsible for managing the data to/from the processors. Our summation happened on each processor - now it is up to us to continue the summation and get the result we want.


In [10]:
summed_parts = dview.gather('asum')
sum(summed_parts)


Out[10]:
120

Multiplexing Exercise - Monte Carlo π

Modified from https://github.com/minrk/IPython.parallel.tutorial

  • Min Ragan-Kelley, UC Berkeley Applied Science & Technology

A simple toy problem to get a handle on multiple engines is a Monte Carlo approximation of π.

Let's say we have a dartboard with a round target inscribed on a square board. If you threw darts randomly, and they land evenly distributed on the square board, how many darts would you expect to hit the target?

$$ \frac{A_c}{A_{sq}} = \frac{\pi r^2}{(2r)^2} = \frac{\pi}{4} $$

In [11]:
from random import random
from math import pi

def mcpi(nsamples):
    s = 0
    for i in xrange(nsamples):
        x = random()
        y = random()
        if x*x + y*y <= 1:
            s+=1
    return 4.*s/nsamples

In [12]:
for n in [10, 100, 1000, 10000, 100000, 1000000]:
    print "%8i" % n,
    for i in range(3):
        print "%.5f" % mcpi(n),
    print


      10 3.20000 3.20000 2.80000
     100 3.08000 3.08000 2.96000
    1000 3.06000 3.14400 3.14000
   10000 3.12400 3.16840 3.13720
  100000 3.13576 3.13420 3.13624
 1000000 3.14060 3.14224 3.14172

Parallel MC Estimation of $\pi$

We can now write a parallel version of our algorithm. In this case we simply partition groups of the random numbers to different processors and re-assemble the answer.


In [13]:
def mcpi(nsamples):
    from random import random
    s = 0
    for i in xrange(nsamples):
        x = random()
        y = random()
        if x*x + y*y <= 1:
            s+=1
    return 4.*s/nsamples
    
def multi_mcpi(dview, nsamples):
    p = len(dview.targets)
    if nsamples % p:
        # ensure even divisibility
        nsamples += p - (nsamples%p)
    
    subsamples = nsamples/p
    
    ar = dview.apply_async(mcpi, subsamples)
    return sum(ar)/p

In [14]:
t0 = time.time()

mcpi(15000000)

t1 = time.time()

print ''
print t1-t0


4.46362805367

In [16]:
t0 = time.time()

multi_mcpi(dview, 15000000)

t1 = time.time()

print ''
print t1-t0


1.44683098793