Start the engines and load other auxillary librarys for function access.
In [1]:
import os,sys,time
import numpy
from ipyparallel import Client
rc = Client()
In [17]:
dview = rc[:]
e0 = rc[0]
e1 = rc[1]
e2 = rc[2]
dview.block = True
We are using Python to demonstrate concepts. This is by no means the only way to program parallel. My simulations are written using an entirely different library. I'm using this environment because it is interactive and can clarify the types of things we need to think about.
There are 4 engines available for computation.
In [3]:
dview.apply_sync(os.getpid)
Out[3]:
Lots of parallel computations involve partitioning data onto processes.
DirectViews have scatter() and gather() methods, to help with this.
Pass any container or numpy array, and IPython will partition the object onto the engines wih scatter,
or reconstruct the full object in the Client with gather().
The range() function creates an array of integers.
In [4]:
range(16)
Out[4]:
We will now scatter those numbers across our engines (processors).
In [5]:
dview.scatter('a',range(16))
dview['a']
Out[5]:
We can examine engine 0 and see what data is present there.
In [6]:
e0['a']
Out[6]:
Let us now re-assemble the data on our client machine.
In [7]:
dview.gather('a')
Out[7]:
In [8]:
dview.scatter('a',range(16))
dview['a']
Out[8]:
In [9]:
dview.scatter('a',range(16))
dview.execute("asum = sum(a)")
dview.gather('asum')
Out[9]:
In [10]:
summed_parts = dview.gather('asum')
sum(summed_parts)
Out[10]:
Modified from https://github.com/minrk/IPython.parallel.tutorial
A simple toy problem to get a handle on multiple engines is a Monte Carlo approximation of π.
Let's say we have a dartboard with a round target inscribed on a square board. If you threw darts randomly, and they land evenly distributed on the square board, how many darts would you expect to hit the target?
In [11]:
from random import random
from math import pi
def mcpi(nsamples):
s = 0
for i in xrange(nsamples):
x = random()
y = random()
if x*x + y*y <= 1:
s+=1
return 4.*s/nsamples
In [12]:
for n in [10, 100, 1000, 10000, 100000, 1000000]:
print "%8i" % n,
for i in range(3):
print "%.5f" % mcpi(n),
print
In [13]:
def mcpi(nsamples):
from random import random
s = 0
for i in xrange(nsamples):
x = random()
y = random()
if x*x + y*y <= 1:
s+=1
return 4.*s/nsamples
def multi_mcpi(dview, nsamples):
p = len(dview.targets)
if nsamples % p:
# ensure even divisibility
nsamples += p - (nsamples%p)
subsamples = nsamples/p
ar = dview.apply_async(mcpi, subsamples)
return sum(ar)/p
In [14]:
t0 = time.time()
mcpi(15000000)
t1 = time.time()
print ''
print t1-t0
In [16]:
t0 = time.time()
multi_mcpi(dview, 15000000)
t1 = time.time()
print ''
print t1-t0