Monte Carlo: Calculating π

In this tutorial we will introduce Monte-Carlo simulations, which are an incredibly useful for simulating statistical systems. We will work through 2 simple examples to demonstrate the principles.

  1. calculation of pi

  2. Maxwell-Boltzmann velocity and speed distributions

Calculation of pi

Take a dart board. Draw a square, and a circle inside the square whose diameter equals the length of the side of the square. Throw some darts at it. Assuming you're as good at darts as me, they will be randomly populated inside the square (ignore any that land outside the square...). How many darts land inside the circle as well? It's a question of relative area:

The area of the square is $d^2$, and the area of circle is $\pi (d/2)^2$. The circle therefore covers $\pi/4$ of the area of the square. So assuming uniform distributions, the ratio of darts that land inside the circle to the total inside the square will also be $\pi/4$, which we can then rearrange to find pi.

So how many darts do you need to throw to get an accurate answer? Let's set up the program to find out.


In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle, Circle
from IPython.display import display

#create the plot so we can visualise it
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
#add a square
Rpatch = Rectangle((0,0),1,1,ec='k',fc='grey',alpha=0.35)
ax.add_patch(Rpatch)
#add circle
Cpatch = Circle((0.5,0.5),0.5,ec='k',fc='b',alpha=0.25)
ax.add_patch(Cpatch)
plt.draw()


Now we make use of random number generators to simulate our dart thrower. Let's add a few to the plot as an example:


In [2]:
xs = np.random.random(20)
ys = np.random.random(20)

ax.plot(xs,ys,'ro')
display(fig)


To decide if the points are inside the circle, we just need to look at their radius (relative to the center coordinate of the circle):


In [49]:
rad = np.sqrt((xs-0.5)**2+(ys-0.5)**2)
inside = rad[rad<0.5] # use logical statements to index lists, this selects the elements of rad whose value is less than 0.5

print len(inside), len(rad)

ratio = float(len(inside))/len(rad)
pi_guess = 4 * ratio

print 'Estimate of pi:',pi_guess
print 'Relative error (%):', 100*abs(pi_guess-np.pi)/np.pi


17 20
Estimate of pi: 3.4
Relative error (%): 8.22536130249

Not too bad for 20 points! Let's try increasing the numbers of darts though...

We'll rewrite the above code into a routine we can call with an argument that represents the number of points:


In [50]:
def estimate_pi(n):
    """ Estimate pi using the drunken dart player model. 
        Argument n (integer) is the number of 'darts' thrown
    """
    xs = np.random.random(n)
    ys = np.random.random(n)
    rad = np.sqrt((xs-0.5)**2+(ys-0.5)**2)
    inside = rad[rad<0.5]
    ratio = float(len(inside))/len(rad)
    pi_guess = 4 * ratio
    
    return pi_guess

Ok, let's try with a few different values:


In [51]:
for n in [20,200,2e3,2e4,2e6]:
    pi_guess = estimate_pi(n)
    print 'Number of points:', n
    print 'Estimate of pi:',pi_guess
    print 'Relative error (%):', 100*abs(pi_guess-np.pi)/np.pi
    print ''


Number of points: 20
Estimate of pi: 3.4
Relative error (%): 8.22536130249

Number of points: 200
Estimate of pi: 3.12
Relative error (%): 0.687315510657

Number of points: 2000.0
Estimate of pi: 3.134
Relative error (%): 0.24168167

Number of points: 20000.0
Estimate of pi: 3.1438
Relative error (%): 0.0702620184601

Number of points: 2000000.0
Estimate of pi: 3.141448
Relative error (%): 0.00460446677031

So as more points are added, the estimate gets closer to the real value. Sometimes low numbers appear to get lucky, so let's have a look at the variability:


In [52]:
pi_guesses = []
for i in range(20):
    pi_guesses.append(estimate_pi(200))

print 'Number of points: 200'
print pi_guesses
print 'Mean value:', np.array(pi_guesses).mean()
print 'Standard deviation:', np.array(pi_guesses).std()

pi_guesses = []
for i in range(20):
    pi_guesses.append(estimate_pi(2e6))

print 'Number of points: 2M'
print pi_guesses
print 'Mean value:', np.array(pi_guesses).mean()
print 'Standard deviation:', np.array(pi_guesses).std()


Number of points: 200
[3.06, 3.16, 3.0, 3.02, 2.92, 3.02, 3.16, 3.18, 3.08, 3.12, 3.26, 3.06, 2.98, 3.14, 3.24, 3.16, 3.2, 3.26, 3.1, 3.12]
Mean value: 3.112
Standard deviation: 0.0923904756996
Number of points: 2M
[3.141804, 3.142372, 3.141678, 3.141238, 3.141286, 3.142784, 3.140546, 3.14064, 3.142424, 3.139482, 3.141902, 3.141988, 3.141348, 3.142602, 3.142088, 3.142558, 3.14093, 3.14298, 3.1401, 3.14212]
Mean value: 3.1416435
Standard deviation: 0.000915795146307

Ok, so not only does the average get closer, but the standard deviation also decreases considerably when more points are used. Let's look at that graphically, using histograms to analyse the spread of values:


In [53]:
pi_guesses = []
for i in range(500):
    pi_guesses.append(estimate_pi(200))

print '200 darts (blue), 10000 darts (red)'
fig2 = plt.figure(figsize=(6,5))
ax = fig2.add_subplot(111)
ax.axvline(np.pi,lw=3,linestyle='dashed',color='k')
bad_hist = ax.hist(pi_guesses,bins=30,color='b',alpha=0.65,lw=0.2)
ax.set_xlim(2.7,3.5)


pi_guesses = []
for i in range(500):
    pi_guesses.append(estimate_pi(1e4))
    
good_hist = ax.hist(pi_guesses,bins=30,color='r',alpha=0.65,lw=0.2)
ax.set_xlabel('Estimate of $\pi$')
ax.set_ylabel('Number of occurrences')
plt.draw()

#display(fig2)
#display(fig3)


200 darts (blue), 10000 darts (red)

How many darts do you have to throw until you get a value accurate to a certain amount of significant figures? In other words, how does the relative error scale with the number of darts $n$? We leave that as an excercise to the reader (though a quick Googling will tell you the answer, I find it's always nicer if I have a code to test the theory!).

More information on this problem can be found here: http://en.wikipedia.org/wiki/Monte_Carlo_integration