Oct 30, 2014
Deterministic problems.
Some problems in science and technology are desrcribed by "exact" mathematics, leading to "precise" results
Example: throwing a ball up in the air ($y(t)=v_0t - \frac{1}{2}gt^2$)
Stochastic problems.
Some problems appear physically uncertain
Examples: rolling a die, molecular motion, games
Use random numbers to mimic the uncertainty of the experiment.
Python has a random
module for drawing random numbers.
random.random()
draws random numbers in $[0,1)$:
In [1]:
import random
random.random()
random.random()
random.random()
Notice
The sequence of random numbers is produced by a deterministic algorithm - the numbers just appear random.
random.random()
generates random numbers that are uniformly distributed in the interval $[0,1)$
random.uniform(a, b)
generates random numbers uniformly distributed in $[a,b)$
"Uniformly distributed" means that if we generate a large set of numbers, no part of $[a,b)$ gets more numbers than others
In [1]:
%matplotlib inline
N = 500 # no of samples
x = range(N)
y = [random.uniform(-1,1) for i in x]
from scitools.std import plot
plot(x, y, '+', axis=[0,N-1,-1.2,1.2])
In [1]:
from numpy import random
r = random.random() # one no between 0 and 1
r = random.random(size=10000) # array with 10000 numbers
r = random.uniform(-1, 10) # one no between -1 and 10
r = random.uniform(-1, 10, size=10000) # array
Vectorized drawing is important for speeding up programs!
Possible problem: two random
modules, one Python "built-in" and one in numpy
(np
)
Convention: use random
(Python) and np.random
In [1]:
random.uniform(-1, 1) # scalar number
import numpy as np
np.random.uniform(-1, 1, 100000) # vectorized
In [1]:
import random
r = random.randint(a, b) # a, a+1, ..., b
import numpy as np
r = np.random.randint(a, b+1, N) # b+1 is not included
r = np.random.random_integers(a, b, N) # b is included
In [1]:
import random
N = 10000
eyes = [random.randint(1, 6) for i in range(N)]
M = 0 # counter for successes: how many times we get 6 eyes
for outcome in eyes:
if outcome == 6:
M += 1
print 'Got six %d times out of %d' % (M, N)
print 'Probability:', float(M)/N
In [1]:
import sys, numpy as np
N = int(sys.argv[1])
eyes = np.random.randint(1, 7, N)
success = eyes == 6 # True/False array
six = np.sum(success) # treats True as 1, False as 0
print 'Got six %d times out of %d' % (six, N)
print 'Probability:', float(M)/N
Impoartant!
Use sum
from numpy
and not Python's built-in sum
function! (The latter is slow, often making a vectorized version slower than the scalar version.)
Debugging programs with random numbers is difficult because the numbers produced vary each time we run the program
For debugging it is important that a new run reproduces the sequence of random numbers in the last run
This is possible by fixing the seed of the random
module:
random.seed(121)
(int
argument)
In [1]:
import random
random.seed(2)
['%.2f' % random.random() for i in range(7)]
['%.2f' % random.random() for i in range(7)]
random.seed(2) # repeat the random sequence
['%.2f' % random.random() for i in range(7)]
In [1]:
awards = ['car', 'computer', 'ball', 'pen']
import random
random.choice(awards)
Alternatively, we can compute a random index:
In [1]:
index = random.randint(0, len(awards)-1)
awards[index]
We can also shuffle the list randomly, and then pick any element:
In [1]:
random.shuffle(awards)
awards[0]
In [1]:
# A: ace, J: jack, Q: queen, K: king
# C: clubs, D: diamonds, H: hearts, S: spades
def make_deck():
ranks = ['A', '2', '3', '4', '5', '6', '7',
'8', '9', '10', 'J', 'Q', 'K']
suits = ['C', 'D', 'H', 'S']
deck = []
for s in suits:
for r in ranks:
deck.append(s + r)
random.shuffle(deck)
return deck
deck = make_deck()
Draw a card at random:
In [1]:
deck = make_deck()
card = deck[0]
del deck[0]
card = deck.pop(0) # return and remove element with index 0
In [1]:
def deal_hand(n, deck):
hand = [deck[i] for i in range(n)]
del deck[:n]
return hand, deck
In [1]:
def deal(cards_per_hand, no_of_players):
deck = make_deck()
hands = []
for i in range(no_of_players):
hand, deck = deal_hand(cards_per_hand, deck)
hands.append(hand)
return hands
players = deal(5, 4)
import pprint; pprint.pprint(players)
Resulting output:
In [1]:
[['D4', 'CQ', 'H10', 'DK', 'CK'],
['D7', 'D6', 'SJ', 'S4', 'C5'],
['C3', 'DQ', 'S3', 'C9', 'DJ'],
['H6', 'H9', 'C6', 'D5', 'S6']]
In [1]:
def same_rank(hand, n_of_a_kind):
ranks = [card[1:] for card in hand]
counter = 0
already_counted = []
for rank in ranks:
if rank not in already_counted and \
ranks.count(rank) == n_of_a_kind:
counter += 1
already_counted.append(rank)
return counter
In [1]:
def same_suit(hand):
suits = [card[0] for card in hand]
counter = {} # counter[suit] = how many cards of suit
for suit in suits:
# attention only to count > 1:
count = suits.count(suit)
if count > 1:
counter[suit] = count
return counter
In [1]:
The hand D4, CQ, H10, DK, CK
has 1 pairs, 0 3-of-a-kind and
2+2 cards of the same suit.
The hand D7, D6, SJ, S4, C5
has 0 pairs, 0 3-of-a-kind and
2+2 cards of the same suit.
The hand C3, DQ, S3, C9, DJ
has 1 pairs, 0 3-of-a-kind and
2+2 cards of the same suit.
The hand H6, H9, C6, D5, S6
has 0 pairs, 1 3-of-a-kind and
2 cards of the same suit.
In [1]:
class Deck:
def __init__(self, shuffle=True):
ranks = ['A', '2', '3', '4', '5', '6', '7',
'8', '9', '10', 'J', 'Q', 'K']
suits = ['C', 'D', 'H', 'S']
self.deck = [s+r for s in suits for r in ranks]
random.shuffle(self.deck)
def hand(self, n=1):
"""Deal n cards. Return hand as list."""
hand = [self.deck[i] for i in range(n)]
del self.deck[:n]
# alternative:
# hand = [self.pop(0) for i in range(n)]
return hand
def putback(self, card):
"""Put back a card under the rest."""
self.deck.append(card)
In [1]:
class Card:
def __init__(self, suit, rank):
self.card = suit + str(rank)
class Hand:
def __init__(self, list_of_cards):
self.hand = list_of_cards
class Deck:
def __init__(self, shuffle=True):
ranks = ['A', '2', '3', '4', '5', '6', '7',
'8', '9', '10', 'J', 'Q', 'K']
suits = ['C', 'D', 'H', 'S']
self.deck = [Card(s,r) for s in suits for r in ranks]
random.shuffle(self.deck)
def deal(self, n=1):
hand = Hand([self.deck[i] for i in range(n)])
del self.deck[:n]
return hand
def putback(self, card):
self.deck.append(card)
Warning:
To print a Deck
instance, Card
and Hand
must have __repr__
methods that return a "pretty print" string (see the book), because print
on list object applies __repr__
to print each element.
Is the class version better than the function version?
Yes! The function version has functions updating a global variable deck
, as in
In [1]:
hand, deck = deal_hand(5, deck)
This is often considered bad programming. In the class version we avoid a global variable - the deck is stored and updated inside the class. Errors are less likely to sneak in in the class version.
What is the probability that a certain event $A$ happens?
Simulate $N$ events and count how many times $M$ the event $A$ happens. The probability of the event $A$ is then $M/N$ (as $N\rightarrow\infty$).
Example:
You throw two dice, one black and one green. What is the probability that the number of eyes on the black is larger than that on the green?
In [1]:
import random
import sys
N = int(sys.argv[1]) # no of experiments
M = 0 # no of successful events
for i in range(N):
black = random.randint(1, 6) # throw black
green = random.randint(1, 6) # throw green
if black > green: # success?
M += 1
p = float(M)/N
print 'probability:', p
In [1]:
import sys
N = int(sys.argv[1]) # no of experiments
import numpy as np
r = np.random.random_integers(1, 6, (2, N))
black = r[0,:] # eyes for all throws with black
green = r[1,:] # eyes for all throws with green
success = black > green # success[i]==True if black[i]>green[i]
M = np.sum(success) # sum up all successes
p = float(M)/N
print 'probability:', p
In [1]:
combinations = [(black, green)
for black in range(1, 7)
for green in range(1, 7)]
How many of the (black, green)
pairs that have
the property black > green
?
In [1]:
success = [black > green for black, green in combinations]
M = sum(success)
print 'probability:', float(M)/len(combinations)
Terminal> python black_gt_green_exact.py
probability: 0.416666666667
Terminal> time python black_gt_green.py 10000
probability: 0.4158
Terminal> time python black_gt_green.py 1000000
probability: 0.416516
real 0m1.725s
Terminal> time python black_gt_green.py 10000000
probability: 0.4164688
real 0m17.649s
Terminal> time python black_gt_green_vec.py 10000000
probability: 0.4170253
real 0m0.816s
In [1]:
import sys
N = int(sys.argv[1]) # no of experiments
import random
start_capital = 10
money = start_capital
for i in range(N):
money -= 1 # pay for the game
black = random.randint(1, 6) # throw black
green = random.randint(1, 6) # throw brown
if black > green: # success?
money += 2 # get award
net_profit_total = money - start_capital
net_profit_per_game = net_profit_total/float(N)
print 'Net profit per game in the long run:', net_profit_per_game
Terminaldd> python black_gt_green_game.py 1000000
Net profit per game in the long run: -0.167804
In [1]:
import sys
N = int(sys.argv[1]) # no of experiments
import numpy as np
r = np.random.random_integers(1, 6, size=(2, N))
money = 10 - N # capital after N throws
black = r[0,:] # eyes for all throws with black
green = r[1,:] # eyes for all throws with green
success = black > green # success[i] is true if black[i]>green[i]
M = np.sum(success) # sum up all successes
money += 2*M # add all awards for winning
print 'Net profit per game in the long run:', (money-10)/float(N)
In [1]:
hat = []
for color in 'black', 'red', 'blue':
for i in range(4):
hat.append(color)
Choose two balls at random:
In [1]:
import random
index = random.randint(0, len(hat)-1) # random index
ball1 = hat[index]; del hat[index]
index = random.randint(0, len(hat)-1) # random index
ball2 = hat[index]; del hat[index]
# or:
random.shuffle(hat) # random sequence of balls
ball1 = hat.pop(0)
ball2 = hat.pop(0)
In [1]:
def new_hat(): # make a new hat with 12 balls
return [color for color in 'black', 'red', 'blue'
for i in range(4)]
def draw_ball(hat):
index = random.randint(0, len(hat)-1)
color = hat[index]; del hat[index]
return color, hat # (return hat since it is modified)
# run experiments:
n = input('How many balls are to be drawn? ')
N = input('How many experiments? ')
M = 0 # no of successes
for e in range(N):
hat = new_hat()
balls = [] # the n balls we draw
for i in range(n):
color, hat = draw_ball(hat)
balls.append(color)
if balls.count('black') >= 2: # two black balls or more?
M += 1
print 'Probability:', float(M)/N
Terminal> python balls_in_hat.py
How many balls are to be drawn? 2
How many experiments? 10000
Probability: 0.0914
Terminal> python balls_in_hat.py
How many balls are to be drawn? 8
How many experiments? 10000
Probability: 0.9346
Terminal> python balls_in_hat.py
How many balls are to be drawn? 4
How many experiments? 10000
Probability: 0.4033
In [1]:
import random
number = random.randint(1, 100) # the computer's secret number
attempts = 0 # no of attempts to guess the number
guess = 0 # user's guess at the number
while guess != number:
guess = input('Guess a number: ')
attempts += 1
if guess == number:
print 'Correct! You used', attempts, 'attempts!'
break
elif guess < number: print 'Go higher!'
else: print 'Go lower!'
Idea: compute $f_m$ by averaging $N$ function values. To choose the $N$ coordinates $x_0,\ldots,x_{N-1}$ we use random numbers in $[a,b]$. Then
In [1]:
def MCint(f, a, b, n):
s = 0
for i in range(n):
x = random.uniform(a, b)
s += f(x)
I = (float(b-a)/n)*s
return I
In [1]:
def MCint_vec(f, a, b, n):
x = np.random.uniform(a, b, n)
s = np.sum(f(x))
I = (float(b-a)/n)*s
return I
Remark:
Monte Carlo integration is slow for $\int f(x)dx$ (slower than the Trapezoidal rule, e.g.), but very efficient for integrating functions of many variables $\int f(x_1,x_2,\ldots,x_n)dx_1dx_2\cdots dx_n$
Choose a box $B=[x_L,x_H]\times[y_L,y_H]$ with some geometric object $G$ inside, what is the area of $G$?
Method: draw $N$ points at random inside $B$, count how many, $M$, that fall within $G$, $G$'s area is then $M/N\times\hbox{area}(B)$
Special case: $G$ is the geometry between $y=f(x)$ and the $x$ axis for $x\in [a,b]$, i.e., the area of $G$ is $\int_a^bf(x)dx$, and our method gives $\int_a^bf(x)dx \approx {M\over N}m(b-a)$ if $B$ is the box $[a,b]\times [0,m]$
Scalar code:
In [1]:
def MCint_area(f, a, b, n, fmax):
below = 0 # counter for no of points below the curve
for i in range(n):
x = random.uniform(a, b)
y = random.uniform(0, fmax)
if y <= f(x):
below += 1
area = below/float(n)*(b-a)*fmax
return area
Vectorized code:
In [1]:
from numpy import *
def MCint_area_vec(f, a, b, n, fmax):
x = np.random.uniform(a, b, n)
y = np.random.uniform(0, fmax, n)
below = y[y < f(x)].size
area = below/float(n)*(b-a)*fmax
return area
Basics of random walk in 1D:
One particle moves to the left and right with equal probability
$n$ particles start at $x=0$ at time $t=0$ - how do the particles get distributed over time?
Applications:
molecular motion
heat transport
quantum mechanics
polymer chains
population genetics
brain research
hazard games
pricing of financial instruments
In [1]:
from scitools.std import plot
import random
np = 4 # no of particles
ns = 100 # no of steps
positions = zeros(np) # all particles start at x=0
HEAD = 1; TAIL = 2 # constants
xmax = sqrt(ns); xmin = -xmax # extent of plot axis
for step in range(ns):
for p in range(np):
coin = random_.randint(1,2) # flip coin
if coin == HEAD:
positions[p] += 1 # step to the right
elif coin == TAIL:
positions[p] -= 1 # step to the left
plot(positions, y, 'ko3',
axis=[xmin, xmax, -0.2, 0.2])
time.sleep(0.2) # pause between moves
In [1]:
mean_pos = mean(positions)
stdev_pos = std(positions) # "width" of particle cluster
# shape of particle cluster:
from scitools.std import compute_histogram
pos, freq = compute_histogram(positions, nbins=int(xmax),
piecewise_constant=True)
plot(pos, freq, 'b-')
In [1]:
moves = numpy.random.random_integers(1, 2, size=np*ns)
moves = 2*moves - 3 # -1, 1 instead of 1, 2
moves.shape = (ns, np)
Evolution through time:
In [1]:
positions = numpy.zeros(np)
for step in range(ns):
positions += moves[step, :]
# can do some statistics:
print numpy.mean(positions), numpy.std(positions)
In [1]:
def random_walk_2D(np, ns, plot_step):
xpositions = numpy.zeros(np)
ypositions = numpy.zeros(np)
NORTH = 1; SOUTH = 2; WEST = 3; EAST = 4
for step in range(ns):
for i in range(len(xpositions)):
direction = random.randint(1, 4)
if direction == NORTH:
ypositions[i] += 1
elif direction == SOUTH:
ypositions[i] -= 1
elif direction == EAST:
xpositions[i] += 1
elif direction == WEST:
xpositions[i] -= 1
return xpositions, ypositions
In [1]:
def random_walk_2D(np, ns, plot_step):
xpositions = zeros(np)
ypositions = zeros(np)
moves = numpy.random.random_integers(1, 4, size=ns*np)
moves.shape = (ns, np)
NORTH = 1; SOUTH = 2; WEST = 3; EAST = 4
for step in range(ns):
this_move = moves[step,:]
ypositions += where(this_move == NORTH, 1, 0)
ypositions -= where(this_move == SOUTH, 1, 0)
xpositions += where(this_move == EAST, 1, 0)
xpositions -= where(this_move == WEST, 1, 0)
return xpositions, ypositions
We plot every plot_step
step
One plot on the screen + one hardcopy for movie file
Extent of axis: it can be shown that after $n_s$ steps, the typical width of the cluster of particles (standard deviation) is of order $\sqrt{n_s}$, so we can set min/max axis extent as, e.g.,
In [1]:
xymax = 3*sqrt(ns); xymin = -xymax
Inside for
loop over steps:
In [1]:
# just plot every plot_step steps:
if (step+1) % plot_step == 0:
plot(xpositions, ypositions, 'ko',
axis=[xymin, xymax, xymin, xymax],
title='%d particles after %d steps' % \
(np, step+1),
savefig='tmp_%03d.png' % (step+1))
Can classes be used to implement a random walk?
Yes, it sounds natural with class Particle
, holding the position of a particle as attributes and with a method move
for moving the particle one step
Class Particles
holds a list of Particle
instances and has a method move
for moving all particles one step and a method moves
for moving all particles through all steps
Additional methods in class Particles
can plot and compute statistics
Downside: with class Particle
the code is scalar - a vectorized version must
use arrays inside class Particles
instead of a list of Particle
instances
The implementation is an exercise
Draw a uniformly distributed random number in $[0,1)$:
In [1]:
import random
r = random.random()
Draw a uniformly distributed random number in $[a,b)$:
In [1]:
r = random.uniform(a, b)
Draw a uniformly distributed random integer in $[a,b]$:
In [1]:
i = random.randint(a, b)
In [1]:
import numpy as np
r = np.random.random(n)
Draw $n$ uniformly distributed random numbers in $[a,b)$:
In [1]:
r = np.random.uniform(a, b, n)
Draw $n$ uniformly distributed random integers in $[a,b]$:
In [1]:
i = np.random.randint(a, b+1, n)
i = np.random.random_integers(a, b, n)
Probability: perform $N$ experiments, count $M$ successes, then success has probability $M/N$ ($N$ must be large)
Monte Carlo simulation: let a program do $N$ experiments and count $M$ (simple method for probability problems)
Recall difference equation for the development of an investment $x_0$ with annual interest rate $p$:
But:
In reality, $p$ is uncertain in the future
Let us model this uncertainty by letting $p$ be random
Assume the interest is added every month:
A particular realization $x_n, p_n$, $n=0,1,\ldots,N$, is called a path (through time) or a realization. We are interested in the statistics of many paths.
Remark:
The development of $p$ is like a random walk, but the "particle" moves at each time level with probability $1/M$ (not 1 - always - as in a normal random walk).
In [1]:
def simulate_one_path(N, x0, p0, M, m):
x = zeros(N+1)
p = zeros(N+1)
index_set = range(0, N+1)
x[0] = x0
p[0] = p0
for n in index_set[1:]:
x[n] = x[n-1] + p[n-1]/(100.0*12)*x[n-1]
# update interest rate p:
r = random.randint(1, M)
if r == 1:
# adjust gamma:
r = random.randint(1, 2)
gamma = m if r == 1 else -m
else:
gamma = 0
pn = p[n-1] + gamma
p[n] = pn if 1 <= pn <= 15 else p[n-1]
return x, p
In [1]:
def simulate_n_paths(n, N, L, p0, M, m):
xm = zeros(N+1)
pm = zeros(N+1)
for i in range(n):
x, p = simulate_one_path(N, L, p0, M, m)
# accumulate paths:
xm += x
pm += p
# compute average:
xm /= float(n)
pm /= float(n)
return xm, pm
In [1]:
x0 = 1 # initial investment
p0 = 5 # initial interest rate
N = 10*12 # number of months
M = 3 # p changes (on average) every M months
n = 1000 # number of simulations
m = 0.5 # adjustment of p
We may add some graphics in the program:
* plot some realizations of $x_n$ and $p_n$
* plot the mean $x_n$ with plus/minus one standard deviation
* plot the mean $p_n$ with plus/minus one standard deviation
See the book for graphics details (good example on updating several different plots simultaneously in a simulation)