Dice Simulaiton

In this excercise, we want to simulate the outcome of rolling dice. We will walk through several levels of building up funcitonality.

Single Die

Let's create a function that will return a random value between one and six that emulates the outcome of the roll of one die. Python has a random number package called random.



In [ ]:

    
import random
def single_die():
    """Outcome of a single die roll"""
    return random.randint(1,6)

Check

To check our function, let's call it 50 times and print the output. We should see numbers between 1 and 6.



In [ ]:

    
for _ in range(50):
    print(single_die(),end=' ')

Multiple Dice Roll

Now let's make a function that returns the sum of N 6-sided dice being rolled.



In [ ]:

    
def dice_roll(dice_count):
    """Outcome of a rolling dice_count dice
    
    Args:
        dice_count (int): number of dice to roll
    
    Returns:
        int: sum of face values of dice
        
    """
    out = 0
    for _ in range(dice_count):
        out += single_die()
    return out

Check

Let's perform the same check with 100 values and make sure we see values in the range of 2 to 12.



In [ ]:

    
for _ in range(100):
    print(dice_roll(2), end=' ')

Capture the outcome of multiple rolls

Write a function that will return a list of values for many dice rolls



In [ ]:

    
def dice_rolls(dice_count, rolls_count):
    """Return list of many dice rolls
    
    Args:
        dice_count (int): number of dice to roll
        rolls_count (int): number of rolls to do
        
    Returns:
        list: list of dice roll values.
        
    """
    out = []
    for _ in range(rolls_count):
        out.append(dice_roll(dice_count))
    return out

print(dice_rolls(2,100))

Plot Result

Make a function that plots the histogram of the dice values.



In [ ]:

    
import pylab as plt
%matplotlib inline

plt.rcParams['figure.figsize'] = (10, 4)

def dice_histogram(dice_count, rolls_count, bins):
    """Plots outcome of many dice rolls
    
    Args:
        dice_count (int): number of dice to roll
        rolls_count (int): number of rolls to do
        bins (int): number of histogram bins
        
    """
    plt.hist(dice_rolls(dice_count, rolls_count),bins)
    plt.show()
    
dice_histogram(2, 10000, 200)

Aside

The outputs follow a binomial distribution. As the number of dice increase, the binomial distribution approaches a Gaussian distribution due to the Central Limit Theorem (CLT). Try making a histogram with 100 dice. The resulting plot is a "Bell Curve" that represents the Gaussian distribution.



In [ ]:

    
dice_histogram(100, 10000, 200)

Slow?

That seemed slow. How do we time it?



In [ ]:

    
import time
start = time.time()

dice_histogram(100, 10000, 200)

print(time.time()-start, 'seconds')

Seems like a long time... Can we make it faster? Yes!

Optimize w/ Numpy

Using lots of loops in python is not usually the most efficient way to accomplish numeric tasks. Instead, we should use numpy. With numpy we can "vectorize" operations and under the hood numpy is doing the computation with C code that has a python interface. We don't have to worry about anything under the hood.

2-D Array of Values

Start by checking out numpy's randint function. Let's rewrite dice_rolls using numpy functions and no loops.

To do this, we are going to use np.random.randint to create a 2-D array of random dice rolls. That array will have dice_count rows and rolls_count columns--ie, the size of the array is (dice_count, rolls_count).



In [ ]:

    
import numpy as np
np.random.randint(1,7,(2,10))

The result is a np.array object with is like a list, but better. The most notable difference is that we can to element-wise math operations on numpy arrays easily.

Column sum

To find the roll values, we need to sum up the 2-D array by each column.



In [ ]:

    
np.sum(np.random.randint(1,7,(2,10)),axis=0)

Let's use this knowledge to rewrite `dice_rolls`



In [ ]:

    
def dice_rolls_np(dice_count, rolls_count):
    """Return list of many dice rolls
    
    Args:
        dice_count (int): number of dice to roll
        rolls_count (int): number of rolls to do
        
    Returns:
        np.array: list of dice roll values.
        
    """
    
    return np.sum(
        np.random.randint(1,7,(dice_count,rolls_count)),
        axis=0)


print(dice_rolls(2,100))

Histogram and timeit



In [ ]:

    
def dice_histogram_np(dice_count, rolls_count, bins):
    """Plots outcome of many dice rolls
    
    Args:
        dice_count (int): number of dice to roll
        rolls_count (int): number of rolls to do
        bins (int): number of histogram bins
        
    """
    plt.hist(dice_rolls_np(dice_count, rolls_count),bins)
    plt.show()
    
start = time.time()

dice_histogram_np(100, 10000, 200)

print(time.time()-start, 'seconds')

That is way faster!

`%timeit`

Jupyter has a magic function to time function execution. Let's try that:



In [ ]:

    
%timeit dice_rolls_np(100, 1000)



In [ ]:

    
%timeit dice_rolls(100, 1000)

The improvement in the core function call was two orders of magnitude, but when we timed it initially, we were also waiting for the plot to render which consumed the majority of the time.

Risk Game Simulation

In risk two players roll dice in each battle to determine how many armies are lost on each side.

Here are the rules:

The attacking player rolls three dice
The defending player rolls two dice
The defending player wins dice ties
The dice are matched in sorted order
The outcome is a measure of the net increase in armies for the the attacking player with values of -2, -1, 0, 1, 2

Let's make a function that simulates the outcome of one Risk battle and outputs the net score.

The functions we created in the first part of this tutorial are not useful for this task.



In [ ]:

    
def risk_battle():
    """Risk battle simulation"""
    
    # get array of three dice values
    attacking_dice = np.random.randint(1,7,3)
    
    # get array of two dice values
    defending_dice = np.random.randint(1,7,2)
    
    # sort both sets and take top two values
    attacking_dice_sorted = np.sort(attacking_dice)[::-1]
    defending_dice_sorted = np.sort(defending_dice)[::-1]
    
    # are the attacking values greater?
    attack_wins = attacking_dice_sorted[:2] > defending_dice_sorted[:2]
    
    # convert boolean values to -1, +1
    attack_wins_pm = attack_wins*2 - 1
    
    # sum up these outcomes
    return np.sum(attack_wins_pm)
    
for _ in range(50):
    print(risk_battle(), end=' ')

Histogram

Let's plot the histogram. Instead of making a function, let's just use list comprehension to make a list and then plot.



In [ ]:

    
outcomes = [risk_battle() for _ in range(10000)]
plt.hist(outcomes)
plt.show()

Expected Margin

If we run many simulations, how many armies do we expect the attacker to be ahead by on average?



In [ ]:

    
np.mean([risk_battle() for _ in range(10000)])

Dice Simulaiton

Single Die

Check

Multiple Dice Roll

Check

Capture the outcome of multiple rolls

Plot Result

Aside

Slow?

Optimize w/ Numpy

2-D Array of Values

Column sum

Let's use this knowledge to rewrite dice_rolls

Histogram and timeit

%timeit

Risk Game Simulation

Histogram

Expected Margin

Let's use this knowledge to rewrite `dice_rolls`

`%timeit`