nbinteract
was designed to make interactive explanations easy to create. In this tutorial, we will show the process of writing a simulation from scratch and visualizing the results interactively.
In this section, you will create an interactive simulation of the Monty Hall Problem. You may continue writing code in the notebook from the previous section or create a new one for this section.
The Monty Hall Problem (Wikipedia) is a famous probability problem that has stumped many, mathematicians included. The problem goes something like this:
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to switch your choice to door No. 2?" Is it to your advantage to switch your choice?
Perhaps unintuitively, you will win the prize about twice as often if you switch doors. We can show this through simulation.
One way to write an interactive explanation is to write functions and create interactions for each one as applicable. Composing the functions allows you to create more complicated processes. nbinteract
also provides tools for interactive visualizations as we will soon see.
Let's start with defining a function to simulate one round of the Monty Hall Problem.
In [1]:
from ipywidgets import interact
import numpy as np
import random
PRIZES = ['Car', 'Goat 1', 'Goat 2']
def monty_hall(example_num=0):
'''
Simulates one round of the Monty Hall Problem. Outputs a tuple of
(result if stay, result if switch, result behind opened door) where
each results is one of PRIZES.
'''
pick = random.choice(PRIZES)
opened = random.choice(
[p for p in PRIZES if p != pick and p != 'Car']
)
remainder = next(p for p in PRIZES if p != pick and p != opened)
return (pick, remainder, opened)
Note that the example_num
argument is passed in but not used in the monty_hall
function. Although it's unneeded for the function, it is easier to use interact
to call functions when they have arguments to manipulate:
In [2]:
interact(monty_hall, example_num=(0, 100));
By interacting with the function above, we are able to informally verify that the function never allows the host to open a door with a car behind it. Even though the function is random we are able to use interaction to examine its long-term behavior!
We'll continue by defining a function to simulate a game of Monty Hall and output the winning strategy for that game:
In [3]:
def winner(example_num=0):
'''
Plays a game of Monty Hall. If staying with the original door wins
a car, return 'stay'. Otherwise, the remaining door contains the car
so 'switch' would have won.
'''
picked, _, _ = monty_hall()
return 'stay' if picked == 'Car' else 'switch'
interact(winner, example_num=(0, 100));
Again, a bit of interaction lets us quickly examine the behavior of winner
. We can see that switch
appears more often than stay
.
Let's create an interactive bar chart of the number of times each strategy wins. We'll use nbinteract
's plotting functionality.
nbi.bar
creates a bar chart:
In [4]:
import nbinteract as nbi
nbi.bar(['a', 'b'], [4, 6])
To make an interactive chart, pass a response function in place of one or both of bar
's arguments.
In [5]:
# This function generates the x-values
def categories(n):
return list('abcdefg')[:n]
# This function generates the y-values (heights of bars)
# The y response function always takes in the x-values as its
# first argument
def offset_y(xs, offset):
num_categories = len(xs)
return np.arange(num_categories) + offset
# Each argument of the response functions is passed in as a keyword
# argument to `nbi.bar` in the same format as `interact`
nbi.bar(categories, offset_y, n=(1, 7), offset=(0, 10))
Now, let's turn back to our original goal: plotting the winners as games are played.
We can call winner
many times and use nbi.bar
to show the bar chart as it's built over the trials.
Note that we compute the results before defining our function won
. This has two benefits over running the simulation directly in won
:
won
, moving the slider from 500 to a different number back to 500 will give us a slightly different bar chart.
In [6]:
categories = ['stay', 'switch']
winners = [winner() for _ in range(1000)]
# Note that the the first argument to the y response function
# will be the x-values which we don't need
def won(_, num_games):
'''
Outputs a 2-tuple of the number of times each strategy won
after num_games games.
'''
return (winners[:num_games].count('stay'),
winners[:num_games].count('switch'))
nbi.bar(categories, won, num_games=(1, 1000))
Note that by default the plot will adjust its y-axis to match the limits of the data. We can manually set the y-axis limits to better visualize this plot being built up. We will also add labels to our plot:
In [7]:
options = {
'title': 'Number of times each strategy wins',
'xlabel': 'Strategy',
'ylabel': 'Number of wins',
'ylim': (0, 700),
}
nbi.bar(categories, won, options=options, num_games=(1, 1000))
We can get even fancy and use the Play
widget from ipywidgets
to animate the plot.
In [8]:
from ipywidgets import Play
nbi.bar(categories, won, options=options,
num_games=Play(min=0, max=1000, step=10, value=0, interval=17))
Now we have an interactive, animated bar plot showing the distribution of wins over time for both Monty Hall strategies. This is a convincing argument that switching is better than staying. In fact, the bar plot above suggests that switching is about twice as likely to win as staying!
Is switching actually twice as likely to win? We can again use simulation to answer this question by simulating sets of 50 games at a time. recording the proportion of times switch
wins.
In [9]:
def prop_wins(sample_size):
'''Returns proportion of times switching wins after sample_size games.'''
return sum(winner() == 'switch' for _ in range(sample_size)) / sample_size
interact(prop_wins, sample_size=(10, 100));
We can then define a function to play sets of games and generate a list of win proportions for each set:
In [10]:
def generate_proportions(sample_size, repetitions):
'''
Returns an array of length reptitions. Each element in the list is the
proportion of times switching won in sample_size games.
'''
return np.array([prop_wins(sample_size) for _ in range(repetitions)])
interact(generate_proportions, sample_size=(10, 100), repetitions=(10, 100));
Interacting with generate_proportions
shows the relationship between its arguments sample_size
and repetitions
more quickly than reading the function itself!
We can then use nbi.hist
to show these proportions being computed over runs.
Again, we pre-compute the simulations and interact with a function that takes a slice of the simulations to make the interaction faster.
In [11]:
# Play the game 10 times, recording the proportion of times switching wins.
# Repeat 100 times to record 100 proportions
proportions = generate_proportions(sample_size=10, repetitions=100)
def props_up_to(num_sets):
return proportions[:num_sets]
nbi.hist(props_up_to, num_sets=Play(min=0, max=100, value=0, interval=50))
As with last time, it's illustrative to specify the limits of the axes:
In [12]:
options = {
'title': 'Distribution of win proportion over 100 sets of 10 games when switching',
'xlabel': 'Proportions',
'ylabel': 'Percent per area',
'xlim': (0.3, 1),
'ylim': (0, 3),
'bins': 7,
}
nbi.hist(props_up_to, options=options, num_sets=Play(min=0, max=100, value=0, interval=50))
We can see that the distribution of wins is centered at roughly 0.66 but the distribution almost spans the entire x-axis. Will increasing the sample size make our distribution more narrow? Will increasing repetitions do the trick? Or both? We can find out through simulation and interaction.
We'll start with increasing the sample size:
In [13]:
varying_sample_size = [generate_proportions(sample_size, repetitions=100)
for sample_size in range(10, 101)]
def props_for_sample_size(sample_size):
return varying_sample_size[sample_size - 10]
changed_options = {
'title': 'Distribution of win proportions as sample size increases',
'ylim': (0, 6),
'bins': 20,
}
nbi.hist(props_for_sample_size,
options={**options, **changed_options},
sample_size=Play(min=10, max=100, value=10, interval=50))
So increasing the sample size makes the distribution narrower. We can now see more clearly that the distribution is centered at 0.66.
We can repeat the process for the number of repetitions:
In [14]:
varying_reps = [generate_proportions(sample_size=10, repetitions=reps) for reps in range(10, 101)]
def props_for_reps(reps):
return varying_reps[reps - 10]
changed_options = {
'title': 'Distribution of win proportions as repetitions increase',
'ylim': (0, 5),
}
nbi.hist(props_for_reps,
options={**options, **changed_options},
reps=Play(min=10, max=100, value=10, interval=50))
Our distribution gets smoother as the number of repetitions increases since we have more proportions to plot. However, the spread of the distribution remains the same.
Through simulation, we've shown that increasing the sample size causes our distribution of proportions to become more narrowly distributed. Increasing the number of repetitions makes our distribution look smoother when plotted but doesn't affect the distribution spread.
We've also seen through increasing the sample size shows that our distribution is centered at 0.66. This implies that the probability of winning the Monty Hall game when switching is around 0.66, and thus the probability of winning the Monty Hall game when staying is around 0.33.
Through this extended example, we've shown how interaction and visualization can help explore ideas and convincingly explain concepts. Congrats on making it through the tutorial! Let's now publish your results to the web.
Like the previous section of the tutorial, you should open a terminal and run the following command:
nbinteract {notebook}
Where {notebook}
is replaced with the name of the notebook you wish to convert (e.g. nbinteract monty.ipynb
). Now, commit and push your changes:
git add -A
git commit -m "Publish nb"
git push origin master
As before, you will now have a URL pointing to the interactive webpage generated by this tutorial:
{username}.github.io/nbinteract-tutorial/monty.html
Where {username}
is replaced with your GitHub username. For example, if my
username is SamLau95
, my URL is:
SamLau95.github.io/nbinteract-tutorial/monty.html