The Polya urn model is a popular model for both statistics and to illustrate certain mental exercises.
Typically, these exercises involve randomly selecting colored balls, and these selection exercises can vary the properties of the remaining contents of the urn. A common question to ask is: given some number of colors and some number of balls, what are the chances of randomly selecting a ball of a specific color?
In the code below, you'll finish writing a function that encodes this information into a dictionary, where the keys are the colors and the values are the counts of balls of that color. The contents of the urn will be handed to you in a list form, where each element of the list represents a ball in an urn, and the element itself will be a certain color.
For example, the list ["blue", "blue", "green", "blue"]
should result in the dictionary {"blue": 3, "green": 1}
. Use the urn_dict
dictionary object to store the results.
In [ ]:
def urn_to_dict(urn_list):
urn_dict = {}
### BEGIN SOLUTION
### END SOLUTION
return urn_dict
In [ ]:
u1 = ["green", "green", "blue", "green"]
a1 = set({("green", 3), ("blue", 1)})
assert a1 == set(urn_to_dict(u1).items())
u2 = ["red", "blue", "blue", "green", "yellow", "black", "black", "green", "blue", "yellow", "red", "green", "blue", "black", "yellow", "yellow", "yellow", "green", "blue", "red", "red", "blue", "red", "blue", "yellow", "yellow", "yellow"]
a2 = set({('black', 3), ('blue', 7), ('green', 4), ('red', 5), ('yellow', 8)})
assert a2 == set(urn_to_dict(u2).items())
In this part, you'll write code to compute the probabilities of certain colors using the dictionary object in the previous part. Your code will receive a dictionary of colors with their relative counts, and a "query" color, and you will need to return the chances of randomly selecting a ball of that query color.
Remember, probability is a fraction: the numerator is the number of occurrences of the event you're interested in, and the denominator is the number of all possible events. It's kind of like an average.
For example, if the input dictionary is {"red": 3, "blue": 1}
and the query color is "blue"
, then the fraction you would return is 1/4
, or 0.25 (probabilities should always be between 0 and 1). Put your answer in the variable prob
.
In [ ]:
def chances_of_color(counts, query):
prob = 0.0
### BEGIN SOLUTION
### END SOLUTION
return prob
In [ ]:
import numpy.testing as t
c1 = {"blue": 3, "red": 1}
t.assert_allclose(chances_of_color(c1, "blue"), 0.75)
In [ ]:
import numpy.testing as t
c2 = {"red": 934, "blue": 493859, "yellow": 31, "green": 3892, "black": 487}
t.assert_allclose(chances_of_color(c2, "green"), 0.007796427505443677)
In [ ]:
import numpy.testing as t
c3 = {"red": 5, "blue": 5, "yellow": 5, "green": 5, "black": 5}
t.assert_allclose(chances_of_color(c2, "orange"), 0.0)
In this part, you'll do the opposite of what you implemented in Part B: you'll get a dictionary and a query color, but you'll need to return the chances of drawing a ball that is not the same color as the query.
For example, if the input dictionary is {"red": 3, "blue": 1}
and the query color is "blue"
, then the fraction you would return is 3/4
, or 0.75. Put your answer in the variable prob
.
HINT: You can use the function you wrote in Part B to help!
In [ ]:
def chances_of_not_color(counts, query):
prob = 0.0
### BEGIN SOLUTION
### END SOLUTION
return prob
In [ ]:
import numpy.testing as t
c1 = {"blue": 3, "red": 1}
t.assert_allclose(chances_of_not_color(c1, "blue"), 0.25)
In [ ]:
import numpy.testing as t
c2 = {"red": 934, "blue": 493859, "yellow": 31, "green": 3892, "black": 487}
t.assert_allclose(chances_of_not_color(c2, "blue"), 0.010705063871811693)
In [ ]:
import numpy.testing as t
c3 = {"red": 5, "blue": 5, "yellow": 5, "green": 5, "black": 5}
t.assert_allclose(chances_of_not_color(c2, "orange"), 1.0)
Even more interesting is when we start talking about combinations of colors. Let's say I'm reaching into a Polya urn to pull out two balls; it's valuable to know what my chances of at least ball being a certain color would be.
In the function below, complete the code that will compute the chances of drawing at least one ball of the given color, provided you randomly select a certain number of balls.
Remember, you compute probability exactly as before--the number of events of interest (selecting a certain number of balls with at least one of a certain color) divided by the total number of possible events (all possible draws)--only this time you'll need to account for combinations of multiple balls. To do this, you'll have the help of the itertools
module. If you can't remember how the module works, consult its documentation.
For example, if I give you an urn list of ["blue", "green", "red"]
, the number 2
, and the query color "blue"
, then you would return 2/3
, or 0.66666 (There are three possible combinations of groupings of 2 colors: blue-green, blue-red, and green-red. Two of these three combinations contain the query color blue).
In [ ]:
import itertools
def select_chances(urn_list, number, color):
prob = 0.0
### BEGIN SOLUTION
### END SOLUTION
return prob
In [ ]:
import numpy.testing as t
q1 = ["blue", "green", "red"]
t.assert_allclose(select_chances(q1, 2, "red"), 2/3)
q2 = ["red", "blue", "blue", "green", "yellow", "black", "black", "green", "blue", "yellow", "red", "green", "blue", "black", "yellow", "yellow", "yellow", "green", "blue", "red", "red", "blue", "red", "blue", "yellow", "yellow", "yellow"]
t.assert_allclose(select_chances(q2, 3, "red"), 0.4735042735042735)
One final wrinkle: let's say I'm no longer picking colored balls simultaneously from the urn, but rather in sequence--that is, one right after the other. Now I can ask, for a given urn and a certain number of balls I'm going to pick, what are the chances that I draw a ball of a certain color first?
For example, if I give you an urn list of ["blue", "green", "red"]
, the number 2
, and the query color "blue"
, then you would return 2/6
, or 0.33333.
(There are six possible ways of drawing two balls in sequence:
and two of those six involve drawing the blue one first)
You are welcome to again use itertools
. Remember to store your answer in the variable prob
.
In [ ]:
import itertools
def select_chances_first(urn_list, number, color):
prob = 0.0
### BEGIN SOLUTION
### END SOLUTION
return prob
In [ ]:
import numpy.testing as t
q1 = ["blue", "green", "red"]
t.assert_allclose(select_chances_first(q1, 2, "red"), 2/6)
q2 = ["red", "blue", "blue", "green", "yellow", "black", "black", "green", "blue", "yellow", "red", "green", "blue", "black", "yellow", "yellow", "yellow", "green", "blue", "red", "red", "blue", "red", "blue", "yellow", "yellow", "yellow"]
t.assert_allclose(select_chances_first(q2, 3, "red"), 0.18518518518518517)