The Polya urn model is a popular model for both statistics and to illustrate certain mental exercises.
Typically, these exercises involve randomly selecting colored balls, and these selection exercises can vary the properties of the remaining contents of the urn. A common question to ask is: given some number of colors and some number of balls, what are the chances of randomly selecting a ball of a specific color?
Write a function which:
urn_to_dict
The contents of the urn will be handed to you in a list form (the input argument), where each element of the list represents a ball in an urn, and the element itself will be a certain color. You then need to count how many times each color occurs in the list, and assemble those counts in the dictionary that your function should return.
For example, the list ["blue", "blue", "green", "blue"]
should result in the dictionary {"blue": 3, "green": 1}
. Use the urn_dict
dictionary object to store the results.
In [ ]:
In [ ]:
u1 = ["green", "green", "blue", "green"]
a1 = set({("green", 3), ("blue", 1)})
assert a1 == set(urn_to_dict(u1).items())
In [ ]:
u2 = ["red", "blue", "blue", "green", "yellow", "black", "black", "green", "blue", "yellow", "red", "green", "blue", "black", "yellow", "yellow", "yellow", "green", "blue", "red", "red", "blue", "red", "blue", "yellow", "yellow", "yellow"]
a2 = set({('black', 3), ('blue', 7), ('green', 4), ('red', 5), ('yellow', 8)})
assert a2 == set(urn_to_dict(u2).items())
In this part, you'll write code to compute the probabilities of certain colors using the dictionary object in the previous part. Your code will receive a dictionary of colors with their relative counts (i.e., the output of Part A), and a "query" color, and you will need to return the chances of randomly selecting a ball of that query color.
Write a function which:
chances_of_color
Remember, probability is a fraction: the numerator is the number of occurrences of the event you're interested in, and the denominator is the number of all possible events. It's kind of like an average.
For example, if the input dictionary is {"red": 3, "blue": 1}
and the query color is "blue"
, then the fraction you would return is 1/4
, or 0.25 (probabilities should always be between 0 and 1).
In [ ]:
In [ ]:
import numpy.testing as t
c1 = {"blue": 3, "red": 1}
t.assert_allclose(chances_of_color(c1, "blue"), 0.75)
In [ ]:
import numpy.testing as t
c2 = {"red": 934, "blue": 493859, "yellow": 31, "green": 3892, "black": 487}
t.assert_allclose(chances_of_color(c2, "green"), 0.007796427505443677)
In [ ]:
import numpy.testing as t
c3 = {"red": 5, "blue": 5, "yellow": 5, "green": 5, "black": 5}
t.assert_allclose(chances_of_color(c2, "orange"), 0.0)
In this part, you'll do the opposite of what you implemented in Part B: you'll get a dictionary and a query color, but you'll need to return the chances of drawing a ball that is not the same color as the query.
Write a function which:
chances_of_not_color
For example, if the input dictionary is {"red": 3, "blue": 1}
and the query color is "blue"
, then the fraction you would return is 3/4
, or 0.75.
HINT: You can use the function you wrote in Part B to help!
In [ ]:
In [ ]:
import numpy.testing as t
c1 = {"blue": 3, "red": 1}
t.assert_allclose(chances_of_not_color(c1, "blue"), 0.25)
In [ ]:
import numpy.testing as t
c2 = {"red": 934, "blue": 493859, "yellow": 31, "green": 3892, "black": 487}
t.assert_allclose(chances_of_not_color(c2, "blue"), 0.010705063871811693)
In [ ]:
import numpy.testing as t
c3 = {"red": 5, "blue": 5, "yellow": 5, "green": 5, "black": 5}
t.assert_allclose(chances_of_not_color(c2, "orange"), 1.0)
Even more interesting is when we start talking about combinations of colors. Let's say I'm reaching into a Polya urn to pull out two balls; it's valuable to know what my chances of at least 1 ball being a certain color would be.
Write a function which:
select_chances
Remember, you compute probability exactly as before--the number of events of interest (selecting a certain number of balls with at least one of a certain color) divided by the total number of possible events (all possible draws)--only this time you'll need to account for combinations of multiple balls.
For example, if I give you an urn list of ["blue", "green", "red"]
, the number 2
, and the query color "blue"
, then you would return 2/3
, or 0.66666 (There are three possible combinations of groupings of 2 balls: blue-green, blue-red, and green-red. Two of these three combinations contain the query color blue).
HINT: It will be very, very helpful if make use of the itertools
module for generating combinations of colored balls. If you can't remember how the module works, consult its documentation. Seriously though, it will vastly simplify your life in this question.
In [ ]:
In [ ]:
import numpy.testing as t
q1 = ["blue", "green", "red"]
t.assert_allclose(select_chances(q1, 2, "red"), 2/3)
In [ ]:
q2 = ["red", "blue", "blue", "green", "yellow", "black", "black", "green", "blue", "yellow", "red", "green", "blue", "black", "yellow", "yellow", "yellow", "green", "blue", "red", "red", "blue", "red", "blue", "yellow", "yellow", "yellow"]
t.assert_allclose(select_chances(q2, 3, "red"), 0.4735042735042735)
One final wrinkle: let's say I'm no longer picking colored balls simultaneously from the urn, but rather in sequence--that is, one right after the other. Now I can ask, for a given urn and a certain number of balls I'm going to pick, what are the chances that I draw a ball of a certain color first?
For example, if I give you an urn list of ["blue", "green", "red"]
, the number 2
, and the query color "blue"
, then you would return 2/6
, or 0.33333.
(There are six possible ways of drawing two balls in sequence:
and two of those six involve drawing the blue one first)
Write a function which:
select_chances_first
You are welcome to again use itertools
.
In [ ]:
In [ ]:
import numpy.testing as t
q1 = ["blue", "green", "red"]
t.assert_allclose(select_chances_first(q1, 2, "red"), 2/6)
In [ ]:
q2 = ["red", "blue", "blue", "green", "yellow", "black", "black", "green", "blue", "yellow", "red", "green", "blue", "black", "yellow", "yellow", "yellow", "green", "blue", "red", "red", "blue", "red", "blue", "yellow", "yellow", "yellow"]
t.assert_allclose(select_chances_first(q2, 3, "red"), 0.18518518518518517)