Assignment 3: Probabilistic modeling

In this assignment, you will work with probabilistic models known as Bayesian networks to efficiently calculate the answer to probability questions concerning discrete random variables.

To help, we've provided a package called pbnt that supports the representation of Bayesian networks and automatic inference (!) of marginal probabilities. Note that you need numpy to run pbnt.

This assignment is due on T-Square on October 15th by 9:35 AM.

Warning

Due to compatibility bugs in pbnt, this assignment requires Python 2.7 to run, which in turn requires iPython2. So you should run this notebook using

ipython2 notebook probability_notebook.ipynb

If you don't have iPython2 installed, you can download it here, unzip it, and install it using

python setup.py install

If you don't have iPython2 installed and you don't want to have more than one version installed, you can set it up using a virtual environment (virtualenv).

You can find instructions on how to set up a virtualenv under the following links:

Windows users (TL;DR use PowerShell)
Linux/Mac users (TL;DR use pip if you can)

Once you have virtualenv installed, you should navigate to the directory containing probability_notebook.ipynb and run the command

virtualenv .

which will create a subdirectory called "bin" which contains scripts to run the virtual environment. You'll then call

source bin/activate

to activate the virtualenv. You'll see that your command line now looks something like this:

(assignment_3)my-laptop:probability_assignment user_name$

From here, you can install iPython2 to your local directory by running the command

pip2.7 install "ipython [notebook]"

and then you'll be able to open probability_notebook.ipynb using iPython2.

Whenever you want to quit your virtual environment, just type

deactivate

and you can reactivate the environment later with the same command you used before. If you ever want to get rid of the virtualenv files entirely, just delete the "bin" folder.



In [257]:

    
"""Testing pbnt.
Run this before anything else
to get pbnt to work!"""
import sys
# from importlib import reload
if('pbnt/combined' not in sys.path):
    sys.path.append('pbnt/combined')
from exampleinference import inferenceExample

# Should output:
# ('The marginal probability of sprinkler=false:', 0.80102921)
#('The marginal probability of wetgrass=false | cloudy=False, rain=True:', 0.055)

inferenceExample()









    



('The marginal probability of sprinkler=false:', 0.80102921)
('The marginal probability of wetgrass=false | cloudy=False, rain=True:', 0.055)

Part 1: Bayesian network tutorial

40 points total

To start, design a basic probabilistic model for the following system.

There's a nuclear power plant in which an alarm is supposed to ring when the core temperature, indicated by a gauge, exceeds a fixed threshold. For simplicity, we assume that the temperature is represented as either high or normal. Use the following Boolean variables in your implementation:

A = alarm sounds
F_A = alarm is faulty
G = gauge reading (high = True, normal = False)
F_G = gauge is faulty
T = actual temperature (high = True, normal = False)

In addition, assume that the gauge is more likely to fail when the temperature is high.

You will test your implementation at the end of the section.

1a: Casting the net

10 points

Design a Bayesian network for this system, using pbnt to represent the nodes and conditional probability arcs connecting nodes. Don't worry about the probabilities for now. Fill out the function below to create the net.

The following command will create a BayesNode with 2 values, an id of 0 and the name "alarm":

A_node = BayesNode(0,2,name='alarm')

You will use BayesNode.add_parent() and BayesNode.add_child() to connect nodes. For example, to connect the alarm and temperature nodes that you've already made (i.e. assuming that temperature affects the alarm probability):

A_node.add_parent(T_node)
T_node.add_child(A_node)

You can run probability_tests.network_setup_test() to make sure your network is set up correctly.



In [258]:

    
from Node import BayesNode
from Graph import BayesNet
def make_power_plant_net():
    """Create a Bayes Net representation of
    the above power plant problem."""
    T_node = BayesNode(0,2,name='temperature')
    G_node = BayesNode(1,2,name='gauge')
    A_node = BayesNode(2,2,name='alarm')
    F_G_node = BayesNode(3,2,name='faulty gauge')
    F_A_node = BayesNode(4,2,name='faulty alarm')
    
    T_node.add_child(G_node)
    T_node.add_child(F_G_node)
    G_node.add_parent(T_node)
    F_G_node.add_parent(T_node)
    F_G_node.add_child(G_node)
    G_node.add_parent(F_G_node)
    G_node.add_child(A_node)
    A_node.add_parent(G_node)
    F_A_node.add_child(A_node)
    A_node.add_parent(F_A_node)
    
    nodes = [T_node, G_node, F_G_node, A_node, F_A_node]
    
    return BayesNet(nodes)



In [259]:

    
from probability_tests import network_setup_test
power_plant = make_power_plant_net()
network_setup_test(power_plant)









    



correct number of nodes
correct number of edges between nodes

1b: Polytrees

5 points

Is the network for the power plant system a polytree? Why or why not? Choose from the following answers.



In [260]:

    
def is_polytree():
    """Multiple choice question about polytrees."""
    
    choice = 'c'
    answers = {
        'a' : 'Yes, because it can be decomposed into multiple sub-trees.',
        'b' : 'Yes, because its underlying undirected graph is a tree.',
        'c' : 'No, because its underlying undirected graph is not a tree.',
        'd' : 'No, because it cannot be decomposed into multiple sub-trees.'
    }
    return answers[choice]

1c: Setting the probabilities

15 points

Assume that the following statements about the system are true:

The temperature gauge reads the correct temperature with 95% probability when it is not faulty and 20% probability when it is faulty. For simplicity, say that the gauge's "true" value corresponds with its "hot" reading and "false" with its "normal" reading, so the gauge would have a 95% chance of returning "true" when the temperature is hot and it is not faulty.
The alarm is faulty 15% of the time.
The temperature is hot (call this "true") 20% of the time.
When the temperature is hot, the gauge is faulty 80% of the time. Otherwise, the gauge is faulty 5% of the time.
The alarm responds correctly to the gauge 55% of the time when the alarm is faulty, and it responds correctly to the gauge 90% of the time when the alarm is not faulty. For instance, when it is faulty, the alarm sounds 55% of the time that the gauge is "hot" and remains silent 55% of the time that the gauge is "normal."

Knowing these facts, set the conditional probabilities for the necessary variables on the network you just built.

Using pbnt's Distribution class: if you wanted to set the distribution for P(A) to 70% true, 30% false, you would invoke the following commands.

A_distribution = DiscreteDistribution(A_node)
index = A_distribution.generate_index([],[])
A_distribution[index] = [0.3,0.7]
A_Node.set_dist(A_distribution)

If you wanted to set the distribution for P(A|G) to be

$G$	$P(A=true	G)$
T	0.75
F	0.85

you would invoke:

from numpy import zeros, float32
dist = zeros([G_node.size(), A_node.size()], dtype=float32)
dist[0,:] = [0.15, 0.85]
dist[1,:] = [0.25, 0.75]
A_distribution = ConditionalDiscreteDistribution(nodes=[G_node,A_node], table=dist)
A_node.set_dist(A_distribution)

Modeling a three-variable relationship is a bit trickier. If you wanted to set the following distribution for $P(A|G,T)$ to be

$G$	$T$	$P(A=true
T	T	0.15
T	F	0.6
F	T	0.2
F	F	0.1

you would invoke:

from numpy import zeros, float32
dist = zeros([G_node.size(), T_node.size(), A_node.size()], dtype=float32)
dist[1,1,:] = [0.85, 0.15]
dist[1,0,:] = [0.4, 0.6]
dist[0,1,:] = [0.8, 0.2]
dist[0,0,:] = [0.9, 0.1]
A_distribution = ConditionalDiscreteDistribution(nodes=[G_node, T_node, A_node], table=dist)
A_node.set_dist(A_distribution)

The key is to remember that 0 represents the index of the false probability, and 1 represents true.

You can check your probability distributions with probability_tests.probability_setup_test().



In [261]:

    
from numpy import zeros, float32
import Distribution
from Distribution import DiscreteDistribution, ConditionalDiscreteDistribution
def set_probability(bayes_net):
    """Set probability distribution for each
    node in the power plant system."""
    
    A_node = bayes_net.get_node_by_name("alarm")
    F_A_node = bayes_net.get_node_by_name("faulty alarm")
    G_node = bayes_net.get_node_by_name("gauge")
    F_G_node = bayes_net.get_node_by_name("faulty gauge")
    T_node = bayes_net.get_node_by_name("temperature")
    nodes = [A_node, F_A_node, G_node, F_G_node, T_node]
    
    #for completely independent nodes
    T_dist = DiscreteDistribution(T_node)
    index = T_dist.generate_index([],[])
    T_dist[index] = [0.8,0.2]
    T_node.set_dist(T_dist)
    
    F_A_dist = DiscreteDistribution(F_A_node)
    index = F_A_dist.generate_index([],[])
    F_A_dist[index] = [0.85,0.15]
    F_A_node.set_dist(F_A_dist)
    
    #for single parent node
    dist = zeros([T_node.size(), F_G_node.size()], dtype=float32)
    dist[0,:] = [0.95,0.05]
    dist[1,:] = [0.2,0.8]
    F_G_dist = ConditionalDiscreteDistribution(nodes = [T_node, F_G_node], table=dist)
    F_G_node.set_dist(F_G_dist)
    
    #for double parent node
    dist = zeros([F_G_node.size(), T_node.size(), G_node.size()], dtype=float32)
    dist[0,0,:] = [0.95,0.05]
    dist[0,1,:] = [0.05,0.95]
    dist[1,0,:] = [0.2,0.8]
    dist[1,1,:] = [0.8,0.2]
    G_dist = ConditionalDiscreteDistribution(nodes=[F_G_node,T_node,G_node], table=dist)
    G_node.set_dist(G_dist)
    
    dist = zeros([F_A_node.size(),G_node.size(),A_node.size()], dtype=float32)
    dist[0,0,:] = [0.9,0.1]
    dist[0,1,:] = [0.1,0.9]
    dist[1,0,:] = [0.55,0.45]
    dist[1,1,:] = [0.45,0.55]
    A_dist = ConditionalDiscreteDistribution(nodes=[F_A_node,G_node,A_node], table=dist)
    A_node.set_dist(A_dist)
    
    return bayes_net



In [262]:

    
set_probability(power_plant)
from probability_tests import probability_setup_test
probability_setup_test(power_plant)









    



correct temperature distribution size
correct temperature distribution
correct faulty gauge distribution size
correct faulty gauge distribution
correct gauge distribution size
correct alarm distribution size
correct faulty alarm distribution size
correct faulty alarm distribution

1d: Probability calculations

10 points

To finish up, you're going to perform inference on the network to calculate the following probabilities:

the marginal probability that the alarm sounds
the marginal probability that the gauge shows "hot"
the probability that the temperature is actually hot, given that the alarm sounds and the alarm and gauge are both working

You'll fill out the "get_prob" functions to calculate the probabilities.

Here's an example of how to do inference for the marginal probability of the "faulty alarm" node being True (assuming "bayes_net" is your network):

F_A_node = bayes_net.get_node_by_name('faulty alarm')
engine = JunctionTreeEngine(bayes_net)
Q = engine.marginal(F_A_node)[0]
index = Q.generate_index([True],range(Q.nDims))
prob = Q[index]

To compute the conditional probability, set the evidence variables before computing the marginal as seen below (here we're computing $P(A = false | F_A = true, T = False)$):

engine.evidence[F_A_node] = True
engine.evidence[T_node] = False
Q = engine.marginal(A_node)[0]
index = Q.generate_index([False],range(Q.nDims))
prob = Q[index]

If you need to sanity-check to make sure you're doing inference correctly, you can run inference on one of the probabilities that we gave you in 1c. For instance, running inference on $P(T=true)$ should return 0.19999994 (i.e. almost 20%). You can also calculate the answers by hand to double-check.



In [263]:

    
def get_alarm_prob(bayes_net, alarm_rings):
    """Calculate the marginal 
    probability of the alarm 
    ringing (T/F) in the 
    power plant system."""
    A_node = bayes_net.get_node_by_name('alarm')
    engine = JunctionTreeEngine(bayes_net)
    Q = engine.marginal(A_node)[0]
    index = Q.generate_index([alarm_rings],range(Q.nDims))
    alarm_prob = Q[index]
    
    return alarm_prob



In [264]:

    
def get_gauge_prob(bayes_net, gauge_hot):
    """Calculate the marginal
    probability of the gauge 
    showing hot (T/F) in the 
    power plant system."""
    G_node = bayes_net.get_node_by_name('gauge')
    engine = JunctionTreeEngine(bayes_net)
    Q = engine.marginal(G_node)[0]
    index = Q.generate_index([gauge_hot],range(Q.nDims))
    gauge_prob = Q[index]
    
    return gauge_prob



In [265]:

    
from Inference import JunctionTreeEngine
def get_temperature_prob(bayes_net,temp_hot):
    """Calculate theprobability of the 
    temperature being hot (T/F) in the
    power plant system, given that the
    alarm sounds and neither the gauge
    nor alarm is faulty."""
    T_node = bayes_net.get_node_by_name('temperature')
    A_node = bayes_net.get_node_by_name('alarm')
    F_A_node = bayes_net.get_node_by_name('faulty alarm')
    F_G_node = bayes_net.get_node_by_name('faulty gauge')
    engine = JunctionTreeEngine(bayes_net)
    engine.evidence[A_node] = True
    engine.evidence[F_A_node] = False
    engine.evidence[F_G_node] = False
    Q = engine.marginal(T_node)[0]
    index = Q.generate_index([temp_hot],range(Q.nDims))
    temp_prob = Q[index]
    
    return temp_prob



In [266]:

    
print get_alarm_prob(power_plant,True)
print get_gauge_prob(power_plant,True)
print get_temperature_prob(power_plant,True)

Part 2: Sampling

60 points total

For the main exercise, consider the following scenario.

There are three frisbee teams who play each other: the Airheads, the Buffoons, and the Clods (A, B and C for short). Each match is between two teams, and each team can either win, lose, or draw in a match. Each team has a fixed but unknown skill level, represented as an integer from 0 to 3. Each match's outcome is probabilistically proportional to the difference in skill level between the teams.

We want to predict the outcome of the matches, given prior knowledge of previous matches. Rather than using inference, we will do so by sampling the network using two Markov Chain Monte Carlo models: Metropolis-Hastings (2b) and Gibbs sampling (2c).

2a: Build the network

10 points

Build a Bayes Net to represent the three teams and their influences on the match outcomes. Assume the following variable conventions:

variable name	description
A	A's skill level
B	B's skill level
C	C's skill level
AvB	the outcome of A vs. B (0 = A wins, 1 = B wins, 2 = tie)
BvC	the outcome of B vs. C (0 = B wins, 1 = C wins, 2 = tie)
CvA	the outcome of C vs. A (0 = C wins, 1 = A wins, 2 = tie)

Assume that each team has the following prior distribution of skill levels:

skill level	P(skill level)
0	0.15
1	0.45
2	0.30
3	0.10

In addition, assume that the differences in skill levels correspond to the following probabilities of winning:

skill difference (T2 - T1)	T1 wins	T2 wins	Tie
0	0.10	0.10	0.80
1	0.20	0.40	0.20
2	0.15	0.75	0.10
3	0.05	0.90	0.05



In [267]:

    
def get_game_network():
    """Create a Bayes Net representation
    of the game problem."""
    
    #create the network
    A = BayesNode(0,4,name='Ateam')
    B = BayesNode(1,4,name='Bteam')
    C = BayesNode(2,4,name='Cteam')
    AvB = BayesNode(3,3,name='AvB')
    BvC = BayesNode(4,3,name='BvC')
    CvA = BayesNode(5,3,name='CvA')
    
    A.add_child(AvB)
    A.add_child(CvA)
    B.add_child(AvB)
    B.add_child(BvC)
    C.add_child(BvC)
    C.add_child(CvA)
    AvB.add_parent(A)
    AvB.add_parent(B)
    BvC.add_parent(B)
    BvC.add_parent(C)
    CvA.add_parent(C)
    CvA.add_parent(A)
    
    nodes = [A,B,C,AvB,BvC,CvA]
    game_net = BayesNet(nodes)
    
    #setting priors for team skills
    skillDist = DiscreteDistribution(A)
    index = skillDist.generate_index([],[])
    skillDist[index] = [0.15,0.45,0.3,0.1]
    A.set_dist(skillDist)
    
    skillDist = DiscreteDistribution(B)
    index = skillDist.generate_index([],[])
    skillDist[index] = [0.15,0.45,0.3,0.1]
    B.set_dist(skillDist)
    
    skillDist = DiscreteDistribution(C)
    index = skillDist.generate_index([],[])
    skillDist[index] = [0.15,0.45,0.3,0.1]
    C.set_dist(skillDist)
    
    
    #setting probability priors for winning
    dist = zeros([A.size(),B.size(),AvB.size()], dtype=float32)
    dist[0,0,:] = [0.1,0.1,0.8]
    dist[1,1,:] = [0.1,0.1,0.8]
    dist[2,2,:] = [0.1,0.1,0.8]
    dist[3,3,:] = [0.1,0.1,0.8]
    dist[0,1,:] = [0.2,0.4,0.2]
    dist[1,2,:] = [0.2,0.4,0.2]
    dist[2,3,:] = [0.2,0.4,0.2]
    dist[0,2,:] = [0.15,0.75,0.1]
    dist[1,3,:] = [0.15,0.75,0.1]
    dist[0,3,:] = [0.05,0.9,0.05]
    dist[3,0,:] = [0.9,0.05,0.05]
    dist[1,0,:] = [0.4,0.2,0.2]
    dist[2,1,:] = [0.4,0.2,0.2]
    dist[3,2,:] = [0.4,0.2,0.2]
    dist[2,0,:] = [0.75,0.15,0.1]
    dist[3,1,:] = [0.75,0.15,0.1]
    AvB_dist = ConditionalDiscreteDistribution(nodes=[A,B,AvB], table=dist)
    AvB.set_dist(AvB_dist)
    
    dist = zeros([B.size(),C.size(),BvC.size()], dtype=float32)
    dist[0,0,:] = [0.1,0.1,0.8]
    dist[1,1,:] = [0.1,0.1,0.8]
    dist[2,2,:] = [0.1,0.1,0.8]
    dist[3,3,:] = [0.1,0.1,0.8]
    dist[0,1,:] = [0.2,0.4,0.2]
    dist[1,2,:] = [0.2,0.4,0.2]
    dist[2,3,:] = [0.2,0.4,0.2]
    dist[0,2,:] = [0.15,0.75,0.1]
    dist[1,3,:] = [0.15,0.75,0.1]
    dist[0,3,:] = [0.05,0.9,0.05]
    dist[3,0,:] = [0.9,0.05,0.05]
    dist[1,0,:] = [0.4,0.2,0.2]
    dist[2,1,:] = [0.4,0.2,0.2]
    dist[3,2,:] = [0.4,0.2,0.2]
    dist[2,0,:] = [0.75,0.15,0.1]
    dist[3,1,:] = [0.75,0.15,0.1]
    BvC_dist = ConditionalDiscreteDistribution(nodes=[B,C,BvC], table=dist)
    BvC.set_dist(BvC_dist)
    
    dist = zeros([C.size(),A.size(),CvA.size()], dtype=float32)
    dist[0,0,:] = [0.1,0.1,0.8]
    dist[1,1,:] = [0.1,0.1,0.8]
    dist[2,2,:] = [0.1,0.1,0.8]
    dist[3,3,:] = [0.1,0.1,0.8]
    dist[0,1,:] = [0.2,0.4,0.2]
    dist[1,2,:] = [0.2,0.4,0.2]
    dist[2,3,:] = [0.2,0.4,0.2]
    dist[0,2,:] = [0.15,0.75,0.1]
    dist[1,3,:] = [0.15,0.75,0.1]
    dist[0,3,:] = [0.05,0.9,0.05]
    dist[3,0,:] = [0.9,0.05,0.05]
    dist[1,0,:] = [0.4,0.2,0.2]
    dist[2,1,:] = [0.4,0.2,0.2]
    dist[3,2,:] = [0.4,0.2,0.2]
    dist[2,0,:] = [0.75,0.15,0.1]
    dist[3,1,:] = [0.75,0.15,0.1]
    CvA_dist = ConditionalDiscreteDistribution(nodes=[C,A,CvA], table=dist)
    CvA.set_dist(CvA_dist)
    
    return game_net



In [268]:

    
game_net = get_game_network()

2b: Metropolis-Hastings sampling

15 points

Now you will implement the Metropolis-Hastings algorithm, which is a method for estimating a probability distribution when it is prohibitively expensive (even for inference!) to completely compute the distribution. You'll do this in MH_sampling(), which takes a Bayesian network and initial state as a parameter and returns a sample state drawn from the network's distribution. The method should just perform a single iteration of the algorithm. If an initial value is not given, default to a state chosen uniformly at random from the possible states.

The general idea is to build an approximation of a latent probability distribution by repeatedly generating a "candidate" value for each random variable in the system, and then probabilistically accepting or rejecting the candidate value based on an underlying acceptance function. These slides provide a nice intro, and this cheat sheet provides an explanation of the details.

Hint 1: in both Metropolis-Hastings and Gibbs sampling, you'll need access to each node's probability distribution. You can access this distribution by calling

A_node.dist.table

which will return the same numpy array that you provided when constructing the probability distribution.

Hint 2: you'll also want to use the random package (e.g. random.randint()) for the probabilistic choices that sampling makes.

Hint 3: in order to count the sample states later on, you'll want to make sure the sample that you return is hashable. One way to do this is by returning the sample as a tuple.



In [269]:

    
import random
def MH_sampling(bayes_net, initial_value):
    """Complete a single iteration of the 
    Metropolis-Hastings algorithm given a
    Bayesian network and an initial state
    value. Returns the state sampled from
    the probability distribution."""
       
    var_id = random.randint(0,2)
    new_val = random.randint(0,2)
    sample = initial_value
    sample[var_id] = new_val
    for node in bayes_net.nodes:
        if node.id == 0:
            A = node
        if node.id == 1:
            B = node
        if node.id == 2:
            C = node
        if node.id == 3:
            AvB = node
        if node.id == 4:
            BvC = node
        if node.id == 5:
            CvA = node
    
    Adist = A.dist.table
    Bdist = B.dist.table
    Cdist = C.dist.table
    AvBdist = AvB.dist.table
    BvCdist = BvC.dist.table
    CvAdist = CvA.dist.table
    
    p_x = Adist[initial_value[0]]*Bdist[initial_value[1]]*Cdist[initial_value[2]]*AvBdist[initial_value[0],initial_value[1],initial_value[3]]*BvCdist[initial_value[1],initial_value[2],initial_value[4]]*CvAdist[initial_value[2],initial_value[0],initial_value[5]]
    p_x_dash = Adist[sample[0]]*Bdist[sample[1]]*Cdist[sample[2]]*AvBdist[sample[0],sample[1],sample[3]]*BvCdist[sample[1],sample[2],sample[4]]*CvAdist[sample[2],sample[0],sample[5]]
    
    if p_x!=0:
        A = p_x_dash/p_x
    else:
        A = 1
    
    if A>=1:
        initial_value[var_id] = new_val
        sample = initial_value
    else:
        weighted = [(1,int(A*100)), (0, 100-int(A*100))]
        population = [val for val, cnt in weighted for i in range(cnt)]
        acceptance = random.choice(population)
        if acceptance == 0:
            sample = initial_value
        else:
            sample = initial_value
            sample[var_id] = new_val
        
    return sample



In [270]:

    
# arbitrary initial state for the game system
initial_value = [0,0,0,0,2,1] 
sample = MH_sampling(game_net, initial_value)
print sample









    



[1, 0, 0, 0, 2, 1]

2c: Gibbs sampling

15 points

Implement the Gibbs sampling algorithm, which is a special case of Metropolis-Hastings. You'll do this in Gibbs_sampling(), which takes a Bayesian network and initial state value as a parameter and returns a sample state drawn from the network's distribution. The method should just consist of a single iteration of the algorithm. If no initial value is provided, default to a uniform distribution over the possible states.

You may find this helpful in understanding the basics of Gibbs sampling over Bayesian networks. Make sure to identify what makes it different from Metropolis-Hastings.



In [15]:

    
import random
def Gibbs_sampling(bayes_net, initial_value):
    """Complete a single iteration of the
    Gibbs sampling algorithm given a
    Bayesian network and an initial state
    value. Returns the state sampled from
    the probability distribution."""
    sample = initial_value
    if initial_value:
        var_id = random.randint(0,2)
        for node in bayes_net.nodes:
            if node.id == var_id:
                target = node
        if target == 3:
            parent = [0,1]
        elif target == 4:
            parent = [1,2]
        else:
            parent = [2,0]
        target_dist = target.dist.table
        weights = target_dist[initial_value[parent[0]], initial_value[parent[1]],:]
        weights2 = [int(x*100) for x in weights]
        values = [0,1,2]
        weighted = zip(values,weights2)
        population = [val for val, cnt in weighted for i in range(cnt)]
        new_val = random.choice()
        sample[var_id] = new_val
    else:
        #default
        for i in range(0,5):
            upper_bound = 3 if i<3 else 2
            sample[i] = random.randint(0,upper_bound)
    
    return sample

2d: Comparing sampling methods

15 points

Suppose that you know the following outcome of two of the three games: A beats B and A draws with C. Start by calculating the posterior distribution for the outcome of the BvC match in calculate_posterior(), using the inference methods from 1d.

Estimate the likelihood of different outcomes for the third match by running Gibbs sampling until it converges to a stationary distribution. We'll say that the sampler has converged when, for 10 successive iterations, the difference in expected outcome for the third match differs from the previous estimated outcome by less than .1%.

Repeat this experiment for Metropolis-Hastings sampling.

Which algorithm converges more quickly? By approximately what factor? For instance, if Metropolis-Hastings takes twice as many iterations to converge as Gibbs sampling, you'd say that it converged faster by a factor of 2. Fill in sampling_question() to answer both parts.



In [16]:

    
def calculate_posterior(games_net):
    """Calculate the posterior distribution
    of the BvC match given that A won against
    B and tied C. Return a list of probabilities
    corresponding to win, loss and tie likelihood."""
    posterior = [0,0,0]
    engine = JunctionTreeEngine(games_net)
    AvB = games_net.get_node_by_name('AvB')
    BvC = games_net.get_node_by_name('BvC')
    CvA = games_net.get_node_by_name('CvA')
    engine.evidence[AvB] = 0
    engine.evidence[CvA] = 2
    Q = engine.marginal(BvC)[0]
    index = Q.generate_index([0],range(Q.nDims))
    posterior[0] = Q[index]
    index = Q.generate_index([1],range(Q.nDims))
    posterior[1] = Q[index]
    index = Q.generate_index([2],range(Q.nDims))
    posterior[2] = Q[index]
    return posterior



In [17]:

    
iter_counts = [1e1,1e3,1e5,1e6]
def compare_sampling(bayes_net, posterior):
    """Compare Gibbs and Metropolis-Hastings
    sampling by calculating how long it takes
    for each method to converge to the 
    provided posterior."""
    # TODO: finish this function
    return Gibbs_convergence, MH_convergence



In [18]:

    
def sampling_question():
    """Question about sampling performance."""
    # TODO: assign value to choice and factor
    choice = 1
    options = ['Gibbs','Metropolis-Hastings']
    factor = 0
    return options[choice], factor



In [19]:

    
# test your sampling methods here
posterior = calculate_posterior(game_net)
compare_sampling(game_net, posterior)









    



---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-19-7cd9985469f3> in <module>()
      1 # test your sampling methods here
      2 posterior = calculate_posterior(game_net)
----> 3 compare_sampling(game_net, posterior)

<ipython-input-17-78785d811a6e> in compare_sampling(bayes_net, posterior)
      6     provided posterior."""
      7     # TODO: finish this function
----> 8     return Gibbs_convergence, MH_convergence

NameError: global name 'Gibbs_convergence' is not defined

2e: Theoretical follow-up

5 points

Suppose there are now $n$ teams in the competition, and all matches have been played except for the last match. Using inference by enumeration, how does the complexity of predicting the last match vary with $n$?

Fill in complexity_question() to answer, using big-O notation. For example, write 'O(n^2)' for second-degree polynomial runtime.



In [ ]:

    
def complexity_question():
    # TODO: write an expression for complexity
    complexity = 'O(2^n)'
    return complexity

You're done! Submit this file as a "probability_notebook.ipynb" on T-Square before October 15, 9:35 AM. This assignment will be graded on the accuracy of the functions you completed.