Factor Operations with pyBN

It is probably rare that a user wants to directly manipulate factors unless they are developing a new algorithm, but it's still important to see how factor operations are done in pyBN. Moreover, the ease-of-use and transparency of pyBN's factor operations mean it can be a great teaching/learning tool!

In this tutorial, I will go over the main operations you can do with factors. First, let's start with actually creating a factor. So, we will read in a Bayesian Network from one of the included networks:



In [3]:

    
from pyBN import *
bn = read_bn('data/cmu.bn')



In [4]:

    
print bn.V
print bn.E









    



['Burglary', 'Earthquake', 'Alarm', 'JohnCalls', 'MaryCalls']
[['Burglary', 'Alarm'], ['Earthquake', 'Alarm'], ['Alarm', 'JohnCalls'], ['Alarm', 'MaryCalls']]

As you can see, we have a Bayesian network with 5 nodes and some edges between them. Let's create a factor now. This is easy in pyBN - just pass in the BayesNet object and the name of the variable.



In [6]:

    
alarm_factor = Factor(bn,'Alarm')

Now that we have a factor, we can explore its properties. Every factor in pyBN has the following attributes:

*self.bn* : a BayesNet object

*self.var* : a string
    The random variable to which this Factor belongs

*self.scope* : a list
    The RV, and its parents (the RVs involved in the
    conditional probability table)

*self.card* : a dictionary, where
    key = an RV in self.scope, and
    val = integer cardinality of the key (i.e. how
        many possible values it has)

*self.stride* : a dictionary, where
    key = an RV in self.scope, and
    val = integer stride (i.e. how many rows in the 
        CPT until the NEXT value of RV is reached)

*self.cpt* : a nested numpy array
    The probability values for self.var conditioned
    on its parents



In [8]:

    
print alarm_factor.bn
print alarm_factor.var
print alarm_factor.scope
print alarm_factor.card
print alarm_factor.stride
print alarm_factor.cpt









    



<pyBN.classes.bayesnet.BayesNet object at 0x10c73ced0>
Alarm
['Alarm', 'Burglary', 'Earthquake']
{'Burglary': 2, 'Alarm': 2, 'Earthquake': 2}
{'Burglary': 2, 'Alarm': 1, 'Earthquake': 4}
[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]

Along with those properties, there are a great number of methods (functions) at hand:

*multiply_factor*
    Multiply two factors together. The factor
    multiplication algorithm used here is adapted
    from Koller and Friedman (PGMs) textbook.

*sumover_var* :
    Sum over one *rv* by keeping it constant. Thus, you 
    end up with a 1-D factor whose scope is ONLY *rv*
    and whose length = cardinality of rv. 

*sumout_var_list* :
    Remove a collection of rv's from the factor
    by summing out (i.e. calling sumout_var) over
    each rv.

*sumout_var* :
    Remove passed-in *rv* from the factor by summing
    over everything else.

*maxout_var* :
    Remove *rv* from the factor by taking the maximum value 
    of all rv instantiations over everyting else.

*reduce_factor_by_list* :
    Reduce the factor by numerous sets of
    [rv,val]

*reduce_factor* :
    Condition the factor by eliminating any sets of
    values that don't align with a given [rv, val]

*to_log* :
    Convert probabilities to log space from
    normal space.

*from_log* :
    Convert probabilities from log space to
    normal space.

*normalize* :
    Make relevant collections of probabilities sum to one.

Here is a look at Factor Multiplication:



In [26]:

    
import numpy as np
f1 = Factor(bn,'Alarm')
f2 = Factor(bn,'Burglary')
f1.multiply_factor(f2)

f3 = Factor(bn,'Burglary')
f4 = Factor(bn,'Alarm')
f3.multiply_factor(f4)

print np.round(f1.cpt,3)
print '\n',np.round(f3.cpt,3)









    



[ 0.998  0.001  0.001  0.     0.06   0.939  0.     0.001]

[ 0.998  0.001  0.001  0.     0.06   0.939  0.     0.001]

Here is a look at "sumover_var":



In [25]:

    
f = Factor(bn,'Alarm')
print f.cpt
print f.scope
print f.stride
f.sumover_var('Burglary')
print '\n',f.cpt
print f.scope
print f.stride









    



[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]
['Alarm', 'Burglary', 'Earthquake']
{'Burglary': 2, 'Alarm': 1, 'Earthquake': 4}

[ 0.5  0.5]
['Burglary']
{'Burglary': 1}

Here is a look at "sumout_var", which is essentially the opposite of "sumover_var":



In [16]:

    
f = Factor(bn,'Alarm')
f.sumout_var('Earthquake')
print f.stride
print f.scope
print f.card
print f.cpt









    



{'Burglary': 2, 'Alarm': 1}
['Alarm', 'Burglary']
{'Burglary': 2, 'Alarm': 2}
[ 0.5295  0.4705  0.38    0.62  ]

Additionally, you can sum over a LIST of variables with "sumover_var_list". Notice how summing over every variable in the scope except for ONE variable is equivalent to summing over that ONE variable:



In [24]:

    
f = Factor(bn,'Alarm')
print f.cpt
f.sumout_var_list(['Burglary','Earthquake'])
print f.scope
print f.stride
print f.cpt

f1 = Factor(bn,'Alarm')
print '\n',f1.cpt
f1.sumover_var('Alarm')
print f1.scope
print f1.stride
print f1.cpt









    



[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]
['Alarm']
{'Alarm': 1}
[ 0.45475  0.54525]

[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]
['Alarm']
{'Alarm': 1}
[ 0.45475  0.54525]

Even more, you can use "maxout_var" to take the max values over a variable in the factor. This is a fundamental operation in Max-Sum Variable Elimination for MAP inference. Notice how the variable being maxed out is removed from the scope because it is conditioned upon and thus taken as truth in a sense.



In [27]:

    
f = Factor(bn,'Alarm')
print f.scope
print f.cpt
f.maxout_var('Burglary')
print '\n', f.scope
print f.cpt









    



['Alarm', 'Burglary', 'Earthquake']
[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]

['Alarm', 'Earthquake']
[ 0.77501  0.22499  0.05942  0.94058]

Moreover, you can also use "reduce_factor" to reduce a factor based on evidence. This is different from "sumover_var" because "reduce_factor" is not summing over anything, it is simply removing any parent-child instantiations which are not consistent with the evidence. Moreover, there should not be any need for normalization because the CPT should already be normalized over the rv-val evidence (but we do it anyways because of rounding). This function is essential when user's pass in evidence to any inference query.



In [29]:

    
f = Factor(bn, 'Alarm')
print f.scope
print f.cpt
f.reduce_factor('Burglary','Yes')
print '\n', f.scope
print f.cpt









    



['Alarm', 'Burglary', 'Earthquake']
[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]

['Alarm', 'Earthquake']
[ 0.71  0.29  0.05  0.95]

Another piece of functionality is the capability to convert the factor probabilities to/from log-space. This is important for MAP inference, since the sum of log-probabilities is equal the product of normal probabilities



In [32]:

    
f = Factor(bn,'Alarm')
print f.cpt
f.to_log()
print np.round(f.cpt,2)
f.from_log()
print f.cpt









    



[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]
[-0.   -6.91 -0.34 -1.24 -2.81 -0.06 -3.   -0.05]
[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]

Lastly, we have normalization. This function does most of its work behind the scenes because it cleans up the factor probabilities after multiplication or reduction. Still, it's an important function of which users should be aware.



In [34]:

    
f = Factor(bn, 'Alarm')
print f.cpt
f.cpt[0]=20
f.cpt[1]=20
f.cpt[4]=0.94
f.cpt[7]=0.15
print f.cpt
f.normalize()
print f.cpt









    



[ 0.999  0.001  0.71   0.29   0.06   0.94   0.05   0.95 ]
[ 20.    20.     0.71   0.29   0.94   0.94   0.05   0.15]
[ 0.5   0.5   0.71  0.29  0.5   0.5   0.25  0.75]

That's all for factor operations with pyBN. As you can see, there is a lot going on with factor operations. While these functions are the behind-the-scenes drivers of most inference queries, it is still useful for users to see how they operate. These operations have all been optimized to run incredibly fast so that inference queries can be as fast as possible.