In [1]:
from discrete_probability import *
Let's start with a simple three variable Bayesian network. First we need to define our variables.
In [2]:
E, H, S = variables = map(Variable, 'EHS')
# equivalent to E, H, S = variables = Variable('E'), Variable('H'), Variable('S')
By default the values a variable can take are True
and False
.
In [3]:
print(E)
print(E.values)
An Assignment
represents a variable taking on one of its values and can be made using the <<
operator.
In [4]:
e = E << True
print(e)
We can get all of the assignments for a variable using get_assignments()
which returns all of the assignments as an unpackable list.
In [5]:
e, e_ = E
h, h_ = H
s, s_ = S
print(h)
print(s_)
Now we'll define a small data set over these variables that we can use to learn a parameterization for our network.
In [6]:
data_header = [H, S, E]
data_samples = [
[True, False, True],
[True, False, True],
[False, True, False],
[False, False, True],
[True, False, False],
[True, False, True],
[False, False, False],
[True, False, True],
[True, False, True],
[False, False, True],
[True, False, True],
[True, True, True],
[True, False, True],
[True, True, True],
[True, False, True],
[True, False, True]]
Currently the learning routines work on samples represented as Assignment
s (though this will probably change later). We can use the helper function data_to_assignments
to convert our data.
In [7]:
data_assignments = data_to_assignments(data_header, data_samples)
print(data_assignments)
Before creating our Bayesian network let's represent this data using a Table
. This class can represent either marginal or conditional tables depending on the construction arguments. A single argument creates a marginal table.
In [8]:
P = Table(variables)
P.learn_from_complete_data(data_assignments)
Out[8]:
Table
provide methods for marginalization (marginalize_over
, marginalize_out
) and conditioning (condition
, condition_on
).
In [9]:
print(P.marginalize_over([E, S]))
In [10]:
print(P.marginalize_out([E, S]))
In [11]:
print(P.condition_on([H]))
In [12]:
print(P.condition([E], [H]))
Using these operations alone we can already answer quite a few queries. However, a much easier interface is provided that matches conventional writing. For example, to represent the marginal over $E$ given evidence $H=True$ and $S=False$ we often write $P(E\mid h,\bar{s})$. We can write this query exactly the same (appending an underscore in lieu of a bar) and get the marginal.
In [13]:
P(E|h,s_)
Out[13]:
Specific probabilities such as $P(e\mid h,\bar{s})$ can also be written.
In [14]:
P(e|h,s_)
Out[14]:
Unconditioned marginals can also be found such as $P(E,S)$.
In [15]:
P(E,S)
Out[15]:
In [16]:
P(e,s_)
Out[16]:
Note that those lower-case variables were defined in cell [5]
. If they are not defined we can make assignments explicitly. For example, the below is equivalent to the above.
In [17]:
P(E<<True, S<<False)
Out[17]:
Now we'll create a simple Bayesian network. To define the graph we need the variables that will act as nodes and the edges between the nodes. A directed edge can be specified using <
and >
.
In [18]:
bn = BayesianNetwork(variables, [S < H, H > E])
If you're running this in an IPython Notebook you can view the graph using some slick d3.js
(based on http://bl.ocks.org/mbostock/1153292).
In [19]:
bn.display(180, 180)
Out[19]:
Since the Javascript doesn't carry over nicely in the viewer here's a screenshot of the graph.
In [20]:
from IPython.core.display import Image
Image(filename='hse_graph.png')
Out[20]:
At first our distribution is unparameterized. This means that the network is "invalid".
In [21]:
print(bn.is_valid)
We can parameterize it from the same data set.
In [22]:
bn.learn_from_complete_data(data_assignments)
print(bn.is_valid)
Unfortunately I haven't written any of the Bayesian network inference algorithms yet (variable elimination, factor elimination, belief propagation, stochastic sampling, network polynomial, etc.) but for now we can turn it into a marginal Table
and use that for inference. Turning it into a Table
is against the point of a Bayesian network (compact representation) but it will allow us to do inference for now.
In [23]:
Pb = bn.as_marginal_table()
print(Pb)
You might notice that this marginal table (above), although learned from the same data, is not the same as learning the marginal Table
directly from the data (below). This is due to the additional constraints on conditional independences imposed by the network structure of the Bayesian network.
In [24]:
print(P)
Also note that this is in spite of the distributions sharing the same family marginals.
In [25]:
print(P(H))
print(Pb(H))
In [26]:
print(P(S|H))
print(Pb(S|H))
In [27]:
print(P(E|H))
print(Pb(E|H))
In fact, we can see conditional independences in the Bayesian network. Using d-separation (and the Markovian assumption on families) we expect $E$ to be independent of $S$ given $H$. In other words, $P(E\mid s,H)=P(E\mid \bar{s},H)$. In the conditional table below we see that the probabilities for $E$ do not change with respect to $S$ because $H$ is given.
In [28]:
print(Pb(E|S,H))
In [29]:
print(Pb(E|h,s))
print(Pb(E|h,s_))
In [30]:
print(Pb(E|h_,s))
print(Pb(E|h_,s_))
However, we expect, due to lack of d-separation (open value through $H$), that $E$ is dependent on $S$. We see that this is the case in the conditional table below.
In [31]:
print(Pb(E|S))
Also note the lack of these conditional independences in the original marginal Table
.
In [32]:
print(P(E|S,H))
Again, the Bayesian network imposes conditional independence constraints on the distribution. Specifically this conditional independence arises from the lack of an edge between $S$ and $E$. If we were to represent the original distribution as a Bayesian network the resulting structure would be a complete graph (all nodes connected to all other nodes).