Bayes Nets

A Julia package for Bayesian Networks

A Bayesian Network (BN) represents a probability distribution over a set of variables, $P(x_1, x_2, \ldots, x_n)$. They leverage variable relations in order to efficiently decompose the joint distribution into smaller conditional probability distributions.

A BN is defined by a directed acyclic graph and a set of conditional probability distributions. Each node in the graph corresponds to a variable $x_i$ and is associated with a conditional probability distribution $P(x_i \mid \text{parents}(x_i))$.

Installation


In [ ]:
Pkg.add("BayesNets");

Visualization of network structure is provided by the TikzGraphs package. Installation requirements (e.g., PGF/Tikz and pdf2svg) are provided here.

Use


In [1]:
srand(0) # seed the random number generator to 0, for a reproducible demonstration
using BayesNets


INFO: Recompiling stale cache file /home/tim/.julia/lib/v0.5/BayesNets.ji for module BayesNets.

Representation

Bayesian Networks are represented with the BayesNet type. This type contains the directed acyclic graph (a LightTables.DiGraph) and a list of conditional probability distributions (a list of CPDs)

Here we construct the BayesNet $a \rightarrow b$, with Gaussians $a$ and $b$:

$$ a = \mathcal{N}(0,1) \qquad b = \mathcal{N}(2a +3,1) $$

In [2]:
bn = BayesNet()
push!(bn, StaticCPD(:a, Normal(1.0)))
push!(bn, LinearGaussianCPD(:b, [:a], [2.0], 3.0, 1.0))


Out[2]:

Conditional Probability Distributions

Conditional Probablity Distributions, $P(x_i \mid \text{parents}(x_i))$, are defined in BayesNets.CPDs. Each CPD knows its own name, the names of its parents, and is associated with a distribution from Distributions.jl.

CPDForm Description
StaticCPD Any Distributions.distribution; indepedent of any parents
FunctionalCPD Allows for a quick-and-dirty CPD with a custom eval function
CategoricalCPD Categorical distribution, assumes integer parents in $1:N$
LinearGaussianCPD Linear Gaussian, assumes target and parents are numeric
ConditionalLinearGaussianCPD A linear Gaussian for each discrete parent instantiation

Each CPD can be learned from data using fit.

Here we learn the same network as above.


In [3]:
a = randn(100)
b = randn(100) .+ 2*a .+ 3

data = DataFrame(a=a, b=b)
cpdA = fit(StaticCPD{Normal}, data, :a)
cpdB = fit(LinearGaussianCPD, data, :b, [:a])

bn2 = BayesNet([cpdA, cpdB])


Out[3]:

Each CPD implements four functions:

  • name(cpd) - obtain the name of the variable target variable
  • parents(cpd) - obtain the list of parents
  • nparams(cpd - obtain the number of free parameters in the CPD
  • cpd(assignment) - allows calling cpd() to obtain the conditional distribution
  • Distributions.fit(Type{CPD}, data, target, parents)

In [5]:
cpdB(:a=>0.5)


Out[5]:
Distributions.Normal{Float64}(μ=3.7835395874388134, σ=2.236109637835203)

Several functions conveniently condition and then produce their return values:


In [6]:
rand(cpdB, :a=>0.5) # condition and then sample
pdf(cpdB, :a=>1.0, :b=>3.0) # condition and then compute pdf(distribution, 3)
logpdf(cpdB, :a=>1.0, :b=>3.0) # condition and then compute logpdf(distribution, 3);

The NamedCategorical distribution allows for String or Symbol return values. The FunctionalCPD allows for crafting quick and simple CPDs:


In [7]:
bn2 = BayesNet()
push!(bn2, StaticCPD(:sighted, NamedCategorical([:bird, :plane, :superman], [0.40, 0.55, 0.05])))
push!(bn2, FunctionalCPD{Bernoulli}(:happy, [:sighted], a->Bernoulli(a == :superman ? 0.95 : 0.2)))


Out[7]:

Likelihood

A Bayesian Network represents a joint probability distribution, $P(x_1, x_2, \ldots, x_n)$. Assignments are represented as dictionaries mapping variable names (Symbols) to variable values. We can evaluate probabilities as we would with Distributions.jl, only we use exclamation points as we modify the internal state when we condition:


In [8]:
pdf(bn, :a=>0.5, :b=>2.0) # evaluate the probability density


Out[8]:
0.01900834726778591

We can also evaluate the likelihood of a dataset:


In [9]:
data = DataFrame(a=[0.5,1.0,2.0], b=[4.0,5.0,7.0])
pdf(bn, data)    #  0.00215
logpdf(bn, data) # -6.1386;

Or the likelihood for a particular cpd:


In [10]:
pdf(cpdB, data)    #  0.006
logpdf(cpdB, data) # -5.201


Out[10]:
-5.200981176187056

Sampling

Assignments can be sampled from a BayesNet.


In [11]:
rand(bn)


Out[11]:
Dict{Symbol,Any} with 2 entries:
  :a => 1.06468
  :b => 6.03699

In [12]:
rand(bn, 5)


Out[12]:
ab
10.76230399659611613.4917223005147227
2-0.80055017339008790.23443764725795835
32.21158109549336377.515865521050183
4-0.273598490270001943.7133964333979224
5-0.425156346926224062.4710294668620785

Samples can be drawn that are consistent with a provided assignment:


In [13]:
bn = BayesNet()
push!(bn, StaticCPD(:a, Categorical([0.3,0.7])))
push!(bn, StaticCPD(:b, Categorical([0.6,0.4])))
push!(bn, CategoricalCPD{Bernoulli}(:c, [:a, :b], [2,2], [Bernoulli(0.1), Bernoulli(0.2), Bernoulli(1.0), Bernoulli(0.4)]))


Out[13]:

In [14]:
rand(bn, 5, Assignment(:c=>1))


Out[14]:
abc
1221
2211
3211
4211
5221

One can also use weighted sampling:


In [15]:
rand_table_weighted(bn; nsamples=5, consistent_with=Assignment(:c=>1))


Out[15]:
abcp
11110.09090909090909091
22110.18181818181818182
32110.18181818181818182
42210.36363636363636365
52110.18181818181818182

Parameter Learning

BayesNets.jl supports parameter learning for an entire graph.


In [16]:
# specify each node's CPD type individually
fit(BayesNet, data, (:a=>:b), [StaticCPD{Normal}, LinearGaussianCPD])


Out[16]:

In [17]:
# specify a single CPD type for all nodes
fit(BayesNet, data, (:a=>:b), LinearGaussianCPD)


Out[17]:

Fitting can be done for specific BayesNets types as well:


In [18]:
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3], 
                 b=[1,1,1,2,2,2,2,1,1,2,1,1],
                 a=[1,1,1,2,1,1,2,1,1,2,1,1])

fit(DiscreteBayesNet, data, (:a=>:b, :a=>:c, :b=>:c))


Out[18]:

Fitting a DiscreteCPD, which is a CategoricalCPD{Categorical}, can be done with a specified number of categories. This prevents cases where your test data does not provide an example for every category.


In [20]:
cpd = fit(DiscreteCPD, DataFrame(a=[1,2,1,2,2]), :a, ncategories=3);
cpd = fit(DiscreteCPD, data, :b, [:a], parental_ncategories=[3], target_ncategories=3);

Structure Learning

Structure learning can be done as well.


In [22]:
using Discretizers
using RDatasets
iris = dataset("datasets", "iris")
names(iris)
data = DataFrame(
    SepalLength = iris[:SepalLength],
    SepalWidth = iris[:SepalWidth],
    PetalLength = iris[:PetalLength],
    PetalWidth = iris[:PetalWidth],
    Species = encode(CategoricalDiscretizer(iris[:Species]), iris[:Species]),
)
data[1:3,:] # only display a subset...


INFO: Precompiling module RData.
Out[22]:
SepalLengthSepalWidthPetalLengthPetalWidthSpecies
15.13.51.40.21
24.93.01.40.21
34.73.21.30.21

In [23]:
typealias CLG ConditionalLinearGaussianCPD
params = K2GraphSearch([:Species, :SepalLength, :SepalWidth, :PetalLength, :PetalWidth], 
                       [StaticCPD{Categorical}, CLG, CLG, CLG, CLG],
                       max_n_parents=2)
bn = fit(BayesNet, data, params)
bn


Out[23]:

A ScoringFunction allows for extracting a scoring metric for a CPD given data. The negative BIC score is implemented in NegativeBayesianInformationCriterion.

A GraphSearchStrategy defines a structure learning algorithm. The K2 algorithm is defined through K2GraphSearch and GreedyHillClimbing is implemented for discrete Bayesian networks and the Bayesian score:


In [24]:
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3], 
                 b=[1,1,1,2,2,2,2,1,1,2,1,1],
                 a=[1,1,1,2,1,1,2,1,1,2,1,1])
params = GreedyHillClimbing(ScoreComponentCache(data), max_n_parents=3, prior=UniformPrior())
bn = fit(DiscreteBayesNet, data, params)


Out[24]:

A whole suite of features are supported for DiscreteBayesNets.


In [25]:
bayesian_score(bn, data, params.prior) # compute the Bayesian score of the data under the BayesNet


Out[25]:
-31.28804624550449

In [26]:
count(bn, :a, data) # obtain a list of counts for the node


Out[26]:
acount
119
223

In [27]:
statistics(bn.dag, data) # sufficient statistics from a discrete dataset


Out[27]:
3-element Array{Array{Int64,2},1}:
 [4; 4; 4]                 
 [3 1 3; 1 3 1]            
 [3 1 3 0 2 0; 0 0 0 1 1 1]

In [28]:
table(bn, :b) # obtain the factor table for a node


Out[28]:
abp
1110.7777777777777778
2210.0
3120.2222222222222222
4221.0

In [29]:
table(bn, :c, :a=>1) # obtain a factor table matching a particular assignment


Out[29]:
bacp
11110.42857142857142855
22110.0
31120.14285714285714285
42121.0
51130.42857142857142855
62130.0

Reading from XDSL

One can read discrete Bayesian networks from the .XDSL file format.


In [30]:
bn = readxdsl(Pkg.dir("BayesNets", "test", "sample_bn.xdsl"))


Out[30]:

In [ ]: