A Julia package for Bayesian Networks
A Bayesian Network (BN) represents a probability distribution over a set of variables, $P(x_1, x_2, \ldots, x_n)$. They leverage variable relations in order to efficiently decompose the joint distribution into smaller conditional probability distributions.
A BN is defined by a directed acyclic graph and a set of conditional probability distributions. Each node in the graph corresponds to a variable $x_i$ and is associated with a conditional probability distribution $P(x_i \mid \text{parents}(x_i))$.
In [ ]:
Pkg.add("BayesNets");
Visualization of network structure is provided by the TikzGraphs package. Installation requirements (e.g., PGF/Tikz and pdf2svg) are provided here.
In [1]:
srand(0) # seed the random number generator to 0, for a reproducible demonstration
using BayesNets
Bayesian Networks are represented with the BayesNet
type. This type contains the directed acyclic graph (a LightTables.DiGraph) and a list of conditional probability distributions (a list of CPDs)
Here we construct the BayesNet $a \rightarrow b$, with Gaussians $a$ and $b$:
$$ a = \mathcal{N}(0,1) \qquad b = \mathcal{N}(2a +3,1) $$
In [2]:
bn = BayesNet()
push!(bn, StaticCPD(:a, Normal(1.0)))
push!(bn, LinearGaussianCPD(:b, [:a], [2.0], 3.0, 1.0))
Out[2]:
Conditional Probablity Distributions, $P(x_i \mid \text{parents}(x_i))$, are defined in BayesNets.CPDs. Each CPD knows its own name, the names of its parents, and is associated with a distribution from Distributions.jl.
CPDForm |
Description |
---|---|
StaticCPD |
Any Distributions.distribution ; indepedent of any parents |
FunctionalCPD |
Allows for a quick-and-dirty CPD with a custom eval function |
CategoricalCPD |
Categorical distribution, assumes integer parents in $1:N$ |
LinearGaussianCPD |
Linear Gaussian, assumes target and parents are numeric |
ConditionalLinearGaussianCPD |
A linear Gaussian for each discrete parent instantiation |
Each CPD can be learned from data using fit
.
Here we learn the same network as above.
In [3]:
a = randn(100)
b = randn(100) .+ 2*a .+ 3
data = DataFrame(a=a, b=b)
cpdA = fit(StaticCPD{Normal}, data, :a)
cpdB = fit(LinearGaussianCPD, data, :b, [:a])
bn2 = BayesNet([cpdA, cpdB])
Out[3]:
Each CPD
implements four functions:
name(cpd)
- obtain the name of the variable target variableparents(cpd)
- obtain the list of parentsnparams(cpd
- obtain the number of free parameters in the CPDcpd(assignment)
- allows calling cpd()
to obtain the conditional distributionDistributions.fit(Type{CPD}, data, target, parents)
In [5]:
cpdB(:a=>0.5)
Out[5]:
Several functions conveniently condition and then produce their return values:
In [6]:
rand(cpdB, :a=>0.5) # condition and then sample
pdf(cpdB, :a=>1.0, :b=>3.0) # condition and then compute pdf(distribution, 3)
logpdf(cpdB, :a=>1.0, :b=>3.0) # condition and then compute logpdf(distribution, 3);
The NamedCategorical distribution allows for String or Symbol return values. The FunctionalCPD allows for crafting quick and simple CPDs:
In [7]:
bn2 = BayesNet()
push!(bn2, StaticCPD(:sighted, NamedCategorical([:bird, :plane, :superman], [0.40, 0.55, 0.05])))
push!(bn2, FunctionalCPD{Bernoulli}(:happy, [:sighted], a->Bernoulli(a == :superman ? 0.95 : 0.2)))
Out[7]:
A Bayesian Network represents a joint probability distribution, $P(x_1, x_2, \ldots, x_n)$. Assignments are represented as dictionaries mapping variable names (Symbols) to variable values. We can evaluate probabilities as we would with Distributions.jl, only we use exclamation points as we modify the internal state when we condition:
In [8]:
pdf(bn, :a=>0.5, :b=>2.0) # evaluate the probability density
Out[8]:
We can also evaluate the likelihood of a dataset:
In [9]:
data = DataFrame(a=[0.5,1.0,2.0], b=[4.0,5.0,7.0])
pdf(bn, data) # 0.00215
logpdf(bn, data) # -6.1386;
Or the likelihood for a particular cpd:
In [10]:
pdf(cpdB, data) # 0.006
logpdf(cpdB, data) # -5.201
Out[10]:
In [11]:
rand(bn)
Out[11]:
In [12]:
rand(bn, 5)
Out[12]:
Samples can be drawn that are consistent with a provided assignment:
In [13]:
bn = BayesNet()
push!(bn, StaticCPD(:a, Categorical([0.3,0.7])))
push!(bn, StaticCPD(:b, Categorical([0.6,0.4])))
push!(bn, CategoricalCPD{Bernoulli}(:c, [:a, :b], [2,2], [Bernoulli(0.1), Bernoulli(0.2), Bernoulli(1.0), Bernoulli(0.4)]))
Out[13]:
In [14]:
rand(bn, 5, Assignment(:c=>1))
Out[14]:
One can also use weighted sampling:
In [15]:
rand_table_weighted(bn; nsamples=5, consistent_with=Assignment(:c=>1))
Out[15]:
In [16]:
# specify each node's CPD type individually
fit(BayesNet, data, (:a=>:b), [StaticCPD{Normal}, LinearGaussianCPD])
Out[16]:
In [17]:
# specify a single CPD type for all nodes
fit(BayesNet, data, (:a=>:b), LinearGaussianCPD)
Out[17]:
Fitting can be done for specific BayesNets types as well:
In [18]:
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3],
b=[1,1,1,2,2,2,2,1,1,2,1,1],
a=[1,1,1,2,1,1,2,1,1,2,1,1])
fit(DiscreteBayesNet, data, (:a=>:b, :a=>:c, :b=>:c))
Out[18]:
Fitting a DiscreteCPD
, which is a CategoricalCPD{Categorical}
, can be done with a specified number of categories. This prevents cases where your test data does not provide an example for every category.
In [20]:
cpd = fit(DiscreteCPD, DataFrame(a=[1,2,1,2,2]), :a, ncategories=3);
cpd = fit(DiscreteCPD, data, :b, [:a], parental_ncategories=[3], target_ncategories=3);
In [22]:
using Discretizers
using RDatasets
iris = dataset("datasets", "iris")
names(iris)
data = DataFrame(
SepalLength = iris[:SepalLength],
SepalWidth = iris[:SepalWidth],
PetalLength = iris[:PetalLength],
PetalWidth = iris[:PetalWidth],
Species = encode(CategoricalDiscretizer(iris[:Species]), iris[:Species]),
)
data[1:3,:] # only display a subset...
Out[22]:
In [23]:
typealias CLG ConditionalLinearGaussianCPD
params = K2GraphSearch([:Species, :SepalLength, :SepalWidth, :PetalLength, :PetalWidth],
[StaticCPD{Categorical}, CLG, CLG, CLG, CLG],
max_n_parents=2)
bn = fit(BayesNet, data, params)
bn
Out[23]:
A ScoringFunction
allows for extracting a scoring metric for a CPD given data.
The negative BIC score is implemented in NegativeBayesianInformationCriterion.
A GraphSearchStrategy
defines a structure learning algorithm.
The K2 algorithm is defined through K2GraphSearch
and GreedyHillClimbing
is implemented for discrete Bayesian networks and the Bayesian score:
In [24]:
data = DataFrame(c=[1,1,1,1,2,2,2,2,3,3,3,3],
b=[1,1,1,2,2,2,2,1,1,2,1,1],
a=[1,1,1,2,1,1,2,1,1,2,1,1])
params = GreedyHillClimbing(ScoreComponentCache(data), max_n_parents=3, prior=UniformPrior())
bn = fit(DiscreteBayesNet, data, params)
Out[24]:
A whole suite of features are supported for DiscreteBayesNets.
In [25]:
bayesian_score(bn, data, params.prior) # compute the Bayesian score of the data under the BayesNet
Out[25]:
In [26]:
count(bn, :a, data) # obtain a list of counts for the node
Out[26]:
In [27]:
statistics(bn.dag, data) # sufficient statistics from a discrete dataset
Out[27]:
In [28]:
table(bn, :b) # obtain the factor table for a node
Out[28]:
In [29]:
table(bn, :c, :a=>1) # obtain a factor table matching a particular assignment
Out[29]:
In [30]:
bn = readxdsl(Pkg.dir("BayesNets", "test", "sample_bn.xdsl"))
Out[30]:
In [ ]: