Using otagrum


In [1]:
import openturns as ot
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb

Importing the otagrum module


In [2]:
import otagrum as otagr

Creating the CBN structure

We begin by creating the CBN that will be used throughout this example.

To do so, we need a NamedDAG structure...


In [3]:
dag = gum.DAG()

In [4]:
mapping = {}
mapping['A'] = dag.addNode() # Add node A
mapping['B'] = dag.addNode() # Add node B
mapping['C'] = dag.addNode() # Add node C
mapping['D'] = dag.addNode() # Add node D

In [5]:
dag.addArc(mapping['A'], mapping['C']) # Arc A -> C
dag.addArc(mapping['B'], mapping['C']) # Arc B -> C
dag.addArc(mapping['C'], mapping['D']) # Arc C -> D

In [6]:
dag


Out[6]:
G 0 0 2 2 0->2 1 1 1->2 3 3 2->3

In [7]:
structure = otagr.NamedDAG(dag, list(mapping.keys()))

In [8]:
gnb.showDot(structure.toDot())


G B B C C B->C D D C->D A A A->C

Parameters of the CBN

... and a collection of local conditional copulas.


In [9]:
lcc_list = [] # Local Conditional Copulas
for i in range( structure.getSize() ):
    dim_lcc = structure.getParents(i).getSize() + 1
    R = ot.CorrelationMatrix(dim_lcc)
    for j in range(dim_lcc):
        for k in range(j):
            R[j, k] = 0.6
    lcc_list.append( ot.Normal([0.0]*dim_lcc, [1.0]*dim_lcc, R).getCopula() )

Creation of the CBN

Now that we have a NamedDAG structure and a collection of local conditional copulas, we can construct a CBN.


In [10]:
cbn = otagr.ContinuousBayesianNetwork(structure, lcc_list)

Sampling from CBN

Having a CBN, we can now sample from it.


In [11]:
ot.RandomGenerator.SetSeed(10) # Set random seed
sample = cbn.getSample(1000)
train = sample[:-100]
test = sample[-100:]

Learning the structure with continuous PC

Now that we have data, we can use it to learn the structure with the continuous PC algorithm.


In [12]:
learner = otagr.ContinuousPC(sample, maxConditioningSetSize=5, alpha=0.1)

We first learn the skeleton, that is the undirected structure.


In [13]:
skeleton = learner.learnSkeleton()

In [14]:
skeleton


Out[14]:
no_name 0 0 2 2 0->2 1 1 1->2 3 3 2->3

Then we look for the v-structures, leading to a Partially Directed Acyclic Graph (PDAG)


In [15]:
pdag = learner.learnPDAG()

In [16]:
pdag


Out[16]:
no_name 0 0 2 2 0->2 1 1 1->2 3 3 2->3

Finally, the remaining edges are oriented by propagating constraints


In [17]:
ndag = learner.learnDAG()

In [18]:
gnb.showDot(ndag.toDot())


G B B C C B->C D D C->D A A A->C

The true structure has been recovered.

Learning with continuous MIIC

Otagrum provides another learning algorithm to learn the structure: the continuous MIIC algorithm.


In [19]:
learner = otagr.ContinuousMIIC(sample)

This algorithm relies on the computing of mutual information which is done through the copula function. Hence, a copula model for the data is needed. The continuous MIIC algorithm can make use of Gaussian copulas (parametric) or Bernstein copulas (non-parametric) to compute mutual information. Moreover, due to finite sampling size, the mutual information estimators need to be corrected. Two kind of correction are provided: NoCorr (no correction) or Naive (a fixed correction is substracted from the raw mutual information estimators). Those behaviours can be changed as follows:


In [20]:
#learner.setCMode(otagr.CorrectedMutualInformation.CModeTypes_Bernstein) # By default
learner.setCMode(otagr.CorrectedMutualInformation.CModeTypes_Gaussian) # To use Gaussian copulas
learner.setKMode(otagr.CorrectedMutualInformation.KModeTypes_Naive) # By default
#learner.setKMode(otagr.CorrectedMutualInformation.KModeTypes_NoCorr) # To use the raw estimators
learner.setAlpha(0.01) # Set the correction value for the Naive behaviour, it is set to 0.01 by default

As with PC algorithm we can learn the skeleton, PDAG and DAG using


In [21]:
skeleton = learner.learnSkeleton()

In [22]:
skeleton


Out[22]:
no_name 0 0 2 2 0->2 1 1 1->2 3 3 2->3

In [23]:
pdag = learner.learnPDAG()

In [24]:
pdag


Out[24]:
no_name 0 0 2 2 0->2 1 1 1->2 3 3 2->3

In [25]:
dag = learner.learnDAG()

In [26]:
gnb.showDot(dag.toDot())


G B B C C B->C D D C->D A A A->C

Learning parameters

Bernstein copulas are used to learn the local conditional copulas associated to each node


In [27]:
lcc_list = []
for i in range(train.getDimension()):
    indices = [i] + [int(n) for n in ndag.getParents(i)]
    dim_lcc = len(indices)
    if dim_lcc == 1:
        bernsteinCopula = ot.Uniform(0.0, 1.0).getCopula()
    elif dim_lcc > 1:
        K = otagr.ContinuousTTest.GetK(len(train), dim_lcc)
        bernsteinCopula = ot.EmpiricalBernsteinCopula(train.getMarginal(indices), K, True)
    lcc_list.append(bernsteinCopula)

We can now create the learned CBN


In [28]:
lcbn = otagr.ContinuousBayesianNetwork(ndag, lcc_list) # Learned CBN

And compare the mean loglikelihood between the true and the learned models


In [29]:
def compute_mean_LL(cbn, test):
    ll = 0
    for t in test:
        ll += cbn.computeLogPDF(t)
    ll /= len(test)
    return ll

In [30]:
true_LL = compute_mean_LL(cbn, test)
print(true_LL)


0.316209061434805

In [31]:
exp_LL = compute_mean_LL(lcbn, test)
print(exp_LL)


0.1289760450076917