In [1]:
import openturns as ot
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
Importing the otagrum module
In [2]:
import otagrum as otagr
Creating the CBN structure
We begin by creating the CBN that will be used throughout this example.
To do so, we need a NamedDAG structure...
In [3]:
dag = gum.DAG()
In [4]:
mapping = {}
mapping['A'] = dag.addNode() # Add node A
mapping['B'] = dag.addNode() # Add node B
mapping['C'] = dag.addNode() # Add node C
mapping['D'] = dag.addNode() # Add node D
In [5]:
dag.addArc(mapping['A'], mapping['C']) # Arc A -> C
dag.addArc(mapping['B'], mapping['C']) # Arc B -> C
dag.addArc(mapping['C'], mapping['D']) # Arc C -> D
In [6]:
dag
Out[6]:
In [7]:
structure = otagr.NamedDAG(dag, list(mapping.keys()))
In [8]:
gnb.showDot(structure.toDot())
Parameters of the CBN
... and a collection of local conditional copulas.
In [9]:
lcc_list = [] # Local Conditional Copulas
for i in range( structure.getSize() ):
dim_lcc = structure.getParents(i).getSize() + 1
R = ot.CorrelationMatrix(dim_lcc)
for j in range(dim_lcc):
for k in range(j):
R[j, k] = 0.6
lcc_list.append( ot.Normal([0.0]*dim_lcc, [1.0]*dim_lcc, R).getCopula() )
Creation of the CBN
Now that we have a NamedDAG structure and a collection of local conditional copulas, we can construct a CBN.
In [10]:
cbn = otagr.ContinuousBayesianNetwork(structure, lcc_list)
Sampling from CBN
Having a CBN, we can now sample from it.
In [11]:
ot.RandomGenerator.SetSeed(10) # Set random seed
sample = cbn.getSample(1000)
train = sample[:-100]
test = sample[-100:]
Learning the structure with continuous PC
Now that we have data, we can use it to learn the structure with the continuous PC algorithm.
In [12]:
learner = otagr.ContinuousPC(sample, maxConditioningSetSize=5, alpha=0.1)
We first learn the skeleton, that is the undirected structure.
In [13]:
skeleton = learner.learnSkeleton()
In [14]:
skeleton
Out[14]:
Then we look for the v-structures, leading to a Partially Directed Acyclic Graph (PDAG)
In [15]:
pdag = learner.learnPDAG()
In [16]:
pdag
Out[16]:
Finally, the remaining edges are oriented by propagating constraints
In [17]:
ndag = learner.learnDAG()
In [18]:
gnb.showDot(ndag.toDot())
The true structure has been recovered.
Otagrum provides another learning algorithm to learn the structure: the continuous MIIC algorithm.
In [19]:
learner = otagr.ContinuousMIIC(sample)
This algorithm relies on the computing of mutual information which is done through the copula function. Hence, a copula model for the data is needed. The continuous MIIC algorithm can make use of Gaussian copulas (parametric) or Bernstein copulas (non-parametric) to compute mutual information. Moreover, due to finite sampling size, the mutual information estimators need to be corrected. Two kind of correction are provided: NoCorr (no correction) or Naive (a fixed correction is substracted from the raw mutual information estimators). Those behaviours can be changed as follows:
In [20]:
#learner.setCMode(otagr.CorrectedMutualInformation.CModeTypes_Bernstein) # By default
learner.setCMode(otagr.CorrectedMutualInformation.CModeTypes_Gaussian) # To use Gaussian copulas
learner.setKMode(otagr.CorrectedMutualInformation.KModeTypes_Naive) # By default
#learner.setKMode(otagr.CorrectedMutualInformation.KModeTypes_NoCorr) # To use the raw estimators
learner.setAlpha(0.01) # Set the correction value for the Naive behaviour, it is set to 0.01 by default
As with PC algorithm we can learn the skeleton, PDAG and DAG using
In [21]:
skeleton = learner.learnSkeleton()
In [22]:
skeleton
Out[22]:
In [23]:
pdag = learner.learnPDAG()
In [24]:
pdag
Out[24]:
In [25]:
dag = learner.learnDAG()
In [26]:
gnb.showDot(dag.toDot())
Learning parameters
Bernstein copulas are used to learn the local conditional copulas associated to each node
In [27]:
lcc_list = []
for i in range(train.getDimension()):
indices = [i] + [int(n) for n in ndag.getParents(i)]
dim_lcc = len(indices)
if dim_lcc == 1:
bernsteinCopula = ot.Uniform(0.0, 1.0).getCopula()
elif dim_lcc > 1:
K = otagr.ContinuousTTest.GetK(len(train), dim_lcc)
bernsteinCopula = ot.EmpiricalBernsteinCopula(train.getMarginal(indices), K, True)
lcc_list.append(bernsteinCopula)
We can now create the learned CBN
In [28]:
lcbn = otagr.ContinuousBayesianNetwork(ndag, lcc_list) # Learned CBN
And compare the mean loglikelihood between the true and the learned models
In [29]:
def compute_mean_LL(cbn, test):
ll = 0
for t in test:
ll += cbn.computeLogPDF(t)
ll /= len(test)
return ll
In [30]:
true_LL = compute_mean_LL(cbn, test)
print(true_LL)
In [31]:
exp_LL = compute_mean_LL(lcbn, test)
print(exp_LL)