Module `hierarchical`

in `bayesiantests`

compares the performance of two classifiers that have been assessed by *m*-runs of *k*-fold cross-validation on *q* datasets. It returns probabilities that, based on the measured performance, one model is better than another or vice versa or they are within the region of practical equivalence.

This notebook demonstrates the use of the module.

```
In [1]:
```import numpy as np
scores = np.loadtxt('Data/diffNbcHnb.csv', delimiter=',')
names = ("HNB", "NBC")
print(scores)

```
```

To analyse this data, we will use the function hierarchical in the module bayesiantests that accepts the following arguments.

```
scores: a 2-d array of differences.
rope: the region of practical equivalence. We consider two classifiers equivalent if the difference in their
performance is smaller than rope.
rho: correlation due to cross-validation
names: the names of the two classifiers; if x is a vector of differences, positive values mean that the second
(right) model had a higher score.
```

The hierarchical function uses **STAN** through Python module **pystan**.

```
In [8]:
```import bayesiantests as bt
rope=0.01 #we consider two classifers equivalent when the difference of accuracy is less that 1%
rho=1/10 #we are performing 10 folds, 10 runs cross-validation
pleft, prope, pright=bt.hierarchical(scores,rope,rho)

The first value (`left`

) is the probability that the the differences of accuracies is negative (and, therefore, in favor of HNB). The third value (`right`

) is the probability that the the differences of accuracies are positive (and, therefore, in favor of NBC). The second is the probability of the two classifiers to be practically equivalent, i.e., the difference within the rope.

In the above case, the HNB performs better than naive Bayes with a probability of 0.9965, and they are practically equivalent with a probability of 0.002. Therefore, we can conclude with high probability that HNB is better than NBC.

If we add arguments `verbose`

and `names`

, the function also prints out the probabilities.

```
In [9]:
```pl, pe, pr=bt.hierarchical(scores,rope,rho, verbose=True, names=names)

```
```

The posterior distribution can be plotted out:

- using the function
`hierarchical_MC(scores,rope,rho, names=names)`

we generate the samples of the posterior - using the function
`plot_posterior(samples,names=('C1', 'C2'))`

we then plot the posterior in the probability simplex

```
In [10]:
```%matplotlib inline
import matplotlib.pyplot as plt
samples=bt.hierarchical_MC(scores,rope,rho, names=names)
#plt.rcParams['figure.facecolor'] = 'black'
fig = bt.plot_posterior(samples,names)
plt.savefig('triangle_hierarchical.png',facecolor="black")
plt.show()

```
```

```
@ARTICLE{bayesiantests2016,
author = {{Benavoli}, A. and {Corani}, G. and {Demsar}, J. and {Zaffalon}, M.},
title = "{Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis}",
journal = {ArXiv e-prints},
archivePrefix = "arXiv",
eprint = {1606.04316},
url={https://arxiv.org/abs/1606.04316},
year = 2016,
month = jun
}
```

```
@article{corani2016unpub,
title = { Statistical comparison of classifiers through Bayesian hierarchical modelling},
author = {Corani, Giorgio and Benavoli, Alessio and Demsar, Janez and Mangili, Francesca and Zaffalon, Marco},
url = {http://ipg.idsia.ch/preprints/corani2016b.pdf},
year = {2016},
date = {2016-01-01},
institution = {technical report IDSIA},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
```