This is a companion Jupyter notebook to the work Multivariate Dependencies Beyond Shannon Information by Ryan G. James and James P. Crutchfield. This worksheet was written by Ryan G. James. It primarily makes use of the dit
package for information theory calculations.
In [ ]:
from __future__ import division, print_function
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from dit import ditParams, Distribution
from dit.distconst import uniform
ditParams['repr.print'] = ditParams['print.exact'] = True
In [ ]:
from dit.example_dists.mdbsi import dyadic, triadic
dists = [('dyadic', dyadic), ('triadic', triadic)]
In [ ]:
from dit.profiles import ExtropyPartition, ShannonPartition
def print_partition(dists, partition):
ps = [str(partition(dist)).split('\n') for _, dist in dists ]
print('\t' + '\t\t\t\t'.join(name for name, _ in dists))
for lines in zip(*ps):
print('\t\t'.join(lines))
In [ ]:
print_partition(dists, ShannonPartition)
Both I-Diagrams are the same. This implies that no Shannon measure (entropy, mutual information, conditional mutual information [including the transfer entropy], co-information, etc) can differentiate these patterns of dependency.
In [ ]:
print_partition(dists, ExtropyPartition)
Similarly, the X-Diagrams are identical and so no extropy-based measure can differentiate the distributions.
In [ ]:
from prettytable import PrettyTable
from dit.multivariate import (entropy,
coinformation,
total_correlation,
dual_total_correlation,
independent_information,
caekl_mutual_information,
interaction_information,
intrinsic_total_correlation,
gk_common_information,
wyner_common_information,
exact_common_information,
functional_common_information,
mss_common_information,
tse_complexity,
)
from dit.other import (extropy,
disequilibrium,
perplexity,
LMPR_complexity,
renyi_entropy,
tsallis_entropy,
)
In [ ]:
def print_table(title, table, dists):
pt = PrettyTable(field_names = [''] + [name for name, _ in table])
for name, _ in table:
pt.float_format[name] = ' 5.{0}'.format(3)
for name, dist in dists:
pt.add_row([name] + [measure(dist) for _, measure in table])
print("\n{}".format(title))
print(pt.get_string())
Entropies generally capture the uncertainty contained in a distribution. Here, we compute the Shannon entropy, the Renyi entropy of order 2 (also known as the collision entropy), and the Tsallis entropy of order 2. Though we only compute the order 2 values, any order will produce values identical for both distributions.
In [ ]:
entropies = [('H', entropy),
('Renyi (α=2)', lambda d: renyi_entropy(d, 2)),
('Tsallis (q=2)', lambda d: tsallis_entropy(d, 2)),
]
In [ ]:
print_table('Entropies', entropies, dists)
The entropies for both distributions are indentical. This is not surprising: they have the same probability mass function.
Mutual informations are multivariate generalizations of the standard Shannon mutual information. By far, the most widely used (and often simply assumed to be the only) generalization is the total correlation, sometimes called the multi-information. It is defined as: $$ T[\mathbf{X}] = \sum H[X_i] - H[\mathbf{X}] = \sum p(\mathbf{x}) \log_2 \frac{p(\mathbf{x})}{p(x_1)p(x_2)\ldots p(x_n)} $$
Other generalizations exist, though, including the co-information, the dual total correlation, and the CAEKL mutual information.
In [ ]:
mutual_informations = [('I', coinformation),
('T', total_correlation),
('B', dual_total_correlation),
('J', caekl_mutual_information),
('II', interaction_information),
]
In [ ]:
print_table('Mutual Informations', mutual_informations, dists)
The equivalence of all these generalizations is not surprising: Each of them can be defined as a function of the I-diagram, and so must be identical here.
In [ ]:
common_informations = [('K', gk_common_information),
('C', lambda d: wyner_common_information(d, niter=1, polish=False)),
('G', lambda d: exact_common_information(d, niter=1, polish=False)),
('F', functional_common_information),
('M', mss_common_information),
]
In [ ]:
print_table('Common Informations', common_informations, dists)
As it turns out, only the Gács-Körner common information, K
, distinguishes the two.
In [ ]:
other_measures = [('IMI', lambda d: intrinsic_total_correlation(d, d.rvs[:-1], d.rvs[-1])),
('X', extropy),
('R', independent_information),
('P', perplexity),
('D', disequilibrium),
('LMRP', LMPR_complexity),
('TSE', tse_complexity),
]
In [ ]:
print_table('Other Measures', other_measures, dists)
Several other measures fail to differentiate our two distributions. For many of these (X
, P
, D
, LMRP
) this is because they are defined relative to the probability mass function. For the others, it is due to the equality of the I-diagrams. Only the intrinsic mutual information, IMI
, can distinguish the two.
In [ ]:
from dit.profiles import *
def plot_profile(dists, profile):
n = len(dists)
plt.figure(figsize=(8*n, 6))
ent = max(entropy(dist) for _, dist in dists)
for i, (name, dist) in enumerate(dists):
ax = plt.subplot(1, n, i+1)
profile(dist).draw(ax=ax)
if profile not in [EntropyTriangle, EntropyTriangle2]:
ax.set_ylim((-0.1, ent + 0.1))
ax.set_title(name)
In [ ]:
plot_profile(dists, ComplexityProfile)
Once again, these two profiles are identical due to the I-Diagrams being identical. The complexity profile incorrectly suggests that there is no information at the scale of 3 variables.
In [ ]:
plot_profile(dists, MUIProfile)
The marginal utility of information is based on a linear programming problem with constrains related to values from the I-Diagram, and so here again the two distributions are undifferentiated.
In [ ]:
plot_profile(dists, SchneidmanProfile)
The connected informations are based on differences between maximum entropy distributions with differing $k$-way marginal distributions fixed. Here, the two distributions are differentiated
In [ ]:
plot_profile(dists, EntropyTriangle)
Both distributions are at an idential location in the multivariate entropy triangle.
In [ ]:
from dit.pid.helpers import compare_measures
In [ ]:
for name, dist in dists:
compare_measures(dist, name=name)
Here we see that the PID determines that in dyadic distribution two random variables uniquely contribute a bit of information to the third, whereas in the triadic distribution two random variables redundantly influene the third with one bit, and synergistically with another.
In [ ]:
from itertools import product
In [ ]:
outcomes_a = [
(0,0,0,0),
(0,2,3,2),
(1,0,2,1),
(1,2,1,3),
(2,1,3,3),
(2,3,0,1),
(3,1,1,2),
(3,3,2,0),
]
outcomes_b = [
(0,0,0,0),
(0,0,1,1),
(0,1,0,1),
(0,1,1,0),
(1,0,0,1),
(1,0,1,0),
(1,1,0,0),
(1,1,1,1),
]
outcomes = [ tuple([2*a+b for a, b in zip(a_, b_)]) for a_, b_ in product(outcomes_a, outcomes_b) ]
quadradic = uniform(outcomes)
In [ ]:
dyadic2 = uniform([(4*a+2*c+e, 4*a+2*d+f, 4*b+2*c+f, 4*b+2*d+e) for a, b, c, d, e, f in product([0,1], repeat=6)])
In [ ]:
dists2 = [('dyadic2', dyadic2), ('quadradic', quadradic)]
In [ ]:
print_partition(dists2, ShannonPartition)
In [ ]:
print_partition(dists2, ExtropyPartition)
In [ ]:
print_table('Entropies', entropies, dists2)
In [ ]:
print_table('Mutual Informations', mutual_informations, dists2)
In [ ]:
print_table('Common Informations', common_informations, dists2)
In [ ]:
print_table('Other Measures', other_measures, dists2)
In [ ]:
plot_profile(dists2, ComplexityProfile)
In [ ]:
plot_profile(dists2, MUIProfile)
In [ ]:
plot_profile(dists2, SchneidmanProfile)
In [ ]:
plot_profile(dists2, EntropyTriangle)
In [ ]: