The script "calculate_mutual_info.py" takes as an input a file containing various time-series replicas: each column will be interpreted as different replica and each row will be a different value as a function of time.
The replicas needs to have the same number of time-measures (i.e. same number of rows).
The output will contain a symmetric matrix of size (N x N) where N = number of replicas, which contains the Mutual Information of each replica against the others (on the diagonal the values of Information Entropy of each replica).
The script starts by loading the needed packages:
In [ ]:
import ts
import matplotlib.pyplot as plt
import numpy as np
from argparse import ArgumentParser
then the argument parser is defined:
In [ ]:
parser = ArgumentParser( description = 'Calculate Mutual Information')
#
# SEE FILE FOR DETAILS.
#
options = parser.parse_args()
The Input file format has been already described. Other options give the possibility to :
By default the program assumes that the data are 1-dimensional time series, so if the input files contains N columns it will generate N replicas. But the data can also be multi dimensional: if the user specify that the data are k-dimensional, if the input files contains N columns it will generate N/k replicas. In the case the user specifies that the data has to be represented in k($>1$) dimensions, by default the script assumes that the values of the various dimensions of a given replicas are consecutives columns.
EXAMPLE: If we specify --dim 3 and tha files contains 6 columns, the program will genrate 2 3-dim replicas, and it will assume that the column in the input file are:
X1 Y1 Z1 X2 Y2 Z2
i.e. : the 1-st column is the 1-st dimension of the 1-st replica, the 2-nd column in the 2-nd dimension of the 1-st replica and so on.
Specifing the option --interleave, the user can modify this behaviour and the script will instead assume that the input data are organized as the following:
X1 X2 Y1 Y2 Z1 Z2
i.e. : the first N/k colum are the 1-st dimension of replicas, followed by N/K columns containing the 2-nd dimension and son on.
In [ ]:
def interleave(data,ndim):
nfr, nrep = data.shape
out = np.zeros(data.shape)
for i in range(nrep/ndim):
or j in range(ndim):
out[:,ndim*i+j] = data[:,j*(nrep/ndim)+i]
return out
Following our exploration of the script we now enter in the actual execution.
Firstly the options are stored in more readable variables.
In [ ]:
f_dat = options.dat
f_out = options.out
stride = options.stride
and finally the data is read from the file specified from the user and if the --interleave option has been selected the data is reorganized as described above and stored in a numpy array named dat
In [ ]:
dat = np.loadtxt(f_dat)
dat = dat[::stride]
if (options.interleave) & (options.ndim != 1):
dat = interleave(dat,options.ndim)
Then the dat array is used to create an instance of the TimeSer Object defined in the ts module.
In [ ]:
DATA= ts.TimeSer(dat,len(dat),dim=options.ndim,nbins=options.nbins)
DATA.calc_bins(opt=options.opt)
In the TimeSer Object a series of procedre for the calculation of Statistical Entropy, Mutual Information and Information Transfer are available.
The most crucial and time consuming functions multiple are actually wrapper for FORTRAN90 code that have been compiled with the f2py tool. These function variants are identified by the post-fix "for".
Some of the functions have also been parallelized with OpenMP. The Module automatically identify the number of processors available on the computer, and automatically gnereates an equal number of threads and equally distributes the calculations among these threads. The parallelized versions are identified by the post-fix "omp".
In the script here presented we use the "OMP" version of the "mutual_info()" function [called "mutual_info_omp()"], wich produces as an output two numpy arrays:
In [ ]:
M, E = DATA.mutual_info_omp()
Then finally an image representing the Mutual Information matrix is generated using matplotlib.
In [ ]:
fig = plt.figure()
ax = fig.add_subplot(111)
mat = ax.matshow(M)
fig.colorbar(mat)
plt.show()
If asked the image is also saved to a file in the SVG format (Scalable Vector Graphics) that can be easily opened with any vector graphic editor (e.g. Inkscape, Adobe Illustrator)
In [ ]:
if options.plot:
fig.savefig(f_out.split('.')[0]+".svg",format='svg')
And the Mutual Information Matrix is also saved to disk in text format.
In [ ]:
np.savetxt(f_out,M)
quit()