Load modules
In [1]:
import pprint
pp = pprint.PrettyPrinter(indent=4)
import pandas as pd
import numpy as np
from datetime import datetime
import oxyba as ox
import overgang as og
%load_ext autoreload
%autoreload 2
%reload_ext autoreload
In [2]:
df = pd.read_csv("demo-data.csv", delimiter=',')
datatable = np.array(df)
Transform data
In [3]:
datatable[:,1] = [datetime.strptime(d,'%Y-%m-%d') for d in datatable[:,1]]
map_to_encodings
In [4]:
mapping = [['AAA'], ['AA+', 'AA', 'AA-'], ['A+', 'A', 'A-'],
['BBB+', 'BBB', 'BBB-'], ['BB+', 'BB', 'BB-'],
['B+', 'B', 'B-'], ['CCC+', 'CCC', 'CCC-', 'CC', 'C'],
['DDD', 'DD', 'D', 'RD']]
datatable[:,2] = ox.mapencode(datatable[:,2], mapping, nastate=True)
numstates = len(mapping) + 1
Transform tabular data to list of lists
In [5]:
#datalist = og.table_transform(datatable, lastdate=datetime(2018,1,1))
datalist = og.table_transform(datatable)
The datalist object as the following strucutre
In [6]:
pp.pprint(datalist[20:22])
ctmc_fit with debug=True will throw an exception if something is wrong.
With debug=False (Default) ctmc_fit is very fast but might crash at a later or generate bogus results.
In [7]:
try:
transmat, genmat, transcount, statetime = og.ctmc_fit(
datalist, numstates, 1.0, toltime=1e-8, debug=True)
except Exception as e:
print(e)
ctmc_fit2 will just send a warning message and will try to autocorrect and proceed.
Obviously, this kind of implementation invites for sluggish data preprocessing
but is a painless way to get quick results for a small data set.
In [8]:
transmat, genmat, transcount, statetime = og.ctmc_fit2(datalist, numstates)
At least we got a result with ctmc_fit2
In [9]:
print("Transition Matrix\n", transmat.round(2))
ctmc_fit with debug=True calls internally ctmc_datacheck at the beginning.
ctmc_datacheck iterates over datalist and checks all kinds of error causes and throws immediatly an exception.
In [10]:
try:
og.ctmc_datacheck(datalist, numstates, toltime=1e-8)
except Exception as e:
print(e)
In [11]:
datalist[40]
Out[11]: