This is the IOHMM model with the parameters learned in a supervised way. This is corresponding to the counting frequency process as in the supervised HMM. See notes in http://www.cs.columbia.edu/4761/notes07/chapter4.3-HMM.pdf.

SupervisedIOHMM


In [1]:
from __future__ import  division

import json
import warnings


import numpy as np
import pandas as pd


from IOHMM import SupervisedIOHMM
from IOHMM import OLS, CrossEntropyMNL


warnings.simplefilter("ignore")

Load speed data


In [2]:
speed = pd.read_csv('../data/speed.csv')
speed.head()


Out[2]:
Unnamed: 0 rt corr Pacc prev
0 1 6.456770 cor 0.0 inc
1 2 5.602119 cor 0.0 cor
2 3 6.253829 inc 0.0 cor
3 4 5.451038 inc 0.0 inc
4 5 5.872118 inc 0.0 inc

Label some/all states

In our structure of the code, the states should be a dictionary, the key is the index in the sequence (e.g. 0, 5) and the value is a one-out-of-n code of array where the kth value is 1 if the hidden state is k. n is the number of states in total.

In the following example, we assume that the "corr" column gives the correct hidden states.


In [3]:
states = {}
corr = np.array(speed['corr'])
for i in range(len(corr)):
    state = np.zeros((2,))
    if corr[i] == 'cor':
        states[i] = np.array([0,1])
    else:
        states[i] = np.array([1,0])

Set up a simple model manully


In [4]:
# we choose 2 hidden states in this model
SHMM = SupervisedIOHMM(num_states=2)

# we set only one output 'rt' modeled by a linear regression model
SHMM.set_models(model_emissions = [OLS()], 
                model_transition=CrossEntropyMNL(solver='lbfgs'),
                model_initial=CrossEntropyMNL(solver='lbfgs'))

# we set no covariates associated with initial/transitiojn/emission models
SHMM.set_inputs(covariates_initial = [], covariates_transition = [], covariates_emissions = [[]])

# set the response of the emission model
SHMM.set_outputs([['rt']])

# set the data and ground truth states
SHMM.set_data([[speed, states]])

Start training


In [5]:
SHMM.train()

See the training results


In [6]:
# the coefficients of the output model for each states
print(SHMM.model_emissions[0][0].coef)
print(SHMM.model_emissions[1][0].coef)


[[ 5.70451774]]
[[ 6.13678825]]

In [7]:
# the scale/dispersion of the output model of each states
print(np.sqrt(SHMM.model_emissions[0][0].dispersion))
print(np.sqrt(SHMM.model_emissions[1][0].dispersion))


[[ 0.35831781]]
[[ 0.47356034]]

In [8]:
# the transition probability from each state
print(np.exp(SHMM.model_transition[0].predict_log_proba(np.array([[]]))))
print(np.exp(SHMM.model_transition[1].predict_log_proba(np.array([[]]))))


[[ 0.38392857  0.61607143]]
[[ 0.21165647  0.78834353]]

Save the trained model


In [9]:
json_dict = SHMM.to_json('../models/SupervisedIOHMM/')
json_dict


Out[9]:
{'data_type': 'SupervisedIOHMM',
 'properties': {'covariates_emissions': [[]],
  'covariates_initial': [],
  'covariates_transition': [],
  'model_emissions': [[{'data_type': 'OLS',
     'properties': {'alpha': 0,
      'coef': {'data_type': 'numpy.ndarray',
       'path': '../models/SupervisedIOHMM/model_emissions/state_0/emission_0/coef.npy'},
      'dispersion': {'data_type': 'numpy.ndarray',
       'path': '../models/SupervisedIOHMM/model_emissions/state_0/emission_0/dispersion.npy'},
      'est_stderr': False,
      'fit_intercept': True,
      'l1_ratio': 0,
      'max_iter': 100,
      'n_targets': 1,
      'reg_method': None,
      'solver': 'svd',
      'stderr': {'data_type': 'numpy.ndarray',
       'path': '../models/SupervisedIOHMM/model_emissions/state_0/emission_0/stderr.npy'},
      'tol': 0.0001}}],
   [{'data_type': 'OLS',
     'properties': {'alpha': 0,
      'coef': {'data_type': 'numpy.ndarray',
       'path': '../models/SupervisedIOHMM/model_emissions/state_1/emission_0/coef.npy'},
      'dispersion': {'data_type': 'numpy.ndarray',
       'path': '../models/SupervisedIOHMM/model_emissions/state_1/emission_0/dispersion.npy'},
      'est_stderr': False,
      'fit_intercept': True,
      'l1_ratio': 0,
      'max_iter': 100,
      'n_targets': 1,
      'reg_method': None,
      'solver': 'svd',
      'stderr': {'data_type': 'numpy.ndarray',
       'path': '../models/SupervisedIOHMM/model_emissions/state_1/emission_0/stderr.npy'},
      'tol': 0.0001}}]],
  'model_initial': {'data_type': 'CrossEntropyMNL',
   'properties': {'alpha': 0,
    'coef': {'data_type': 'numpy.ndarray',
     'path': '../models/SupervisedIOHMM/model_initial/coef.npy'},
    'est_stderr': False,
    'fit_intercept': True,
    'l1_ratio': 0,
    'max_iter': 100,
    'n_classes': 2,
    'reg_method': 'l2',
    'solver': 'lbfgs',
    'stderr': {'data_type': 'numpy.ndarray',
     'path': '../models/SupervisedIOHMM/model_initial/stderr.npy'},
    'tol': 0.0001}},
  'model_transition': [{'data_type': 'CrossEntropyMNL',
    'properties': {'alpha': 0,
     'coef': {'data_type': 'numpy.ndarray',
      'path': '../models/SupervisedIOHMM/model_transition/state_0/coef.npy'},
     'est_stderr': False,
     'fit_intercept': True,
     'l1_ratio': 0,
     'max_iter': 100,
     'n_classes': 2,
     'reg_method': 'l2',
     'solver': 'lbfgs',
     'stderr': {'data_type': 'numpy.ndarray',
      'path': '../models/SupervisedIOHMM/model_transition/state_0/stderr.npy'},
     'tol': 0.0001}},
   {'data_type': 'CrossEntropyMNL',
    'properties': {'alpha': 0,
     'coef': {'data_type': 'numpy.ndarray',
      'path': '../models/SupervisedIOHMM/model_transition/state_1/coef.npy'},
     'est_stderr': False,
     'fit_intercept': True,
     'l1_ratio': 0,
     'max_iter': 100,
     'n_classes': 2,
     'reg_method': 'l2',
     'solver': 'lbfgs',
     'stderr': {'data_type': 'numpy.ndarray',
      'path': '../models/SupervisedIOHMM/model_transition/state_1/stderr.npy'},
     'tol': 0.0001}}],
  'num_states': 2,
  'responses_emissions': [['rt']]}}

In [10]:
with open('../models/SupervisedIOHMM/model.json', 'w') as outfile:
    json.dump(json_dict, outfile, indent=4, sort_keys=True)

Load back the trained model


In [11]:
SHMM_from_json = SupervisedIOHMM.from_json(json_dict)

See if the coefficients are any different


In [12]:
# the coefficients of the output model for each states
print(SHMM.model_emissions[0][0].coef)
print(SHMM.model_emissions[1][0].coef)


[[ 5.70451774]]
[[ 6.13678825]]

Set up the model using a config file, instead of doing it manully


In [13]:
with open('../models/SupervisedIOHMM/config.json') as json_data:
    json_dict = json.load(json_data)

SHMM_from_config = SupervisedIOHMM.from_config(json_dict)

Set data and start training


In [14]:
SHMM_from_config.set_data([[speed, states]])
SHMM_from_config.train()

See if the training results are any different?


In [15]:
# the coefficients of the output model for each states
print(SHMM_from_config.model_emissions[0][0].coef)
print(SHMM_from_config.model_emissions[1][0].coef)


[[ 5.70451774]]
[[ 6.13678825]]

In [ ]: