In [1]:

    
import numpy as np
import pylab as pl
%matplotlib inline

HMMs can model a sub-class of linearly dependent processes (LDPs) -- this article characterizes LDPs using OOMs (learning algorithm has time complexity $O(N + nm^3)$ where $N$ is the length of the training set, $n$ is the number of distinguishable observations, and $m$ is the dimensionality of the model's state space

Notation:
- Let $(X_n)_{n=0,1,2,\dots}$ be a discrete-time stochastic process
- For $P(X_0 = a_0,\dots,X_N = a_N)$, write $P(a_0 \dots a_N)$ or $P(\bar{a})$
- For $P(X_{N+1} = b_1,\dots,\dots,X_{N+M} = b_M \mid X_0 = a_0, \dots, X_N a_N)$, write $P(b_{N+1} \dots b_{N+M} \mid a_0 \dots a_N)$ or $P(\bar{v} \mid \bar{a})$

From HMMs to OOMs
- In an HMM, we have a set of hidden states $S$, e.g. $S = \{ s_1,s_2 \}$, and a set of observable events $O$, e.g. $O = \{ a,b \}$
- The transition rates between hidden states are represented by a transition matrix $M$ $$ \begin{array}{cc} M = \left[ \begin{array}{cc} 0.0 & 1.0 \\ 0.5 & 0.5 \end{array} \right] & M^T = \left[ \begin{array}{cc} 0.0 & 0.5 \\ 1.0 & 0.5 \end{array} \right] \end{array}$$
- Observation probabilities are given by diagonal observation matrices $O_a$ and $O_b$ $$ \begin{array}{cc} O_a = \left[ \begin{array}{cc} 0.5 & 0.0 \\ 0.0 & 1.0 \end{array} \right] & O_b = \left[ \begin{array}{cc} 0.5 & 0.0 \\ 0.0 & 0.0 \end{array} \right] \end{array}$$

[skipping ahead a little]

HMM vs. OMM:
- An HMM is a structure $(\newcommand{\reals}{\mathbb{R}} \reals^m, \{ T_a, T_b\}, w_-$
Definition: an OOM is a structure
Interpretable OOMs
- Definition:
  - Let $O$ be a finite set of observables $k \geq 1$. A $k$-event is a nonempty subset of $O^k$
  - Let $m \geq 1$.
Learning OOMs from data:
- In an interpretable OOM, $$\begin{align} w_0 & = (P(A_1),\dots,P(A_m)) \\ \tau_a w_0 & = (P(a A_1),\dots,P( a A_m)) \\ \tau_a \tau_b w_0 & = (P(b a A_1),\dots,P(b a A_m)) \\ & \dots \end{align}$$
- $w_0, \tau_a, \tau_a \tau_b w_0, \dots$ can be estimated from data by counting frequencies
  - Can obtain $\tau_a$ from argument-value pairs $$ \begin{align} w_0 & \to \tau_a w_0\\ \tau_b w_0 & \to \tau_a \tau_b w_0\\ & \dots \end{align} $$



In [ ]: