Markov Models


Exploratory notebook related to introductory concepts and theory behind Markov Models. Includes toy examples implementation and relative visualization.

In [ ]:
# basic libraries import
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

%matplotlib notebook


Markov Models

Also called Markov Chains, Markov Models are a common approach to treat sequential data. Sequential data can be seen as a set of dependent data points for which we consequently lose the independent and identically distributed (i.i.d.) assumption.

We still assume stationarity and limited horizon, meaning that future predictions depend only on a fixed amount of most recent observations. For example if we assume that a data point depends only on the previous most recent observation, we are dealing with first-order Markov chain.

Transition Matrix

A model can be represented via a transition matrix. Each entry represents the probability to move from current state (row value) to next state (column value). Each row must sum up to 1.

In [ ]:
# possible states of our system
states = ['sun', 'rain', 'cloud']

# transition matrix
data = [
    [0.5, 0.1, 0.4],
    [0.2, 0.4, 0.4],
    [0.4, 0.3, 0.3]
transition_matrix = pd.DataFrame(data, index=states, columns=states)

In [ ]:

In [ ]:
# each row must sum up to 1
for state in states:
    assert transition_matrix.loc[state].sum() == 1.0

One can derive a transition matrix from historical data

In [ ]:
observations = np.array(["sun", "sun", "cloud", "cloud", "rain", "cloud", "sun", "sun", "rain", "cloud", "cloud", "sun"])

In [ ]:
# count how many time each transition (from_state -> to_state) occured in the past
transition_matrix = pd.DataFrame(np.zeros((len(states), len(states))), index=states, columns=states)
for (from_state, to_state) in list(zip(observations, np.roll(observations, -1)))[:-1]: 
    transition_matrix.set_value(from_state, to_state, transition_matrix.loc[from_state, to_state]+1)

In [ ]:

In [ ]:
# normalize to obtain transition probabilities
transition_matrix = transition_matrix.div(transition_matrix.sum(axis=1), axis=0)

In [ ]:

Latent Variable and State Space

If we were to consider a second-order Markov chain we would have to specify the transition probabilities for each pair of states. The number of parameters then grows exponentially with the horizon/window size, so in order to overcome this limitation, latent variables have been introduced. This leads to state space models.

  • Hidden Markov Model (HMM) if latent variables are discrete
  • Linear Dynamical Systems (LDS) if latent variables are continuous (Gaussian)

Hidden Markov Model (HMM)

As we saw before, for HMM we are dealing with discrete latent variables. This means that each state of the model can emit/generate different outputs, each of which has an associated emission probability. In other words each state has a probability distribution of possible outputs. Notice also that each output can belong to more that one state.

TODO: concrete example of HMM

In [ ]: