Exploratory notebook related to introductory concepts and theory behind Markov Models. Includes toy examples implementation and relative visualization.
In [ ]:
# basic libraries import
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib notebook
sns.set_context("paper")
Also called Markov Chains, Markov Models are a common approach to treat sequential data. Sequential data can be seen as a set of dependent data points for which we consequently lose the independent and identically distributed (i.i.d.) assumption.
We still assume stationarity and limited horizon, meaning that future predictions depend only on a fixed amount of most recent observations. For example if we assume that a data point depends only on the previous most recent observation, we are dealing with first-order Markov chain.
A model can be represented via a transition matrix. Each entry represents the probability to move from current state (row value) to next state (column value). Each row must sum up to 1.
In [ ]:
# possible states of our system
states = ['sun', 'rain', 'cloud']
# transition matrix
data = [
[0.5, 0.1, 0.4],
[0.2, 0.4, 0.4],
[0.4, 0.3, 0.3]
]
transition_matrix = pd.DataFrame(data, index=states, columns=states)
In [ ]:
transition_matrix
In [ ]:
# each row must sum up to 1
for state in states:
assert transition_matrix.loc[state].sum() == 1.0
One can derive a transition matrix from historical data
In [ ]:
observations = np.array(["sun", "sun", "cloud", "cloud", "rain", "cloud", "sun", "sun", "rain", "cloud", "cloud", "sun"])
In [ ]:
# count how many time each transition (from_state -> to_state) occured in the past
transition_matrix = pd.DataFrame(np.zeros((len(states), len(states))), index=states, columns=states)
for (from_state, to_state) in list(zip(observations, np.roll(observations, -1)))[:-1]:
transition_matrix.set_value(from_state, to_state, transition_matrix.loc[from_state, to_state]+1)
In [ ]:
transition_matrix
In [ ]:
# normalize to obtain transition probabilities
transition_matrix = transition_matrix.div(transition_matrix.sum(axis=1), axis=0)
In [ ]:
transition_matrix
If we were to consider a second-order Markov chain we would have to specify the transition probabilities for each pair of states. The number of parameters then grows exponentially with the horizon/window size, so in order to overcome this limitation, latent variables have been introduced. This leads to state space models.
As we saw before, for HMM we are dealing with discrete latent variables. This means that each state of the model can emit/generate different outputs, each of which has an associated emission probability. In other words each state has a probability distribution of possible outputs. Notice also that each output can belong to more that one state.
TODO: concrete example of HMM
In [ ]: