INF 674 S7: Markov Chains

Céline Comte & Fabien Mathieu

2017-2018

The objective of this practical is to make you see a Markov chain in action. In particular, you will observe what happens if the conditions of the Perron-Frobenius theorem are not met.

Things to know (theoretical take-away)

Right stochastic matrix Let $A = (a_{i,j})$ be a $n\times n$ matrix. We say that $A$ is a right stochastic matrix if:

The coefficients of $A$ are non-negative: $\forall i, j, a_{i,j}\geq 0$.
Each row sums to 1: $\forall i, \sum_j a_{i,j} = 1$.

Homogeneous Markov chain: A right stochastic matrix $A$ can define a Markov Chain that describes the evolution of a distribution over $n$ states as follows: if one is in state $i$ at step $k$ of the process, the probability to be in state $j$ at step $k+1$ is $a_{i,j}$.

With this notation, the evolution of a Markov chain is easy to study: if $P_k$ is the probability distribution at step $k$ ($P_k\geq 0, \sum_{i}P_k[i]=1$), then we have

$$P_{k+1}=P_k A.$$

Irreducibility: Let $A$ be a non-negative matrix $A$ ($\forall i,j, A[i,j]\geq 0$). Let $G=(V,E)$ be the oriented graph associated to $A$: $(i,j)\in E$ if, and only if $A[i,j]>0$. The following propositions are equivalent:

$A$ is irreducible
$\forall i,j, \exists k>0, A^k[i,j]>0$
$G$ is strongly connected: $\forall (i,j)\in V^2$, there exists an oriented path in $G$ from $i$ to $j$. Intuitively, the irreducibility property indicates that starting from any state, any state can be reached with a positive (e.g. >0) probability after some steps.

Aperiodicity: (from wikipedia) For an index $i$, the period of $i$ is the greatest common divisor of all natural numbers $k$ such that $A^k[i,i]>0$. When $A$ is irreducible, the period of every index is the same and is called the period of $A$.

$A$ is said aperiodic if the period of each index is 1.

Intuitively, a period $k>1$ indicates that the length of any cycle must be a multiple of $k$.

Perron-Frobenius Theorem (a variant, actually): If $A$ is right stochastic, irreducible and aperiodic, then $A^k\xrightarrow[k\to \infty]{} B$, where $B$ is the right stochastic matrix having all its rows equal to the same row vector $Q$ defined as the unique normalized solution to the equation $QA = Q$.

Interpretation: When the condition of the Perron-Frobenius theorem are met, the process will eventually converge to a unique distribution, which does not depend of its initial state, which is forgotten.

Markov chains animations

To start with, a small experiment that plays with the assumptions of Perron-Frobenius theorem.

Consider the following game: you have a circular game board made of $ n = 36 $ squares numbered from $ 0 $ to $ 35 $. At (discrete) turn $ t=0 $, a player stands on square $ 0 $. Between two turns, the player moves of a certain number of squares. Remember that the board is circular: if the player is in square $35$ and moves one square forward, she lands in square $0$. We propose to visualize the evolution of the (probabilistic) position of the player on the board.

We give you the code to visualize the game where at each turn, the player moves one square forward.

First, evaluate the following cells. Note that you should install ffmpeg if it is not available in your system (https://ffmpeg.org/ ). Linux people may need to install the package libav-tools. If you are on a TPT machine, then ffmpeg is not available and you need to use the writer avconv instead.



In [ ]:

    
%pylab inline



In [ ]:

    
from matplotlib import animation, rc
from IPython.display import HTML

# use writer = 'avconv' on TPT computers
rc('animation', html='html5', writer = 'ffmpeg')

# xkcd()

n = 36

Remark: uncomment the use of xkcd() if you want to pimp your animations a little bit.

The following function will be used to display the evolution of a distribution. It takes two arguments:

next_step is a function that takes a distribution as input and returns the resulting distribution after one step of the Markov process.
k_max indicates the number of steps you want to watch.



In [ ]:

    
def evolution(next_step, k_max):
    # Turn interactive plotting off
    fig, ax = subplots()
    # Initiate probability distribution: the position of the player is known.
    P = zeros(n)
    P[0] = 1
    # Display probability
    pbar = ax.bar(range(n),P,1)
    xlim([0,n])
    #Init only required for blitting to give a clean slate.
    def init():
        for rect, y in zip(pbar, P):
            rect.set_height(y)        
        return pbar
    def animate(i):
        P[:] = next_step(P) # Update the values using the next_step function
        for rect, y in zip(pbar, P):
            rect.set_height(y)        
        return pbar
    ani = animation.FuncAnimation(fig, animate, frames = k_max, init_func=init,
    interval=50, blit=True)
    return ani

The rule for the toy example is to move one case forward. This can be easily done with the function roll from numpy package.



In [ ]:

    
def next_step(P):
    # Roll, baby!
    return roll(P, 1)

We now call the function evolution. The %%capture command hides a static picture that would be shown otherwise.



In [ ]:

    
%%capture
ani = evolution(next_step,180)

And... Showtime (can take a few seconds to initiate, depending on the tmax you set)



In [ ]:

    
ani

Question 1

Are the condition of the Perron-Frobenius met? Justify your answers:

using the theoretical definitions
by commenting the animation above

Answer:

Question 2

We now change the rules. At each turn, the player tosses an unbiased coin. If it is a head, she moves forward $ a = 1 $ step. If it is a tail, she moves forward $ b = 2 $ steps.

As we are now playing with more general Markov chains, roll may not be enough. You may want to use dot(P, A), which multiplies a vector $P$ and a matrix $A$, with a properly built matrix $A$.



In [ ]:

    
def next_stepA(P):
    return dot(P, A)

Visualize the evolution of the position of the player. Comment what you observe.

Answer:

Question 3

New rules. We now assume $ a = 1 $, $ b = i^2 $, where $ i $ is the index of the square where the player stands.

Visualize the evolution of the position of the player. Comment what you observe.

Answer:

Question 4

New rules. We assume now $ a = 1 $, $ b = 7 $.

Visualize the evolution of the position of the player. Comment what you observe.

Answer:

Question 5

New rules. We assume now $ a = 2 $, $ b = 4 $.

Visualize the evolution of the position of the player. Comment what you observe.

Answer:

Question 6

New rule: when the player stands on square 35, she rolls a dice. If she scores 6 (which happens with probability $1/6$), she realizes that the game is pointless and quits the game. Redo questions 2 to 5 with this new rule (theory, practice, discussions...).

Answer:

PageRank on a small graph (if time)

The goal of this last part is to test a few variants of PageRank on small graphs and see how they differ.

It is entirely optional, but it can help you before running PageRank on bigger graphs.

Let $ h $ be an integer (use $ h $ = 3 for toy evaluation, $ h = 9 $ for complete evaluation).

Let $ G $ be the following graph:

There are $ n = 2^{h+1}-1 $ nodes numbered from $0$ to $ 2^{h+1}-2 $.
The oriented edges are as follows:
- For all nodes $ i\geq 1 $, there is an edge from $ i $ to $ \lfloor (i-1) / 2 \rfloor $.
- For all nodes $ i\geq 1 $, there is an edge from $ i $ to $ i - 1 $.
- For all nodes $j$ such that $ j \in [2^h-1, 2^{h+1}-2] $, there is an edge from $0$ to $j$.

Question 1

Build the stochastic matrix $A$ associated to $G$, defined by $ A[i,j] = 1/deg_+(i)$ if there is an edge from $ i $ to $ j $, 0 otherwise. ($deg_+$: outdegree)

Answer:

Question 2

Try to guess which nodes of the graph are important. Justify your answer (e.g. copying/pasting the answer to the next question is not enough).

Answer:

Question 3

Compute the PageRank $P$ which is a solution of $ P = P A $. Proceed iteratively starting from $ P = \mathbf{1}/n $ and updating $ P \leftarrow P A $ until $ ||P-PA||_1<\epsilon $ (recommendation: $ \epsilon = 10^{-8} $). Display, for each of the first 10 iterations, the current top ten (the ten nodes that have highest value in current $P$), the total number of iterations and the final top ten.

Configure a maximal number of authorized iterations to avoid infinite loops (recommendation: 2000 iterations). When reaching the maximal number of iterations, your function should display a message saying the process has not converged and give the past top ten obtained.

Answer:

Question 4

We add $b$ new nodes to $G$ ($ b = 10 $) numbered from $ n $ to $ n+b-1 $. For each new node $ i $, we add an edge from $ i-1 $ to $i$.

Do Questions 1, 2, 3 over with this new graph. What happens (in theory and in practice)?
Use $ P \leftarrow P A / ||PA||_1 $ for updating $P$. What happens?

Answer:

Question 5

We add one edge to the previous graph, from $ n+b-1 $ to $ n+b-2 $.

Do Questions 1, 2, 3 over with this new graph, along with the update proposed in Question 4. What happens (in theory and in practice)?
Use $ P \leftarrow d P A + (1-d)\mathbf{1}/n $ for updating $P$, with $d = 0.85$. What happens?
(Bonus) Try numerically to simply use $ P \leftarrow d P A + \mathbf{1} $ for updating $P$. Can you explain the result?

Answer:

Question 6

In case you didn't do it already, please compare and discuss the convergence and rankings you observed in this practical.

Answer: