In [ ]:
%%HTML
<style>
.output_png {
    display: table-cell;
    text-align: center;
    vertical-align: middle;
}

body:after {
background-image: url('../04-Small-Worlds/lincs.png');
background-size: 200px 100px;
position: fixed;
bottom: 1em;
right: 8em;
width: 200px; 
height: 100px;
content:"";
}
</style>

Markov Chains & PageRank

Céline Comte, Fabien Mathieu

ACN Master

Roadmap

1. Introduction

2. Markov Chains

3. PageRank

Introduction

Online resources

For theoretical background, look at

How to find the page I am looking for?

  • I know it already (bookmark...)
  • I surf from page to page
  • I use a search engine

Ranking: A Big Issue

  • One request, lot of answers:
    • Google : 11 billions results
    • Algorithm: 128 millions results
    • Platypus: 9 millions results
  • Humans seldom read past the first page (10 results)
  • How to select relevant pages?

An Old Issue

Solutions

  • Content of pages (pertinence)
  • Study the URL
  • Number of incoming links
  • Hyperlinks

Markov Chains

Definitions

  • $ V $: a set of state
  • Random process $ X_k, k\in \mathbb{N} $: set of random variables on $ V $
  • Markov chain: $\forall k$, $$P(X_k=j|X_{k-1} = i_{k-1}, \ldots, X_0 = i_0) = P(X_k=j | X_{k-1} = i_{k-1})$$
  • Transition coefficients: $ p^k_{i,j} = P(X_k = j | X_{k-1} = i) $
  • Homogeneous chain: $ p^k_{i,j} \rightarrow p_{i,j} $.

Stochastic matrix

  • $ A = (p_{i,j}) $: right stochastic matrix (each line sums to $1$.
  • If $ x_k $ (row vector) represents state distribution at step $ k $, then $ x_k = x_0 A^k $.
  • Interpretation : multiply by $ A $ $\Rightarrow$ move one step.
  • Proof: just write it.

Transition Graph

A Markov chain can be represented by a weighted oriented graph $ G = (V,E,m) $

  • $ V $: states
  • $ E $: non zero transitions
  • $ m(e) $: probability of transition $ e $

Spoiler : PageRank interprets a Web Graph as a transition graph

Aperiodic, Irreducible

  • Aperiodic: all nodes have period 1
  • Irreducible: $ \forall i,j, \exists k, (A^k)_{i,j} > 0 $
  • Irreducibility is linked to connectivity

Stochastic Perron-Frobenius

Main Theorem

Let $A$ be a stochastic matrix

  1. The spectral radius of $ A $ is $1$, (it is an eigenvalue)
  2. If $ A $ is irreducible, there is a unique probability $P$ that is left eigenvector. $ P > 0 $.
  3. If $ A $ is also aperiodic, all eigenvalues but $1$ have modulus < 1.

Proof of 1. as exercise

Convergence of $A$

Let $ A $ be stochastic, irreducible, aperiodic, $ P $ the corresponding probability eigenvalue.

Then $ A^k \underset{k\to \infty}{\longrightarrow} \mathbf{1}.P $

Proof: $ A^k $ converges to $ A^\infty $, stochastic rank 1 projection over $ P $.

Corollary

Let $ P_0 $ be a probability distribution over $ V $. Then $ P_k := P_0 A^k \underset{k\to \infty}{\longrightarrow} P $.

Proof: straightforward

Interpretation

  • An irreducible, aperiodic, Markov Chain forgets its starting point.
  • $ P $: stationary distribution.
  • $ P > 0 $: all states are visited infinitely often

Removing hypothesis

Perron-Frobenius conditions are not always met:

  • reducible matrices
  • periodic graphs
  • sub-stochastic matrices

Reducible matrices

Any directed graph can be split into strongly connected components (SCCs) If $A$ is reducible, $G$ possesses more than one SCC:

  • At least one recurrent SCC
  • Possibly transient SCC's

Reducible matrices

Up to a permutation of states we have: $$A = \begin{pmatrix} T & E \\ 0 & R \end{pmatrix}\text{, with } R = \begin{pmatrix} R_1 & 0 & \cdots & 0 \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & R_d \end{pmatrix} $$

Reducible matrices

If the recurrent SCC's are ap. ir. $$A^k \underset{k\to \infty}{\longrightarrow} \begin{pmatrix} 0 & F \\ 0 & R^\infty \end{pmatrix}\text{, with } \left\{ \begin{array}{l} R^\infty = \begin{pmatrix} R^\infty_1 & 0 & \cdots & 0 \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & R^\infty_d \end{pmatrix} \\ \text{ and }\\ F = (\mathbf{1}-T)^{-1}ER^\infty \end{array} \right. $$

Interpretation

  • $ R^\infty $: cannot escape a recurrent SCC.
  • $ T^k \to 0 $: probability of been in transient state goes to $0$.
  • $ F $ describes the dispatch from transient to recurrent:
    • $ (1-T)^{-1} = \sum T^k $: walk in transient SCC's;
    • $ E $: transient/recurrent transition;
    • $ R^\infty $: stationary state.

Periodic Matrices

  • Cycles
  • Bipartite graphs

If irreducible:

  • Keep existence of a unique eigenprobability.
  • Lose standard convergence (keep Cesàro convergence)

Periodic Matrices

Let $ A $ be stochastic, irreducible, associated to probability $ P $. Let $B = (\alpha A) + (1-\alpha \mathbf{1}) $, for $ \alpha \in ]0,1[ $, $ P_0 $ a probability. Then $$P_0 B^k\underset{k\to \infty}{\longrightarrow} P$$

Interpretation: uniform loops preserve $P$ and suppress periodicity.

Substochastic matrices

$ A $ sub-stochastic: $ 0\leq A \lneq B $, with $ B $ stochastic.

$ A $ sub-irreducible: $ 0\leq A \lneq B $, with $ B $ stochastic irreducible.

Theorem: if $ A $ sub-irreducible, $ A^k \to 0 $

Interpretation: leak (aka incomplete Markov Chain)

Pseudo-Recurrent Component

Let $ A $ be non-negative, associated to $G$. A SCC is pseudo-recurrent iff:

  • Its spectral radius is maximal (compared to other SCCs);
  • Any SCC reachable from it has a strictly smaller spectral radius

Theorem

  • The spectral radius of $A$ is the one of PR SCC's;
  • There is a positive maximal eigenvalue (unique if aperiodicity), with multiplicity equal to the number of PR SCC's;
  • To each PR SCC $ C $ is associated a non-negative eigenvalue with support $ \uparrow C $.

Interpretation

If $ A $ is sub-irreducible, $ A^k $ goes to $ 0 $, but:

  • Convergence is not uniform.
  • Pseudo-recurrent components are the slowest $ \rightarrow $ they will prevail in $ A^k $.
  • If one unique pseudo-recurrent $ C $ and $ \uparrow C = V $, $ A $ is pseudo-irreducible.

Take-Away

  • Markov Chain $ \Leftrightarrow $ random walk in a graph.
  • Perfect case (fully defined, irreducible, aperiodic) $ \Rightarrow $ convergence to a unique stationary distribution.
  • Otherwise, workarounds are required

PageRank Basics

Use hyperlinks

  • Simple method: Indegree
  • Problem: robustness
  • Brin & Page: a page is important if referenced by important pages
    • Recursive indegree
    • Must be turned into equation

Random surfer

  • Assumes important pages are easily found thanks to structure
  • Model: user randomly clicks from page to page
  • Importance $ \Leftrightarrow $ probability
  • A few details to check

Web graph transition matrix

$ A = (a_{ij}) $, with $ a_{i,j} = 1 / d(i) $ if $ i \rightarrow j $, 0 otherwise.

  • $ A $ is substochastic.
  • $ A $ stochastic $ \Leftrightarrow $ no leaf.

PageRank: simple definition

  • Build $A$ from Web Graph, assume PF conditions are met
  • Choose some $P_0$ over $V$
  • Do $P_{n+1} = P_n A$ as long as necessary
  • PR on simple graph?

Coping with Web Graph Reality

  • Stochastic
  • Irreducible
  • Fit in memory

Stochasticity of a Web graph

A web graph possesses many leaves:

  • Pages with no hrefs
  • Unexplored pages (pages are crawled)

As a result, probability leaks.

Normalized PageRank

Simple workaround for leaves: $P_{n+1} = P_n A / ||P_n A||_1$

PageRank with damping

  • Actual class of PageRank to use
  • $P_{n+1} = \alpha P_n A +(1-\alpha) Z$
    • $Z$: zapping distribution
    • $\alpha$: damping factor

Interpretations

  • Random walk with reset
  • Propagation of vanishing fluid injected from $Z$
  • Fixed point equation

Memory footprint

Usually, $A$ does not fit into memory.

Solution: cf Wikipedia practical

Customization

In practice, ranking corresponds to a request.

  • Mix PR with other metrics
  • Customize PR itself

Customization

Three ways to customize $A$:

  • $\alpha$
  • $A$
  • $Z$ (cf practical)