Community detection algorithms

On this seminar your are asked to implement simple community detection algorightm. It is called Markov Cluster Algorithm (MCL).

It consists of the following steps: <br> Input: Transfer matrix $T = D^{-1}A$ <br> Output: Adjacency matrix $M^*$ <br> <br>

  1. Set $M = T$
  2. repeat:
    1. Expansion Step: $M = M^p$ (usually $p=2$)
    2. Inflation Step: Raise every entry of $M$ to the power $\alpha$ (usualy $\alpha=2$)
    3. Renormalize: Normalize each column by its sum
    4. Prunning: Remove entries that are close to $0$
  3. until $M$ converges
  4. $M^* = M$ <br> <br>

As a result you should get a cluster matrix s.t. elements of the cluster correspont to nonzero elements of the rows of the matrix. <br>

  • What advantages and disadvantages does this method have?
  • Run this method for network 1, 2 and 3.

In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# plt.xkcd()

import networkx as nx
import scipy
import scipy.io
%matplotlib inline

In [ ]:
data = scipy.io.loadmat('./data/network1.mat')
A = data['A'].astype( 'float' )
plt.spy( A )
comm = data[ 'Comm' ]
# Continue here
#

In [ ]:
def markov_clust( A, p = 2, alpha = 2, tol = 1e-8, rel_eps = 1e-4, niter = 1e3 ) :
    i = 0
## Convert into a transfer matrix
    M = np.dot( np.diag( 1.0 / np.sum( A, axis = 1, dtype = np.float64 ) ), A )
    while i < niter :
## Remember the old matrix
        M_prime = M
## Expansion step
        M = np.linalg.matrix_power( M, p )
## Inflation step
        M = np.power( M, alpha )
## Renormalisation step
        M = np.dot( np.diag( 1.0 / np.sum( M, axis = 1, dtype = np.float64 ) ), M )
## Pruning 
        M[ np.abs( M ) < tol ] = 0
        if np.sum( np.abs( M - M_prime ) / ( np.abs( M_prime ) + rel_eps ) ) < rel_eps :
            break
        i += 1
    return (M, i)

In [ ]:
M, i = markov_clust( A )
plt.spy( M )
print i
  1. Set $M = T$
  2. repeat:
    1. Expansion Step: $M = M^p$ (usually $p=2$)
    2. Inflation Step: Raise every entry of $M$ to the power $\alpha$ (usualy $\alpha=2$)
    3. Renormalize: Normalize each column by its sum
    4. Prunning: Remove entries that are close to $0$
  3. until $M$ converges
  4. $M^* = M$