In [3]:
import networkx as nx
import numpy as np
import pandas as pd
In [5]:
graph=nx.karate_club_graph()
For undirected networks, the node visit frequency of node $\alpha$ simply corresponds to the relative weight $\omega_\alpha$ of the links connected to the node. The relative weight is the total weight of the links connected to the node divided by twice the total weight of all links in the network, which corresponds to the total weight of all link-ends. (Rosvall, M., Axelsson, D., & Bergstrom, C. T. (2010). The map equation. The European Physical Journal Special Topics, 178(1), 13–23.)
In our case this is further simplified as the graph is not directed (weights are implicitly 1). This makes in priciple the computation of page rank obsolete as the node visit frequencies are given by the relative weight. The computation of the exit node probability $q_{i\curvearrowright} = \tau \frac{n-n_i}{n} \sum_{\alpha\in i} p_\alpha + (1 -\tau ) \sum_{\alpha \in i} \sum_{\beta \notin i} p_\alpha \omega_{\alpha\beta}$ would also be simplified as $q_{i\curvearrowright}$ becomes $\omega_{\curvearrowright}$:
$\omega_{i\curvearrowright}$ for the relative weight of links exiting module $i$, and then $\omega_{\curvearrowright} = \sum_{i=1}^{m} \omega_{i\curvearrowright}$ for the total relative weight of links between modules.
However, ommiting the pagerank calculation could potentially complicate other things down the road as maybe later we would like to use directed graphs and it introduces additional logic as opposed to one short method call (as exemplified in the cell below).
So there is a tradeoff to be made. The table bewlow shows the difference between the two approaches exemplified by using the infamous karate network. The question is if these differences are neglecable.
$L(M) = \omega_{\curvearrowright} \log_2(\omega_{\curvearrowright}) - 2 \sum_{i=1}^{m} \omega_{i\curvearrowright} \log_2(\omega_{i\curvearrowright}) - \sum_{\alpha=1}^{n} \omega_{\alpha} \log_2(\omega_{\alpha}) + \sum_{i=1}^{m} (\omega_{i\curvearrowright} + \omega_i) \log_2(\omega_{i\curvearrowright} + \omega_i) $
In [6]:
page_rank=nx.pagerank(graph).values()
In [10]:
relative_weight = [float(graph.degree(i))/sum(graph.degree().values()) for i in graph]
In [38]:
difference=map(lambda x, y: abs(x-y), page_rank, relative_weight)
In [57]:
pd.DataFrame(difference, columns=['Absolute differences pagerank vs. relative weight (true empirical value)'])
Out[57]: