In [3]:
import networkx as nx
import numpy as np
import pandas as pd

In [5]:
graph=nx.karate_club_graph()

For undirected networks, the node visit frequency of node $\alpha$ simply corresponds to the relative weight $\omega_\alpha$ of the links connected to the node. The relative weight is the total weight of the links connected to the node divided by twice the total weight of all links in the network, which corres􏰓ponds to the total weight of all link-ends. (Rosvall, M., Axelsson, D., & Bergstrom, C. T. (2010). The map equation. The European Physical Journal Special Topics, 178(1), 13–23.)

In our case this is further simplified as the graph is not directed (weights are implicitly 1). This makes in priciple the computation of page rank obsolete as the node visit frequencies are given by the relative weight. The computation of the exit node probability $q_{i\curvearrowright} = \tau \frac{n-n_i}{n} \sum_{\alpha\in i} p_\alpha + (1 -\tau ) \sum_{\alpha \in i} \sum_{\beta \notin i} p_\alpha \omega_{\alpha\beta}$ would also be simplified as $q_{i\curvearrowright}$ becomes $\omega_{\curvearrowright}$:

$\omega_{i\curvearrowright}$ for the relative weight of links exiting module $i$, and then $\omega_{\curvearrowright}􏰐 = \sum_{i=1}^{m} \omega_{i\curvearrowright}􏰐$ for the total relative weight of links between modules.

However, ommiting the pagerank calculation could potentially complicate other things down the road as maybe later we would like to use directed graphs and it introduces additional logic as opposed to one short method call (as exemplified in the cell below).

So there is a tradeoff to be made. The table bewlow shows the difference between the two approaches exemplified by using the infamous karate network. The question is if these differences are neglecable.

$L(M) = \omega_{\curvearrowright} \log_2(\omega_{\curvearrowright}) - 2 \sum_{i=1}^{m} \omega_{i\curvearrowright}􏰐 \log_2(\omega_{i\curvearrowright}) - \sum_{\alpha=1}^{n} \omega_{\alpha} \log_2(\omega_{\alpha}) + \sum_{i=1}^{m} (\omega_{i\curvearrowright}􏰐 + \omega_i) \log_2(\omega_{i\curvearrowright} + \omega_i) $


In [6]:
page_rank=nx.pagerank(graph).values()

In [10]:
relative_weight = [float(graph.degree(i))/sum(graph.degree().values()) for i in graph]

In [38]:
difference=map(lambda x, y: abs(x-y), page_rank, relative_weight)

In [57]:
pd.DataFrame(difference, columns=['Absolute differences pagerank vs. relative weight (true empirical value)'])


Out[57]:
Absolute differences pagerank vs. relative weight (true empirical value)
0 0.005562
1 0.004814
2 0.007024
3 0.002601
4 0.002749
5 0.003472
6 0.003472
7 0.001150
8 0.002286
9 0.001488
10 0.002749
11 0.003155
12 0.001825
13 0.002515
14 0.001715
15 0.001715
16 0.003965
17 0.001738
18 0.001715
19 0.000374
20 0.001715
21 0.001738
22 0.001715
23 0.000530
24 0.001845
25 0.001775
26 0.002223
27 0.000002
28 0.000342
29 0.000646
30 0.001052
31 0.001305
32 0.005231
33 0.008056