Tutorial "Algorithmic Methods for Network Analysis with NetworKit" (Part 3)

Determining Important Nodes

There are a number of ways to measure the importance of nodes in a network. Possibly the easiest is the degree, i.e. the number of neighbors. In a social network, for example, a person who knows many others could be an important person. However, is this notion really meaningful? Probably not, since it does not consider the importance of the neighbors. Also, there is an interesting effect in social networks with respect to neighborhood sizes. Let us investigate this effect a little bit:

Q&A Session #4

Do you think your number of online friends is above/below/on average? (You do not have to answer this question openly.) Answer (may be secret):
What do you expect: How many people (in percent) in a social network have fewer friends than their friends on average? Answer (choose one): a) 0 - 25% b) 26 - 50% c) 51 - 75% d) 76 - 100%
Use the Facebook graph. Compute for each vertex the average degree of its neighbors. Answer:
Count the number of persons whose friends have on average more friends. What is their percentage in this network? Answer:



In [1]:

    
from networkit import *



In [2]:

    
%matplotlib inline



In [3]:

    
cd ~/Documents/workspace/NetworKit









    



/Users/Henning/Documents/workspace/NetworKit



In [4]:

    
G = readGraph("input/MIT8.edgelist", Format.EdgeListTabZero)



In [5]:

    
# Code for 3-3) and 3-4)
%matplotlib inline
import matplotlib.pyplot as plt

# def avgFriendDegree(v):



In [6]:

    
count = 0 # count the number of persons whose friends have on average more friends

Thus, ... % of the users in this network have fewer friends than their friends have on average. While this result cannot be generalized exactly like this to other networks, the qualitative effect is often seen in social (and other) networks. Thus, let us now consider measures that broaden the rather narrow scope of the degree.

$k$-core Decomposition

Thus, the next concept we consider is the $k$-core decomposition. To answer the following Q&A session, go back to the lecture slides.

Q&A Session #5

What is the definition of an $i$-core? (Note that $k$ is often used for the largest core only!) Answer:
Why do you think it can be considered a more robust measure for importance compared to the degree? Answer:
Compute the $k$-core decomposition of the three networks used before. Then print the non-empty $i$-shells by using the method coreNumbers(). What results (similarities/differences) do you expect? Are these expectations met by the results? Answer:
What disadvantage do you see when using $k$-core decomposition to rate nodes according to their importance? Answer:



In [7]:

    
# Code for 5-3)
mit8 = readGraph("input/MIT8.edgelist", Format.EdgeListTabZero)
airf1 = readGraph("input/airfoil1.graph", Format.METIS)
gen = generators.ErdosRenyiGenerator(1000, 0.01)
er1000 = gen.generate()

# for g in {mit8, airf1, er1000}:

Centrality Measures

The $k$-core decomposition is rather, as the name suggests, a decomposition of the vertices into discrete subsets. Nodes with the same coreness (i.e. in the same shell) have equal importance. Rankings where many vertices are equally important are often not very meaningful. That is why the $k$-core decomposition should not be interpreted as a fine-grained ranking mechanism.

Q&A Session #6

Take the Facebook graph MIT8 and find the most central nodes. Take the relevance of their neighbors into account. Consider that MIT8 models a social network, not a web graph. Which algorithm would you choose? (Hint: Look at the lecture slides!) Answer:
What are the 15 most important nodes according to the method in 1)? Answer:
What other centrality measures do you recall? Answer:

After you answered the questions, proceed with Tutorial #4.



In [8]:

    
# Code for 6-1) and 6-2)