Aitor Morales-Gregorio
Python Module of the Week on the 29.12.2019 @ INM-6
At best in a fresh environment with python3 and matplotlib:
conda env create -n nx python=3 matplotlib
pip install networkx
You can also use the environment file in this repository 'env_nx.yml' and create your environment simply by:
conda env create -f env_nx.yml
And activate it:
conda activate nx
The following basic graph types are provided as Python classes:
Graph
This class implements an undirected graph. It ignores multiple edges between two nodes. It does allow self-loop edges between a node and itself.
DiGraph
Directed graphs, that is, graphs with directed edges. Provides operations common to directed graphs, (a subclass of Graph).
MultiGraph
A flexible graph class that allows multiple undirected edges between pairs of nodes. The additional flexibility leads to some degradation in performance, though usually not significant.
MultiDiGraph
A directed version of a MultiGraph.
This tutorial partially follows the tutorials in the official NetworkX documentation (which is very good!)
In [1]:
import networkx as nx
In [2]:
import matplotlib.pyplot as plt
from random import randint
from pprint import pprint
In [3]:
# Let's create a fresh network!
G = nx.Graph()
nx.draw(G)
In [4]:
# You can add a node like so
G.add_node(1)
nx.draw(G)
In [5]:
# Or you can add other more meaningful nodes all from a list
G.clear()
G.add_nodes_from(['BSc', 'MSc', 'PhD', 'PostDoc', 'Prof', 'Technician'])
nx.draw(G, with_labels=True)
In [6]:
# Adding edges is pretty much the same, just with tuples instead of single values
G.add_edge('BSc', 'MSc')
G.add_edges_from([('MSc', 'PhD'), ('PhD', 'PostDoc'), ('PostDoc', 'Prof')])
for node in list(G.nodes):
G.add_edge('Technician', node)
nx.draw(G, with_labels=True)
In [7]:
# One can get the adjacency matrix
A = nx.adjacency_matrix(G)
print('-- This is a Scipy sparse matrix:\n')
print(A)
print('\n-- This is a numpy matrix:\n')
print(A.todense())
In [8]:
# Degree
list(G.degree())
Out[8]:
In [9]:
# Clustering coefficient
nx.clustering(G)
Out[9]:
In [10]:
# Betweenness centrality
nx.betweenness_centrality(G)
Out[10]:
In [11]:
# Rich club coefficient
## nx.rich_club_coefficient(G) --> ERROR: Not implemented for graphs with self loops
# I destroy edges by removing the node
G.remove_node('Technician')
G.add_node('Technician')
nx.draw(G, with_labels=True)
plt.show()
nx.rich_club_coefficient(G)
Out[11]:
Link to documentation of rich club coefficient function
In [12]:
WG = nx.Graph()
WG.add_nodes_from(['BSc', 'MSc', 'PhD', 'PostDoc', 'Prof', 'Technician'])
'''
Any keyword argument passed to an edge can be treated as a weight
'''
WG.add_edge('BSc', 'MSc', friendliness=randint(10, 99))
WG.add_edge('MSc', 'PhD', friendliness=randint(10, 99))
WG.add_edge('PhD', 'PostDoc', friendliness=randint(10, 99))
WG.add_edge('PostDoc', 'Prof', friendliness=randint(10, 99))
for node in list(WG.nodes):
WG.add_edge('Technician', node, friendliness=randint(10, 99))
'''
One can define at which position each node should be plotted
This can be done manually when creating the nodes:
>>> G.add_node(label, pos=(x,y))
...
>>> pos = nx.get_node_attributes(G, 'pos')
Or much more conveniently, using some optimized algorithm like
>> nx.spring_layout(G)
'''
pos = nx.spring_layout(WG)
nx.draw(WG, pos, with_labels=True)
# Additionally we can print the edge weights
labels = nx.get_edge_attributes(WG,'friendliness')
foo = nx.draw_networkx_edge_labels(WG, pos, edge_labels=labels)
In [13]:
# Degree
print('Degrees of unweighted graph:')
pprint(list(WG.degree()))
print('\nDegrees of weighted graph (a.k.a. intensity):')
pprint(list(WG.degree(weight='friendliness')))
In [14]:
# Clustering coefficient
print('Clustering of unweighted graph:')
pprint(nx.clustering(WG))
print('\nClustering of weighted graph:')
pprint(nx.clustering(WG, weight='friendliness'))
To create a directed graph you have to use:
In [15]:
# The only difference is that the new graph is an instance of nx.DiGraph instead of nx.Graph
DG = nx.DiGraph()
DG.add_nodes_from(['BSc', 'MSc', 'PhD', 'PostDoc', 'Prof', 'Technician'])
'''
When generating edges:
* Source: first tuple element
* Target: second tuple element
'''
DG.add_edges_from([('BSc', 'MSc'), ('MSc', 'PhD'), ('PhD', 'PostDoc'), ('PostDoc', 'Prof')])
for node in list(G.nodes):
DG.add_edge('Technician', node)
nx.draw(DG, with_labels=True)
NetworkX offers a ton of graph generating functions!
In [16]:
# A more realistic network
G_r = nx.fast_gnp_random_graph(n=19, p=0.2, seed=None, directed=True)
nx.draw(G_r, with_labels=True)
plt.show()
print("\033[1m Documentation for nx.fast_gnp_random_graph\033[0m")
print(nx.fast_gnp_random_graph.__doc__)
In [17]:
# Adjacency matrix
A = nx.adjacency_matrix(G_r)
print('\n-- This is a numpy matrix:\n')
print(A.todense())
Networkx offers a wide variety of IO functions for most standard formats:
See all the possibilities for IO in the documentation
This section shows an example of how to load long-range connectivity data (given as a .json file) using pandas.
In [18]:
import pandas as pd
df = pd.read_json('cocomac.json')
df.head()
Out[18]:
In [19]:
G = nx.convert_matrix.from_pandas_adjacency(df)
pos = nx.spring_layout(G)
nx.draw(G, pos, with_labels=True)
In [20]:
# One can also change the drawing style of course
nx.draw(G, pos, with_labels=True, node_size=600, edge_color='gray')
plt.show()
In [21]:
import seaborn as sns
degrees = [d for n, d in G.degree()]
sns.distplot(degrees)
plt.ylabel('Normalized density')
plt.xlabel('Node degree')
plt.show()
nx.draw(G, pos, with_labels=True, node_color=degrees, node_size=600, cmap=plt.cm.Blues, edge_color='gray')
plt.show()
In [22]:
# Clustering coefficient
c_dict = nx.clustering(G)
c_coef = list(c_dict.values())
sns.distplot(c_coef, bins=10)
plt.ylabel('Normalized density')
plt.xlabel('Clustering coefficient')
plt.show()
nx.draw(G, pos, with_labels=True, node_color=c_coef, node_size=600, cmap=plt.cm.Reds, edge_color='gray')
plt.show()
In [23]:
# Betweenness centrality
bc_dict = nx.betweenness_centrality(G)
bc = list(bc_dict.values())
sns.distplot(bc, bins=10)
plt.ylabel('Normalized density')
plt.xlabel('Betweenness centrality')
plt.show()
nx.draw(G, pos, with_labels=True, node_color=bc, node_size=600, cmap=plt.cm.Greens, edge_color='gray')
plt.show()
In [28]:
import holoviews as hv
import math
hv.extension('bokeh', 'matplotlib')
In [56]:
k = 2/(math.sqrt(G.order()))
hv_obj = hv.Graph.from_networkx(G, nx.spring_layout, iterations=1000, k=k)
labels = hv.Labels(hv_obj.nodes, ['x', 'y'], 'index')
(hv_obj.options(width=600, height=400) * labels.opts(text_font_size='8pt', text_color='white', bgcolor='gray'))
Out[56]: