Python 3.6 Jupyter Notebook

Introduction to NetworkX

Your completion of the notebook exercises will be graded based on your ability to do the following:

Understand: Do your pseudo-code and comments show evidence that you recall and understand technical concepts?

Apply: Are you able to execute code (using the supplied examples) that performs the required functionality on supplied or generated data sets?

Analyze: Are you able to pick the relevant method or library to resolve specific stated questions?

Notebook objectives

By the end of this notebook, you will be expected to:

  • Create basic graphs using NetworkX;
  • Use graph generators to explore classic graphs;
  • Visualize graph objects; and
  • Compute neighborhood information from a NetworkX graph object.

List of exercises

Exercise 1: Create and manipulate NetworkX graphs.

Notebook introduction

This week, the practical assessments will focus on the study of networks. In this notebook, you will start with an introduction to NetworkX.

NetworkX is a Python language software package used to create, manipulate, and study the structure, dynamics, and function of complex networks. The first version of this software package was designed and written by Aric Hagberg, Dan Schult, and Pieter Swart between 2002 and 2003.

A network or graph is a set of vertices or nodes, with relationships between nodes represented by a set of lines. These lines can include arrows to depict a directional relationship.

With NetworkX you can load and store networks in standard and nonstandard data formats, generate numerous types of random and classic networks, analyze network structure, build network models, design new network algorithms, draw networks, and much more.

To access and use the NetworkX module functionality, it first needs to be imported into your notebook.

Here are some additional links that will provide you with solid foundational knowledge of NetworkX:

Note:
It is strongly recommended that you save and checkpoint after applying significant changes or completing exercises. This allows you to return the notebook to a previous state should you wish to do so. On the Jupyter menu, select "File", then "Save and Checkpoint" from the dropdown menu that appears.

Load libraries and set options


In [ ]:
# Load relevant libraries.
import networkx as nx  
import matplotlib.pylab as plt
%matplotlib inline
import pygraphviz as pgv
import random

from IPython.display import Image, display

1. Graph creation

With NetworkX, graph objects can be created in one of three ways:

  1. Adding edges and nodes explicitly.

  2. Importing data from data sources.

  3. Graph generators.

This notebook predominantly investigates graph exploration using the first approach, with a few remarks made on the other graph creation approaches.

1.1 Adding edges and nodes explicitly

First, create a graph object by explicitly adding nodes to said object.

1.1.1 Instantiate an empty, undirected graph object


In [ ]:
# Instantiate an empty network undirected graph object, and assign to variable G.
G=nx.Graph()

1.1.2 Add nodes


In [ ]:
# Add a node (1) to G.
G.add_node(1)

In [ ]:
# Add another node ('x') to G.
G.add_node('x')

1.1.3 Visualize the graph structure

A graph is an abstract mathematical object without a specific representation in the Cartesian coordinate space. Therefore, graph visualization is somewhat arbitrary. Notebook 2 will look at algorithms that have been proposed to aid in presenting graph objects. This notebook, however, will use a function - "pydot" - defined below, and which has some appealing aesthetics.


In [ ]:
def pydot(G):
    '''
    A function for graph visualization using the dot framework
    '''
    pdot = nx.drawing.nx_pydot.to_pydot(G)
    display(Image(pdot.create_png()))

You can now visualize the simple graph that you have defined and populated with node information.


In [ ]:
pydot(G)

1.1.4 Use edge information to populate a graph object

Alternatively, you can populate node information by starting off from an edge pair or a list of edge pairs. Such a pairing may or may not include the strength, or other attributes, that describe the relationship between the pair(s). The special edge attribute "weight" should always be numerical, and holds values used by algorithms requiring weighted edges. When specifying edges as tuples, the optional third argument refers to the weight.


In [ ]:
# Add an edge between two nodes, 1 and 3. 
# Note that nodes are automatically added if they do not exist.
G.add_edge(1,3)

In [ ]:
# Add edge information, and specify the value of the weight attribute.
G.add_edge(2,'x',weight=0.9)
G.add_edge(1,'x',weight=3.142)

In [ ]:
# Add edges from a list of tuples.
# In each tuple, the first 2 elements are nodes, and third element is value of the weight attribute. 
edgelist=[('a','b',5.0),('b','c',3.0),('a','c',1.0),('c','d',7.3)] 
G.add_weighted_edges_from(edgelist)

1.1.5 Graph visualization


In [ ]:
# Visualize the graph object.
pydot(G)

In [ ]:
# Visualize the graph object, including weight information.
for u,v,d in G.edges(data=True):
    d['label'] = d.get('weight','')
pydot(G)

1.1.6 A node can be any hashable object

A node can be any of the so-called hashable objects:

An object is hashable if it has a hash value that never changes during its lifetime (it needs a "_hash_()" method), and can be compared to other objects (it needs an "_eq_()" or "_cmp_()" method). Hashable objects that compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

While all of Python’s immutable built-in objects are hashable, no mutable containers (such as lists or dictionaries) are. Objects that are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

Examples of hashable objects in Python include strings, numbers, files, functions, etc. In the following two examples, a node that is a math function and a node that is a file object are added to the graph object.


In [ ]:
# Add a sine function, imported from the math module, as a node.
from math import sin
G.add_node(sin)

In [ ]:
# Add file handle object as node.
fh = open("../data/CallLog.csv","r") # handle to file object.
G.add_node(fh)

1.2 Examining the Graph() object

You can examine the nodes and edges in your graph using various commands.

1.2.1 Getting node information


In [ ]:
# List the nodes in your graph object.
list(G.nodes())

In [ ]:
# How many nodes are contained within your graph model?
G.number_of_nodes()

In [ ]:
# Alternative method for getting the number nodes.
G.order()

1.2.2 Getting edge information


In [ ]:
# List the edges in the graph object.
list(G.edges())

In [ ]:
# How many edges do you have? 
G.number_of_edges()

In [ ]:
# Alternative method for getting number of edges.
G.size()

1.2.3 Getting edge weight information

The most direct way to get edge weight data is by using the "get_edge_data" method, which returns the attribute dictionary associated with an edge pairing.


In [ ]:
for (u, v, wt) in G.edges.data('weight'):
    if wt != None:
        print('(%s, %s, %.3f)' % (u, v, wt))
    if wt is None:
        print(u,v, wt)

In the dict output, the label key was added above when you wanted to show weight attribute information when visualizing the graph. By default, it is not included when adding weight information to the edges.

Print the weight information for all of the edges in your graph object.


In [ ]:
for n1,n2,attr in G.edges(data=True):
         print(n1,n2,attr)

1.2.4 Getting neighbor information

It is also possible to get a list of the neighbors associated with a given node. In the following cell, invoke the graph method "neighbors" and specify the node whose neighbors you are interested in.


In [ ]:
list(G.neighbors('x'))

You can also print the list of all nodes and their corresponding neighbors. The code below prints the node, and the node's neigbhors as list (that is, enclosed between two square brackets).


In [ ]:
for node in G.nodes():
         print(node, list(G.neighbors(node)))

1.2.5 Removing nodes or edges

Removing edges and nodes from a graph is very simple, and is illustrated in the following cell.


In [ ]:
# Add a set of edges from a list of tuples.
e = [(1 ,2) ,(1 ,3)]
G.add_edges_from(e)

In [ ]:
# Remove edge (1,2).
G.remove_edge(1,2)

In [ ]:
# Similarly, you can also remove a node, and all edges linked to that node will also fall away.
G.remove_node(3)

In [ ]:
# Multiple edge or node removal is also possible.
G.remove_edges_from(e)
Important: Removing a node not in the graph raises an error.

In [ ]:
# Trying to remove a node not in the graph raises an error.
G.remove_node(3)

1.2.6 Cleaning up


In [ ]:
# Close the file handle object used above.
fh.close()

# Remove the graph object from the workspace.
del G

1.3 Graph generators

NetworkX also has standard algorithms to create network topologies. The following cells include some examples that you are encouraged to build, analyze, and visualize, using the tools described above, as well as other tools that will be introduced later.

1.3.1 Small, famous graphs


In [ ]:
# Generate some of the small, famous graphs.
petersen=nx.petersen_graph()
tutte=nx.tutte_graph()
maze=nx.sedgewick_maze_graph()
tet=nx.tetrahedral_graph()

In [ ]:
# Plot one of the small, famous graphs.
pydot(petersen)

1.3.2 Classic graphs


In [ ]:
# Generate some classic graphs.
K_5=nx.complete_graph(5)
K_3_5=nx.complete_bipartite_graph(3,5)
barbell=nx.barbell_graph(10,10)
lollipop=nx.lollipop_graph(10,20)

In [ ]:
# Plot one of the classic graphs.
pydot(barbell)

1.3.3 Random graphs


In [ ]:
# Generate some random graphs for arbitrary parameter values.
er=nx.erdos_renyi_graph(10,0.15)
ws=nx.watts_strogatz_graph(30,3,0.1)
ba=nx.barabasi_albert_graph(10,5)
red=nx.random_lobster(20,0.9,0.9)

In [ ]:
# Plot one of the random graphs.
pydot(ba)


Exercise 1 Start.

Note:

This exercise contains five sections. It is broken up into these sections in order to make it easier to follow. Complete all five sections before saving and submitting your notebook.

Exercise 1.1: Instructions

Create an Erdos Renyi random graph. Your graph should have 30 nodes, where each of the edges are chosen with a probability of 0.15, using NetworkX's graph generator methods. Set the argument for the seed parameter to "random.seed(10)". Assign your graph to a variable "G".

Hint: An Erdos Renyi random graph is generated with NetworkX using the following:

G = nx.erdos_renyi_graph(nodes, probability, seed_value)

In [ ]:
# Your answer here.

Exercise 1.2: Instructions

Compute the number of edges in the graph, using one of the methods provided by NetworkX.


In [ ]:
# Your answer here.

Exercise 1.3: Instructions

Write a piece of code that prints the nodel label and neighbors for each node in the graph "G" that you created. Your code should be reusable for other graph objects.


In [ ]:
# Your answer here.

Exercise 1.4: Instructions

Write code to find a node with the largest number of neighbors in the graph. Your code should print both the node label and the number of neighbors it has.


In [ ]:
# Your answer here.

Exercise 1.5: Instructions

Remove the node with the most neighbors (found in Exercise 1.4) from the graph.


In [ ]:
# Your answer here.


Exercise 1 End.

Exercise complete:

This is a good time to "Save and Checkpoint".

2. Submit your notebook

Please make sure that you:

  • Perform a final "Save and Checkpoint";
  • Download a copy of the notebook in ".ipynb" format to your local machine using "File", "Download as", and "IPython Notebook (.ipynb)"; and
  • Submit a copy of this file to the Online Campus.

In [ ]: