NetworkX is a python module. To start exploring NetworkX we simply need to start a python session (Like the IPython session you are in now!), and type
In [ ]:
import networkx
All of NetworkX's data structures and functions can then be accessed using the syntax networkx.[Object]
, where [Object]
is the function or data structure you need. Of course you would replace [Object]
with the function you wanted. For example to make a graph, we'd write:
In [ ]:
G = networkx.Graph()
Usually to save ourselves some keystrokes, we'll import NetworkX using a shorter variable name
In [ ]:
import networkx as nx
One of the main strengths of NetworkX is its flexible graph data structures. There are four data structures
Graph
: Undirected GraphsDiGraph
: Directed GraphsMultiGraph
: Undirected multigraphs, ie graphs which allow for multiple edges between nodesMultiDiGraph
: Directed MultigraphsEach of these has the same basic structure, attributes and features, with a few minor differences.
Creating Graphs is as simple as calling the appropriate constructor.
In [ ]:
G = nx.Graph()
D = nx.DiGraph()
M = nx.MultiGraph()
MD = nx.MultiDiGraph()
You can also add attributes to a graph during creation, either by providing a dictionary, or simply using keyword arguments
In [ ]:
G = nx.Graph(DateCreated='2015-01-10',name="Terry")
In [ ]:
G.graph
The graph attribute is just a dictionary and can be treated as one, so you can add and delete more information from it.
In [ ]:
G.graph['Current']=False
del G.graph['name']
In [ ]:
G.graph
Next we'll cover how to add and remove nodes, as well as check for their existance in a graph and add attributes to both!
There are two main functions for adding nodes. add_node
, and add_nodes_from
. The former takes single values, and the latter takes any iterable (list, set, iterator, generator). Nodes can be of any immutable type. This means numbers (ints and floats complex), strings, bytes, tuples or frozen sets. They cannot be mutable, such as lists, dictionaries or sets. Nodes in the same graph do not have to be of the same type
In [ ]:
# Adding single nodes of various types
G.add_node(0)
G.add_node('A')
G.add_node(('x',1.2))
# Adding collections of nodes
G.add_nodes_from([2,4,6,8,10])
G.add_nodes_from(set([10+(3*i)%5 for i in range(10,50)]))
Accessing nodes is done using the nodes
function which is a member of the Graph
object.
In [ ]:
G.nodes()
Sometimes to save memory we might only want to access a list of nodes one at a time, so we can use an iterator. These are especially useful in long running loops to save memory.
In [ ]:
for n in G.nodes_iter():
if type(n)== str:
print(n + ' is a string!')
else:
print(str(n) + " is not a string!")
In the future more functions of NetworkX will exclusively use iterators to save memory and be more Python 3 like...
We can also check to see if a graph has a node several different ways. The easiest is just using the in
keyword in python, but there is also the has_node
function.
In [ ]:
13 in G
In [ ]:
9 in G
In [ ]:
G.has_node(13)
In [ ]:
G.has_node(9)
You can also add attributes to nodes. This can be handy for storing information about nodes within the graph object. This can be done when you create new nodes using keyword arguments to the add_node
and add_nodes_from
function
In [ ]:
G.add_node('Spam',company='Hormel',food='meat')
When using add_nodes_from
you provide a tuple with the first element being the node, and the second being a dictionary of attributes for that node. You can also add attributes which will be applied to all added nodes using keyword arguments
In [ ]:
G.add_nodes_from([('Bologna',{'company':'Oscar Meyer'}),
('Bacon',{'company':'Wright'}),
('Sausage',{'company':'Jimmy Dean'})],food='meat')
To list node attributes you need to provide the data=True
keyword to the nodes
and nodes_iter
functions
In [ ]:
G.nodes(data=True)
Attributes are stored in a special dictionary within the graph called node
you can access, edit and remove attributes there
In [ ]:
G.node['Spam']
In [ ]:
G.node['Spam']['Delicious'] = True
G.node[6]['integer'] = True
In [ ]:
G.nodes(data=True)
In [ ]:
del G.node[6]['integer']
In [ ]:
G.nodes(data=True)
Similiarly, you can remove nodes with the remove_node
and remove_nodes_from
functions
In [ ]:
G.remove_node(14)
G.remove_nodes_from([10,11,12,13])
In [ ]:
G.nodes()
In [ ]:
Using the spaces provided below make a new graph, FizzBuzz
. Add nodes labeled 0 to 100 to the graph. Each node should have an attribute 'fizz' and 'buzz'. If the nodes label is divisble by 3 fizz=True
if it is divisble by 5 buzz=True
, otherwise both are false.
In [ ]:
Adding edges is similar to adding nodes. They can be added, using either add_edge
or add_edges_from
. They can also have attributes in the same way nodes can. If you add an edge that includes a node that doesn't exist it will create it for you
In [ ]:
G.add_edge('Bacon','Sausage',breakfast=True)
G.add_edge('Ham','Bacon',breakfast=True)
G.add_edge('Spam','Eggs',breakfast=True)
Here we are using a list comprehension. This is an easy way to construct lists using a single line. Learn more about list comprehensions here.
In [ ]:
G.add_edges_from([(i,i+2) for i in range(2,8,2)])
In [ ]:
G.edges()
In [ ]:
G.edges(data=True)
Removing edges is accomplished by using the remove_edge
or remove_edges_from
function. Remove edge attributes can be done by indexing into the graph
In [ ]:
G['Spam']['Eggs']
In [ ]:
del G['Spam']['Eggs']['breakfast']
In [ ]:
G.remove_edge(2,4)
In [ ]:
G.edges(data=True)
You can check for the existance of edges with has_edge
In [ ]:
G.has_edge(2,4)
In [ ]:
G.has_edge('Ham','Bacon')
For directed graphs, ordering matters. add_edge(u,v)
will add an edge from u
to v
In [ ]:
D.add_nodes_from(range(10))
In [ ]:
D.add_edges_from([(i,i+1 % 10) for i in range(0,10)])
In [ ]:
D.edges()
In [ ]:
D.has_edge(0,1)
In [ ]:
D.has_edge(1,0)
You can also access edges for only a subset of nodes by passing edges a collection of nodes
In [ ]:
D.edges([3,4,5])
For the FizzBuzz
graph above, add edges betweeen two nodes u
and v
if they are both divisible by 2 or by 7. Each edge should include attributes div2
and div7
which are true if u
and v
are divisible by 2 and 7 respecitively. Exclude self loops.
In [ ]:
In [ ]:
Multigraphs can have multiple edges between any two nodes. They are referenced by a key.
In [ ]:
M.add_edge(0,1)
M.add_edge(0,1)
In [ ]:
M.edges()
The keys of the edges can be accessed by using the keyword keys=True
. This will give a tuple of (u,v,k)
, with the edge being u
and v
and the key being k
.
In [ ]:
M.edges(keys=True)
MultiDraphs
and MultiDiGraphs
are similar to Graphs
and DiGraphs
in most respects
In addition to adding nodes and edges one at a time networkx has some convenient functions for adding complete subgraphs. But beware, these may be removed, or the API changed in the future.
In [ ]:
G.add_cycle(range(100,110))
In [ ]:
G.edges()
Basic graph properties are functions which are member of the Graph
class itself. We'll explore different metrics in part III.
The order of a graph is the number of nodes, it can be accessed by calling G.order()
or using the builtin length function: len(G)
.
In [ ]:
G.order()
In [ ]:
len(G)
The number of edges is usually referred to as the size of the graph, and can be accessed by G.size()
. You could also find out by calling len(G.edges())
, but this is much slower.
In [ ]:
G.size()
For multigraphs it counts the number of edges includeing multiplicity
In [ ]:
M.size()
Node neighbors can be accessed via the neighbors
In [ ]:
G.neighbors('Bacon')
In the case of directed graphs, neighbors are only those originating at the node.
In [ ]:
D.add_edges_from([(0,i) for i in range(5,10)])
D.neighbors(0)
For multigraphs, neighbors are only reported once.
In [ ]:
M.neighbors(0)
The degree of a graph can be found using the degree
function for undirected graphs, and in_degree
and out_degree
for directed graphs. They both return a dictionary with the node as the keys of the dictionary and the degree as the value
In [ ]:
G.degree()
In [ ]:
D.in_degree()
In [ ]:
D.out_degree()
Both of these can be called on a single node or a subset of nodes if not all degrees are needed
In [ ]:
D.in_degree(5)
In [ ]:
D.out_degree([0,1,2])
You can also calculate weighted degree. To do this each edge has to have specific attribute to be used as a weight.
In [ ]:
WG = nx.Graph()
WG.add_star(range(5))
WG.add_star(range(5,10))
WG.add_edges_from([(i,2*i %10) for i in range(10)])
for (u,v) in WG.edges_iter():
WG[u][v]['product'] = (u+1)*(v+1)
In [ ]:
WG.degree(weight='product')
Let's make a network of the people in this room. First, create a graph called C
. Everyone state their name (one at a time) and where they are from. Add nodes to the graph representing each individual, with an attribute denoting where they are from. Add edges to the graph between an individual and their closest three classmates. Have each edge have an attribute that indicates whether there was a previous relationship between the two. If none existed have relationship=None
, if it does exist have the relationship stated, e.g. relationship='Cousin-in-law'
In [ ]:
How many nodes are in the Graph? How many Edges? What is the degree of the graph?
In the next section we'll learn more about saving and loading graphs, as well as operations on graphs, but for now just run the code below.
In [ ]:
nx.write_gpickle(C,'./data/Classroom.pickle')