```
In [ ]:
```import networkx as nx
from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

As mentioned earlier, networks, also known as graphs, are comprised of individual entities and their representatives. The technical term for these are nodes and edges, and when we draw them we typically use circles (nodes) and lines (edges).

In this notebook, we will work with a synthetic (i.e. simulated) social network, in which nodes are individual people, and edges represent their relationships. If two nodes have an edge between them, then those two individauls know one another.

In the `networkx`

implementation, graph objects store their data in dictionaries.

Nodes are part of the attribute `Graph.node`

, which is a dictionary where the key is the node ID and the values are a dictionary of attributes.

Edges are part of the attribute `Graph.edge`

, which is a nested dictionary. Data are accessed as such: `G.edge[node1][node2]['attr_name']`

.

Because of the dictionary implementation of the graph, any hashable object can be a node. This means strings and tuples, but not lists and sets.

With this synthetic social network, we will attempt to answer the following basic questions using the NetworkX API:

- How many people are present in the network?
- What is the distribution of attributes of the people in this network?
- How many relationships are represented in the network?
- What is the distribution of the number of friends that each person has?

First off, let's load up the synthetic social network. This will show you through some of the basics of NetworkX.

For those who are interested, I simply created an Erdõs-Rényi graph with `n=30`

and `p=0.1`

. I used randomized functions that I wrote to generate attributes and append them to each node and edge. I then pickled the graph to disk.

```
In [ ]:
```G = nx.read_gpickle('Synthetic Social Network.pkl') #If you are Python 2.7, read in Synthetic Social Network 27.pkl
nx.draw(G)

```
In [ ]:
```# Who are represented in the network?
G.nodes()

Exercise: Can you write a single line of code that returns the number of individuals represented?

```
In [ ]:
```

```
In [ ]:
```# Who is connected to who in the network?
G.edges()

```
In [ ]:
```len(G.edges())

```
In [ ]:
```# Let's get a list of nodes with their attributes.
G.nodes(data=True)
# NetworkX will return a list of tuples in the form (node_id, attribute_dictionary)

```
In [ ]:
```from collections import Counter
Counter([d['sex'] for n, d in G.nodes(data=True)])

Edges can also store attributes in their attribute dictionary.

```
In [ ]:
```G.edges(data=True)

`.year`

, `.month`

, `.day`

.

```
In [ ]:
```

We found out that there are two individuals that we left out of the network, individual no. 31 and 32. They are one male (31) and one female (32), their ages are 22 and 24 respectively, they knew each other on 2010-01-09, and together, they both known individual 7, on 2009-12-11. Use the functions `G.add_node()`

and `G.add_edge()`

to introduce this data into the network.

If you need more help, check out https://networkx.github.io/documentation/latest/tutorial/tutorial.html

```
In [ ]:
```

While we're on the matter of graph construction, let's take a look at our tutorial class. On your sheet of paper, you should have a list of names - these are people for which you knew their name prior to coming to class.

As we iterate over the class, I would like you to holler out your name, your nationality, and in a very slow fashion, the names of the people who you knew in the class.

```
In [ ]:
```## You may choose to join me in this endeavor together.
ptG = nx.DiGraph() #ptG stands for PyCon Tutorial Graph.
# Add in nodes and edges
ptG.add_node('', nationality='') # (my own TextExpander shortcut is ;addnode)
ptG.add_edge('', '') # (my own TextExpander shortcut is ;addedge)
# We are now going to draw the network using a hive plot, grouping the nodes by the top two nationality groups, and 'others'
# for the third group.
nodes = dict()
nodes['group1'] = [] #list comprehension here
nodes['group2'] = [] #list comprehension here
nodes['group3'] = [] #list comprehension here
edges = dict()
edges['group1'] = [] #list comprehension here
nodes_cmap = dict()
nodes_cmap['group1'] = 'blue'
nodes_cmap['group2'] = 'green'
nodes_cmap['group3'] = 'black'
edges_cmap = dict()
edges_['group1'] = 'black'
from hiveplot import HivePlot
h = HivePlot(nodes, edges, nodes_cmap, edges_cmap)
# h.set_minor_angle(np.pi / 32) #optional
h.draw()

A similar pattern can be used for edges:

```
[n2 for n1, n2, d in G.edges(data=True)]
```

or

```
[n2 for _, n2, d in G.edges(data=True)]
```

If the graph you are constructing is a directed graph, with a "source" and "sink" available, then I would recommend the following pattern:

```
[(sc, sk) for sc, sk, d in G.edges(data=True)]
```

or

`[d['attr'] for sc, sk, d in G.edges(data=True)]`

```
In [ ]:
```nx.draw(G)

`with_labels=True`

argument.

```
In [ ]:
```nx.draw(G, with_labels=True)

However, note that if the number of nodes in the graph gets really large, node-link diagrams can begin to look like massive hairballs. This is undesirable for graph visualization.

Instead, we can use a matrix to represent them. The nodes are on the x- and y- axes, and a filled square represent an edge between the nodes. This is done by using the `nx.to_numpy_matrix(G)`

function.

We then use `matplotlib`

's `pcolor(numpy_array)`

function to plot. Because `pcolor`

cannot take in numpy matrices, we will cast the matrix as an array of arrays, and then get `pcolor`

to plot it.

```
In [ ]:
```matrix = nx.to_numpy_matrix(G)
plt.pcolor(np.array(matrix))
plt.axes().set_aspect('equal') # set aspect ratio equal to get a square visualization
plt.xlim(min(G.nodes()), max(G.nodes())) # set x and y limits to the number of nodes present.
plt.ylim(min(G.nodes()), max(G.nodes()))
plt.title('Adjacency Matrix')
plt.show()

Let's try another visualization, the Circos plot. We can order the nodes in the Circos plot according to the node ID, but any other ordering is possible as well. Edges are drawn between two nodes.

Credit goes to Justin Zabilansky (MIT) for the implementation.

```
In [ ]:
```from circos import CircosPlot
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
nodes = sorted(G.nodes())
edges = G.edges()
c = CircosPlot(nodes, edges, radius=10, ax=ax)
c.draw()

```
In [ ]:
```