# DH3501: Advanced Social NetworksClass 5: NetworkX and Centrality

Western University
Department of Modern Languages and Literatures
Digital Humanities – DH 3501

Instructor: David Brown
E-mail: dbrow52@uwo.ca
Office: AHB 1R14

## So...impressions on the NetworkX API...Hard? Easy? Let's go over the basics.

``````

In :

# Usually we import networkx as nx.
import networkx as nx

# Instantiate a graph.
g = nx.Graph()

# Add a list of nodes.

# Add a list of edges.

# Remove a node.
g.remove_node(5)

``````

What about removing an edge? Multiple nodes? Multiple edges? All nodes and edges? Use the following cell to figure out how to delete nodes and edges and to clear the entire graph.

``````

In :

``````

What happens if we add multiple nodes with the same name?

``````

In :

``````
``````

In :

# Do you remember how to look at the nodes in the graph? How about the edges?

``````

## Node and edge attributes

``````

In :

g.node

``````
``````

Out:

{'name': 'Joebob'}

``````
``````

In :

g.node["name"] = "Dave"
g.node["job"] = "Ph.D. Student"
g.node

``````
``````

Out:

{'job': 'Ph.D. Student', 'name': 'Dave'}

``````
``````

In :

# Add an edge with attributes.
g["s"]["p"]

``````
``````

Out:

{'type': 'knows'}

``````
``````

In :

g["s"]["p"]["type"] = "follows"
g["s"]["p"]["weight"] = 1
g["s"]["p"]

``````
``````

Out:

{'type': 'follows', 'weight': 1}

``````

## Graph generators

``````

In :

rand = nx.gnp_random_graph(20, 0.25)
sf = nx.scale_free_graph(20)

``````

## Drawing

``````

In :

# Config environment visualization.
%matplotlib inline
import matplotlib as plt
plt.rcParams['figure.figsize'] = 17, 12

``````
``````

In :

nx.draw_networkx(rand)

``````
``````

``````
``````

In :

nx.draw(sf)

``````
``````

``````

## Analytics

NetworkX has tons of analytics algorithms ready to go out of the box. There are too many to go over in class, but we'll be seeing them throughout the term. Today, we will focus on the concpet of centrality, how it is measured, and how it is interpreted.

## Centrality

Centrality is a micro measure that compares a node to all of the other nodes in the network. It is a way to measure the power, influence, and overall importance of a node in a network. However, it is important to remember that measures of centrality must be interpreted within the context of the network i.e., a measure of centrality does not have the same connotations for every network.

There are four main groups or types of centrality measurement:

• Degree - how connected is a node (how many adjacent edges does it posses).
• Closeness - how close is a node to all of the other nodes in the graph, how easily can it access them.
• Betweenness - how important a node is in terms of connecting other nodes.
• Neighbors characteristics - how important or central a node's neighbors are.

In small groups, discuss the different measures of centrality in terms of how they were presented in T & K. What do these measure mean? What examples did they use in the book? Can you think of some of your own example of how these measures could be used and how you would interpret them?

### Degree Centrality

``````

In :

g = nx.scale_free_graph(50)
dc = nx.degree_centrality(g)
idc = nx.in_degree_centrality(g)
odc = nx.out_degree_centrality(g)

``````

### Closeness Centrality

``````

In :

cc = nx.closeness_centrality(g)

``````

### Betweenness Centrality

``````

In :

bc = nx.betweenness_centrality(g)

``````

### Centrality summary

``````

In :

import pandas as pd
cent_df = pd.DataFrame({"deg": dc, "indeg": idc, "outdeg": odc, "close": cc, "betw": bc})
cent_df.describe()

``````
``````

Out:

betw
close
deg
indeg
outdeg

count
50.000000
50.000000
50.000000
50.000000
50.000000

mean
0.003223
0.070292
0.077551
0.038776
0.038776

std
0.010941
0.023337
0.135184
0.112557
0.035766

min
0.000000
0.000000
0.020408
0.000000
0.000000

25%
0.000000
0.062272
0.020408
0.000000
0.020408

50%
0.000000
0.071429
0.020408
0.000000
0.020408

75%
0.000000
0.076923
0.061224
0.020408
0.040816

max
0.068240
0.113379
0.775510
0.632653
0.142857

``````

## What's a histogram?

``````

In :

cent_df.hist()

``````
``````

Out:

array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f2bfde2f3d0>,
<matplotlib.axes._subplots.AxesSubplot object at 0x7f2bfd92a810>],
[<matplotlib.axes._subplots.AxesSubplot object at 0x7f2bfd8aa6d0>,
<matplotlib.axes._subplots.AxesSubplot object at 0x7f2bfd875bd0>],
[<matplotlib.axes._subplots.AxesSubplot object at 0x7f2bfd7fcd90>,
<matplotlib.axes._subplots.AxesSubplot object at 0x7f2bfd77ca50>]], dtype=object)

``````

## Other centrality measures

### Eigenvector centrality and the grey cardinal

What is a "grey cardinal"? What examples did they use in the book? Can you think of another figure from pop culture that would be a "grey cardinal"?

## Thought experiment: The meaning of degree centrality

Come up with a network representation for the data contained withing both Facebook and Twitter. Think of all of the possible content and all of the relationships that can exist between users and between users and content. Then answer the following questions:

• What does degree mean in each network?
• Does the meaning of degree depend on the type of edges counted?
• Are the relationships symetrical? Or would it be a directed network?
• If directed, what are the differences between in-degree and out-degree? What does it mean to have a high in- or out-degree?