In [1]:
# Usually we import networkx as nx.
import networkx as nx
# Instantiate a graph.
g = nx.Graph()
# Add a node.
g.add_node(1)
# Add a list of nodes.
g.add_nodes_from([2, 3, 4, 5])
# Add an edge.
g.add_edge(1, 2)
# Add a list of edges.
g.add_edges_from([(2, 3), (3, 4)])
# Remove a node.
g.remove_node(5)
What about removing an edge? Multiple nodes? Multiple edges? All nodes and edges? Use the following cell to figure out how to delete nodes and edges and to clear the entire graph.
In [2]:
# Your code goes here.
What happens if we add multiple nodes with the same name?
In [3]:
g.add_edges_from([(1,2),(1,3)])
g.add_node(1)
g.add_edge(1,2)
g.add_node("spam") # adds node "spam"
g.add_nodes_from("spam") # adds 4 nodes: 's', 'p', 'a', 'm'
In [4]:
# Do you remember how to look at the nodes in the graph? How about the edges?
# Your code goes here.
In [5]:
g.add_node(4, {"name": "Joebob"})
g.node[4]
Out[5]:
In [6]:
g.node[4]["name"] = "Dave"
g.node[4]["job"] = "Ph.D. Student"
g.node[4]
Out[6]:
In [7]:
# Add an edge with attributes.
g.add_edge("s", "p", {"type": "knows"})
g["s"]["p"]
Out[7]:
In [8]:
g["s"]["p"]["type"] = "follows"
g["s"]["p"]["weight"] = 1
g["s"]["p"]
Out[8]:
In [9]:
rand = nx.gnp_random_graph(20, 0.25)
sf = nx.scale_free_graph(20)
In [10]:
# Config environment visualization.
%matplotlib inline
import matplotlib as plt
plt.rcParams['figure.figsize'] = 17, 12
In [11]:
nx.draw_networkx(rand)
In [12]:
nx.draw(sf)
NetworkX has tons of analytics algorithms ready to go out of the box. There are too many to go over in class, but we'll be seeing them throughout the term. Today, we will focus on the concpet of centrality, how it is measured, and how it is interpreted.
Centrality is a micro measure that compares a node to all of the other nodes in the network. It is a way to measure the power, influence, and overall importance of a node in a network. However, it is important to remember that measures of centrality must be interpreted within the context of the network i.e., a measure of centrality does not have the same connotations for every network.
There are four main groups or types of centrality measurement:
In small groups, discuss the different measures of centrality in terms of how they were presented in T & K. What do these measure mean? What examples did they use in the book? Can you think of some of your own example of how these measures could be used and how you would interpret them?
In [13]:
g = nx.scale_free_graph(50)
dc = nx.degree_centrality(g)
idc = nx.in_degree_centrality(g)
odc = nx.out_degree_centrality(g)
In [14]:
cc = nx.closeness_centrality(g)
In [15]:
bc = nx.betweenness_centrality(g)
In [16]:
import pandas as pd
cent_df = pd.DataFrame({"deg": dc, "indeg": idc, "outdeg": odc, "close": cc, "betw": bc})
cent_df.describe()
Out[16]:
In [17]:
cent_df.hist()
Out[17]:
What is a "grey cardinal"? What examples did they use in the book? Can you think of another figure from pop culture that would be a "grey cardinal"?
Come up with a network representation for the data contained withing both Facebook and Twitter. Think of all of the possible content and all of the relationships that can exist between users and between users and content. Then answer the following questions: