Computing in Context

Social Science Track

Lecture the Fifth, in which we briefly explore

Networks


In [1]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import Image
pd.set_option('display.mpl_style', 'default') # Make the graphs a bit prettier
plt.rcParams['figure.figsize'] = (15, 8)

for a fine tutorial, from which I'm showing slides, see http://www.scottbot.net/HIAL/?page_id=41142

How to represent networks with numbers


In [2]:
Image("https://upload.wikimedia.org/wikipedia/commons/thumb/2/28/6n-graph2.svg/300px-6n-graph2.svg.png")
###h/t wikipedia


Out[2]:

A long way

1 IS CONNECTED TO 1

1 IS CONNECTED TO 2

1 IS CONNECTED TO 5

2 IS CONNECTED IS 1

. . . .

adjacency matrix

$\begin{pmatrix} 1 & 1 & 0 & 0 & 1 & 0\\ 1 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 1\\ 1 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0\\ \end{pmatrix}$

A $1$ at column $i$ and row $j$ means that $i$ and $j$ are connected.

$6$ is connected only to $4$; $1$ is connected to itself, $2$, and $5$.

This graph is symmetric: it assumes a connection from $1$ to $2$ is the same as one from $2$ to $1$.

Often this is not the case: think of friending someone on a social network: A friending B does not mean B friends A. Such a graph is called a directed graph.

An example: membership in patriotic clubs

I'm taking this example from http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/ and moving it from R to python


In [3]:
membership_matrix=pd.read_csv("https://raw.githubusercontent.com/kjhealy/revere/master/data/PaulRevereAppD.csv", index_col=[0])

In [4]:
membership_matrix


Out[4]:
StAndrewsLodge LoyalNine NorthCaucus LongRoomClub TeaParty BostonCommittee LondonEnemies
Adams.John 0 0 1 1 0 0 0
Adams.Samuel 0 0 1 1 0 1 1
Allen.Dr 0 0 1 0 0 0 0
Appleton.Nathaniel 0 0 1 0 0 1 0
Ash.Gilbert 1 0 0 0 0 0 0
Austin.Benjamin 0 0 0 0 0 0 1
Austin.Samuel 0 0 0 0 0 0 1
Avery.John 0 1 0 0 0 0 1
Baldwin.Cyrus 0 0 0 0 0 0 1
Ballard.John 0 0 1 0 0 0 0
Barber.Nathaniel 0 0 1 0 1 1 1
Barnard.Samuel 0 0 0 0 1 0 0
Barrett.Samuel 1 0 0 0 0 0 1
Bass.Henry 0 1 1 0 1 0 1
Bell.William 1 0 0 0 0 0 0
Blake.Increase 1 0 0 0 0 0 0
Boit.John 0 0 1 0 0 0 0
Bolter.Thomas 0 0 0 0 1 0 0
Boyer.Peter 0 0 0 0 0 0 1
Boynton.Richard 0 0 0 0 0 1 1
Brackett.Jos 0 0 0 0 0 0 1
Bradford.John 0 0 0 0 0 1 1
Bradlee.David 0 0 0 0 1 0 0
Bradlee.Josiah 0 0 0 0 1 0 0
Bradlee.Nathaniel 0 0 0 0 1 0 0
Bradlee.Thomas 0 0 0 0 1 0 0
Bray.George 1 0 0 0 0 0 0
Breck.William 0 0 1 0 0 0 0
Bewer.James 0 0 0 0 1 0 0
Brimmer.Herman 0 0 0 0 0 0 1
... ... ... ... ... ... ... ...
Swan.James 0 0 1 0 1 0 0
Sweetser.John 0 0 0 0 0 1 0
Symmes.Eben 0 0 1 0 0 0 0
Symmes.John 0 0 1 0 0 0 0
Tabor.Philip 1 0 0 0 0 0 0
Tileston.Thomas 0 0 1 0 0 0 0
Trott.George 0 1 0 0 0 0 0
Tyler.Royall 0 0 0 1 0 0 0
Urann.Thomas 1 0 1 0 1 0 0
Vernon.Fortesque 0 0 0 0 0 0 1
Waldo.Benjamin 0 0 0 0 0 0 1
Warren.Joseph 1 0 1 1 0 1 1
Webb.Joseph 1 0 0 0 0 0 0
Webster.Thomas 1 0 0 0 0 0 0
Welles.Henry 1 1 0 0 0 0 0
Wendell.Oliver 0 0 0 0 0 1 1
Wheeler.Josiah 0 0 0 0 1 0 0
White.Samuel 0 0 1 0 0 0 0
Whitten.John 1 0 0 0 0 0 0
Whitwell.Samuel 0 0 0 0 0 0 1
Whitwell.William 0 0 0 0 0 0 1
Williams.Jeremiah 0 0 0 0 1 0 0
Williams.Jonathan 0 0 0 0 0 0 1
Williams.Thomas 0 0 0 0 1 0 0
Willis.Nathaniel 0 0 0 0 1 0 0
Wingfield.William 1 0 0 0 0 0 0
Winslow.John 0 0 0 1 0 0 0
Winthrop.John 0 0 1 0 0 0 1
Wyeth.Joshua 0 0 0 0 1 0 0
Young.Thomas 0 0 1 0 1 1 0

254 rows × 7 columns


In [5]:
membership_matrix.shape


Out[5]:
(254, 7)

We have a matrix of 254 rows and 7 columns

discerning connections!

We are intested in

  • who is connected to whom

  • which organization most connected to which other organization.

Our quarry is something like the similarity matrices we looked at in doing our work with text.

We are going to create an adjacency matrix, another square matrix that indicates who is connected to whom.

a little more matrix math

(You are not responsible for this but I want you to know the smoke and mirrors.)

We have to do a little of basic arithmetic with matrices to make this work.

First we need the idea of the transpose, which is just flipping the rows and columns of a matrix.

$\begin{bmatrix} 1 & 2 \

3 & 4 \end{bmatrix}^{\mathrm{T}}

\begin{bmatrix} 1 & 3 \\ 2 & 4 \end{bmatrix}

$

The adjacency matrix is the product of a matrix and its transpose.

Say we have a $257 * 7$ matrix; its transpose will be $7 * 257$ matrix.

If we multiply an $M * N$ matrix by an $N * M$ matrix, we get a $M * M$ matrix.

So let's start with the adjacency of every person to every other person.

Our goal, then, is a symmetrical matrix that matches $M$ people to $M$ people.

In python we can use the .dot method to perform the necessary form of matrix multiplication. We can do this directly on pandas dataframes.


In [6]:
person_adjacency=membership_matrix.dot(membership_matrix.T)
# multiplying the membership matrix by its transpose

In [7]:
person_adjacency


Out[7]:
Adams.John Adams.Samuel Allen.Dr Appleton.Nathaniel Ash.Gilbert Austin.Benjamin Austin.Samuel Avery.John Baldwin.Cyrus Ballard.John ... Whitwell.William Williams.Jeremiah Williams.Jonathan Williams.Thomas Willis.Nathaniel Wingfield.William Winslow.John Winthrop.John Wyeth.Joshua Young.Thomas
Adams.John 2 2 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 1 1 0 1
Adams.Samuel 2 4 1 2 0 1 1 1 1 1 ... 1 0 1 0 0 0 1 2 0 2
Allen.Dr 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Appleton.Nathaniel 1 2 1 2 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 2
Ash.Gilbert 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Austin.Benjamin 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Austin.Samuel 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Avery.John 0 1 0 0 0 1 1 2 1 0 ... 1 0 1 0 0 0 0 1 0 0
Baldwin.Cyrus 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Ballard.John 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Barber.Nathaniel 1 3 1 2 0 1 1 1 1 1 ... 1 1 1 1 1 0 0 2 1 3
Barnard.Samuel 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Barrett.Samuel 0 1 0 0 1 1 1 1 1 0 ... 1 0 1 0 0 1 0 1 0 0
Bass.Henry 1 2 1 1 0 1 1 2 1 1 ... 1 1 1 1 1 0 0 2 1 2
Bell.William 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Blake.Increase 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Boit.John 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Bolter.Thomas 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Boyer.Peter 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Boynton.Richard 0 2 0 1 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 1
Brackett.Jos 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Bradford.John 0 2 0 1 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 1
Bradlee.David 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Bradlee.Josiah 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Bradlee.Nathaniel 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Bradlee.Thomas 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Bray.George 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Breck.William 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Bewer.James 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Brimmer.Herman 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Swan.James 1 1 1 1 0 0 0 0 0 1 ... 0 1 0 1 1 0 0 1 1 2
Sweetser.John 0 1 0 1 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
Symmes.Eben 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Symmes.John 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Tabor.Philip 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Tileston.Thomas 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Trott.George 0 0 0 0 0 0 0 1 0 0 ... 0 0 0 0 0 0 0 0 0 0
Tyler.Royall 1 1 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 1 0 0 0
Urann.Thomas 1 1 1 1 1 0 0 0 0 1 ... 0 1 0 1 1 1 0 1 1 2
Vernon.Fortesque 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Waldo.Benjamin 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Warren.Joseph 2 4 1 2 1 1 1 1 1 1 ... 1 0 1 0 0 1 1 2 0 2
Webb.Joseph 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Webster.Thomas 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Welles.Henry 0 0 0 0 1 0 0 1 0 0 ... 0 0 0 0 0 1 0 0 0 0
Wendell.Oliver 0 2 0 1 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 1
Wheeler.Josiah 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
White.Samuel 1 1 1 1 0 0 0 0 0 1 ... 0 0 0 0 0 0 0 1 0 1
Whitten.John 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Whitwell.Samuel 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Whitwell.William 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Williams.Jeremiah 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Williams.Jonathan 0 1 0 0 0 1 1 1 1 0 ... 1 0 1 0 0 0 0 1 0 0
Williams.Thomas 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Willis.Nathaniel 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Wingfield.William 0 0 0 0 1 0 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 0
Winslow.John 1 1 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 1 0 0 0
Winthrop.John 1 2 1 1 0 1 1 1 1 1 ... 1 0 1 0 0 0 0 2 0 1
Wyeth.Joshua 0 0 0 0 0 0 0 0 0 0 ... 0 1 0 1 1 0 0 0 1 1
Young.Thomas 1 2 1 2 0 0 0 0 0 1 ... 0 1 0 1 1 0 0 1 1 3

254 rows × 254 columns

Just as easily we can get the adjacency among the clubs.

This time we multipy the transpose by the original matrix.

(Remember: matrix multiplication is not commutative.)


In [8]:
club_adjacency=membership_matrix.T.dot(membership_matrix)

In [9]:
club_adjacency


Out[9]:
StAndrewsLodge LoyalNine NorthCaucus LongRoomClub TeaParty BostonCommittee LondonEnemies
StAndrewsLodge 53 2 3 2 3 1 3
LoyalNine 2 10 3 0 2 0 3
NorthCaucus 3 3 59 5 13 9 16
LongRoomClub 2 0 5 17 2 5 5
TeaParty 3 2 13 2 97 3 8
BostonCommittee 1 0 9 5 3 21 11
LondonEnemies 3 3 16 5 8 11 62

In [10]:
club_adjacency["StAndrewsLodge"]


Out[10]:
StAndrewsLodge     53
LoyalNine           2
NorthCaucus         3
LongRoomClub        2
TeaParty            3
BostonCommittee     1
LondonEnemies       3
Name: StAndrewsLodge, dtype: int64

In [11]:
[value for value in club_adjacency["StAndrewsLodge"]]


Out[11]:
[53, 2, 3, 2, 3, 1, 3]

In [12]:
club_adjacency_names=[club for club in club_adjacency]

Python package for networks: networkx


In [13]:
import networkx as nx

introduces new data type, the graph, with lots of operations to create, modify, analyze, graph, import and export graphs

find the full documentation at https://networkx.github.io/

Begin by initializing a graph, then add edges, nodes or both.


In [14]:
my_first_graph=nx.Graph() # initialize a graph

In [15]:
my_first_graph.add_edge(1,2)

In [16]:
nx.draw(my_first_graph)



In [17]:
my_first_graph.add_edges_from([(2,3), (2,4)])  #add a list of edges, in the form of tuples (x,y)

In [18]:
nx.draw(my_first_graph)



In [19]:
nx.number_of_nodes(my_first_graph)


Out[19]:
4

In [20]:
my_first_graph.nodes()


Out[20]:
[1, 2, 3, 4]

In [21]:
my_first_graph.edges()


Out[21]:
[(1, 2), (2, 3), (2, 4)]

In [22]:
nx.betweenness_centrality(my_first_graph)


Out[22]:
{1: 0.0, 2: 1.0, 3: 0.0, 4: 0.0}

Both nodes and edges can have additional attributes

most important for us is "weight": how strong a connection between two nodes


In [23]:
my_first_graph.add_edge(4,5, weight=3)

In [24]:
nx.draw(my_first_graph)


Default simply graphing method not show the weighted edge.

We will fix this.

For examples of drawing in networkx, see https://networkx.github.io/documentation/latest/gallery.html


In [25]:
G=nx.Graph()

In [26]:
for club in club_adjacency:
  for i, j in enumerate(club_adjacency[club]):
    G.add_edge(club, club_adjacency_names[i], weight=j)

In [27]:
pos=nx.spring_layout(G,iterations=20)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_nodes(G, pos)
nx.draw_networkx_labels(G, pos)
plt.show()



In [28]:
H=nx.Graph()

In [29]:
person_adjacency_names=[person for person in person_adjacency]
for person in person_adjacency:
  for i, j in enumerate(person_adjacency[person]):
    if person != person_adjacency_names[i]:
        H.add_edge(person, person_adjacency_names[i], weight=j)

In [30]:
nx.draw(H)



In [31]:
pos=nx.spring_layout(H, iterations=30)
plt.figure(figsize=(40,40))
nx.draw_networkx_edges(H, pos, width=.015)
nx.draw_networkx_nodes(H, pos, node_size=50)
nx.draw_networkx_labels(H, pos)
plt.axis('off')
plt.show()


Graphing the prelude to analysis, not the thing itself

Just as term-document matrices are the fundamental form for undertaking text analysis, putting data into a graph form, as in networkx, allows a wide range of analysis, including sophisticated supervised and unsupervised learning.

Google's Page-Rank--the core of its search algorithm--is a graph analysis algorithm.

Intuition is that websites (nodes) referred to by other high-ranking websites (nodes) should be more highly ranked

Black box time!

lots of centrality measures, most built into networkx

We'll use a fairly sophisticated one, eigenvector centrality, built into networks.

If you want to unblackbox, you're ready for a more advanced course!


In [32]:
H_centrality=nx.eigenvector_centrality(H, max_iter=30)

In [33]:
H_centrality


Out[33]:
{'Adams.John': 0.042042617911528087,
 'Adams.Samuel': 0.082652308324596649,
 'Allen.Dr': 0.036137828617833566,
 'Appleton.Nathaniel': 0.046733890801182211,
 'Ash.Gilbert': 0.0088622466677354388,
 'Austin.Benjamin': 0.031513355025979455,
 'Austin.Samuel': 0.031513355025979455,
 'Avery.John': 0.035111625311478385,
 'Baldwin.Cyrus': 0.031513355025979455,
 'Ballard.John': 0.036137828617833566,
 'Barber.Nathaniel': 0.15749939125034759,
 'Barnard.Samuel': 0.083263395828609463,
 'Barrett.Samuel': 0.039998750675148034,
 'Bass.Henry': 0.15058746829025768,
 'Bell.William': 0.0088622466677354388,
 'Bewer.James': 0.083263395828609449,
 'Blake.Increase': 0.0088622466677354406,
 'Boit.John': 0.036137828617833566,
 'Bolter.Thomas': 0.083263395828609449,
 'Boyer.Peter': 0.031513355025979455,
 'Boynton.Richard': 0.042152575755578896,
 'Brackett.Jos': 0.031513355025979455,
 'Bradford.John': 0.042152575755578896,
 'Bradlee.David': 0.083263395828609449,
 'Bradlee.Josiah': 0.083263395828609449,
 'Bradlee.Nathaniel': 0.083263395828609449,
 'Bradlee.Thomas': 0.083263395828609449,
 'Bray.George': 0.0088622466677354406,
 'Breck.William': 0.036137828617833566,
 'Brimmer.Herman': 0.031513355025979455,
 'Brimmer.Martin': 0.031513355025979455,
 'Broomfield.Henry': 0.031513355025979455,
 'Brown.Enoch': 0.031513355025979455,
 'Brown.Hugh': 0.0088622466677354406,
 'Brown.John': 0.031513355025979455,
 'Bruce.Stephen': 0.083263395828609449,
 'Burbeck.Edward': 0.0088622466677354406,
 'Burbeck.William': 0.0088622466677354406,
 'Burt.Benjamin': 0.036137828617833566,
 'Burton.Benjamin': 0.083263395828609449,
 'Cailleteau.Edward': 0.0088622466677354388,
 'Callendar.Elisha': 0.0088622466677354406,
 'Campbell.Nicholas': 0.083263395828609449,
 'Cazneau.Capt': 0.036137828617833566,
 'Chadwell.Mr': 0.036137828617833566,
 'Champney.Caleb': 0.036137828617833566,
 'Chase.Thomas': 0.15058746829025771,
 'Cheever.Ezekiel': 0.067019755249596524,
 'Chipman.Seth': 0.0088622466677354406,
 'Chrysty.Thomas': 0.036137828617833566,
 'Church.Benjamin': 0.082652308324596649,
 'Clarke.Benjamin': 0.083263395828609449,
 'Cleverly.Stephen': 0.0039290771674795697,
 'Cochran.John': 0.083263395828609449,
 'Colesworthy.Gilbert': 0.083263395828609449,
 'Collier.Gershom': 0.083263395828609463,
 'Collins.Ezra': 0.0088622466677354406,
 'Collson.Adam': 0.11828681873223398,
 'Condy.JamesFoster': 0.1481235480058298,
 'Cooper.Samuel': 0.08872836634856221,
 'Cooper.William': 0.006300892669768088,
 'Crafts.Thomas': 0.012671935782633193,
 'Crane.John': 0.083263395828609449,
 'Davis.Caleb': 0.042152575755578896,
 'Davis.Edward': 0.031513355025979455,
 'Davis.Robert': 0.083263395828609449,
 'Davis.William': 0.031513355025979455,
 'Dawes.Thomas': 0.006300892669768088,
 'Dennie.William': 0.046733890801182211,
 'Deshon.Moses': 0.0088622466677354406,
 'Dexter.Samuel': 0.006300892669768088,
 'Dolbear.Edward': 0.083263395828609449,
 'Doyle.Peter': 0.0088622466677354388,
 'Eaton.Joseph': 0.083263395828609449,
 'Eayres.Joseph': 0.11370550368663061,
 'Eckley.Unknown': 0.083263395828609449,
 'Edes.Benjamin': 0.0396929403570817,
 'Emmes.Samuel': 0.036137828617833566,
 'Etheridge.William': 0.083263395828609449,
 'Fenno.Samuel': 0.083263395828609449,
 'Ferrell.Ambrose': 0.0088622466677354406,
 'Field.Joseph': 0.0039290771674795697,
 'Flagg.Josiah': 0.0088622466677354406,
 'Fleet.Thomas': 0.006300892669768088,
 'Foster.Bos': 0.031513355025979455,
 'Foster.Samuel': 0.083263395828609449,
 'Frothingham.Nathaniel': 0.083263395828609449,
 'Gammell.John': 0.083263395828609463,
 'Gill.Moses': 0.031513355025979455,
 'Gore.Samuel': 0.083263395828609463,
 'Gould.William': 0.0088622466677354388,
 'Graham.James': 0.0088622466677354388,
 'Grant.Moses': 0.1481235480058298,
 'Gray.Wait': 0.0088622466677354406,
 'Greene.Nathaniel': 0.083263395828609449,
 'Greenleaf.Joseph': 0.077232256607993793,
 'Greenleaf.William': 0.04215257575557891,
 'Greenough.Newn': 0.031513355025979455,
 'Ham.William': 0.0088622466677354406,
 'Hammond.Samuel': 0.083263395828609449,
 'Hancock.Eben': 0.031513355025979455,
 'Hancock.John': 0.037461302865924771,
 'Hendley.William': 0.083263395828609449,
 'Hewes.George': 0.083263395828609449,
 'Hickling.William': 0.036137828617833566,
 'Hicks.John': 0.083263395828609449,
 'Hill.Alexander': 0.011036364864839903,
 'Hitchborn.Nathaniel': 0.0088622466677354406,
 'Hitchborn.Thomas': 0.036137828617833566,
 'Hobbs.Samuel': 0.083263395828609449,
 'Hoffins.John': 0.0088622466677354406,
 'Holmes.Nathaniel': 0.036137828617833566,
 'Hooton.John': 0.083263395828609449,
 'Hopkins.Caleb': 0.031513355025979455,
 'Hoskins.William': 0.036137828617833566,
 'Howard.Samuel': 0.083263395828609449,
 'Howe.Edward': 0.083263395828609449,
 'Hunnewell.Jonathan': 0.083263395828609449,
 'Hunnewell.Richard': 0.083263395828609449,
 'Hunstable.Thomas': 0.083263395828609449,
 'Hunt.Abraham': 0.083263395828609463,
 'Ingersoll.Daniel': 0.083263395828609449,
 'Inglish.Alexander': 0.0088622466677354388,
 'Isaac.Pierce': 0.031513355025979455,
 'Ivers.James': 0.031513355025979455,
 'Jarvis.Charles': 0.031513355025979455,
 'Jarvis.Edward': 0.0088622466677354406,
 'Jefferds.Unknown': 0.0088622466677354388,
 'Jenkins.John': 0.0088622466677354406,
 'Johnston.Eben': 0.031513355025979455,
 'Johonnott.Gabriel': 0.036137828617833566,
 'Kent.Benjamin': 0.036137828617833566,
 'Kerr.Walter': 0.0088622466677354406,
 'Kimball.Thomas': 0.036137828617833566,
 'Kinnison.David': 0.083263395828609449,
 'Lambert.John': 0.006300892669768088,
 'Lee.Joseph': 0.083263395828609449,
 'Lewis.Phillip': 0.0088622466677354388,
 'Lincoln.Amos': 0.083263395828609449,
 'Loring.Matthew': 0.083263395828609449,
 'Lowell.John': 0.036137828617833566,
 'MacKintosh.Capt': 0.083263395828609449,
 'MacNeil.Archibald': 0.083263395828609449,
 'Machin.Thomas': 0.083263395828609449,
 'Mackay.William': 0.011036364864839903,
 'Marett.Phillip': 0.0088622466677354388,
 'Marlton.John': 0.0088622466677354388,
 'Marshall.Thomas': 0.031513355025979455,
 'Marson.John': 0.031513355025979455,
 'Mason.Jonathan': 0.031513355025979455,
 'Matchett.John': 0.036137828617833566,
 'May.John': 0.083263395828609449,
 'McAlpine.William': 0.0088622466677354388,
 'Melville.Thomas': 0.083263395828609449,
 'Merrit.John': 0.036137828617833566,
 'Milliken.Thomas': 0.0088622466677354406,
 'Molineux.William': 0.12802527438105318,
 'Moody.Samuel': 0.0088622466677354388,
 'Moore.Thomas': 0.083263395828609449,
 'Morse.Anthony': 0.083263395828609449,
 'Morton.Perez': 0.036137828617833566,
 'Mountford.Joseph': 0.083263395828609449,
 'Newell.Eliphelet': 0.083263395828609449,
 'Nicholls.Unknown': 0.0088622466677354406,
 'Noyces.Nat': 0.031513355025979455,
 'Obear.Israel': 0.0088622466677354388,
 'Otis.James': 0.017175438417510437,
 'Palfrey.William': 0.0088622466677354406,
 'Palmer.Joseph': 0.083263395828609449,
 'Palms.Richard': 0.036137828617833566,
 'Parker.Jonathan': 0.083263395828609449,
 'Parkman.Elias': 0.067019755249596524,
 'Partridge.Sam': 0.031513355025979455,
 'Payson.Joseph': 0.083263395828609463,
 'Pearce.Isaac': 0.036137828617833566,
 'Pearce.IsaacJun': 0.036137828617833566,
 'Peck.Samuel': 0.091265814157785341,
 'Peck.Thomas': 0.036137828617833566,
 'Peters.John': 0.083263395828609449,
 'Phillips.John': 0.0088622466677354406,
 'Phillips.Samuel': 0.006300892669768088,
 'Phillips.William': 0.031513355025979455,
 'Pierce.William': 0.083263395828609449,
 'Pierpont.Robert': 0.011036364864839903,
 'Pitts.John': 0.031513355025979455,
 'Pitts.Lendall': 0.083263395828609449,
 'Pitts.Samuel': 0.031513355025979455,
 'Porter.Thomas': 0.083263395828609449,
 'Potter.Edward': 0.0088622466677354388,
 'Powell.William': 0.04215257575557891,
 'Prentiss.Henry': 0.083263395828609449,
 'Prince.Job': 0.031513355025979455,
 'Prince.John': 0.083263395828609449,
 'Proctor.Edward': 0.1481235480058298,
 'Pulling.John': 0.067019755249596524,
 'Pulling.Richard': 0.0088622466677354388,
 'Purkitt.Henry': 0.083263395828609449,
 'Quincy.Josiah': 0.017175438417510437,
 'Randall.John': 0.083263395828609449,
 'Revere.Paul': 0.16004633269693017,
 'Roby.Joseph': 0.083263395828609449,
 'Roylson.Thomas': 0.031513355025979455,
 'Ruddock.Abiel': 0.067019755249596524,
 'Russell.John': 0.083263395828609463,
 'Russell.William': 0.083263395828609449,
 'Sessions.Robert': 0.083263395828609449,
 'Seward.James': 0.0088622466677354406,
 'Sharp.Gibbens': 0.036137828617833566,
 'Shed.Joseph': 0.083263395828609449,
 'Sigourney.John': 0.036137828617833566,
 'Simpson.Benjamin': 0.083263395828609463,
 'Slater.Peter': 0.083263395828609463,
 'Sloper.Ambrose': 0.0088622466677354406,
 'Smith.John': 0.0039290771674795697,
 'Spear.Thomas': 0.083263395828609449,
 'Sprague.Samuel': 0.083263395828609449,
 'Spurr.John': 0.083263395828609449,
 'Stanbridge.Henry': 0.0088622466677354406,
 'Starr.James': 0.083263395828609449,
 'Stearns.Phineas': 0.083263395828609449,
 'Stevens.Ebenezer': 0.083263395828609449,
 'Stoddard.Asa': 0.036137828617833566,
 'Stoddard.Jonathan': 0.036137828617833566,
 'Story.Elisha': 0.11828681873223398,
 'Swan.James': 0.11828681873223397,
 'Sweetser.John': 0.011036364864839903,
 'Symmes.Eben': 0.036137828617833566,
 'Symmes.John': 0.036137828617833566,
 'Tabor.Philip': 0.0088622466677354388,
 'Tileston.Thomas': 0.036137828617833566,
 'Trott.George': 0.0039290771674795697,
 'Tyler.Royall': 0.006300892669768088,
 'Urann.Thomas': 0.12589136707860255,
 'Vernon.Fortesque': 0.031513355025979455,
 'Waldo.Benjamin': 0.031513355025979455,
 'Warren.Joseph': 0.090442273025718459,
 'Webb.Joseph': 0.0088622466677354388,
 'Webster.Thomas': 0.0088622466677354406,
 'Welles.Henry': 0.01267193578263319,
 'Wendell.Oliver': 0.042152575755578896,
 'Wheeler.Josiah': 0.083263395828609449,
 'White.Samuel': 0.036137828617833566,
 'Whitten.John': 0.0088622466677354406,
 'Whitwell.Samuel': 0.031513355025979455,
 'Whitwell.William': 0.031513355025979455,
 'Williams.Jeremiah': 0.083263395828609449,
 'Williams.Jonathan': 0.031513355025979455,
 'Williams.Thomas': 0.083263395828609449,
 'Willis.Nathaniel': 0.083263395828609449,
 'Wingfield.William': 0.0088622466677354406,
 'Winslow.John': 0.006300892669768088,
 'Winthrop.John': 0.067019755249596524,
 'Wyeth.Joshua': 0.083263395828609449,
 'Young.Thomas': 0.12802527438105318}

argh, need to sort that


In [34]:
sorted(H_centrality, key=H_centrality.get, reverse=True)[:10]  #useful syntax to know for sorting dictionaries!


Out[34]:
['Revere.Paul',
 'Barber.Nathaniel',
 'Chase.Thomas',
 'Bass.Henry',
 'Proctor.Edward',
 'Condy.JamesFoster',
 'Grant.Moses',
 'Young.Thomas',
 'Molineux.William',
 'Urann.Thomas']

In [35]:
for name in sorted(H_centrality, key=H_centrality.get, reverse=True)[0:10]:
    print(name, H_centrality[name])


Revere.Paul 0.160046332697
Barber.Nathaniel 0.15749939125
Chase.Thomas 0.15058746829
Bass.Henry 0.15058746829
Proctor.Edward 0.148123548006
Condy.JamesFoster 0.148123548006
Grant.Moses 0.148123548006
Young.Thomas 0.128025274381
Molineux.William 0.128025274381
Urann.Thomas 0.125891367079

Revere is our terrorist mastermind!

Get them on a do-not fly, er, ride list!

Point isn't that such techniques are always right:

  • they do uncover relationship

  • they have tons of false positives


In [ ]: