# DH3501: Advanced Social NetworksClass 6: Formal Definitions and the Jackson Text

Western University
Department of Modern Languages and Literatures
Digital Humanities – DH 3501

Instructor: David Brown
E-mail: dbrow52@uwo.ca
Office: AHB 1R14

## So, what do you think of the Jackson text?

• The Jackson text may seem a bit mathy or formal...but if you are going to study networks, you have to get used to a bit of math.

• The most important thing is to take your time and remain calm when you read it:

## Formal representation of networks

Graph theory uses a variety of symbols, including set-builder notation:

• $\{i | i \in g\}$

Set of all i such that i is an element in g...

• A set is an unordered collection of unique elements.
• Python implements a set data structure.

How is a set different from a Python list???



In [1]:

set([1,2,3,4])




Out[1]:

{1, 2, 3, 4}



## Some common symbols we will see in this class:

Set:

• $\{1,...,n\}$

Matrix:

• $g = \begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}$

Membership:

• $ij \in g$

Owns, has member:

• $g \ni ij$

Set such that:

• $\{x | x \in \mathbb{R} \}$
• or
• $\{x : x \in \mathbb{R} \}$

Subset, subset or equal set:

• $g' \subset g$
• $g' \subseteq g$

Superset, superset or equal set:

• $g \supset g'$
• $g \supseteq g'$

Union:

• $a \cup b$

Intersect:

• $a \cap b$

Anyone remember what the difference between a set union and a set intersect is?

## Summation

Stats class anyone?

$\sum\limits_{i=1}^ni=1 + 2 + 3 + \dots + n$

• Summation is the operation of adding a sequence of numbers.

## Anyone freaking yet?

IT'S OK! We've still got Python to help us. Let's put all of this in terms we understand.

## Coding challenge: Translation to Python

All of the above statements can be expressed in Python. Translate them!

For example:

# 1
N = set(range(1, 11))


Translate the following:

1. N = $\{1,\dots,10\}$
2. $5 \in N$
3. $\{4 | 4 \in N \}$
4. $\{1, 3, 4\} \subset N$
5. $a = \{1, 2, 3\}$
6. $b = \{3, 4, 5\}$
7. $a \cap b$
8. $a \cup b$

Hint: Ever looked at the Python docs for set?



In [2]:

# Translate here. Add as many new cells as you like.



### Everyone knows how to do a summation in Python right?



In [3]:

sum(range(5))




Out[3]:

10



What if we need to apply an operation to each element in the summation first?

$\sum\limits_{i=1}^ni^2=1^2 + 2^2 + 3^2 + \dots + n^2$

Implement a one liner in Python that performs the above summation with n = 4.



In [4]:

# One liner here :)
sum(map(lambda x: x * x, range(5)))




Out[4]:

30



## Ok down to business: Formal representations of graphs a la Jackson

pp. 41-44

Jackson defines a graph as a set $N = \{1,...,n\}$ of n nodes involved in a network of relationships defined as a real valued $n x n$ matrix $g$, where $g_{ij}$ represents the relation between $i$ and $j$.

• $graph = (N,g)$
• $N = \{1, 2, 3\}$
• $g = \begin{bmatrix}0 & 1 & 0\\1 & 0 & 1\\0 & 1 & 0\end{bmatrix}$

A network is directed if if it is possible that:

• $g_{ij} \ne g_{ji}$

Edges can also be represented as a set of tuples:

• $g = \{(1,2), (2,3)\}$

A subset of edges:

• $g' \subset g = \{ij | ij \in g' \} \subset \{ij | ij \in g\}$

The network created by deleting all edges except those that are between nodes in the set $S$ where $S \subset N$:

• $[g|s]_{ij} = \begin{cases} 1\ \text{if}\ i \in S, j \in S, g_{ij} = 1,\ \text{else}\ \\ 0 \end{cases}$

The set of all undirected and unweighted networks on $N$:

• $G(N)$

The set of nodes that have at least one link in the network $g$:

• $N(g) = \{i|\exists j\ \text{s.t.}\ ij \in g,\ \text{or}\ ji \in g\}$

Can you translate that to set builder speak?

## Isomorphism

Isomorphism is a mapping (bijection) that preserves edge structure.

Jackson p. 43 one line at a time:

• $(N, g)$ and $(N', g')$ are isomorphic if there exists a bijection

• $f : N \to N'$ such that

• $ij \in g$ if and only if

• $f(i)f(j) \in g'$

## Paths and Cycles

Page 44-45

Definitions:

• A walk is a sequence of links connecting a sequence of nodes.

• A cycle is a walk that starts and ends at the same node, with all nodes appearing once except the starting node which also appears as the ending node.

• A path is a walk where a node appears at most once in the sequence.

• A geodesic between two nodes is a shortest path between them.

Formally...

A path in a network $g \in G(N)$ betgween nodes $i$ and $j$ is a sequence of links

• $i_1,i_2,\dots,i_{K-1},i_K$ such that

• $i_ki_{k+1} \in g$ for each

• $k \in \{1,\dots,K-1\}$, with

• $i_1 = i$ and $i_K = j$, and such that each node in the sequence

• $i_1,\dots,i_K$ is distinct.

A walk in a network $g \in G(N)$ betgween nodes $i$ and $j$ is a sequence of links

• $i_1,i_2,\dots,i_{K-1},i_K$ such that

• $i_kI_{k+1} \in g$ for each

• $k \in \{1,\dots,K-1\}$, with

• $i_1 = i$ and $i_K = j$

WAIT! WHAT'S THE DIFFERENCE BETWEEN A PATH AND A WALK AGAIN?

## Components and connectivity

Remember that a network is connected (path connected) if every two nodes in the network are connected by some path in the network.

• $(N,g)$ is connected if for each $i \in N$ and $j \in N$ there exists a path in $(N,g)$ between $i$ and $j$

A component of a network $(N,g)$, is a nonempty subnetwork $(N',g')$ such that $\emptyset \ne N' \subset N, g' \subset g,$ and

• $(N',g')$ is connected, and
• if $i \in N'$ and $ij \in g,$ then $j \in N'$ and $ij \in g'$

Set of components of a network $(N,g)$ is denoted $C(N,g)$

## Node neighborhoods

pp. 50-51

The neighborhood of a node i is the set of nodes that i is linked to.

• $N_i(g)=\{j: g_{ij} = 1\}$

Given some set of nodes $S$, the neighborhood of $S$ is the union of the neighborhoods of its members:

• $N_S(g) = \bigcup\limits_{i \in S} N_i(g) = \{j: \exists i \in S, g_{ij}=1\}$

We can also extend the neighborhood of a node, remember we tried to do this in class 4...

• $N_i^k(g) = N_i(g) \cup \bigg( \bigcup\limits_{j \in N_i(g)} N_j^{k-1}(g) \bigg)$

## Degree

pp. 51

The degree of node is the number of links that are connected to it.

A node $i$'s degree in a network $g$, denoted $d_i(g)$ is:

• $d_i(g) = \# \{j: g_{ji} = 1 \} = \# N_i(g)$

## Degree Distribution

Degree distribution is one of the fundamental ways to describe a graph. It describes the relative frequencies of the nodes that have different degrees in the graph.

• Two famous degree distributions, resulting from two kinds of networks:
• Scale free network
• Random network
• Lot's more on this later this term...

Look at the following examples. Is their anything about the structure of the two graphs that strikes you as different?



In [5]:

# Config environment for code examples.
%matplotlib inline
import networkx as nx
import matplotlib as plt
plt.rcParams['figure.figsize'] = 17, 12




In [6]:

g = nx.scale_free_graph(40).to_undirected()
nx.draw_networkx(g)







In [7]:

g = nx.gnp_random_graph(40, 0.1)
nx.draw_networkx(g)






## Measures of centrality

pp. 61-65

Degree centrality - a normalized measure of a node's connectivity:

• $\frac{d_i(g)}{n - 1}$

Closeness centrality - a the inverse of the average distance between $i$ and any other node:

• $\frac{n-1}{\sum\limits_{j \ne i}\ell(i,j)}$ , where
• $\ell(i,j)$ is the number of links in the shortest path between $i$ and $j$

Betweenness centrality - the proportion of how many shortest paths a node lies on averaged across all pairs of nodes in a network:

• $Ce_i^B(g)= \sum\limits_{k \ne j: i \notin \{k,j\}} \frac{P_i(kj)/P(kj)}{(n-1)(n-2)/2}$

## Coding challenge: Plotting degree distribution

Use NetworkX to generate a scale free graph and a gnp random graph, then plot the degree distribution of each graph...

HINTS:

• Degree distribution plots the probability $P(k)$ that a node has the degree $k$
• You calculate $P(k)$ by dividing the number of nodes in the network with degree $k$ by the total number of nodes $n$ in the graph.
• $P(k) = \frac{\# nodes\ with\ degree\ k} {\# nodes}$
• On the x axis of the graph you plot the degree, $k$
• On the y-axis of the graph you plot the probability $P(k)$
• Some useful functions:
nx.degree(graph)
plt.scatter(xvals, yvals)
plt.yscale("log")
plt.xscale("log")



In [8]:

def degree_distribution(g):
"""
:param g: networkx.Graph
:returns ?: whatever you need to plot the degree distribution.
"""
pass

# Here you can either script or write a function to plot the degree distributions.
# Use as many cells as you need.