A SIMPLE STORY (1) The fate of Saddam and network science

- Invasion that started in March 19, 2003. Many of the regime's high ranking officials, including Saddam Hussein, avoided capture.
Hussein was last spotted kissing a baby in Baghdad in April 2003, and then his trace went cold.

Designed a deck of cards, each card engraved with the images of the 55 most wanted.

- It worked: by May 1, 2003, 15 men on the cards were captured, and by the end of the month another 12 were under custody.
- Yet, the ace of spades, i.e. Hussein himself, remained at large.

The capture of Saddam Hussein:

shows the strong predictive power of networks.

underlies the need to obtain accurate maps of the networks we aim to study; and the often heroic difficulties we encounter during the mapping process.

demonstrates the remarkable stability of these networks: The capture of Hussein was not based on fresh intelligence, but rather on his pre-invasion social links, unearthed from old photos stacked in his family album.

shows that the choice of network we focus on makes a huge difference: the hierarchical tree, that captured the official organization of the Iraqi government, was of no use when it came to Saddam Hussein's whereabouts.

A SIMPLE STORY (2): August 15, 2003 blackout.

An important theme of this class:

we must understand how network structure affects the robustness of a complex system.

develop quantitative tools to assess the interplay between network structure and the dynamical processes on the networks, and their impact on failures.

We will learn that failures reality failures follow reproducible laws, that can be quantified and even predicted using the tools of network science.

[adj., v. kuh m-pleks, kom-pleks; n. kom-pleks] –adjective

- composed of many interconnected parts; compound; composite: a complex highway system.

- characterized by a very complicated or involved arrangement of parts, units, etc.: complex machinery.

- so complicated or intricate as to be hard to understand or deal with: a complex problem.
`Source: Dictionary.com`

- so complicated or intricate as to be hard to understand or deal with: a complex problem.

a scientific theory which asserts that some systems display behavioral phenomena that are completely inexplicable by any conventional analysis of the systems’ constituent parts. These phenomena, commonly referred to as emergent behaviour, seem to occur in many complex systems involving living organisms, such as a stock market or the human brain. Source: John L. Casti, Encyclopædia Britannica

- Social graph
- Organization
- Brain
- finantial network
- business
- Internet
- Genes

The human genome project, completed in 2001, offered the first comprehensive list of all human genes. Yet, to fully understand how our cells function, and the origin of disease, we need accurate maps that tell us how these genes and other cellular components interact with each other.

Terrorism is one of the maladies of the 21st century, absorbing significant resources to combat it worldwide. Network thinking is increasingly present in the arsenal of various law enforcement agencies in charge of limiting terrorist activities. It is used to disrupt the financial network of terrorist organizations, to map terrorist networks, and to uncover the role of their members and their capabilities. While much of the work in this area is classified, several success stories have surfaced. Examples include the use of social networks to capture Saddam Hussein or the capture of the individuals behind the March 11, 2004 Madrid train bombings through the examination of the mobile call net- work.

While the H1N1 pandemic was not as devastating as it was feared at the beginning of the outbreak in 2009, it gained a special role in the history of epidemics: it was the first pandemic whose course and time evolution was accurately predicted months before the pandemic reached its peak (Image 1.9) [14]. This was possible thanks to fundamental advances in understanding the role of networks in the spread of viruses. Indeed, before 2000 epidemic modeling was dominated by compartment models, assuming that everyone can infect everyone else one word the same socio-physical compartment. The emergence of a network-based framework has fundamentally changed this, offering a new level of predictability in epidemic phenomena. Today epidemic prediction is one of the most active applications of network science. It is the source several fundamental results, covered in this book, that are used to predict the spread of both biological and electronic viruses. The impact of these advances are felt beyond biological viruses. In January 2010 network science tools have predicted the conditions necessary for the emergence of viruses spreading through mobile phones. The first major mobile epidemic outbreak that started in the fall of 2010 in China, infecting over 300,000 phones each day, closely followed the predicted scenario.

The human brain, consisting of hundreds of billions of interlinked neurons, is one of the least understood net- works from the perspective of network science. The reason is simple: we lack maps telling us which neurons link to each other. The only fully mapped neural map available for research is that of the C.Elegans worm, with only 300 neurons. Should detailed maps of mammalian brains become available, brain research could become the most prolific application area of network science. Driven by the potential impact of such maps, in 2010 the National Institutes of Health has initiated the Connectome project, aimed at developing the technologies that could provide an accurate neuron-level map of mammalian brains.

1735: Euler’s theorem:

- If a graph has more than two nodes of odd degree, there is no path.
- If a graph is connected and has no odd degree nodes, it has at least one path.

network often refers to real systems

- www,
- social network
- metabolic network.

Language: (Network, node, link)

graph: mathematical representation of a network

- web graph,
- social graph (a Facebook term)

Language: (Graph, vertex, edge)

The choice of the proper network representation determines our ability to use network theory successfully.

In some cases there is a unique, unambiguous representation. In other cases, the representation is by no means unique. For example, the way we assign the links between a group of individuals will determine the nature of the question we can study.

If you connect individuals that work with each other, you will explore the professional network.

```
In [174]:
```%matplotlib inline
import matplotlib.pyplot as plt
import networkx as nx
Gu = nx.Graph()
for i, j in [(1, 2), (1, 4), (4, 2), (4, 3)]:
Gu.add_edge(i,j)
nx.draw(Gu, with_labels = True)

```
```

```
In [173]:
```import networkx as nx
Gd = nx.DiGraph()
for i, j in [(1, 2), (1, 4), (4, 2), (4, 3)]:
Gd.add_edge(i,j)
nx.draw(Gd, with_labels = True)

```
```

```
In [175]:
```nx.draw(Gu, with_labels = True)

```
```

```
In [177]:
```nx.draw(Gd, with_labels = True)

```
```

```
In [20]:
```import numpy as np
x = [1, 1, 1, 2, 2, 3]
np.mean(x), np.sum(x), np.std(x)

```
Out[20]:
```

```
In [22]:
```plt.hist(x)
plt.show()

```
```

```
In [113]:
```from collections import defaultdict, Counter
freq = defaultdict(int)
for i in x:
freq[i] +=1
freq

```
Out[113]:
```

```
In [37]:
```freq_sum = np.sum(freq.values())
freq_sum

```
Out[37]:
```

```
In [38]:
```px = [float(i)/freq_sum for i in freq.values()]
px

```
Out[38]:
```

```
In [41]:
```plt.plot(freq.keys(), px, 'r-o')
plt.show()

```
```

```
In [182]:
```plt.figure(1)
plt.subplot(121)
pos = nx.spring_layout(Gu) #定义一个布局，此处采用了spring布局方式
nx.draw(Gu, pos, with_labels = True)
plt.subplot(122)
nx.draw(Gd, pos, with_labels = True)

```
```

bipartite graph (or bigraph) is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets.

- Hits algorithm
- recommendation system

- for a connected graph: where $d_{ij}$ is the distance from node i to node j

- In an undirected graph $d_{ij} =d_{ji}$ , so we only need to count them once:

has a path from each node to every other node and vice versa (e.g. AB path and BA path).

it is connected if we disregard the edge directions.

Strongly connected components can be identified, but not every node is partof a nontrivial strongly connected component.

- In-component: nodes that can reach the scc,
- Out-component: nodes that can be reached from the scc.

```
In [215]:
```G1 = nx.complete_graph(4)
pos = nx.spring_layout(G1) #定义一个布局，此处采用了spring布局方式
nx.draw(G1, pos = pos, with_labels = True)

```
```

```
In [206]:
```print(nx.transitivity(G1))

```
```

```
In [216]:
```G2 = nx.Graph()
for i, j in [(1, 2), (1, 3), (1, 0), (3, 0)]:
G2.add_edge(i,j)
nx.draw(G2,pos = pos, with_labels = True)

```
```

```
In [198]:
```print(nx.transitivity(G2))

```
```

```
In [219]:
```G3 = nx.Graph()
for i, j in [(1, 2), (1, 3), (1, 0)]:
G3.add_edge(i,j)
nx.draw(G3, pos =pos, with_labels = True)

```
```

```
In [200]:
```print(nx.transitivity(G3))

```
```

THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

- A. Degree distribution: $p_k$
- B. Path length: $<d>$
- C. Clustering coefficient: $C_i$

Definition: A random graph is a graph of N nodes where each pair of nodes is connected by probability p.

N lableled nodes are connected with L randomly placed links. Erdös-Rényi(1959)

Each pair of N labeled nodes is connected with probability p. Gilbert (1959)

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [ ]:
```

```
In [144]:
```import networkx as nx
import matplotlib.pyplot as plt
BA= nx.random_graphs.barabasi_albert_graph(500,1) #生成n=20、m=1的BA无标度网络
pos = nx.spring_layout(BA) #定义一个布局，此处采用了spring布局方式
nx.draw(BA,pos,with_labels=False,node_size = 30) #绘制图形
plt.show()

```
```

```
In [145]:
```nx.degree_histogram(BA)[:3]

```
Out[145]:
```

```
In [146]:
```BA.degree().items()[:3]

```
Out[146]:
```

```
In [147]:
```plt.hist(BA.degree().values())
plt.show()

```
```

```
In [148]:
```def plotDegreeDistributionLongTail(G):
degs = defaultdict(int)
for i in G.degree().values(): degs[i]+=1
items = sorted ( degs.items () )
x, y = np.array(items).T
plt.plot(x, y, 'b-o')
plt.legend(['Degree'])
plt.xlabel('$K$', fontsize = 20)
plt.ylabel('$P_K$', fontsize = 20)
plt.title('$Degree\,Distribution$', fontsize = 20)
plt.show()
plotDegreeDistributionLongTail(BA)

```
```

```
In [150]:
```def plotDegreeDistribution(G):
degs = defaultdict(int)
for i in G.degree().values(): degs[i]+=1
items = sorted ( degs.items () )
x, y = np.array(items).T
plt.plot(x, y, 'b-o')
plt.xscale('log')
plt.yscale('log')
plt.legend(['Degree'])
plt.xlabel('$K$', fontsize = 20)
plt.ylabel('$P_K$', fontsize = 20)
plt.title('$Degree\,Distribution$', fontsize = 20)
plt.show()
plotDegreeDistribution(BA)

```
```

```
In [192]:
```BA= nx.random_graphs.barabasi_albert_graph(50000,10) #生成n=50000、m=1的BA无标度网络
plotDegreeDistribution(BA)

```
```

度分布的n阶矩被定义为：

$ <k^n> = \sum_{k_{min}}^{k_{max}}k^np_k = \int_{k_{min}}^{k_{max}}k^np_kdk $ （1）

低阶矩具有明确的统计意义：

- n = 1的时候，一阶矩是$<k^{}>$，即平均度。
- n = 2的时候，二阶矩是$<k^2>$，可以帮助计算方差 $ \delta^2 = <k^2> - <k^{}>^{2} $，测量了度的离散程度（the spread in the degrees）。
- n = 3的时候，三阶矩是$<k^3>$, 决定了度分布的偏度（skewness），测量了$p_k$围绕着
的对称性。

对于无标度网络而言，满足幂律分布：

$p(k) = Ck^{-\gamma}$ （2）

由公式（1）和（2）可以得到：

$ <k^n> = \int_{k_{min}}^{k_{max}}k^np_kdk = C \frac{k_{max}^{n- \gamma +1} - k_{min}^{n - \gamma +1}}{n - \gamma + 1} $ （3）

可以使用wolframalpha的积分计算器积分来进行简单验证，例如x^(n-r)dx从10到100积分 网页链接：[http://www.wolframalpha.com/input/?i=integrate+x%5E%28n-r%29+dx+from+10+to+100][1]

显然：

- 当$n - \gamma +1 <= 0$时，随着$k_{max}增加，$$k_{max}^{n- \gamma +1} \rightarrow 0$。所有满足$n <= \gamma -1$的n阶矩都是有限的。
- 当$n - \gamma +1 >0 $时，随着$k_{max}增加，$$k_{max}^{n- \gamma +1} \rightarrow \infty$。所有满足$n > \gamma -1$的n阶矩都是无极限的。

对于无标度网络而言，一般幂参数$2 < \gamma < 3$，所以：

对于n = 1的情况，即一阶矩平均度$<k^{}>$是有限的。

但对于n >= 2的情况，即$k^2$或$k^3$是无极限的。二阶和高阶矩无穷大是“无标度”的来源

```
In [ ]:
```