Such basic features of a network as those computed in Q&A Session #1 can also be displayed easily with one of NetworKit's convenience functions (before we can show them we have to import the Python module networkit again and change to the correct folder):

```
In [6]:
```from networkit import *

```
In [7]:
```cd ~/Documents/workspace/NetworKit

```
```

```
In [8]:
```%matplotlib inline

```
In [9]:
```G = readGraph("input/PGPgiantcompo.graph", Format.METIS)
pf = profiling.Profile.create(G, preset="minimal")
pf.show()

```
```

Now we know how to access the most fundamental features of a network. Some properties that could be regarded as fundamental are missing because their calculation would take too long for larger graphs.

Before we look at different network types, let us reflect on the distant roots of network analysis. Graph theory is one of its ancestors and has been around for several centuries now. As an example, think of Euler's problem from 1736 of finding a 'Eulerian tour/cycle' over the Pregel bridges in Königsberg.

Do you remember how the problem of deciding wheter a graph has a Eulerian cycle can be solved for arbitrary undirected graphs? If not, use the web to find helpful characterizations of graphs with Eulerian cycles. Write it down here.

**Answer:**Then enter code below to decide whether a graph has such a cycle or not.

**Insert code in next cell**Test it on the PGP graph! What is the result? Does the result meet your expectations?

**Answer:**

```
In [10]:
```# 2-1) An undirected graph is Eulerian iff all its vertices have even degree
# 2-2) and 2-3) Decide whether graph is Eulerian or not
eulerian = True
for v in G.nodes():
if (G.degree(v) % 2 != 0):
eulerian = False
break
if (eulerian == True):
print("2-3) Graph is Eulerian")
else:
print("2-3) Graph is NOT Eulerian")

```
```

As indicated by the previous task, graph theory and graph algorithms have been research foci for quite some time. In comparison, complex networks and their analysis have become a focus of investigation only recently. The reason for our interest in complex networks is their similarity to many real-world phenomena such as social interactions, web graphs, food webs, protein interactions and so forth.

But what makes a network complex or not complex?

Well, complex networks have 'non-trivial' topological features. Let us explore this statement in more detail with the help of data. To this end, we look at a social network, a technical mesh and an Erdös-Renyi random graph.

```
In [11]:
```# Load/generate 3 graphs of different types
mit8 = readGraph("input/MIT8.edgelist", Format.EdgeListTabZero)
airf1 = readGraph("input/airfoil1.graph", Format.METIS)
gen = generators.ErdosRenyiGenerator(1000, 0.01)
er1000 = gen.generate()

Some context on these networks is given below, first for MIT8. It stems from a larger collection of Facebook networks from the early days of the online social network. MI8 models

"Facebook friendships at 100 US universities at some time in 2005, as well as a number of node attributes such as dorm, gender, graduation year, and academic major. The data was apparently provided directly by Facebook. (...) It does not include the names of individual or even of any of the node attributes (they have been given integer ids)." http://sociograph.blogspot.de/2011/03/facebook100-data-and-parser-for-it.html

The airfoil1 graph is a mesh that stems from a two-dimensional numerical simulation where one is interested in the airflow around an airplane wing. Meshes are usually easy to visualize, two-dimensional meshes are even planar in most (if not all reasonable) cases. The picture below illustrates this. The colors signify different vertex blocks, which we can ignore here.

```
In [12]:
```from IPython.core.display import Image
Image('input/airfoil1-10p.png')

```
Out[12]:
```

Some types of realistic networks such as social ones usually have more than one connected component. However, even if there is more than one connected component, there is usually **only one big one**. Let us take a closer look at these *giant components* in particular, the differences of the three networks and other interesting properties in general.

Print the NetworKit overview for each of the three graphs!

**Answer:**For those graphs with more than one connected component, extract the largest connected component $C$ and continue working with $C$. Enter the code for this in the cell below this one.

What are the most striking topological differences between the three graphs in terms of the analytics kernels presented in the lecture?

**Answer:**Pick an arbitrary pair of vertices $(u, v)$ in each network. Since we work with connected graphs, the two nodes are connected. What is the smallest number of hops a virus needs to make in the network to reach $v$ if it starts at $u$? What if $u$ and $v$ are chosen as worst case pair?

**Answer:**What do you think: Why are complex networks more difficult to work with, what makes them 'complex'?

**Answer:**Which parts of your answer to (5) concern theoretical analysis, which concern computational aspects?

**Answer:**

```
In [16]:
```# Code for Q&A Session #3
# 3-1)
pf = profiling.Profile.create(airf1, preset="minimal")
pf.show()
pf = profiling.Profile.create(er1000, preset="minimal")
pf.show()
pf = profiling.Profile.create(mit8, preset="minimal")
pf.show()
# 3-2) extract largest connected component
mit8lc = workflows.extractLargestComponent(mit8)
pf = profiling.Profile.create(mit8lc, preset="minimal")
pf.show()
# for comparison:
# 3-3) Look at diameter, clustering coefficients, k-core decomposition, degree distribution
# 3-4)
for g in {mit8lc, airf1, er1000}:
u = g.randomNode()
v = g.randomNode()
if g.isWeighted():
print("weighted")
dijkstra = graph.Dijkstra(g, u)
dijkstra.run()
dists = dijkstra.getDistances()
else:
print("unweighted")
bfs = graph.BFS(g, u)
bfs.run()
dists = bfs.getDistances()
print("distance from ", u, " to ", v, ": ", dists[v])
print("Diameter of graph ", g.toString(), ": ", distance.Diameter(g))
# 3-5)
# Low diameter: $k$-hop exploration visits large parts of the graph even for small $k$, cache inefficiency
# Degree distribution: load imbalance in parallel computations; may require more flexible data structure
# Visualization: Traditional visualization tools often produce 'hairballs'

```
```