Gloabl network visualisation tutorial

In the introductory_tutorial we ran through building structural covariance network analyses using scona🍪.

In this tutorial we'll cover some of the visualisation tools that communicate global network measures from your results.

Click on any of the links below to jump to that section

Get set up

Import the modules you need


In [1]:
import scona as scn
import scona.datasets as datasets
import numpy as np
import networkx as nx
import pandas as pd
from IPython.display import display

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

%load_ext autoreload
%autoreload 2

Read in the data, build a network and calculate the network metrics

If you're not sure about this step, please check out the introductory_tutorial notebook for more explanation.


In [2]:
# Read in sample data from the NSPN WhitakerVertes PNAS 2016 paper.
df, names, covars, centroids = datasets.NSPN_WhitakerVertes_PNAS2016.import_data()

# calculate residuals of the matrix df for the columns of names
df_res = scn.create_residuals_df(df, names, covars)

# create a correlation matrix over the columns of df_res
M = scn.create_corrmat(df_res, method='pearson')

# Initialise a weighted graph G from the correlation matrix M
G = scn.BrainNetwork(network=M, parcellation=names, centroids=centroids)

# Threshold G at cost 10 to create a binary graph with 10% as many edges as the complete graph G.
G10 = G.threshold(10)

# Create a GraphBundle object that contains the G10 graph called "real_graph"
bundleGraphs = scn.GraphBundle([G10], ["real_graph"])

# Add ten random graphs to this bundle
# (In real life you'd want more than 10 random graphs,
# but this step can take quite a long time to run so 
# for the demo we just create 10)
bundleGraphs.create_random_graphs("real_graph", 10)


        Creating 10 random graphs - may take a little while

Visualise the degree distribution: plot_degree

The degree of each node is the number of edges adjacent to the node. For example if a node is connected to four other nodes then its degree is 4. If it is connected to 50 other nodes, its degree is 50.

Brain networks are usually "scale-free" which means that their degree distribution follows a power law. You can think of them as having a "heavy tail": there are a small number of nodes that have a large number of connections.

This is in contrast to - for example - an Erdős–Rényi graph where each node is connected to the others with a set, random probability. This graph is often called a binomial graph because the probability of connections follows a binomial (Yes-No) distribution.

One of the first things to check for the structural covariance network analysis with scona is that our degree distribution shows this pattern.

Look at the data

The degree distribution is already saved in the G10 graph object. But we'll just spend a few moments showing how you can access that information.

You can make a dictionary of the node ids (the dictionary key) and their degree (the dictionary value).


In [3]:
degrees = dict(G10.degree())

# Print the degree of every 50th node to show what's inside this dictionary
for node_id, degree in list(degrees.items())[::50]:
    print ('Node: {:3d} has degree = {:2d}'.format(node_id, degree))


Node:   0 has degree = 47
Node:  50 has degree = 16
Node: 100 has degree = 80
Node: 150 has degree =  9
Node: 200 has degree = 11
Node: 250 has degree = 37
Node: 300 has degree = 25

You can see the information for a specific node from the graph itself.

Although note that the degree needs to be calculated. It hasn't been added to the attributes yet.


In [4]:
# Display the nodal attributes
G10.nodes[150]


Out[4]:
{'name': 'lh_insula_part3',
 'x': -37.400137000000001,
 'y': -8.5937070000000002,
 'z': 4.4363890000000001,
 'centroids': array([-37.400137,  -8.593707,   4.436389])}

scona has a command for that. Lets go ahead and add the degree to the nodal attributes....along with a few other measures.


In [5]:
# Calculate nodal measures for graph G10
G10.calculate_nodal_measures()

# Display the nodal attributes
G10.nodes[150]


Out[5]:
{'name': 'lh_insula_part3',
 'x': -37.400137000000001,
 'y': -8.5937070000000002,
 'z': 4.4363890000000001,
 'centroids': array([-37.400137,  -8.593707,   4.436389]),
 'module': 1,
 'degree': 9,
 'closeness': 0.39308578745198464,
 'betweenness': 0.0011242664849842761,
 'shortest_path_length': 2.5357142857142856,
 'clustering': 0.5277777777777778,
 'participation_coefficient': 0.0}

Look at all that information!

We only want to visualise the degree distribution at the moment though.

Import the code you need: plot_degree_dist


In [6]:
# import the function to plot network measures
from scona.visualisations import plot_degree_dist

Plot the degree distribution

We only need the BrainNetwork graph to plot the degree distribution.

Default settings

By default we add an Erdős–Rényi random graph that has the same number of nodes as our BrainNetwork Graph for comparison. The default colours are blue for the degree distribution of the real graph and a grey line for the random graph.


In [7]:
plot_degree_dist(G10)


Without the random graph

The random graph is a good sanity check that your degree distribution is not random...but it rather swamps the plot. So this example allows you to plot only the degree distribution of the real graph, without the random graph.


In [8]:
plot_degree_dist(G10, binomial_graph=False)


Save the plot

You can save this figure in any location.

You can do that by passing a file name and (optional) directory path to the figure_name option. If you don't set a directory path the figure will be saved in the local directory.

For this tutorial we'll save the output in a figures folder inside this tutorials directory.


In [9]:
plot_degree_dist(G10, binomial_graph=False, figure_name="figures/DegreeDistribution.png")


☝️ Did you see an error message?

The code checks to see if the directory that you want to save your figure to actually exists. If it doesn't then it creates the directory, but gives you a little warning first to check that it isn't coming as a surprised (for example if you have tried to save your figure in the wrong place!)

We have the tutorials/figures directory specifically ignored in this project so we shouldn't ever see changes there.

Note that if you don't pass a file ending the file will be saved as a png by default.

If you add a file extension allowed by matplotlib (eg .jpg, .svg, .pdf etc) then the figure will be saved in that format.

Change the colours

You can pass a pair of colours to the plot_degree_dist function.

The first colour is that of the histogram for the real graph.

The second colour is the line for the Erdős-Rényi graph.

In the example below, we've chosen red and black 🎨


In [10]:
plot_degree_dist(G10, color=["red", "black"])


Report the global measures of the graph: report_global_measures

One of the first things we want to know are how the global attributes of the network compare to those of random networks.

Specifically we'll calculate:

  • a: assortativity
  • C: clustering
  • E: efficiency
  • L: shortest path
  • M: modularity
  • sigma: small world

and plot a bar chart that compares the real network to the random graphs.

Calculate the global measures


In [11]:
# Calculate the global measures
bundleGraphs_measures = bundleGraphs.report_global_measures()

# Show the dataframe so we can see the measures
display(bundleGraphs_measures)


assortativity average_clustering average_shortest_path_length efficiency modularity
real_graph 0.090769 0.449889 2.376243 0.479840 0.382855
real_graph_R0 -0.088234 0.226946 2.084902 0.519406 0.121712
real_graph_R1 -0.092617 0.247408 2.079741 0.520277 0.124544
real_graph_R2 -0.087816 0.223323 2.084056 0.519651 0.130772
real_graph_R3 -0.096873 0.230480 2.085008 0.519368 0.123251
real_graph_R4 -0.079377 0.223141 2.084310 0.519523 0.123564
real_graph_R5 -0.088001 0.233167 2.078832 0.520325 0.128539
real_graph_R6 -0.093329 0.227506 2.076399 0.520728 0.125078
real_graph_R7 -0.061999 0.227014 2.097064 0.517494 0.129245
real_graph_R8 -0.080110 0.227883 2.088434 0.518872 0.120861
real_graph_R9 -0.073402 0.234468 2.088646 0.518867 0.116025

Now you have everything to plot the network measures of the BrainNetwork Graph and compare these measures to random measures values obtained from 10 random graphs stored inside the graph bundle bundleGraphs.

Import the code you need: plot_network_measures


In [12]:
# import the function to plot network measures
from scona.visualisations import plot_network_measures

Plot the measures

There are 2 required parameters for the plot_network_measures function:

  1. a GraphBundle object (e.g. bundleGraphs)
  2. the name of the real graph in your GraphBundle (e.g. "real_graph")

Default settings

The default colours are blue and grey, and by default the error bars show 95% confidence intervals.


In [13]:
plot_network_measures(bundleGraphs, real_network="real_graph")


Save the figure

You'll probably want to save the beautiful figure you've made!

You can do that by passing a file name and (optional) directory path to the figure_name option. If you don't set a directory path the figure will be saved in the local directory.

For this tutorial we'll save the output in a figures folder inside this tutorials directory.

For fun, we'll also adjust the colours to make the real network orange (#FF4400) and the random network turquoise (#00BBFF).


In [14]:
plot_network_measures(bundleGraphs, "real_graph",
                      figure_name="figures/NetworkMeasuresDemo",
                      color=["#FF4400", "#00BBFF"])


Hide the legend

You might not want to show the legend. That's fine!

We'll also use this example to save an svg file.


In [15]:
plot_network_measures(bundleGraphs, "real_graph",
                      figure_name="figures/NetworkMeasuresDemoNoLegend.svg",
                      show_legend=False)


Only show the real graph

You might not want to show the random graphs.

In this case you have to create a new graph bundle that only contains the real graph, and pass that to the plot_network_measures function.

For this example we've also changed the colour to green (to show off 😉).


In [16]:
# Create a new graph bundle
realBundle = scn.GraphBundle([G10], ["real_graph"])

plot_network_measures(realBundle, real_network = "real_graph",
                      color=["green"])


Change the type of error bars

The variance of measures obtained from random graphs is - by default - shown as the 95% confidence interval.

They're calculated by bootstrapping the random graphs. There's more information in the seaborn documentation if you're curious.

But you don't have to calculate them. You can plot the standard deviations instead if you'd prefer. (These are a bit larger than the 95% confidence intervals so they're a bit easier to see in the plot below.)


In [17]:
plot_network_measures(bundleGraphs,
                      real_network="real_graph",
                      ci="sd")


Alternatively you could show the 99% confidence interval.


In [18]:
plot_network_measures(bundleGraphs, real_network="real_graph",
                      ci=99)


Run with 100 random graphs

You can't publish results with 10 random graphs. These don't give meaningful variations. So let's add 90 more random graphs.

(This still isn't enough, but much better than 10! We'd recommend that you run 1000 random graphs for publication quality results.)

This takes some time (around 5 minutes) so the cell below is commented out by default. Remove the # at the start of each of the lines below to run the commands yourself.


In [19]:
#bundleGraphs.create_random_graphs("real_graph", 90)
#print (len(bundleGraphs))

Congratulations! 🎉

You created additional 90 random graphs, to give you a total of 100 random graphs and 1 real graph, and you managed to answer to some of your emails while waiting.

Here's a beautiful plot of your network measures with 95% confidence intervals....which you can't see because the random networks are all so similar to each other 🤦


In [20]:
#plot_network_measures(bundleGraphs, real_network="real_graph")

plot_rich_club

  • to plot the rich club values per degree along with the random rich club values created from Random Networks with a preserved degree distribution

Function requries GraphBundle object - scona way to handle across-network comparisons. Basically, it is a dictionary, containing BrainNetwork objects as values and strings (corresponding names of BrainNetwork) as keys.

It is also required to pass the name of the real Graph in GraphBundle (e.g. "Real_Graph") as a string.

Let's create input for the function


In [74]:
print ('Woo')


Woo

In [2]:
# instantiate the GraphBundle object with the BrainNetwork Graph and corresponding name for this Graph

bundleGraphs = scn.GraphBundle([H], ["Real_Graph"])

This creates a dictionary-like object with BrainNetwork H keyed by 'Real_Graph'.


In [3]:
bundleGraphs


Out[3]:
{'Real_Graph': <scona.classes.BrainNetwork at 0x7fa66f8ce160>}

Now add a series of random graphs created by edge swap randomisation of H (keyed by 'Real_Graph').

The create_random_graphs method of the GraphBundle class takes in a real network (in our case Real_Graph) and creates a number (10 in the example below) of random graphs. The output is a dictionary of all these graphs.


In [4]:
# Note that 10 is not usually a sufficient number of random graphs to do meaningful analysis,
# it is used here for time considerations
bundleGraphs.create_random_graphs("Real_Graph", 10)


        Creating 10 random graphs - may take a little while

In [5]:
bundleGraphs


Out[5]:
{'Real_Graph': <scona.classes.BrainNetwork at 0x7fa66f8ce160>,
 'Real_Graph_R0': <scona.classes.BrainNetwork at 0x7fa66cc893c8>,
 'Real_Graph_R1': <scona.classes.BrainNetwork at 0x7fa6a26c6b70>,
 'Real_Graph_R2': <scona.classes.BrainNetwork at 0x7fa68f7a0d30>,
 'Real_Graph_R3': <scona.classes.BrainNetwork at 0x7fa68f7a0e10>,
 'Real_Graph_R4': <scona.classes.BrainNetwork at 0x7fa68f7a0b00>,
 'Real_Graph_R5': <scona.classes.BrainNetwork at 0x7fa66cc89400>,
 'Real_Graph_R6': <scona.classes.BrainNetwork at 0x7fa66cc894a8>,
 'Real_Graph_R7': <scona.classes.BrainNetwork at 0x7fa66cc89438>,
 'Real_Graph_R8': <scona.classes.BrainNetwork at 0x7fa66cc89390>,
 'Real_Graph_R9': <scona.classes.BrainNetwork at 0x7fa66cc89358>}

Well-done! The required input - GraphBundle is created which contains real network keyed by "Real_Graph" and 10 random graphs. Now, let's plot the rich club coefficient values of our BrainNetwork Graph and compare real rich club values to random rich club values obtained from 10 random Graphs (stored inside the GraphBundle).


In [6]:
# import the function to plot rich club values
from scona.visualisations import plot_rich_club

In [7]:
# plot the figure and display without saving to a file
plot_rich_club(bundleGraphs, real_network="Real_Graph")



In [8]:
# show rich club values for degrees from 55 to 65
rich_club_df = bundleGraphs.report_rich_club()
rich_club_df.iloc[55:66, :]


Out[8]:
Real_Graph Real_Graph_R0 Real_Graph_R1 Real_Graph_R2 Real_Graph_R3 Real_Graph_R4 Real_Graph_R5 Real_Graph_R6 Real_Graph_R7 Real_Graph_R8 Real_Graph_R9
55 0.566783 0.500581 0.494774 0.472706 0.500581 0.480836 0.486643 0.479675 0.477352 0.499419 0.501742
56 0.574390 0.509756 0.501220 0.473171 0.501220 0.482927 0.485366 0.481707 0.480488 0.502439 0.508537
57 0.578205 0.510256 0.511538 0.480769 0.510256 0.488462 0.487179 0.491026 0.488462 0.503846 0.515385
58 0.599099 0.522523 0.513514 0.483483 0.533033 0.490991 0.496997 0.496997 0.490991 0.513514 0.518018
59 0.606723 0.534454 0.527731 0.494118 0.546218 0.500840 0.500840 0.500840 0.505882 0.529412 0.526050
60 0.606723 0.534454 0.527731 0.494118 0.546218 0.500840 0.500840 0.500840 0.505882 0.529412 0.526050
61 0.615054 0.552688 0.535484 0.509677 0.556989 0.505376 0.518280 0.513978 0.522581 0.537634 0.556989
62 0.625287 0.560920 0.547126 0.512644 0.560920 0.503448 0.521839 0.528736 0.519540 0.542529 0.558621
63 0.652422 0.564103 0.561254 0.535613 0.561254 0.521368 0.532764 0.541311 0.532764 0.566952 0.572650
64 0.653333 0.580000 0.570000 0.536667 0.560000 0.543333 0.540000 0.546667 0.540000 0.573333 0.583333
65 0.663043 0.586957 0.583333 0.547101 0.561594 0.550725 0.539855 0.547101 0.536232 0.576087 0.594203

In [ ]:
rich_club_df.

More examples of plotting rich club values:

  • save the produced figure in the current directory (where this running python file (or notebook) is located) and set different colors (for real values - #FF4400 - red, for random - #00BBFF - blue).

In [9]:
plot_rich_club(bundleGraphs, real_network="Real_Graph",figure_name="Rich_club_values", color=["#FF4400", "#00BBFF"])


  • save the produced figure in the location = figure_name and without the legend.

Please, give your own location (path-to_file) to figure_name in order to save a figure.

Note: if location does not exist, we will notify you and try to automatically create necessary directories.


In [10]:
plot_rich_club(bundleGraphs, real_network="Real_Graph", figure_name="/home/pilot/GSoC/mynewdir1/Rich_Club_Values", 
               show_legend=False)


/home/pilot/anaconda3/lib/python3.6/site-packages/scona/helpers.py:25: UserWarning: The path - /home/pilot/GSoC/mynewdir1/Rich_Club_Values does not exist. But we will create this directory for you and store the figure there.
  "directory for you and store the figure there.".format(path_name))
  • plot rich club values only for the real network (BrainNetwork Graph) and set the color to green.

Simply, do not create random graphs in GraphBundle


In [11]:
realGraph = scn.GraphBundle([H], ["Real_Graph"])

In [12]:
realGraph


Out[12]:
{'Real_Graph': <scona.classes.BrainNetwork at 0x7fa66f8ce160>}

In [14]:
plot_rich_club(realGraph, real_network="Real_Graph", color=["green"])





In [ ]: