Abstract

This notebook compares the candidate mechanism perturbation amplitude at multiple disease stages in Alzheimer's Disease experiments, using the PyBEL web service


In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
import seaborn as sns
from pandas.plotting import andrews_curves, parallel_coordinates
from sklearn.cluster import KMeans
from sklearn import preprocessing
import time

In [2]:
#%config InlineBackend.figure_format = 'svg'
%matplotlib inline

In [3]:
np.random.seed(5)

In [4]:
time.asctime()


Out[4]:
'Mon Feb 26 16:33:56 2018'

In [5]:
downloads = os.path.join(os.path.expanduser('~'), 'Downloads')

early = os.path.join(downloads, '1.csv')
moderate = os.path.join(downloads, '2.csv')
severe = os.path.join(downloads, '3.csv')

In [6]:
scaler = preprocessing.StandardScaler()

In [7]:
early_df = pd.read_csv(early).rename(index=str, columns={"avg": "EarlyAvg"})
moderate_df = pd.read_csv(moderate).rename(index=str, columns={"avg": "ModerateAvg"})
severe_df = pd.read_csv(severe).rename(index=str, columns={"avg": "SevereAvg"})

df = pd.concat(
    [
        early_df[['Namespace', 'Name', 'EarlyAvg']], 
        moderate_df['ModerateAvg'], 
        severe_df['SevereAvg']
    ], 
    axis=1
)

df = df[df['EarlyAvg'].notnull()]
df = df[df['EarlyAvg'] != 0]

cols = ['EarlyAvg', 'ModerateAvg', 'SevereAvg']

df.to_csv(os.path.join(os.path.expanduser('~'), 'Desktop', 'time_series_cmpa.csv'))
df.head()


Out[7]:
Namespace Name EarlyAvg ModerateAvg SevereAvg
1 GOBP response to oxidative stress 11.922756 6.904458 0.818689
3 GOBP mitochondrial calcium ion homeostasis -10.220753 -2.898852 -2.534374
4 GOBP electron transport chain -10.220753 -2.898852 -2.534374
5 GOBP calcium ion homeostasis 2.326490 10.082439 6.821336
6 GOBP microglial cell activation involved in immune ... 0.276019 -4.376121 -2.382717

In [8]:
sns.pairplot(df[['EarlyAvg', 'ModerateAvg', 'SevereAvg']])
plt.show()


Use the pearson correlation over the time series


In [9]:
corr_df = df[cols].T.corr()

Clustering reveals 3 general patterns of biological processes throughout the progressions.


In [10]:
cg = sns.clustermap(corr_df)
plt.setp(cg.ax_heatmap.yaxis.get_majorticklabels(), rotation=0)
plt.savefig(os.path.join(os.path.expanduser('~'), 'Desktop', 'time_series_clustering_ad.pdf'))
plt.show()


Assign classes based on a simple k-means clustering.


In [11]:
km = KMeans(n_clusters=5)
km.fit(df[cols])

df['label'] = km.labels_

Parallel coordinates immediately reveal the groups of patterns in relation of each mechanism to the disease progressions. Andrews curves use fourier analysis to reveal further patterns in the frequency domain.


In [12]:
parallel_coordinates(df[['EarlyAvg', 'ModerateAvg', 'SevereAvg', 'label']], 'label')
plt.savefig(os.path.join(os.path.expanduser('~'), 'Desktop', 'time_series_pc.pdf'))


Retry the whole analysis but min/max normalize each column first


In [13]:
df_norm = df[cols].apply(lambda x: (x - np.min(x)) / (np.max(x) - np.min(x)))

norm_corr_df = df_norm[cols].T.corr()

In [14]:
sns.pairplot(df_norm[['EarlyAvg', 'ModerateAvg', 'SevereAvg']])
plt.show()



In [16]:
#cg = sns.clustermap(norm_corr_df)
#plt.setp(cg.ax_heatmap.yaxis.get_majorticklabels(), rotation=0)
#plt.show()

In [17]:
km_norm = KMeans(n_clusters=6)
km_norm.fit(df[cols])

df_norm['label'] = km_norm.labels_

In [18]:
plt.title('Parallel Coordinates on Normalized Data')
parallel_coordinates(df_norm, 'label')
plt.show()


Class 0 contains candidate mechanisms whose CMPA scores are neither significant nor do they change much over time. The other class members are enumerated below.


In [19]:
df[df['label'] == 0]


Out[19]:
Namespace Name EarlyAvg ModerateAvg SevereAvg label
5 GOBP calcium ion homeostasis 2.326490 10.082439 6.821336 0
12 GOBP memory -0.808335 4.813867 7.480508 0
43 GOBP cell proliferation 0.896167 2.478145 1.950688 0
45 GOBP Notch signaling pathway -1.822177 4.178887 8.694139 0
57 GOBP calcium ion import 4.482384 0.409248 3.796884 0
64 GOBP zinc ion homeostasis 0.248634 -0.319183 5.076812 0
68 GOBP generation of neurons 0.564985 7.706614 5.794660 0
74 GOBP glucose metabolic process -0.608376 0.656481 6.756742 0
78 GOBP regulation of synaptic plasticity -1.059119 5.082835 3.217849 0
96 GOBP chemokine biosynthetic process 1.645633 2.851218 1.701339 0
112 GOBP synapse assembly 2.657504 3.908540 1.516200 0
115 GOBP regulation of dendritic cell dendrite assembly 0.383572 1.839345 2.373630 0
125 GOBP positive regulation of synaptic plasticity 2.328523 3.033854 3.471117 0
126 GOBP cAMP-mediated signaling 0.248634 -0.319183 5.076812 0
132 MESHPP Insulin Resistance 3.282943 5.047754 3.314902 0
133 GOBP response to dexamethasone 0.523132 7.729366 5.828308 0
136 GOBP regulation of catalytic activity -1.810759 0.378933 6.071634 0
186 GOBP learning -0.488665 7.753769 3.582260 0
187 GOBP negative regulation of microtubule depolymeriz... 0.564985 7.706614 5.794660 0
240 GOBP negative regulation of synaptic transmission, ... 0.248634 -0.319183 5.076812 0
257 GOBP leukocyte mediated immunity 0.248634 -0.319183 5.076812 0
269 GOBP glucocorticoid receptor signaling pathway 0.248634 -0.319183 5.076812 0
275 GOBP amyloid fibril formation -0.103646 2.756664 2.765434 0
322 GOBP energy reserve metabolic process 3.658878 -0.806825 7.737790 0

In [20]:
df[df['label'] == 1]


Out[20]:
Namespace Name EarlyAvg ModerateAvg SevereAvg label
3 GOBP mitochondrial calcium ion homeostasis -10.220753 -2.898852 -2.534374 1
4 GOBP electron transport chain -10.220753 -2.898852 -2.534374 1
13 GOBP regulation of synaptic transmission, cholinergic -6.319755 -1.844384 -3.837040 1
22 GOBP cognition -13.098864 -2.769259 -2.702428 1
33 GOBP mitochondria-nucleus signaling pathway -5.657756 -1.472886 2.661210 1
34 GOBP mitochondrion organization -6.785946 -2.283427 1.524850 1
35 GOBP mitochondrial transport -6.450751 -1.657418 -3.457478 1
36 GOBP protein import into mitochondrial matrix -10.220753 -2.898852 -2.534374 1
60 GOBP reelin-mediated signaling pathway -6.596265 -1.350615 -3.165148 1
67 GOBP positive regulation of neuron apoptotic process -12.844473 -19.054702 -0.414471 1
88 GOBP regulation of synaptic activity -4.746485 -3.727622 -5.885100 1
90 GOBP sodium ion homeostasis -6.450751 -1.657418 -3.457478 1
191 GOBP neuron development -15.301905 2.727764 1.894956 1
210 GOBP blood circulation -6.138729 -2.983396 0.207620 1
217 GOBP nitric oxide-cGMP-mediated signaling pathway -6.450751 -1.657418 -3.457478 1
358 GOBP positive regulation of synaptic transmission, ... -6.450751 -1.657418 -3.457478 1

In [21]:
df[df['label'] == 2]


Out[21]:
Namespace Name EarlyAvg ModerateAvg SevereAvg label
1 GOBP response to oxidative stress 11.922756 6.904458 0.818689 2
7 GOBP inflammatory response 8.303427 7.448499 7.812735 2
8 GOBP reactive oxygen species metabolic process 10.220753 2.898852 2.534374 2
9 GOBP neuron apoptotic process 8.260572 -5.967573 5.291992 2
11 GOBP cell death 30.477614 14.481597 14.626048 2
19 MESHPP Lipid Peroxidation 12.901503 3.314835 6.914957 2
32 GOBP response to reactive oxygen species 6.450751 1.657418 3.457478 2
54 GOBP ERK1 and ERK2 cascade 6.273226 0.171814 2.436955 2
62 GOBP negative regulation of calcium-mediated signaling 7.957573 -4.954710 4.877306 2
77 GOBP neuron projection development 10.665303 2.719991 -2.536432 2
80 GOBP long term synaptic depression 6.988615 -0.175062 3.212686 2
97 GOBP chronic inflammatory response 8.345290 -2.137638 -3.575626 2
113 GOBP regulation of neuronal synaptic plasticity 6.032889 0.767076 3.318471 2
116 GOBP brain development 7.968004 1.583114 5.738014 2
117 GOBP maintenance of permeability of blood-brain bar... 5.309949 2.006059 1.902776 2
118 GOBP copper ion export 6.450751 1.657418 3.457478 2
137 MESHPP Oxidative Stress 14.865581 6.765523 12.177025 2
184 GOBP neurogenesis 14.324659 4.749923 2.662253 2
216 GOBP vasoconstriction 10.981457 2.183312 6.768618 2
218 GOBP leukotriene production involved in inflammator... 6.450751 1.657418 3.457478 2
319 GOBP hormone secretion 6.450751 1.657418 3.457478 2
323 GOBP negative regulation of neuron apoptotic process 10.862169 -1.532971 4.019595 2
324 GOBP positive regulation of calcium-mediated signaling 6.450751 1.657418 3.457478 2
332 GOBP cell-matrix adhesion 11.310004 3.724304 -2.769315 2
336 GOBP synaptic transmission, glutamatergic 7.540003 2.482869 -1.846210 2
344 GOBP DNA biosynthetic process 7.438875 2.396486 -1.932782 2
354 PMIBP Neurotoxicity 6.450751 1.657418 3.457478 2
381 GOBP glutamate secretion 6.450751 1.657418 3.457478 2

In [22]:
df[df['label'] == 3]


Out[22]:
Namespace Name EarlyAvg ModerateAvg SevereAvg label
6 GOBP microglial cell activation involved in immune ... 0.276019 -4.376121 -2.382717 3
10 GOBP apoptotic process 1.056038 4.296424 -7.694186 3
14 GOBP neuronal signal transduction -1.066868 -2.614872 -2.639730 3
15 GOBP response to endoplasmic reticulum stress -0.730211 -0.553103 -5.064365 3
16 GOBP response to unfolded protein 1.567032 0.929631 -0.787240 3
17 GOBP endoplasmic reticulum calcium ion homeostasis 1.761505 2.375824 1.026676 3
18 GOBP regulation of ryanodine-sensitive calcium-rele... -3.740118 -1.692425 1.193665 3
21 GOBP cholesterol homeostasis 0.536060 0.432731 1.045690 3
23 GOBP neuron death -1.181309 -5.315558 3.885110 3
24 GOBP positive regulation of glial cell apoptotic pr... 1.372896 -2.790113 -1.101693 3
25 GOBP positive regulation of synaptic transmission, ... 1.788065 -1.173172 -1.033474 3
28 GOBP autophagy 1.505998 -0.771143 -0.146501 3
29 GOBP regulation of actin filament binding -0.106042 -0.076021 0.274029 3
30 GOBP regulation of mitochondrial membrane potential -0.053021 -0.038010 0.137015 3
31 GOBP clathrin-dependent endocytosis -2.545447 0.157776 -1.181924 3
37 GOBP lipid oxidation 2.657761 1.147501 0.497833 3
38 GOBP mitochondrion disassembly 2.228225 1.165980 0.537949 3
40 GOBP cell cycle 0.060439 1.115912 0.801196 3
41 GOBP cell communication -1.316057 -0.628754 -0.319455 3
48 GOBP calcium-dependent cell-matrix adhesion 0.201542 -0.022169 0.002663 3
50 GOBP regulation of cell-cell adhesion 0.201542 -0.022169 0.002663 3
51 GOBP immune response to tumor cell 0.101018 0.848721 0.502803 3
53 GOBP regulation of cell cycle -0.127760 0.920557 -0.427034 3
56 GOBP neuron migration 0.106042 0.076021 -0.274029 3
59 GOBP regulation of long-term synaptic potentiation -0.575578 -0.620103 1.084054 3
63 GOBP negative regulation of neuron projection regen... 6.086485 -6.156048 -2.239952 3
69 GOBP positive regulation of protein neddylation -0.194863 -0.562894 -0.578285 3
71 GOBP protein catabolic process -0.194863 -0.562894 -0.578285 3
72 GOBP cell redox homeostasis -0.369230 -0.869931 -0.786539 3
73 GOBP astrocyte activation 1.991491 -3.801164 -2.983187 3
... ... ... ... ... ... ...
357 GOBP lactic acid secretion 0.978040 1.195253 0.724081 3
359 GOBP cytokine production involved in inflammatory r... -0.248634 0.319183 -5.076812 3
360 GOBP axonogenesis -0.978040 -1.195253 -0.724081 3
361 GOBP T cell tolerance induction -0.082593 -1.023411 -0.390744 3
363 GOBP JAK-STAT cascade 0.547550 0.114759 -1.066254 3
368 GOBP cytokine production 0.643670 0.988891 0.387738 3
369 GOBP oxidative phosphorylation -0.354541 0.013414 -0.139847 3
371 PTS AKT signaling 0.100065 -0.258637 -0.112185 3
372 PTS ERK cascade 0.100065 -0.258637 -0.112185 3
374 GOBP regulation of phagocytosis 0.101018 0.848721 0.502803 3
375 GOBP negative regulation of phagocytosis 0.803265 0.776024 0.587589 3
376 GOBP cytokine secretion 0.566575 0.807244 0.474455 3
377 GOBP regulation of dendritic cell antigen processin... 0.202036 1.697443 1.005606 3
378 MESHPP Mitochondrial Swelling 0.133062 1.139162 -2.575527 3
388 GOBP actin polymerization or depolymerization -0.048206 0.022424 0.155939 3
392 GOBP viral entry into host cell 0.449980 -0.038318 -0.219236 3
393 GOBP positive regulation of viral entry into host cell 0.449980 -0.038318 -0.219236 3
394 MESHPP Cell Transformation, Viral 0.449980 -0.038318 -0.219236 3
395 MESHPP Virus Internalization 0.449980 -0.038318 -0.219236 3
400 GOBP regulation of cAMP biosynthetic process 0.468332 -0.710783 -0.214147 3
403 PTS Regulation of Wnt signaling pathway 0.729428 -0.122743 0.724946 3
405 PTS p53 signaling pathway 0.675989 0.624059 -0.528864 3
407 GOBP cell cycle G1/S phase transition 0.877882 -0.678267 -0.318032 3
408 GOBP SNARE complex assembly 0.413223 0.285728 0.568462 3
411 GOBP focal adhesion assembly -0.007247 -1.279725 -0.173218 3
412 GOBP reactive oxygen species biosynthetic process -0.054095 0.382340 0.030319 3
422 GOBP positive regulation of protein serine/threonin... 0.714148 -0.650780 -0.254970 3
425 GOBP monocyte activation 0.108747 0.079412 -0.420922 3
431 GOBP tumor necrosis factor secretion 0.063234 0.275451 0.118232 3
433 GOBP cell activation 0.224645 -0.624579 -1.758995 3

178 rows × 6 columns


In [23]:
df[df['label'] == 4]


Out[23]:
Namespace Name EarlyAvg ModerateAvg SevereAvg label
98 GOBP production of molecular mediator involved in i... -3.234381 -15.750005 -14.480265 4
340 GOBP glial cell differentiation 8.239913 -19.845326 -31.309422 4

Conclusions

Patients can be measured at multiple time points to be temporally aligned based on these patterns and possibly identify a disease subtype.