Cluster Analysis of ABIDE subjects

I have run the scores pipeline on guilimin (some of the subjects have failed) and now I want to cluster them by their stability maps, seed maps and other available metrics.

Some of the functions I will be using here are part of a little package I have started to wrap things together that I am reusing constantly. This makes it less straight forward to run this notebook on a different machine but makes the whole thing faster and cleaner for me. The package is located here.

Imports


In [1]:
import os
import sys
import nibabel as nib
import brainbox as bb
from matplotlib import pyplot as plt
import scipy.spatial.distance as dist
import scipy.cluster.hierarchy as clh
from IPython.display import clear_output

Paths


In [2]:
in_path = '/data1/abide/Out/Remote/some_failed/out'
out_path = '/data1/abide/Analysis/Remote/some_missing'

Selections


In [3]:
# Network to investigate - in the case of multi-network files
network = 8
# Scale to investigate - chooses the hierarchical cutoff
scale = 0

Grab the data


In [4]:
file_dict = bb.fileOps.grab_files(in_path, '.nii.gz')

In [5]:
matching = [s for s in file_dict['sub_name'] if "NYU" in s]

In [6]:
len(matching)


Out[6]:
960

In [7]:
array_dict = bb.dataOps.read_files(file_dict, network)


I found 6070 files to load.
 80.0 % done 107.08 seconds to go. This one took 0.0910949707031
We are done

In [6]:
metric = 'stability_maps'
feat = array_dict[metric]
eucl = dist.squareform(dist.pdist(feat.T, 'euclidean'))

In [16]:
# Make a quick figure
f = plt.figure(figsize=(8, 10))
ax1 = f.add_axes([0.3, 0.71, 0.6, 0.2])
Y = clh.linkage(eucl, method='ward')
Z1 = clh.dendrogram(Y)
ax1.set_xticks([])
ax1.set_yticks([])

axm =  f.add_axes([0.3, 0.1, 0.6, 0.6])
idx = Z1['leaves']
tmp = eucl[idx, :]
D = tmp[:, idx]

a = axm.matshow(-D, aspect='auto')
axm.set_xticks([])
axm.set_yticks([])
f.colorbar(a, orientation='horizontal')
f.suptitle('Stability Maps')


Out[16]:
<matplotlib.text.Text at 0x145550d0>

In [ ]:
metric = 'rmap_part'
feat = array_dict[metric]
eucl = dist.squareform(dist.pdist(feat.T, 'euclidean'))

In [19]:
# Make a quick figure
f = plt.figure(figsize=(8, 10))
ax1 = f.add_axes([0.3, 0.71, 0.6, 0.2])
Y = clh.linkage(eucl, method='ward')
Z1 = clh.dendrogram(Y)
ax1.set_xticks([])
ax1.set_yticks([])

axm =  f.add_axes([0.3, 0.1, 0.6, 0.6])
idx = Z1['leaves']
tmp = eucl[idx, :]
D = tmp[:, idx]

a = axm.matshow(-D, aspect='auto')
axm.set_xticks([])
axm.set_yticks([])
f.colorbar(a, orientation='horizontal')
f.suptitle('RMap Part')


Out[19]:
<matplotlib.text.Text at 0x16468390>

In [20]:
metric = 'rmap_cores'
feat = array_dict[metric]
eucl = dist.squareform(dist.pdist(feat.T, 'euclidean'))
# Make a quick figure
f = plt.figure(figsize=(8, 10))
ax1 = f.add_axes([0.3, 0.71, 0.6, 0.2])
Y = clh.linkage(eucl, method='ward')
Z1 = clh.dendrogram(Y)
ax1.set_xticks([])
ax1.set_yticks([])

axm =  f.add_axes([0.3, 0.1, 0.6, 0.6])
idx = Z1['leaves']
tmp = eucl[idx, :]
D = tmp[:, idx]

a = axm.matshow(-D, aspect='auto')
axm.set_xticks([])
axm.set_yticks([])
f.colorbar(a, orientation='horizontal')
f.suptitle('RMap Cores')


Out[20]:
<matplotlib.text.Text at 0x1a53da10>

In [21]:
metric = 'dual_regression'
feat = array_dict[metric]
eucl = dist.squareform(dist.pdist(feat.T, 'euclidean'))
# Make a quick figure
f = plt.figure(figsize=(8, 10))
ax1 = f.add_axes([0.3, 0.71, 0.6, 0.2])
Y = clh.linkage(eucl, method='ward')
Z1 = clh.dendrogram(Y)
ax1.set_xticks([])
ax1.set_yticks([])

axm =  f.add_axes([0.3, 0.1, 0.6, 0.6])
idx = Z1['leaves']
tmp = eucl[idx, :]
D = tmp[:, idx]

a = axm.matshow(-D, aspect='auto')
axm.set_xticks([])
axm.set_yticks([])
f.colorbar(a, orientation='horizontal')
f.suptitle('Dual Regression')


Out[21]:
<matplotlib.text.Text at 0x1c714110>

In [ ]:
array_dict.keys()

In [ ]: