BrainHacks: PCA and K-means

In the first part of the code, PCA was applied to the data. The data was previously z-scored. This particular data was masked using the "Harvox Heschls" mask and then processed.


In [3]:
#PCA
d=np.load("dataMask2.npy")
pca = PCA(n_components=2)
pca.fit(d)
dpca=pca.transform(d)

plt.scatter(dpca[:,0], dpca[:,1], marker='o', color='b')


Out[3]:
<matplotlib.collections.PathCollection at 0x14a42128>
  • Using only two components, we can distinguish 5 perfectly separated clusters.
    The next step is to see what covariates match with these clusters.
  • We applied k-means with two-clusters to see if they may match with the musician/non-musician covariate

In [4]:
#K-means
idx, ctrs = kmeans(dpca, 2)

plt.scatter(dpca[(idx==0),0], dpca[(idx==0),1], marker='o', color='r')
plt.scatter(dpca[(idx==1),0], dpca[(idx==1),1], marker='o', color='b')
plt.scatter(ctrs[:,0], ctrs[:,1], marker='o', color='k', linewidths=5)


Out[4]:
<matplotlib.collections.PathCollection at 0x14ada4a8>

In [ ]:
#...work in pregress (Imane, do this next)