Loading spike sorting data in Python

We show here how to use the klustaviewa package to load Klusters files in Python.

Loading the data in memory

We first need to import KlustaViewa and a few other packages.


In [1]:
import pandas as pd
import klustaviewa as kv


2013-05-09 18:06:58,641  DEBUG    correlograms:17         Trying to load the compiled Cython version of the correlogramscomputations...

In [2]:
%pylab


Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline].
For more information, type 'help(pylab)'.

We specify the path to any file of the data set (.xml, .clu.x, etc.).


In [3]:
filename = 'data/test.xml'

Now we create a loader object with the filename as an argument. It offers convenient methods to access parts of the data. This line might take a while as all files are fully loaded in memory.


In [4]:
loader = kv.KlustersLoader(filename=filename)


2013-05-09 18:06:58,674  INFO     loader:683              Opening data/test.xml.

Selecting spikes

The idea is that spikes or clusters can be selected with loader.select(). Any subsequent calls are made with this selection taken into account.


In [5]:
# These are spike indices.
spikes = [10, 20, 30]
# We need to specify explicitely spikes as a keyword argument.
loader.select(spikes=spikes)

Now we can access the different data sets.


In [6]:
clusters = loader.get_clusters()  # shape=Nspikes
waveforms = loader.get_waveforms()  # shape=Nspikes x Nsamples n Channels
features = loader.get_features()  # shape=Nspikes x Nfeatures
masks = loader.get_masks()  # shape=Nspikes x Nchannels
spiketimes = loader.get_spiketimes()  # shape=Nspikes

These objects are Pandas objects (Series for 1D data, DataFrame for 2D data, Panel for 3D data). To get NumPy arrays, to kv.get_array(obj). To select a portion of the data corresponding to specific spikes, use kv.select(obj, spikes).


In [7]:
waveforms_array = kv.get_array(waveforms)
# Plot the waveforms of the first selected spike (spike #10).
plot(waveforms_array[0,...]);



In [8]:
# Showing the times of the selected spikes. The spike index is shown along with the spike time.
print(spiketimes)


10    0.05965
20    0.13390
30    0.20825

Selecting clusters

To select clusters, one can use loader.select(clusters=clusters).


In [9]:
# We need to specify explicitely clusters as a keyword argument.
loader.select(clusters=range(2, 12))
# Now we can acccess the data.
clusters = loader.get_clusters()  # shape=Nspikes
waveforms = loader.get_waveforms()  # shape=Nspikes x Nsamples n Channels
features = loader.get_features()  # shape=Nspikes x Nfeatures
masks = loader.get_masks()  # shape=Nspikes x Nchannels
masks_full = loader.get_masks(full=True)  # shape=Nspikes x Nfeatures
spiketimes = loader.get_spiketimes()  # shape=Nspikes

In [10]:
print(clusters)


2      2
4     10
5     10
9      2
10     9
13    11
15     6
23     2
24     6
27     6
29     4
30     4
32     4
34     7
40     5
...
9933     6
9936    11
9946     4
9952     4
9953     2
9954     2
9956     2
9960     2
9964     2
9965    10
9967    11
9971     3
9978     7
9984    11
9998     6
Length: 2942

Here is how to create a new Pandas DataFrame with the spike times, the clusters.


In [11]:
frame = pd.DataFrame(dict(clusters=clusters, spiketimes=spiketimes))

We can display nicely the head of the frame.


In [12]:
frame.head()


Out[12]:
clusters spiketimes
2 2 0.04160
4 10 0.04535
5 10 0.04685
9 2 0.05895
10 9 0.05965

To plot the spike times:


In [13]:
scatter(kv.get_array(spiketimes), kv.get_array(features)[:,0])
xlim(0, loader.get_duration());


To find out more possibilities with the loader, use tab completion with loader.get_<TAB>.

Computing correlograms


In [14]:
# ncorrbins is the number of bins in the correlograms, corrbin is the bin size in seconds.
correlograms = kv.compute_correlograms(spiketimes, clusters, ncorrbins=100, corrbin=.001)

In [15]:
bar(arange(100), correlograms[2, 2], ec='none');


Computing the correlation matrix


In [16]:
correlations = kv.compute_correlations(kv.get_array(features), kv.get_array(clusters), kv.get_array(masks_full))
matrix = kv.normalize(kv.get_similarity_matrix(correlations))

In [17]:
imshow(matrix, interpolation='none')


Out[17]:
<matplotlib.image.AxesImage at 0xb363358>

In [17]: