1. Imagined movement with Psychic

In this tutorial we will look at imagined movement. Our movement is controlled in the motor cortex where there is an increased level of mu activity (8–12 Hz) when we perform movements. This is accompanied by a reduction of this mu activity in specific regions that deal with the limb that is currently moving. This decrease is called Event Related Desynchronization (ERD). By measuring the amount of mu activity at different locations on the motor cortex, we can determine which limb the subject is moving. This effect also occurs when the subject is not actually moving his limbs, but merely imagining it.

Obtaining the data

The dataset for this tutorial is provided by the fourth BCI competition, which you will have to download youself. First, go to http://www.bbci.de/competition/iv/#download and fill in your name and email address. An email will be sent to you automatically containing a username and password for the download area.

Download Data Set 1, from Berlin, the 100Hz version in MATLAB format: http://bbci.de/competition/download/competition_iv/BCICIV_1_mat.zip and unzip it in a subdirectory called 'data_set_IV'. This subdirectory should be inside the directory in which you've stored the tutorial files.

Description of the data

If you've followed the instructions above, the following code should load the data:

First we load the BCI competition data, assuming it still exists at the location it was downloaded in the previous tutorial. It is stored in a Psychic dataset.



In [45]:

    
#%reset
import psychic
import numpy as np
import scipy.io

m1 = scipy.io.loadmat('data_set_IV/BCICIV_calib_ds1a.mat', struct_as_record=True)

data = np.c_[m1['cnt'].T, m2['cnt'].T]
nchannels, nsamples = data.shape

event_onsets = m1['mrk'][0][0][0]
event_codes = m1['mrk'][0][0][1]
labels = np.zeros((1, nsamples), int)
labels[0, event_onsets] = event_codes

sample_rate = m['nfo']['fs'][0][0][0][0]
ids = np.arange(nsamples) / float(sample_rate)

feat_lab = [s[0].encode('utf8') for s in m['nfo']['clab'][0][0][0]]

d = psychic.DataSet(data, labels, ids, feat_lab=feat_lab)
print d









    



DataSet with 431152 instances, 59 features [59], 3 classes: [100, 430952, 100], extras: []

This dataset represents 431152 samples of ongoing EEG data. Each sample is called an instance and has 59 features: one for each channel. Like in the previous tutorial, the data is sliced into trials by defining a marker dictionary containing class labels for each event code:



In [48]:

    
trials = psychic.nodes.Slice({-1: 'right', 1:'foot'}, (0.5, 2.5)).train_apply(d)
print trials









    



DataSet with 200 instances, 11800 features [59x200], 2 classes: [100, 100], extras: []

Each instance is now a trial, so trials.data is 3-dimensional: channels x samples x trials. trials.Y contains the class labels: there are 140 trials of each class. Again, check back with tutorial 2 if this founds unfamiliar.

Much of the functionality in Psychic comes in the form of 'nodes'. A node is a class with two methods: train and apply. The train method will configure the node based on some sample data. The apply method will apply the transformation the node was designed to do on some data.

The first step is to use a bandpass filter on the data, filtering the signal in the 8-15Hz range. This is done by constructing a psychic.nodes.Filter node, that takes a filter design function as parameter, or one of the convenience wrappers for it that implement common filters. Next we call the train method of the node that uses the filter design function to design the filter. Finally, we can use apply to actually filter the data.



In [70]:

    
# Construct a filter node, a Butterworth filter in this case
filt = psychic.nodes.Butterworth(6, (8, 15))

# Use the filter design on the data, the result is stored internally
filt.train(trials)

# Now the filter is ready for use
trials_filt = filt.apply(trials)
print trials_filt









    



DataSet with 200 instances, 11800 features [59x200], 2 classes: [100, 100], extras: []

before moving on, we need to split the data into a train and a test set. Remember that we calculate the CSP and train the classifier on the training data and check the performance on the test data.

Psychic offers the psychic.cv.strat_splits function that makes splitting data easy. It will create so called splits containing randomly chosen trials, while making sure that each split contains an equal amount of trials for each class. Below, the data is split into two splits:



In [50]:

    
train, test = golem.cv.strat_splits(trials_filt, 2)

print train
print test









    



DataSet with 100 instances, 11800 features [59x200], 2 classes: [50, 50], extras: []
DataSet with 100 instances, 11800 features [59x200], 2 classes: [50, 50], extras: []

Note that both the train and the test set have an equal number of trials belonging to each class (50).

Recall from the previous tutorial the steps we performed to classify the trials:

Calculate the CSP and drop everything but the first and last CSP component
Calculate the logvar of those components
Classify the trial using Linear Discriminant Analysis

Nodes can be linked together forming a chain. Conveniently, scikit-learn nodes can be included as well. The code below constructs three nodes that implement the steps above and links them up as a chain:



In [65]:

    
from sklearn.lda import LDA

pipeline = psychic.nodes.Chain([
    # Only use the 2 most useful CSP components (first and last)
    psychic.nodes.spatialfilter.CSP(2), 
    
    # ApplyOverInstances applies the given function to the instances. We give it a function that calculates
    # the logvar of each component
    psychic.nodes.ApplyOverInstances(lambda x: np.log(np.var(x, axis=1))), # 
    
    # Finally the Linear Discriminant Analysis
    LDA(),
])

# Train the pipeline using the training data
pipeline.train(train)

# Apply the pipeline on the test data
result = pipeline.apply(test)
print result









    



DataSet with 100 instances, 2 features [2], 2 classes: [50, 50], extras: []

result.data contains the output of the LDA: the predicted class labels. result.labels still contains the true class labels. The code below uses this to calculate the accuracy and the confusion matrix:



In [66]:

    
print psychic.perf.accuracy(result)
psychic.perf.conf_mat(result)









    



0.83






    Out[66]:





array([[ 37.,  13.],
       [  4.,  46.]])

The result may vary every time you run the code, since the train and test set are random splits of the data. It should however be around 80%.