Mumodo Demo Notebook - Updated on 24.04.2015
Summary: This notebook showcases working with real tracking data. In particular, data from a Kinect V2 sensor is imported and analyzed
(c) Dialogue Systems Group, University of Bielefeld
In [1]:
%matplotlib inline
import math
import matplotlib.pyplot as plt
from mumodo.mumodoIO import open_streamframe_from_xiofile, open_intervalframe_from_textgrid
from mumodo.plotting import plot_scalar, plot_annotations
from mumodo.analysis import create_intervalframe_from_streamframe, convert_times_of_tier
We are going to import data recorded with a Microsoft Kinect V2 for Windows sensor from an XIO file
In [2]:
KinectData = open_streamframe_from_xiofile("sampledata/test.xio.gz", 'VeniceHubReplay/Venice/Body1')
In [3]:
KinectData[:2]
Out[3]:
The sensor can record data for up to 6 bodies, but it has recorded data from two only. So why not keep only those two?
In [4]:
skeletons = KinectData.ix[:, ['JointPositions3', 'JointPositions4']]
skeletons[:2]
Out[4]:
Commonly we need to look for (and drop) not-a-numbers:
In [5]:
print len(skeletons), len(skeletons.dropna())
skeletons.dropna(inplace=True)
Each cell in the table has a snapshot of the whole skeleton (25 joints) at that moment in time.
In [6]:
set(skeletons['JointPositions3'].map(lambda x: len(x))), set(skeletons['JointPositions4'].map(lambda x: len(x)))
Out[6]:
We can decode the cells of each table into individual joints using the following input from the sensor's documentation
From http://msdn.microsoft.com/en-us/library/microsoft.kinect.kinect.jointtype.aspx
typedef enum _JointType
{
JointType_SpineBase = 0,
JointType_SpineMid = 1,
JointType_Neck = 2,
JointType_Head = 3,
JointType_ShoulderLeft = 4,
JointType_ElbowLeft = 5,
JointType_WristLeft = 6,
JointType_HandLeft = 7,
JointType_ShoulderRight = 8,
JointType_ElbowRight = 9,
JointType_WristRight = 10,
JointType_HandRight = 11,
JointType_HipLeft = 12,
JointType_KneeLeft = 13,
JointType_AnkleLeft = 14,
JointType_FootLeft = 15,
JointType_HipRight = 16,
JointType_KneeRight = 17,
JointType_AnkleRight = 18,
JointType_FootRight = 19,
JointType_SpineShoulder = 20,
JointType_HandTipLeft = 21,
JointType_ThumbLeft = 22,
JointType_HandTipRight = 23,
JointType_ThumbRight = 24,
JointType_Count = (JointType_ThumbRight+1)
}
In [7]:
#Create new columns for some of the joints we are interested in|
skeletons['HandRight3'] = skeletons['JointPositions3'].map(lambda x: x[11])
skeletons['HandRight4'] = skeletons['JointPositions4'].map(lambda x: x[11])
During the demo video included in sample data, the two people do a "high-five" clap 3 times, using their right hands. We would like to measure the distance between their hands during this joint gesture.
In [8]:
clap_times = open_intervalframe_from_textgrid("sampledata/test.TextGrid")['CLAPS']
clap_times
Out[8]:
We want to see the distance before and after the clap instant, so let's turn this into an IntervalFrame instead:
In [9]:
context = 2 #duration before and after the clap
clap_times['start_time'] = clap_times['time'] - context
clap_times['end_time'] = clap_times['time'] + context
del clap_times['time']
clap_times['text'] = clap_times['mark']
del clap_times['mark']
clap_times = clap_times.ix[:, ['start_time', 'end_time', 'text']]
clap_times
Out[9]:
In addition, we need to offset the tracking data in order to synchronize it with these times. See the notebook "ComputingOffset" for more details
In [10]:
skeletons.index -= 9616
Next we define a function to copute distance (or we can use the one from scipy or numpy)
In [11]:
def euclidean_distance(a, b):
""" compute the euclidean distance between two SFVec3f
"""
return math.sqrt( (a.x - b.x) ** 2 + (a.y - b.y) ** 2 + (a.z - b.z) ** 2)
In [12]:
#create and populate the new column
skeletons['HandDistance'] = skeletons['HandRight3']
for i in skeletons.index:
skeletons['HandDistance'].ix[i] = euclidean_distance(skeletons['HandRight3'].ix[i],
skeletons['HandRight4'].ix[i])
In [13]:
plot_scalar(skeletons, ['HandDistance'])
we can plot the data only around clap episodes
In [14]:
episode_no = 2 #this the index of the interval in the IntervalFrame with clap episodes
start = int(1000 * clap_times['start_time'].ix[episode_no]) #convert start and end times to ms
end = int(1000 * clap_times['end_time'].ix[episode_no])
plot_scalar(skeletons, ['HandDistance'], start, end)
Conversely, we can try to locate the claps by the distance measure, e.g.
In [15]:
detected_claps = create_intervalframe_from_streamframe(skeletons, 'HandDistance', lambda x: x < 0.2, 40)
convert_times_of_tier(detected_claps, lambda x: float(x) / 1000) #convert times to ms
detected_claps
Out[15]:
In [16]:
#plot the detected claps as well as the annotated claps
plot_annotations({'annotated': open_intervalframe_from_textgrid("sampledata/test.TextGrid")['CLAPS'],
'detected': detected_claps}, linespan = 10, hscale=2)