Basic IO

Mumodo Demo Notebook -- Update on 24.04.2015

Summary: This notebook describes the basic IO functions to import data from various file types

(c) Dialogue Systems Group, University of Bielefeld


In [1]:
from mumodo.mumodoIO import open_intervalframe_from_increco, open_intervalframe_from_textgrid, \
                            open_streamframe_from_xiofile, save_intervalframe_to_textgrid, \
                            save_streamframe_to_xiofile, quantize, open_intervalframe_from_increco
from mumodo.xiofile import XIOFile
from mumodo.increco import IncReco
from mumodo.InstantIO import MFVec2f
import pickle
import pandas as pd

IntervalFrames from Praat TextGrids

The high level functions allow importing Interval and Stream Frames from files, e.g.


In [2]:
transcriptions = open_intervalframe_from_textgrid("sampledata/test.TextGrid")

The function we just run returns a Python dictionary with the names of the tiers as keys and the IntervalFrames as values. One of these appears to be a PointFrame


In [3]:
transcriptions.keys()


Out[3]:
[u'CLAPS', u'S', u'O']

In [4]:
transcriptions['S']


Out[4]:
start_time end_time text
0 1.30 1.860000 Hello
1 2.88 3.500000 I 'm Spyros
2 4.86 8.280000 Here in the Dialogue Systems Group, in the Uni...
3 8.50 10.400000 We have developed Mumodo, and Venice
4 11.58 11.840000 <CLAP>
5 14.10 17.220000 Well, right now we are being recorded by a cam...
6 17.54 18.840314 and a Microsoft Kinect sensor
7 19.30 21.100000 But how will we get the data from Kinect?
8 27.70 31.480000 We are using this timecode to synchronize the ...
9 31.62 32.860000 With the audio and video
10 33.76 34.000000 <CLAP>
11 37.40 41.300000 We can process the data that comes from Venice...
12 47.60 47.820000 <CLAP>
13 48.94 49.660000 Goodbye

In [5]:
transcriptions['CLAPS']


Out[5]:
time mark
0 11.654230 First Clap
1 33.824485 Second Clap
2 47.672685 Third Clap

In order to save back into a textgrid, we use a similar function, packing our Interval and PointFrames in a dict. The line below will save a Praat textgrid with two copies of the 'CLAPS' tier


In [6]:
save_intervalframe_to_textgrid({'the_claps': transcriptions['CLAPS'],
                                'the_claps_copy': transcriptions['CLAPS']}, 'newtextgrid.TextGrid')

StreamFrames from XIO Files

StreamFrames can be loaded from XIO files (see below). It is important to know the sensorname (second argument) as data from many sensors can be stored in a single XIO file. See below (in the XIO files section) how to find out these sensornames.


In [7]:
mystreamframe1 = open_streamframe_from_xiofile('sampledata/test.xio.gz', 'VeniceHubReplay/Venice/Body1')
mystreamframe2 = open_streamframe_from_xiofile('sampledata/test.xio.gz', 'VeniceHubReplay/Kinect/Face')


opening compressed file ...
opening file without indexing
opening compressed file ...
opening file without indexing

In [8]:
mystreamframe1[:1]


Out[8]:
JointPositions1 JointPositions2 JointPositions3 JointPositions4 JointPositions5 JointPositions6 time
0 [] [] [0.957876 -0.152858 1.7562, 0.945315 0.162776 ... [-0.345256 -0.750549 0.922279, -0.359958 -0.48... [] [] 1429192624579

In [9]:
mystreamframe2[-1:]


Out[9]:
FaceBoxBottomRight FaceBoxTopLeft FaceEyeLeft FaceEyeRight FaceMouthLeftCorner FaceMouthRightCorner FaceNose FaceProperties FaceRotation time
67160 [0.0 0.0, 0.0 0.0] [0.0 0.0, 0.0 0.0] [0.0 0.0, 0.0 0.0] [0.0 0.0, 0.0 0.0] [0.0 0.0, 0.0 0.0] [0.0 0.0, 0.0 0.0] [0.0 0.0, 0.0 0.0] [11111111, 11111111] [0.0 0.0 0.0 0.0, 0.0 0.0 0.0 0.0] 1429192691739

StreamFrames can be saved back to XIO files as follows. Note that we use exactly the same sensornames, although this is optional.

NOTE: The resulting XIO file will NOT be the same as the input file, due to quantization, see below


In [10]:
#Running the line below creates a new 1.5 MB file and takes about 20 seconds
save_streamframe_to_xiofile({'VeniceHubReplay/Kinect/Face': mystreamframe2,
                             'VeniceHubReplay/Venice/Body1': mystreamframe1},
                            'newxiofile.xio.gz')

Other Options to saving Interval and Stream Frames

Instead of saving a StreamFrame to an XIO file or an IntervalFrame to a TextGrid, there are also the following options:

  • Save/Load the Stream or Interval Frame as a CSV (thank you pandas!)
  • Save/Load the Stream or Interval Frame using pickle

Both of these methods are faster than saving/loading to XIO files, so they are particularly helpful for StreamFrames

When saving/loading from CSV, all objects (except maybe primitives such as floats) are turned into strings , that is, you have to parse the data into objects yourself. This is not so much of a problem for IntervalFrames, which can be safely saved into CSV files.


In [11]:
mystreamframe2.to_csv( "streamframeas.csv" )
streamframefromcsv = pd.DataFrame.from_csv("streamframeas.csv")
assert (mystreamframe2['time'] == streamframefromcsv['time']).all()

In [12]:
#the StreamFrame loaded from disk has type of str for this cell
type(mystreamframe2['FaceEyeLeft'].ix[66199]), type(streamframefromcsv['FaceEyeLeft'].ix[66199])


Out[12]:
(mumodo.InstantIO.MFVec2f, str)

In [13]:
#reconstruct the objects in the column
streamframefromcsv['FaceEyeLeft'] = streamframefromcsv['FaceEyeLeft'].map(lambda x: MFVec2f(x))
#check again
type(mystreamframe2['FaceEyeLeft'].ix[66199]), type(streamframefromcsv['FaceEyeLeft'].ix[66199])


Out[13]:
(mumodo.InstantIO.MFVec2f, mumodo.InstantIO.MFVec2f)

When saving/loading using the pickle module, the objects are preserved


In [14]:
pickle.dump(mystreamframe2, open( "pickledstreamframe", "wb" ) )
unpickledstreamframe = pickle.load( open( "pickledstreamframe", "rb" ) )
assert (mystreamframe2['time'] == unpickledstreamframe['time']).all()

In [15]:
type(mystreamframe2['FaceEyeLeft'].ix[66199]), type(streamframefromcsv['FaceEyeLeft'].ix[66199])


Out[15]:
(mumodo.InstantIO.MFVec2f, mumodo.InstantIO.MFVec2f)

However, pickle/unpickle needs you to have the object defined identically in the two Python sessions that you save/load. See the pickle module documentation for details

XIOFiles

NOTE: The XIO format is inherited into mumodo from the FAME software developed by the AI group, university of Bielefeld and is also used by the venice.hub software, developed by the Dialogue Systems Group. Although the venice.hub format is simpler and newer, mumodo is fully backwards compatible with the original XIO format.

NOTE: Most of the time you will not have to deal with XIO files directly, but it is good to know a little bit about their structure

XIO files are XML files that contain typed, timed events (TTE). The XIOFile class handles such files, e.g. to read an existing XIO file:


In [16]:
f = XIOFile("sampledata/test.xio.gz", 'r')


opening compressed file ...
opening file without indexing

Let's have a look at a few of the raw lines in the file. Note that 0, 1 are times in milliseconds relative to the start of the file. Because it is very verbose and consumes a lot of disk space, it is typically compressed by means of the gzip module, hence the gz extension. Mumodo can handle both compressed and uncompressed files. This is what the data in the file looks like


In [17]:
for line in f.xio_quicklinegen(0, 1, parsed=False):
    print line


<mfvec3f value="[-0.345256 -0.750549 0.922279, -0.359958 -0.484651 0.837801, -0.365955 -0.2177 0.739759, -0.429324 -0.0832201 0.740274, -0.458221 -0.201857 0.742501, -0.533512 -0.444521 0.793084, -0.390071 -0.300809 0.549919, -0.398528 -0.205008 0.63399, -0.279814 -0.299241 0.805644, -0.40141 -0.149542 0.542428, -0.437572 0.0808716 0.556279, -0.451535 0.156582 0.563531, -0.437481 -0.722323 0.931979, -0.391703 -1.02876 0.989874, -0.426539 -0.604366 0.782481, -0.321543 -0.567326 0.807825, -0.23228 -0.736532 0.860271, -0.110944 -0.445889 1.03068, -0.362375 -0.460863 0.802363, -0.382999 -0.562574 0.700017, -0.366035 -0.2841 0.766712, -0.412893 -0.154321 0.615675, -0.409121 -0.178938 0.586998, -0.45652 0.190995 0.533486, -0.437145 0.152636 0.547568]" timestamp="1429192624579" sensorName="VeniceHubReplay/Venice/Body1/JointPositions4"/>

<mfvec3f value="[]" timestamp="1429192624580" sensorName="VeniceHubReplay/Venice/Body1/JointPositions6"/>

<mfvec3f value="[]" timestamp="1429192624580" sensorName="VeniceHubReplay/Venice/Body1/JointPositions5"/>

<mfvec2f value="[0.0 0.0, 0.0 0.0]" timestamp="1429192624580" sensorName="VeniceHubReplay/Kinect/Face/FaceEyeRight"/>

Each of these lines represents a typed, timed event (TTE). That is, each event has a timestamp, a type, a value and a sensorname/fieldname combo. The latter part is parsed as follows. The last part of the sensorname attribute (anything after the last "/" becomes the fieldname (and eventually a column in an imported stream frame) while the rest is the sensorname itself. Here are the same lines as above, but now they are parsed. We see that already the strings have been parsed into basic InstantIO objects (see Basic Types demo notebook)


In [18]:
for line in f.xio_quicklinegen(0, 1):
    print line


{'valuetype': 'mfvec3f', 'fieldname': 'JointPositions4', 'sensorname': 'VeniceHubReplay/Venice/Body1', 'value': <mumodo.InstantIO.MFVec3f object at 0x10f629710>, 'time': 1429192624579}
{'valuetype': 'mfvec3f', 'fieldname': 'JointPositions6', 'sensorname': 'VeniceHubReplay/Venice/Body1', 'value': <mumodo.InstantIO.MFVec3f object at 0x11098a710>, 'time': 1429192624580}
{'valuetype': 'mfvec3f', 'fieldname': 'JointPositions5', 'sensorname': 'VeniceHubReplay/Venice/Body1', 'value': <mumodo.InstantIO.MFVec3f object at 0x112fd5b10>, 'time': 1429192624580}
{'valuetype': 'mfvec2f', 'fieldname': 'FaceEyeRight', 'sensorname': 'VeniceHubReplay/Kinect/Face', 'value': <mumodo.InstantIO.MFVec2f object at 0x112fd5b50>, 'time': 1429192624580}

Useful XIO commands

Two typical actions that you may need to do on XIO files are:

  • finding out the minimum timestamp of an XIO file
  • finding out all possible sensornames that can be imported as streamframes

The first is very easy, here is an one-liner:


In [19]:
XIOFile("sampledata/test.xio.gz", "r").min_time


opening compressed file ...
opening file without indexing
Out[19]:
1429192624579

The second one, little bit trickier, requires opening the XIO file in indexed mode, which is a debug mode that pre-parses the entire file. In order to limit this, we only parse a part of the file (1000 lines), if we know that data from all the sensors we want has been logged in this first part:


In [20]:
XIOFile("sampledata/test.xio.gz", indexing=True, maxlines=1000).fieldnames.keys()


opening compressed file ...
indexing ...
unable to parse line  2   <venice>

done! (indexed 1000 lines)
Out[20]:
['', 'VeniceHubReplay/Venice/Body1', 'VeniceHubReplay/Kinect/Face']

Quantization

When importing a StreamFrame from an XIOFile, you convert this sequential, asynchronous event stream into a table. It is quite common that data from the same sensor is logged asybchronously, with slightly different timestamps (this is in part due to the way the venice.hub and other loggers work). For example, data from the same sensor could arrive at the following timestamps (which have been made relative to min_time for convenience)


In [21]:
timestamps = []
for line in f.xio_quicklinegen(5000, 5300):
    if 'Body' in line['sensorname']:
        timestamps.append(line['time'] - f.min_time)
print timestamps


[5015, 5016, 5016, 5016, 5016, 5016, 5050, 5050, 5050, 5050, 5050, 5050, 5082, 5083, 5083, 5083, 5083, 5083, 5115, 5116, 5116, 5116, 5116, 5116, 5149, 5150, 5150, 5150, 5150, 5150, 5182, 5183, 5183, 5183, 5183, 5183, 5215, 5216, 5216, 5217, 5217, 5217, 5249, 5249, 5249, 5249, 5249, 5249, 5281, 5282, 5282, 5282, 5282, 5282]

we notice that the sensor outputs 6 values roughly every ~33 ms, which corresponds to 30 frames per second (fps). If we allowed each timestamp to become a row in a StreamFrame, we would have many empty cells. In order to avoid this, we quantize the data as follows:

When a new event is parsed (e.g. at time 5015) a window is opened for 5 ms (configurable) and all events received thereafter are added to the same frame. This is the job of the quantize() function.

See how the objects at the timestamps above are packed into dictionaries below. Each dictionary is a "frame". The original timestamp is shown for each frame. It corresponds to the time of the first event in the XIO file that was added to this frame, i.e. the start of the window. But, in reality, some of the field values might have been logged up to 5 ms later.

As a result, the timestamps of the individual events are "quantized" into the times of the frames. When writing this data back to an XIO file, the events will have these new timestamps instead of their original ones.


In [22]:
list(quantize(f.xio_quicklinegen(5000, 5300), 'VeniceHubReplay/Venice/Body1'))


Out[22]:
[{'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x112fd5cd0>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x10fecebd0>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x112fd5e90>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x112fd5a50>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x10f6299d0>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x10f629950>,
  'time': 1429192629594},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x10fece710>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x10fecea10>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x112fd5d50>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x10fece550>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x10fece790>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x10fece650>,
  'time': 1429192629629},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x11098a990>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x11098a6d0>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x112fd5f10>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x10feceb90>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x11098a8d0>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x11098a710>,
  'time': 1429192629661},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x102f3ebd0>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x102f3e490>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x10fece1d0>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x102f3ec10>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x102f3e510>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x102f3eb90>,
  'time': 1429192629694},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x112fd5f90>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x112fd5b10>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x102f3f350>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x112fd5ed0>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x112fd5d90>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x112fd5c10>,
  'time': 1429192629728},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x10fece810>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x10fece2d0>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x112fd5c50>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x102f3e4d0>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x10fece910>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x10fece250>,
  'time': 1429192629761},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x10f6128d0>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x10f612150>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x102f3f310>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x10f612f90>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x10f612890>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x10f612090>,
  'time': 1429192629794},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x10f609dd0>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x102f3ec50>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x10f609d50>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x10f609650>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x10f609d90>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x10f609d10>,
  'time': 1429192629828},
 {'JointPositions1': <mumodo.InstantIO.MFVec3f at 0x112fd5fd0>,
  'JointPositions2': <mumodo.InstantIO.MFVec3f at 0x112fd5dd0>,
  'JointPositions3': <mumodo.InstantIO.MFVec3f at 0x112fd5f50>,
  'JointPositions4': <mumodo.InstantIO.MFVec3f at 0x112fd5d10>,
  'JointPositions5': <mumodo.InstantIO.MFVec3f at 0x112fd5e50>,
  'JointPositions6': <mumodo.InstantIO.MFVec3f at 0x112fd5c90>,
  'time': 1429192629860}]

Inc_Reco files

inc_recos are special files that are useful within the Incremental Unit (IU) framework.

They store information about units at different update times. Automatic Speech Recognition (ASR) results that are output incrementally can be stored in these files. Each update time is accompanied by a "chunk" of the output (the output at that time).

Here is what they look like:


In [23]:
with open('sampledata/test.inc_reco') as f:
    lines = 0
    while lines < 22:
        print f.readline()[:-1]
        lines += 1


Time: 2.00
1.888250	2.044875	oh

Time: 2.20
1.888250	2.044875	oh
2.044875	2.111625	i
2.111625	2.194750	don't

Time: 2.80
1.888250	2.044875	oh
2.044875	2.111625	i
2.111625	2.194750	don't
2.479500	2.537625	i
2.537625	2.657625	don't
2.657625	2.747625	know
2.747625	2.837625	i

Time: 2.95
1.888250	2.044875	oh
2.044875	2.111625	i
2.111625	2.194750	don't
2.479500	2.537625	i

But the IncReco class handles these files nicely, e.g.


In [24]:
myincreco = IncReco("sampledata/test.inc_reco")

In [25]:
#get all the update times
print myincreco.get_times()


[2.0, 2.2, 2.8, 2.95, 3.15, 3.45, 3.7, 4.11, 4.42, 4.59, 5.44]

In [26]:
#get the latest chunk at a specific time
myincreco.get_latest_chunk(5)


Out[26]:
{'Chunk': [['1.888250', '2.044875', 'oh'],
  ['2.044875', '2.111625', 'i'],
  ['2.111625', '2.194750', "don't"],
  ['2.479500', '2.537625', 'i'],
  ['2.537625', '2.657625', "don't"],
  ['2.657625', '2.747625', 'know'],
  ['2.747625', '2.837625', 'i'],
  ['2.837625', '2.927625', 'had'],
  ['2.927625', '2.957625', 'a'],
  ['2.957625', '3.107625', 'little'],
  ['3.107625', '3.197625', 'bit'],
  ['3.197625', '3.307625', 'more'],
  ['3.307625', '3.507625', 'time'],
  ['3.507625', '3.577625', 'to'],
  ['3.577625', '3.827625', 'think'],
  ['3.827625', '3.987625', 'about'],
  ['3.987625', '4.047625', 'it'],
  ['4.047625', '4.077625', 'i'],
  ['4.077625', '4.197625', 'was'],
  ['4.197625', '4.517625', 'thinking'],
  ['4.517625', '4.727625', 'of'],
  ['4.727625', '5.037625', 'like']],
 'Time': 4.59}

In [27]:
#get the very last chunk -> final output
myincreco.get_last_chunk()


Out[27]:
{'Chunk': [['1.888250', '2.044875', 'oh'],
  ['2.044875', '2.111625', 'i'],
  ['2.111625', '2.194750', "don't"],
  ['2.479500', '2.537625', 'i'],
  ['2.537625', '2.657625', "don't"],
  ['2.657625', '2.747625', 'know'],
  ['2.747625', '2.837625', 'i'],
  ['2.837625', '2.927625', 'had'],
  ['2.927625', '2.957625', 'a'],
  ['2.957625', '3.107625', 'little'],
  ['3.107625', '3.197625', 'bit'],
  ['3.197625', '3.307625', 'more'],
  ['3.307625', '3.507625', 'time'],
  ['3.507625', '3.577625', 'to'],
  ['3.577625', '3.827625', 'think'],
  ['3.827625', '3.987625', 'about'],
  ['3.987625', '4.047625', 'it'],
  ['4.047625', '4.077625', 'i'],
  ['4.077625', '4.197625', 'was'],
  ['4.197625', '4.517625', 'thinking'],
  ['4.517625', '4.727625', 'of'],
  ['4.727625', '5.037625', 'like'],
  ['5.037625', '5.324375', 'uh']],
 'Time': 5.44}

In addition, you can import an IncReco such as the above as a dictionary of IntervalFrames (one for each chunk):


In [28]:
myrecodict = open_intervalframe_from_increco('sampledata/test.inc_reco')
print myrecodict.keys()


['4.11', '3.45', '5.44', '2.95', '3.15', '4.42', '3.7', '2.8', '4.59', '2.2', '2.0']

In [29]:
#display the final output intervalframe
myrecodict['5.44']


Out[29]:
start_time end_time text
0 1.888250 2.044875 oh
1 2.044875 2.111625 i
2 2.111625 2.194750 don't
3 2.479500 2.537625 i
4 2.537625 2.657625 don't
5 2.657625 2.747625 know
6 2.747625 2.837625 i
7 2.837625 2.927625 had
8 2.927625 2.957625 a
9 2.957625 3.107625 little
10 3.107625 3.197625 bit
11 3.197625 3.307625 more
12 3.307625 3.507625 time
13 3.507625 3.577625 to
14 3.577625 3.827625 think
15 3.827625 3.987625 about
16 3.987625 4.047625 it
17 4.047625 4.077625 i
18 4.077625 4.197625 was
19 4.197625 4.517625 thinking
20 4.517625 4.727625 of
21 4.727625 5.037625 like
22 5.037625 5.324375 uh