Managing Corpora and Resources with mumodo

Mumodo Demo Notebook - Updated on 25.04.2015

Summary: This notebook describes how to create and use mumodo files, which automate data import and create resources objects that abstract away from

(c) Dialogue Systems Group, University of Bielefeld


In [1]:
import mumodo.corpus as cp
#import pandas as pd
from IPython.display import Audio, HTML

Working with mumodo objects (IntervalFrames, StreamFrames, etc) is easy, but mumodo goes beyond that. It offers a way of managing multimodal corpora as objects that have resources. Resources abstract away from files, so that you do not need to look around for files when doing analysis. In addition to easy access to multimodal resources, mumodo files are a great way to organize corpora by keeping track of all data and metadata files.

Loading mumodos and accessing resources

NOTE: Mumodo uses moviepy (which is based on ffmpeg) to handle audio and video data, as well as PIL (Python Image Library) to handle images. Refer to the documentation of those packages for more information

Let's load a mumodo!


In [2]:
mumodofromfile = cp.read_mumodo_from_file('sampledata/test.mumodo')
mumodofromfile


Out[2]:
[Mumodo
 name: A test corpus
 description: Two people clapping their hands three times
 url: possible.url.goes.gere
 localpath:                 sampledata/
 files: ['test.mp4', 'test.wav', 'testimage.png', 'test.xio.gz', 'test.TextGrid', 'test.eaf']
 ID:                 None
 resources: ['clap_points', 'image', 'transcriptionO', 'kinect_body', 'EAF', 'transcriptionS', 'video', 'audio'])]

The function returns a list (with only one item in this case) of mumodos contained in the file we opened. This is the mumodo we need. We can already see the names of the resources it contains


In [3]:
mymumodo = mumodofromfile[0]
mymumodo.get_resource_names()


Out[3]:
['EAF',
 'audio',
 'clap_points',
 'image',
 'kinect_body',
 'transcriptionO',
 'transcriptionS',
 'video']

Using these names (that must be unique) one can retrieve a resource object


In [4]:
myresource = mymumodo['transcriptionS']
myresource


Out[4]:
TextGridTierResource
name: transcriptionS
description: transcription of speaker S
filename: test.TextGrid
units: seconds
tiername: S

mumodos are also iterable, if one needs to get all resources


In [5]:
#NOTE: uncomment to see long text
#for resource in mymumodo:
#    print resource

resources have sensible methods, e.g.


In [6]:
myresource.get_tier()


Out[6]:
start_time end_time text
0 1.30 1.860000 Hello
1 2.88 3.500000 I 'm Spyros
2 4.86 8.280000 Here in the Dialogue Systems Group, in the Uni...
3 8.50 10.400000 We have developed Mumodo, and Venice
4 11.58 11.840000 <CLAP>
5 14.10 17.220000 Well, right now we are being recorded by a cam...
6 17.54 18.840314 and a Microsoft Kinect sensor
7 19.30 21.100000 But how will we get the data from Kinect?
8 27.70 31.480000 We are using this timecode to synchronize the ...
9 31.62 32.860000 With the audio and video
10 33.76 34.000000 <CLAP>
11 37.40 41.300000 We can process the data that comes from Venice...
12 47.60 47.820000 <CLAP>
13 48.94 49.660000 Goodbye

Most resources have a slice() and a show() method, which do what you expect:


In [7]:
mymumodo['clap_points'].get_slice(10,12)


Out[7]:
time mark
0 11.65423 First Clap

In [8]:
mymumodo['image'].show()

In [9]:
#this prints the DataFrame, which does not look good in IPython, b
#but is needed in other interpreters. You can use get_tier() instead
#as shown above
mymumodo['clap_points'].show()


        time         mark
0  11.654230   First Clap
1  33.824485  Second Clap
2  47.672685   Third Clap

running show() on a resource such as Audio and Video causes it to be played by an external player. The IPython will be busy until the external player exits


In [10]:
myvideo = mymumodo['video']
myvideo.set_player('/Set/Path/To/A/Player/')
myvideo.show()


Out[10]:
"I couldn't play"

Instead, one can use IPython functionality to display the AV data in a notebook


In [11]:
Audio(mymumodo['audio'].get_filepath())


Out[11]:

In [12]:
def show_html_video(fname, mimetype):
    """Load the video in the file `fname`, with given mimetype, and display as HTML5 video.
    """
    video_encoded = open(fname, "rb").read().encode("base64")
    video_tag = '<video controls alt="test" src="data:video/{0};base64,{1}">'.format(mimetype, video_encoded)
    return HTML(data=video_tag)

In [13]:
show_html_video(mymumodo['video'].get_filepath(), 'mp4')


Out[13]:

See below for a comprehensive overview of supported resource types

Creating mumodo files

First let as look what a mumodo file looks like:


In [14]:
print cp.serialize_mumodo(mymumodo)


!Mumodo
ID: null
description: Two people clapping their hands three times
files:
- test.mp4
- test.wav
- testimage.png
- test.xio.gz
- test.TextGrid
- test.eaf
localpath: sampledata/
name: A test corpus
parent: null
url: possible.url.goes.gere

!Resource
description: the ELAN file accompanying this data
filename: test.eaf
name: EAF
units: null

!AudioResource
channel: null
description: Audio of interaction
filename: test.wav
name: audio
player: null
units: seconds

!TextGridTierResource
description: point tier with clap events
filename: test.TextGrid
name: clap_points
tiername: CLAPS
units: seconds

!ImageResource
description: Image of interaction
filename: testimage.png
name: image
units: null

!XIOStreamResource
description: Kinect Data of interaction
filename: test.xio.gz
kwargs:
  timestamp_offset: -9616
name: kinect_body
sensorname: VeniceHubReplay/Venice/Body1
units: ms

!TextGridTierResource
description: transcription of speaker O
filename: test.TextGrid
name: transcriptionO
tiername: O
units: seconds

!TextGridTierResource
description: transcription of speaker S
filename: test.TextGrid
name: transcriptionS
tiername: S
units: seconds

!VideoResource
channel: null
description: Video of interaction
filename: test.mp4
name: video
player: /Set/Path/To/A/Player/
units: seconds


There are two ways to create a mumodo file.

  • Type it! This works well with small corpora with few resources.
  • Programmatically as shown below. This is useful for large corpora with repeated file structures

In [15]:
#create a mumodo
programmed_mumodo = cp.Mumodo(name='A test corpus',
                   description='Two people clapping their hands three times',
                   url='possible.url.goes.gere',
                   localpath='sampledata/')

In [16]:
#Populate a list of resources
programmed_resources = []
programmed_resources.append(cp.VideoResource(name='video',
                            description='Video of interaction',
                            filename='test.mp4',
                            units='seconds',
                            player='/Set/Path/To/A/Player/'))
programmed_resources.append(cp.AudioResource(name='audio',
                            description='Audio of interaction',
                            filename='test.wav',
                            units='seconds'))
programmed_resources.append(cp.ImageResource(name='image',
                            description='Image of interaction',
                            filename='testimage.png'))
programmed_resources.append(cp.XIOStreamResource(name='kinect_body',
                            description='Kinect Data of interaction',
                            filename='test.xio.gz',
                            units='ms',
                            sensorname='VeniceHubReplay/Venice/Body1',
                            kwargs={'timestamp_offset': -9616}))
programmed_resources.append(cp.TextGridTierResource(name='transcriptionO',
                              description='transcription of speaker O',
                              filename='test.TextGrid',
                              units='seconds',
                              tiername='O'))
programmed_resources.append(cp.TextGridTierResource(name='transcriptionS',
                              description='transcription of speaker S',
                              filename='test.TextGrid',
                              units='seconds',
                              tiername='S'))
programmed_resources.append(cp.TextGridTierResource(name='clap_points',
                              description='point tier with clap events',
                              filename='test.TextGrid',
                              units='seconds',
                              tiername='CLAPS'))
programmed_resources.append(cp.Resource(name='EAF',
                            description='the ELAN file accompanying this data',
                            filename='test.eaf'))

In [17]:
#add the resources to the mumodo!
for resource in programmed_resources:
    programmed_mumodo.add_resource(resource)

In [18]:
#Write the mumodo to a file
cp.write_mumodo_to_file([programmed_mumodo], 'myprogrammed.mumodo')

In [19]:
#Is it the same?
assert(cp.serialize_mumodo(cp.read_mumodo_from_file('myprogrammed.mumodo')[0]) == cp.serialize_mumodo(mymumodo))

Resource Types

Here is a comprehensive list of supported mumodo resource types:

  • Resource
    • TextResource
    • BinaryResource
    • CSVResource
    • ImageResource
    • BaseAVResource
      • Audio Resource
      • Video Resource
    • BaseStreamResource
      • XIOStreamResource
      • CSVStreamResource
      • PickledStreamResource
    • BaseTierResource
      • TextGridTierResource
      • CSVTierResource
      • PickledTierResource

The classes whose names start with "Base" are not supposed to be instantiated. However, there is a usage for the Resource class, namely for non-importable data to be included in the mumodo for completeness