Managing Corpora and Resources with mumodo

Mumodo Demo Notebook - Updated on 25.04.2015

Summary: This notebook describes how to create and use mumodo files, which automate data import and create resources objects that abstract away from

(c) Dialogue Systems Group, University of Bielefeld



In [1]:

    
import mumodo.corpus as cp
#import pandas as pd
from IPython.display import Audio, HTML

Working with mumodo objects (IntervalFrames, StreamFrames, etc) is easy, but mumodo goes beyond that. It offers a way of managing multimodal corpora as objects that have resources. Resources abstract away from files, so that you do not need to look around for files when doing analysis. In addition to easy access to multimodal resources, mumodo files are a great way to organize corpora by keeping track of all data and metadata files.

Loading mumodos and accessing resources

NOTE: Mumodo uses moviepy (which is based on ffmpeg) to handle audio and video data, as well as PIL (Python Image Library) to handle images. Refer to the documentation of those packages for more information

Let's load a mumodo!



In [2]:

    
mumodofromfile = cp.read_mumodo_from_file('sampledata/test.mumodo')
mumodofromfile









    Out[2]:





[Mumodo
 name: A test corpus
 description: Two people clapping their hands three times
 url: possible.url.goes.gere
 localpath:                 sampledata/
 files: ['test.mp4', 'test.wav', 'testimage.png', 'test.xio.gz', 'test.TextGrid', 'test.eaf']
 ID:                 None
 resources: ['clap_points', 'image', 'transcriptionO', 'kinect_body', 'EAF', 'transcriptionS', 'video', 'audio'])]

The function returns a list (with only one item in this case) of mumodos contained in the file we opened. This is the mumodo we need. We can already see the names of the resources it contains



In [3]:

    
mymumodo = mumodofromfile[0]
mymumodo.get_resource_names()









    Out[3]:





['EAF',
 'audio',
 'clap_points',
 'image',
 'kinect_body',
 'transcriptionO',
 'transcriptionS',
 'video']

Using these names (that must be unique) one can retrieve a resource object



In [4]:

    
myresource = mymumodo['transcriptionS']
myresource









    Out[4]:





TextGridTierResource
name: transcriptionS
description: transcription of speaker S
filename: test.TextGrid
units: seconds
tiername: S

mumodos are also iterable, if one needs to get all resources



In [5]:

    
#NOTE: uncomment to see long text
#for resource in mymumodo:
#    print resource

resources have sensible methods, e.g.



In [6]:

    
myresource.get_tier()









    Out[6]:






  
    
      
      start_time
      end_time
      text
    
  
  
    
      0
      1.30
      1.860000
      Hello
    
    
      1
      2.88
      3.500000
      I 'm Spyros
    
    
      2
      4.86
      8.280000
      Here in the Dialogue Systems Group, in the Uni...
    
    
      3
      8.50
      10.400000
      We have developed Mumodo, and Venice
    
    
      4
      11.58
      11.840000
      <CLAP>
    
    
      5
      14.10
      17.220000
      Well, right now we are being recorded by a cam...
    
    
      6
      17.54
      18.840314
      and a Microsoft Kinect sensor
    
    
      7
      19.30
      21.100000
      But how will we get the data from Kinect?
    
    
      8
      27.70
      31.480000
      We are using this timecode to synchronize the ...
    
    
      9
      31.62
      32.860000
      With the audio and video
    
    
      10
      33.76
      34.000000
      <CLAP>
    
    
      11
      37.40
      41.300000
      We can process the data that comes from Venice...
    
    
      12
      47.60
      47.820000
      <CLAP>
    
    
      13
      48.94
      49.660000
      Goodbye

Most resources have a slice() and a show() method, which do what you expect:



In [7]:

    
mymumodo['clap_points'].get_slice(10,12)









    Out[7]:






  
    
      
      time
      mark
    
  
  
    
      0
      11.65423
      First Clap



In [8]:

    
mymumodo['image'].show()



In [9]:

    
#this prints the DataFrame, which does not look good in IPython, b
#but is needed in other interpreters. You can use get_tier() instead
#as shown above
mymumodo['clap_points'].show()









    



        time         mark
0  11.654230   First Clap
1  33.824485  Second Clap
2  47.672685   Third Clap

running show() on a resource such as Audio and Video causes it to be played by an external player. The IPython will be busy until the external player exits



In [10]:

    
myvideo = mymumodo['video']
myvideo.set_player('/Set/Path/To/A/Player/')
myvideo.show()









    Out[10]:





"I couldn't play"

Instead, one can use IPython functionality to display the AV data in a notebook



In [11]:

    
Audio(mymumodo['audio'].get_filepath())









    Out[11]:



In [12]:

    
def show_html_video(fname, mimetype):
    """Load the video in the file `fname`, with given mimetype, and display as HTML5 video.
    """
    video_encoded = open(fname, "rb").read().encode("base64")
    video_tag = '<video controls alt="test" src="data:video/{0};base64,{1}">'.format(mimetype, video_encoded)
    return HTML(data=video_tag)



In [13]:

    
show_html_video(mymumodo['video'].get_filepath(), 'mp4')









    Out[13]:

See below for a comprehensive overview of supported resource types

Creating mumodo files

First let as look what a mumodo file looks like:



In [14]:

    
print cp.serialize_mumodo(mymumodo)









    



!Mumodo
ID: null
description: Two people clapping their hands three times
files:
- test.mp4
- test.wav
- testimage.png
- test.xio.gz
- test.TextGrid
- test.eaf
localpath: sampledata/
name: A test corpus
parent: null
url: possible.url.goes.gere

!Resource
description: the ELAN file accompanying this data
filename: test.eaf
name: EAF
units: null

!AudioResource
channel: null
description: Audio of interaction
filename: test.wav
name: audio
player: null
units: seconds

!TextGridTierResource
description: point tier with clap events
filename: test.TextGrid
name: clap_points
tiername: CLAPS
units: seconds

!ImageResource
description: Image of interaction
filename: testimage.png
name: image
units: null

!XIOStreamResource
description: Kinect Data of interaction
filename: test.xio.gz
kwargs:
  timestamp_offset: -9616
name: kinect_body
sensorname: VeniceHubReplay/Venice/Body1
units: ms

!TextGridTierResource
description: transcription of speaker O
filename: test.TextGrid
name: transcriptionO
tiername: O
units: seconds

!TextGridTierResource
description: transcription of speaker S
filename: test.TextGrid
name: transcriptionS
tiername: S
units: seconds

!VideoResource
channel: null
description: Video of interaction
filename: test.mp4
name: video
player: /Set/Path/To/A/Player/
units: seconds

There are two ways to create a mumodo file.

Type it! This works well with small corpora with few resources.
Programmatically as shown below. This is useful for large corpora with repeated file structures



In [15]:

    
#create a mumodo
programmed_mumodo = cp.Mumodo(name='A test corpus',
                   description='Two people clapping their hands three times',
                   url='possible.url.goes.gere',
                   localpath='sampledata/')



In [16]:

    
#Populate a list of resources
programmed_resources = []
programmed_resources.append(cp.VideoResource(name='video',
                            description='Video of interaction',
                            filename='test.mp4',
                            units='seconds',
                            player='/Set/Path/To/A/Player/'))
programmed_resources.append(cp.AudioResource(name='audio',
                            description='Audio of interaction',
                            filename='test.wav',
                            units='seconds'))
programmed_resources.append(cp.ImageResource(name='image',
                            description='Image of interaction',
                            filename='testimage.png'))
programmed_resources.append(cp.XIOStreamResource(name='kinect_body',
                            description='Kinect Data of interaction',
                            filename='test.xio.gz',
                            units='ms',
                            sensorname='VeniceHubReplay/Venice/Body1',
                            kwargs={'timestamp_offset': -9616}))
programmed_resources.append(cp.TextGridTierResource(name='transcriptionO',
                              description='transcription of speaker O',
                              filename='test.TextGrid',
                              units='seconds',
                              tiername='O'))
programmed_resources.append(cp.TextGridTierResource(name='transcriptionS',
                              description='transcription of speaker S',
                              filename='test.TextGrid',
                              units='seconds',
                              tiername='S'))
programmed_resources.append(cp.TextGridTierResource(name='clap_points',
                              description='point tier with clap events',
                              filename='test.TextGrid',
                              units='seconds',
                              tiername='CLAPS'))
programmed_resources.append(cp.Resource(name='EAF',
                            description='the ELAN file accompanying this data',
                            filename='test.eaf'))



In [17]:

    
#add the resources to the mumodo!
for resource in programmed_resources:
    programmed_mumodo.add_resource(resource)



In [18]:

    
#Write the mumodo to a file
cp.write_mumodo_to_file([programmed_mumodo], 'myprogrammed.mumodo')



In [19]:

    
#Is it the same?
assert(cp.serialize_mumodo(cp.read_mumodo_from_file('myprogrammed.mumodo')[0]) == cp.serialize_mumodo(mymumodo))

Resource Types

Here is a comprehensive list of supported mumodo resource types:

Resource
- TextResource
- BinaryResource
- CSVResource
- ImageResource
- BaseAVResource
  - Audio Resource
  - Video Resource
- BaseStreamResource
  - XIOStreamResource
  - CSVStreamResource
  - PickledStreamResource
- BaseTierResource
  - TextGridTierResource
  - CSVTierResource
  - PickledTierResource

The classes whose names start with "Base" are not supposed to be instantiated. However, there is a usage for the Resource class, namely for non-importable data to be included in the mumodo for completeness

	start_time	end_time	text
0	1.30	1.860000	Hello
1	2.88	3.500000	I 'm Spyros
2	4.86	8.280000	Here in the Dialogue Systems Group, in the Uni...
3	8.50	10.400000	We have developed Mumodo, and Venice
4	11.58	11.840000	<CLAP>
5	14.10	17.220000	Well, right now we are being recorded by a cam...
6	17.54	18.840314	and a Microsoft Kinect sensor
7	19.30	21.100000	But how will we get the data from Kinect?
8	27.70	31.480000	We are using this timecode to synchronize the ...
9	31.62	32.860000	With the audio and video
10	33.76	34.000000	<CLAP>
11	37.40	41.300000	We can process the data that comes from Venice...
12	47.60	47.820000	<CLAP>
13	48.94	49.660000	Goodbye