Mumodo Demo Notebook - Updated on 25.04.2015
Summary: This notebook describes how to create and use mumodo files, which automate data import and create resources objects that abstract away from
(c) Dialogue Systems Group, University of Bielefeld
In [1]:
import mumodo.corpus as cp
#import pandas as pd
from IPython.display import Audio, HTML
Working with mumodo objects (IntervalFrames, StreamFrames, etc) is easy, but mumodo goes beyond that. It offers a way of managing multimodal corpora as objects that have resources. Resources abstract away from files, so that you do not need to look around for files when doing analysis. In addition to easy access to multimodal resources, mumodo files are a great way to organize corpora by keeping track of all data and metadata files.
NOTE: Mumodo uses moviepy (which is based on ffmpeg) to handle audio and video data, as well as PIL (Python Image Library) to handle images. Refer to the documentation of those packages for more information
Let's load a mumodo!
In [2]:
mumodofromfile = cp.read_mumodo_from_file('sampledata/test.mumodo')
mumodofromfile
Out[2]:
The function returns a list (with only one item in this case) of mumodos contained in the file we opened. This is the mumodo we need. We can already see the names of the resources it contains
In [3]:
mymumodo = mumodofromfile[0]
mymumodo.get_resource_names()
Out[3]:
Using these names (that must be unique) one can retrieve a resource object
In [4]:
myresource = mymumodo['transcriptionS']
myresource
Out[4]:
mumodos are also iterable, if one needs to get all resources
In [5]:
#NOTE: uncomment to see long text
#for resource in mymumodo:
# print resource
resources have sensible methods, e.g.
In [6]:
myresource.get_tier()
Out[6]:
Most resources have a slice() and a show() method, which do what you expect:
In [7]:
mymumodo['clap_points'].get_slice(10,12)
Out[7]:
In [8]:
mymumodo['image'].show()
In [9]:
#this prints the DataFrame, which does not look good in IPython, b
#but is needed in other interpreters. You can use get_tier() instead
#as shown above
mymumodo['clap_points'].show()
running show() on a resource such as Audio and Video causes it to be played by an external player. The IPython will be busy until the external player exits
In [10]:
myvideo = mymumodo['video']
myvideo.set_player('/Set/Path/To/A/Player/')
myvideo.show()
Out[10]:
Instead, one can use IPython functionality to display the AV data in a notebook
In [11]:
Audio(mymumodo['audio'].get_filepath())
Out[11]: