MEANS package provides a set of routines for serialisation (saving) and deserialisation (loading) of the MEANS-specific objects into human-readable YAML format.
These routines all reside inside means.io package, which is imported by default from the package root.
This tutorial will quickly how to include them in your workflow.
First, let's import the means package together with the example models
In [1]:
import means
import means.examples
Let's use a $p53$ model as an example object for serialisation.
In [2]:
p53 = means.examples.MODEL_P53
In [3]:
p53
Out[3]:
The serialisation to string is done via means.io.dump(object) function that takes any serialisable object as it's argument:
In [4]:
serialised_p53 = means.io.dump(p53)
In [5]:
print serialised_p53
Note how this format is equally easy for both humans and machines to read. This is double readability is the main reason why YAML format has been chosen to serialise MEANS objects.
Deserialisation can be done by via means.io.load(serialised_string) function, as shown below:
In [6]:
deserialised_p53 = means.io.load(serialised_p53)
In [7]:
deserialised_p53
Out[7]:
Since most of the serialisation will be done to and from files, MEANS provides a set of convenience functions for performing these,
namely to_file and from_file functions:
In [8]:
# Store P53 to file
means.io.to_file(p53, 'p53-tutorial.txt')
In [9]:
# Read P53 back from the file
p53_from_file = means.io.from_file('p53-tutorial.txt')
In [10]:
p53_from_file
Out[10]:
The serialisation functions dump, and to_file can handle lists of items equally well, for instance, we could try to serialise both $p53$ model and $Hes1$ model at the same time:
In [11]:
p53_hes1 = [means.examples.MODEL_P53, means.examples.MODEL_HES1]
print means.io.dump(p53_hes1)
The form of items being serialisied is completely free (you can serialise anything using these routines).
However, some objects may not be as human-readable when serialised as MEANS ones are, so some care needs to be taken of what is being serialised when human readability is important.
All MEANS objects also have shorthand methods to_file and from_file to write them to and read them from files:
In [12]:
# Write p53 to file
p53.to_file('p53-shorthand.txt')
In [13]:
# Read it back again from the file,
# will throw an error if the file serialised is not a Model
means.Model.from_file('p53-shorthand.txt')
Out[13]:
If human readability is not important, binary serialisation using python's pickle module is suggested instead of MEANS IO routines as it is much faster. The following example shows how to serialise a model object to a string using pickle.
In [14]:
# C-based pickle implementation (cPickle) is faster than the pythonic one so use that
import cPickle as pickle
# Highest protocol should provide best compression and speed
pickled_p53 = pickle.dumps(p53, pickle.HIGHEST_PROTOCOL)
depickled_p53 = pickle.loads(pickled_p53)
# Return the depickled_p53
depickled_p53
Out[14]:
Similarly, pickle.dump and pickle.load functions can be used to write the pickled binary to file and read it back from it.
To reiterate the difference between binary and human-readable serialisation, the following tests show the overall difference in runtime speeds between these two forms:
In [15]:
%timeit means.io.load(means.io.dump(p53))
In [16]:
%timeit pickle.loads(pickle.dumps(p53, pickle.HIGHEST_PROTOCOL))
The binary serialisation is about 28 times faster than the human-readable one, however human-readable form is easier to integrate to other process, as well as platform compatible (pickle can only deserialise python objects).
Whenever designing a pipeline, it is important to take these matters into account and choose appropriate solution for your particular case.
In [17]:
import os
# Cleanup:
os.unlink('p53-tutorial.txt')
os.unlink('p53-shorthand.txt')
In [ ]: