In this tutorial, we show how to read several data sets and save the results in a unique file
In the YEAST case, we have 36 experiments stored in 6 files called alpha0, alpha1, alpha5 , alpha10, alpha20, alpha45
We first want to read all all them and build a unique dataframe. This can be done using the class called MassSpecAlignmentYeast
In [1]:
%pylab inline
from msdas import *
from msdas import yeast
By default if you read a file called alpha, columns with measurements are renamed with the filename. E.g., a column called t0 is renamed as alpha0_t0. This is to avoid issue with identical names over several files (t0 may appear in all files). If you have specific prefixes to append, they can be provided like in the following examples.
In [2]:
filenames = yeast.get_yeast_filenames()
In [3]:
import pandas as pd
df1 = pd.read_csv(filenames[0])
df2 = pd.read_csv(filenames[1])
df1.columns
Out[3]:
In [4]:
df2.columns
Out[4]:
In [5]:
m = MassSpecAlignmentYeast(filenames, prefixes=["a0", "a1", "a5", "a10", "a20", "a45"], verbose=False)
We have merger the 6 yeast data sets altogether. The data is now available as a dataframe inside m.df
In [6]:
m.df.ix[0:3]
Out[6]:
In [7]:
m.df.columns
Out[7]:
In [8]:
m.df.shape
Out[8]:
In [9]:
r = readers.MassSpecReader(m)
from easydev import TempFile
f = TempFile() # a temporary named file
r.to_csv(f.name)
f.delete()
In [10]:
r.plot_phospho_stats()
In [ ]: