The omicexperiment python package

The omicexperiment package has the ultimate goal of providing a pleasant API for rapid analysis of 'omic experiments' in an interactive environment.

The R bioinformatics community has already provided similar implementations for similar functionality. Examples include DESeqDataSet (from the package DeSeq2), MRExperiment (from the package metagenomeSeq), phyloseq-class (from the package phyloseq). To my knowledge, there exists no similar powerful functionality available to users of python.

The philosophy of this package is to build upon solid foundations of the python scientific stack and try not to re-invent the wheel. Packages such as numpy and pandas are powerful optimized libraries in dealing with matrix and tabular data, respectively. This package's backend thus consists almost entirely of pandas DataFrames and pandas APIs.

Example code

from omicexperiment.experiment.microbiome import MicrobiomeExperiment
from omicexperiment.transforms.filters import Sample, Taxonomy

mapping = "example_map.tsv"
biom = "example_fungal.biom"
tax = "blast_tax_assignments.txt"

#our Experiment object
exp = MicrobiomeExperiment(biom, mapping,tax)

#include only samples with more than 90000 counts
exp.dapply(Sample.count > 90000)

##collapse the OTUs in the _data_ DataFrame to genus level assignment
exp.dapply(Taxonomy.groupby('genus')) #any taxonomic rank can be passed


Installation

Just clone the git repository. In the future, the package will be uploaded to PyPi and will be pip-installable.

License

This package is released as open-source, under a BSD License. Please see COPYING.txt.

Documentation

Please see the docs file associated.

Contributing and use in your research

Please be advised that at this stage, this package has been developed as a quick coding hack over the course of one week. Even basic testing is still not implemented so be careful. Contributors are welcome to improve the software. Nowadays, I expect the package to continue growing as I add functionality and improvements as long as I need it in my own research.


In [ ]: