README

This repository contains all of the analysis notebook required for the reproduction of the manuscript:

Analysis of matched tumor and normal profiles reveals common transcriptional and epigenetic signals shared across cancer types.

Andrew M. Gross, Jason F. Kreisberg, Trey Ideker

Analysis

All analysis for the manuscript is recorded in a series of Jupyter (formerly IPython) Notebooks.

To view please follow the Github or NBviewer links.

Dependencies

This code uses a number of features in the scientific python stack as well as a small set of standard R libraries. Thus far, this code has only been tested in a Linux enviroment, it may take some modification to run on other operating systems. I highly recomend installing a scientific Python distribution such as Anaconda or Enthought to handle the majority of the Python dependencies in this project (other than rPy2). These are both free for academic use.

Python Dependencies

  • Numpy and Scipy, numeric calculations and statistics in Python
  • matplotlib, plotting in Python
  • Pandas, data-frames for Python, handles the majority of data-structures
  • rPy2, communication between R and Python
    • NOT IN DISTRIBUTIONS
    • I recommend installing with pip install rpy2
    • Needs R to be compiled with shared libraries

My Internal Package Dependencies

These are Python packages that I use internally for things such as statistics and visualization. They are all available on my Github page, I recomend downloading them and installing them with python setup.py install. I appoligize for the generic names, I am hoping to develop these a bit more and make them into proper packages up to spec in my next code refactor.

  • Figures

    • Code for better figure generation, mainly using Pandas data-structures
    • I am slowly phasing this out and replacing with the very nice seaborn library
  • Stats

    • Contains two packages, Stats and Helpers
    • Stats has a number of helper functions that wrap calls to R or scipy statistics functions and allow them to play nicer with Pandas data-structures
    • Helpers has a number of common tasks that I envoke to make code a bit more readable
  • NotebookImport

    • Utility for importing IPython notebooks as modules
    • Code taken from MinRK's Gist
    • This is dependent on the IPython/Jupyter version you are using, you may get deprecation warnings, I am trying to keep this up to date but I'm not sure how backwards compatable things are
  • MethylTools

    • Utility for organizing probe annotations for the Illumina methylation450k chip
    • Has some R dependencies

R Dependencies