In [21]:
#### REMOVE in README.md ####
import JGV as package
from IPython.core.display import display, Markdown
display(Markdown("# {} {} package documentation\n".format(package.__name__, package.__version__)))
display(Markdown("\n---\n"))
display(Markdown("\n**{}**\n".format(package.__description__)))
display(Markdown("\n---\n"))
display(Markdown("{}\n".format(package.__long_description__)))
display(Markdown("* Author: {} - {}\n".format(package.__author__, package.__email__)))
display(Markdown("* URL: {}\n".format(package.__url__)))
display(Markdown("* Licence: {}\n".format(package.__licence__)))
#############################
Ideally, before installation, create a clean python3 virtual environment to deploy the package, using virtualenvwrapper for example (see http://www.simononsoftware.com/virtualenv-tutorial-part-2/).
In [20]:
#### REMOVE in README.md ####
import JGV as package
from IPython.core.display import display, Markdown
if "__install_requires__" in package.__dict__:
display(Markdown("## Python packages dependencies:\n"))
for dep in package.__install_requires__:
display(Markdown("* {}\n".format(dep)))
#############################
Install the package with pip3. All the required dependencies will be automatically installed.
In [ ]:
pip3 install git+https://github.com/a-slide/JupyterGenoViewer.git --process-dependency-links
To update the package:
In [ ]:
pip3 install git+https://github.com/a-slide/JupyterGenoViewer.git --upgrade --process-dependency-links
The package is meant to be used in a jupyter notebook 4.0.0 +
Launch the notebook in a terminal
In [ ]:
jupyter notebook
If it does not autolaunch your web browser, open manually the following URL http://localhost:8888/tree
From Jupyter home page you can navigate to the directory you want to work in. Then, create a new Python3 Notebook.
In the notebook, import matplotlib and use the jupyter magic command to enable direct plotting in the current Notebook.
In [36]:
import matplotlib.pyplot as pl
%matplotlib inline
Default pylab parameters can be defined at the beginning of the notebook as well (see http://matplotlib.org/users/customizing.html for more options)
In [37]:
pl.rcParams['figure.figsize'] = 20,7
pl.rcParams['font.family'] = 'sans-serif'
pl.rcParams['font.sans-serif'] = ['DejaVu Sans']
pl.style.use('ggplot')
JGV is first initialized with a reference genome. Then annotation and alignment files can be added. Finally, coverage and feature localization plots can be generated.
Each function has specific options that are comprehensively detailed in the testing notebook provided with the package or in html version on nbviewer: Test_notebook
In [24]:
from JGV.JGV import JGV
One can also import the jprint and jhelp function from pycoQC to get a improve the default print and help function in jupyter
In [25]:
from JGV.JGV import jhelp, jprint
A sample test file can be loaded from the package as well
In [26]:
example_bam = JGV.example_bam()
example_fasta = JGV.example_fasta()
example_gtf = JGV.example_gtf()
example_gff3 = JGV.example_gff3()
jprint(example_bam)
jprint(example_fasta)
jprint(example_gtf)
jprint(example_gff3)
JGV starts by creating a Reference object from a fasta file
In [27]:
j = JGV(fp=example_fasta, verbose=True)
One can also give a list of chromosomes to select in the fasta file
In [28]:
j = JGV(fp=example_fasta, verbose=True, ref_list=["I","II","III"])
Finally, instead of a fasta file, one can provide a tab separated index file containing at least 2 columns with the refid(chromosome name) and the length of the sequence, such as a fasta index create by faidx or with the output_index option of JGV
In [29]:
j = JGV(fp=example_fasta, verbose=True, output_index=True)
In [30]:
index = "/home/aleg/Programming/Python3/JupyterGenoViewer/JGV/data/yeast.tsv"
j = JGV(index, verbose=True)
Once initialized a JGV object can parse and save annotation files (gff3, gtf and bed).
In [31]:
j.add_annotation(example_gtf, name="yeastMine")
Several annotation can be loaded. Warnings will be thrown if there are chromosomes found in the reference sequence have no feature in the annotation file
In [32]:
j.add_annotation(example_gff3, name="Ensembl")
Information about the annotations can be obtained with annotation_summary
In [33]:
j.annotation_summary()
JGV objects can also parse and compute the coverage from alignment files (bam, sam and bed).
In [34]:
j.add_alignment(example_bam, name="RNA-Seq")
Similar to annotation, JGV also has an alignment_summary function
In [35]:
j.alignment_summary()
Simple visualization to have a first idea of the sequencing coverage, with many customization options
In [24]:
r = j.refid_coverage_plot()
In [31]:
r = j.refid_coverage_plot(norm_depth=False, norm_len=False, log=True, color="dodgerblue", alpha=0.5)
interval_plot is undoubtedly the most useful function of the package. It has a large panel of option to customize the plots and will adapt automatically to plot all the annotation and alignment coverage over a defined genomic interval or an entire chromosome
In [33]:
j.interval_plot("VI", feature_types=["gene", "transcript", "CDS"])
In [37]:
j.interval_plot("VI", start=220000, end=225000)