This notebook creates the BED file provided in
WARNING: you need an account on synapse to get the FastQ files.
If you just want to test the BED file, download it directly:
wget https://github.com/sequana/resources/raw/master/coverage/JB409847.bed.bz2
and jump to the section Using-Sequana-library-to-detect-ROI-in-the-coverage-data
otherwise, first download the FastQ from Synapse, its reference genome and its genbank annotation. Then, map the reads using BWA to get a BAM file. The BAM file is converted to a BED, which is going to be one input file to our analysis. Finally, we use the coverage tool from Sequana project (i) with the standalone (sequana_coverage) and (ii) the Python library to analyse the BED file.
Versions used:
In [1]:
%pylab inline
matplotlib.rcParams['figure.figsize'] = [10,7]
In [2]:
!sequana_coverage --download-reference JB409847 --download-genbank JB409847
In [3]:
# to install synapseclient, use
# pip install synapseclient
import synapseclient
l = synapseclient.login()
_ = l.get("syn10638367", downloadLocation=".", ifcollision="overwrite.local")
In [4]:
!sequana_mapping --file1 JB409847_R1_clean.fastq.gz --reference JB409847.fa
In [5]:
!bedtools genomecov -d -ibam JB409847.fa.sorted.bam> JB409847.bed
In [6]:
from sequana import GenomeCov
b = GenomeCov("JB409847.bed", "JB409847.gbk")
chromosome = b.chr_list[0]
chromosome.running_median(4001, circular=True)
chromosome.compute_zscore(k=2)
# you can replace the 2 previous lines by since version 0.6.4
# chromosome.run(4001, k=2, circular=True)
In [7]:
chromosome.plot_coverage()
In [ ]: