In [20]:
config = {
    'data_folder': 'probes',
    'RUN'        : 'run_BWA_JKL_23_JU_1_orig',
    'chromosome' : 'SL2.40ch12',
    'scaffold'   : 'SL2.40sc05611',
    'out_folder' : 'reports'
}

if False:
    config['BAC'            ] = 'JBPP0904'
    config['BAC_coord_start'] = 967164
    config['BAC_coord_end'  ] = 970727
    config['BAC_coord'      ] = '%012d-%012d' % ( config['BAC_coord_start'], config['BAC_coord_end'  ] )
    config['RUN'            ] = 'run_BWA_JKL_23_JU_1_orig_PROBES'

In [21]:
%run -i probes_cfg.ipynb

In [22]:
%run -i probes_cfg_header.ipynb


Singularity Report

Run: run_BWA_JKL_23_JU_1_orig

Chromosome: SL2.40ch12

Scaffold: SL2.40sc05611

Config
	RUN            : run_BWA_JKL_23_JU_1_orig
	chromosome     : SL2.40ch12
	data_folder    : probes
	out_extensions : ['eps', 'png', 'pdf']
	out_folder     : reports
	scaffold       : SL2.40sc05611
Max Rows     : 10000
Column Names
	K-mer Coverage
	Sequencing Coverage
	Ns
	AGP Contig
	AGP Gap
	K-mer Coverage averaged: 500 bp
	K-mer Coverage averaged: 5 Kbp
	K-mer Coverage averaged: 1 Mbp

In [23]:
%run -i probes_cfg_images.ipynb


Populating the interactive namespace from numpy and matplotlib
WARNING: pylab import has clobbered these variables: ['axes', 'f', 'axis']
`%matplotlib` prevents importing * from pylab and numpy

Files

Input Files

KmerCoverageFile      : True  probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc05611.sam.cov.prop.cov.gz
SequencingCoverageFile: True  probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc05611.cov.gz
AgpContigFile         : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.contig.agp.cov.gz
AgpGapFile            : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.gap.agp.cov.gz
AgpOtherFile          : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.other.agp.cov.gz
AgpUnknownFile        : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.unknown.agp.cov.gz
NsFile                : True  probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc05611.tab.cov.gz

all files present

Output Files

Combined graph                  :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.pdf
Gaps Distribution               :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.pdf
K-mer Coverage Distribution     :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.pdf
K-mer Coverage Stats            :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.pdf
Ns Distribution                 :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.pdf
Sequencing Coverage Distribution:
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.pdf
Sequencing Coverage Stats       :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.pdf
all_data                        :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
all_data_full                   :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv

Read Files

K-mer Coverage File

probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc05611.sam.cov.prop.cov.gz
Loaded 1203463 rows and 9 columns

Sequencing Coverage File

probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc05611.cov.gz
Loaded 1203463 rows and 2 columns

AGP

Contig

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.contig.agp.cov.gz
Loaded 1203463 rows and 2 columns

Gap

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.gap.agp.cov.gz
Loaded 1203463 rows and 2 columns

Unknown

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.unknown.agp.cov.gz
Loaded 1203463 rows and 2 columns

Other

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.other.agp.cov.gz
Loaded 1203463 rows and 2 columns

Ns

probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc05611.tab.cov.gz
Loaded 1203463 rows and 2 columns

Merge

Saved 1203463 rows and 15 columns

Stats

Describe

             Position  K-mer Coverage  K-mer Coverage averaged: 500 bp  \
count  1203463.000000  1203463.000000                   1203463.000000   
mean    601731.000000        0.723294                         0.721793   
std     347409.987842        0.377425                         0.300210   
min          0.000000        0.000000                         0.000000   
25%     300865.500000        0.521739                         0.622060   
50%     601731.000000        0.956522                         0.851688   
75%     902596.500000        1.000000                         0.934132   
max    1203462.000000        1.000000                         0.998004   

       K-mer Coverage averaged: 2.5 Kbp  K-mer Coverage averaged: 5 Kbp  \
count                    1203463.000000                  1203463.000000   
mean                           0.722996                        0.723085   
std                            0.252134                        0.225579   
min                            0.000000                        0.000000   
25%                            0.647289                        0.660703   
50%                            0.814283                        0.794441   
75%                            0.899640                        0.876877   
max                            0.994767                        0.984203   

       K-mer Coverage averaged: 50 Kbp  K-mer Coverage averaged: 1 Mbp  \
count                   1203463.000000                  1203463.000000   
mean                          0.722214                        0.716346   
std                           0.124127                        0.041315   
min                           0.000000                        0.000000   
25%                           0.662426                        0.703745   
50%                           0.759005                        0.725087   
75%                           0.809083                        0.733580   
max                           0.877406                        0.821217   

       K-mer Coverage averaged: 5 Kbp before  \
count                         1203463.000000   
mean                                0.723151   
std                                 0.225565   
min                                 0.000000   
25%                                 0.661111   
50%                                 0.794493   
75%                                 0.876877   
max                                 0.984203   

       K-mer Coverage averaged: 5 Kbp after      AGP Contig         AGP Gap  \
count                        1203463.000000  1203463.000000  1203463.000000   
mean                               0.721766        0.954131        0.100714   
std                                0.227992        0.209202        0.300949   
min                                0.000000        0.000000        0.000000   
25%                                0.659455        1.000000        0.000000   
50%                                0.794650        1.000000        0.000000   
75%                                0.876955        1.000000        0.000000   
max                                0.984203        1.000000        1.000000   

       AGP Unknown  AGP Other              Ns  Sequencing Coverage  
count      1203463    1203463  1203463.000000       1203463.000000  
mean             0          0        0.045900           109.520162  
std              0          0        0.209268           248.109755  
min              0          0        0.000000             0.000000  
25%              0          0        0.000000            48.000000  
50%              0          0        0.000000            92.000000  
75%              0          0        0.000000           144.000000  
max              0          0        1.000000         21485.000000  

Quantiles

0.010    0.000000
0.025    0.000000
0.050    0.000000
0.100    0.000000
0.200    0.304348
0.300    0.652174
0.400    0.869565
0.500    0.956522
0.600    1.000000
0.700    1.000000
0.800    1.000000
0.900    1.000000
0.950    1.000000
0.975    1.000000
0.990    1.000000
dtype: float64

Percentiles

count      1203463
count == 1 532012
prop  == 1 0.442067599918

Median

Position                                 601731.000000
K-mer Coverage                                0.956522
K-mer Coverage averaged: 500 bp               0.851688
K-mer Coverage averaged: 2.5 Kbp              0.814283
K-mer Coverage averaged: 5 Kbp                0.794441
K-mer Coverage averaged: 50 Kbp               0.759005
K-mer Coverage averaged: 1 Mbp                0.725087
K-mer Coverage averaged: 5 Kbp before         0.794493
K-mer Coverage averaged: 5 Kbp after          0.794650
AGP Contig                                    1.000000
AGP Gap                                       0.000000
AGP Unknown                                   0.000000
AGP Other                                     0.000000
Ns                                            0.000000
Sequencing Coverage                          92.000000
dtype: float64

MAD

Position                                 300865.750000
K-mer Coverage                                0.319989
K-mer Coverage averaged: 500 bp               0.238182
K-mer Coverage averaged: 2.5 Kbp              0.191599
K-mer Coverage averaged: 5 Kbp                0.167375
K-mer Coverage averaged: 50 Kbp               0.094501
K-mer Coverage averaged: 1 Mbp                0.025689
K-mer Coverage averaged: 5 Kbp before         0.167307
K-mer Coverage averaged: 5 Kbp after          0.169331
AGP Contig                                    0.087531
AGP Gap                                       0.181141
AGP Unknown                                   0.000000
AGP Other                                     0.000000
Ns                                            0.087586
Sequencing Coverage                          63.680164
dtype: float64

Plot

K-mer Coverage Stats

Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.pdf

Sequencing Coverage Stats

Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.pdf

K-mer Coverage Distribution

Number of rows: 1018544
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.pdf

Sequencing Coverage Distribution

Number of rows: 1150567
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.pdf

Gaps Distribution

Number of rows: 121205
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.pdf

Ns Distribution

Number of rows: 55239
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.pdf

CSV Output

Original Size 1203463 rows and 15 columns
SAMPLING EVERY 121 ROWS
Saving full data to:  reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
Saving full data to:  reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
Saving full data to:  reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
New Size 9946 rows and 15 columns
Saving data to     : reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
Saving data to     : reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
Saving data to     : reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv

Combined Graph

Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.pdf

In [24]:
%run -i probes_cfg_footer.ipynb


Saulo Aflitos Last updated: 30/06/2015 

CPython 2.7.9
IPython 3.0.0

numpy 1.9.2
scipy 0.15.1
matplotlib 1.4.3
pandas 0.16.0
IPython 3.0.0

compiler   : GCC 4.4.7 20120313 (Red Hat 4.4.7-1)
system     : Linux
release    : 3.13.0-46-generic
machine    : x86_64
processor  : x86_64
CPU cores  : 80
interpreter: 64bit
host name  : assembly
Git hash   : 1c900a288ea3ef7300d0e778c72d6b7129071fef