In [26]:
config = {
    'data_folder'   : 'probes',
    'RUN'           : 'run_BWA_JKL_23_JU_1_orig',
    'chromosome'    : 'SL2.40ch12',
    'scaffold'      : 'SL2.40sc04878',
    'out_folder'    : 'reports'
}

if False:
    config['BAC'            ] = 'JBSC0201'
    config['BAC_coord_start'] = 622710
    config['BAC_coord_end'  ] = 624594
    config['BAC_coord'      ] = '%012d-%012d' % ( config['BAC_coord_start'], config['BAC_coord_end'  ] )
    config['RUN'            ] = 'run_BWA_JKL_23_JU_1_orig_PROBES'

In [27]:
%run -i probes_cfg.ipynb

In [28]:
%run -i probes_cfg_header.ipynb


Singularity Report

Run: run_BWA_JKL_23_JU_1_orig

Chromosome: SL2.40ch12

Scaffold: SL2.40sc04878

Config
	RUN            : run_BWA_JKL_23_JU_1_orig
	chromosome     : SL2.40ch12
	data_folder    : probes
	out_extensions : ['eps', 'png', 'pdf']
	out_folder     : reports
	scaffold       : SL2.40sc04878
Max Rows     : 10000
Column Names
	K-mer Coverage
	Sequencing Coverage
	Ns
	AGP Contig
	AGP Gap
	K-mer Coverage averaged: 500 bp
	K-mer Coverage averaged: 5 Kbp
	K-mer Coverage averaged: 1 Mbp

In [29]:
%run -i probes_cfg_images.ipynb


Populating the interactive namespace from numpy and matplotlib
WARNING: pylab import has clobbered these variables: ['axes', 'f', 'axis']
`%matplotlib` prevents importing * from pylab and numpy

Files

Input Files

KmerCoverageFile      : True  probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc04878.sam.cov.prop.cov.gz
SequencingCoverageFile: True  probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc04878.cov.gz
AgpContigFile         : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.contig.agp.cov.gz
AgpGapFile            : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.gap.agp.cov.gz
AgpOtherFile          : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.other.agp.cov.gz
AgpUnknownFile        : True  probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.unknown.agp.cov.gz
NsFile                : True  probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc04878.tab.cov.gz

all files present

Output Files

Combined graph                  :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.pdf
Gaps Distribution               :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.pdf
K-mer Coverage Distribution     :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.pdf
K-mer Coverage Stats            :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.pdf
Ns Distribution                 :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.pdf
Sequencing Coverage Distribution:
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.pdf
Sequencing Coverage Stats       :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.eps
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.png
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.pdf
all_data                        :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
all_data_full                   :
 - reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
 - reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
 - reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv

Read Files

K-mer Coverage File

probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc04878.sam.cov.prop.cov.gz
Loaded 5717763 rows and 9 columns

Sequencing Coverage File

probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc04878.cov.gz
Loaded 5717763 rows and 2 columns

AGP

Contig

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.contig.agp.cov.gz
Loaded 5717763 rows and 2 columns

Gap

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.gap.agp.cov.gz
Loaded 5717763 rows and 2 columns

Unknown

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.unknown.agp.cov.gz
Loaded 5717763 rows and 2 columns

Other

probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.other.agp.cov.gz
Loaded 5717763 rows and 2 columns

Ns

probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc04878.tab.cov.gz
Loaded 5717763 rows and 2 columns

Merge

Saved 5717763 rows and 15 columns

Stats

Describe

             Position  K-mer Coverage  K-mer Coverage averaged: 500 bp  \
count  5717763.000000  5717763.000000                   5717763.000000   
mean   2858881.000000        0.406522                         0.405711   
std    1650576.147944        0.390383                         0.271069   
min          0.000000        0.000000                         0.000000   
25%    1429440.500000        0.000000                         0.187885   
50%    2858881.000000        0.347826                         0.402065   
75%    4288321.500000        0.782609                         0.603575   
max    5717762.000000        1.000000                         0.998004   

       K-mer Coverage averaged: 2.5 Kbp  K-mer Coverage averaged: 5 Kbp  \
count                    5717763.000000                  5717763.000000   
mean                           0.406334                        0.406386   
std                            0.240735                        0.220669   
min                            0.000000                        0.000000   
25%                            0.239104                        0.259757   
50%                            0.410514                        0.414674   
75%                            0.575874                        0.562974   
max                            0.986405                        0.980360   

       K-mer Coverage averaged: 50 Kbp  K-mer Coverage averaged: 1 Mbp  \
count                   5717763.000000                  5717763.000000   
mean                          0.406687                        0.400120   
std                           0.127137                        0.069638   
min                           0.000000                        0.000000   
25%                           0.331733                        0.382051   
50%                           0.410566                        0.394978   
75%                           0.492965                        0.418949   
max                           0.746490                        0.681233   

       K-mer Coverage averaged: 5 Kbp before  \
count                         5717763.000000   
mean                                0.406200   
std                                 0.220758   
min                                 0.000000   
25%                                 0.259505   
50%                                 0.414517   
75%                                 0.562800   
max                                 0.980360   

       K-mer Coverage averaged: 5 Kbp after      AGP Contig         AGP Gap  \
count                        5717763.000000  5717763.000000  5717763.000000   
mean                               0.406442        0.952585        0.057403   
std                                0.220757        0.212524        0.232611   
min                                0.000000        0.000000        0.000000   
25%                                0.259774        1.000000        0.000000   
50%                                0.414708        1.000000        0.000000   
75%                                0.563087        1.000000        0.000000   
max                                0.980360        1.000000        1.000000   

          AGP Unknown  AGP Other              Ns  Sequencing Coverage  
count  5717763.000000    5717763  5717763.000000       5717763.000000  
mean         0.000017          0        0.047431           132.764166  
std          0.004182          0        0.212559           147.548038  
min          0.000000          0        0.000000             0.000000  
25%          0.000000          0        0.000000            80.000000  
50%          0.000000          0        0.000000           116.000000  
75%          0.000000          0        0.000000           157.000000  
max          1.000000          0        1.000000         12518.000000  

Quantiles

0.010    0.000000
0.025    0.000000
0.050    0.000000
0.100    0.000000
0.200    0.000000
0.300    0.000000
0.400    0.130435
0.500    0.347826
0.600    0.521739
0.700    0.695652
0.800    0.869565
0.900    1.000000
0.950    1.000000
0.975    1.000000
0.990    1.000000
dtype: float64

Percentiles

count      5717763
count == 1 724324
prop  == 1 0.126679612289

Median

Position                                 2858881.000000
K-mer Coverage                                 0.347826
K-mer Coverage averaged: 500 bp                0.402065
K-mer Coverage averaged: 2.5 Kbp               0.410514
K-mer Coverage averaged: 5 Kbp                 0.414674
K-mer Coverage averaged: 50 Kbp                0.410566
K-mer Coverage averaged: 1 Mbp                 0.394978
K-mer Coverage averaged: 5 Kbp before          0.414517
K-mer Coverage averaged: 5 Kbp after           0.414708
AGP Contig                                     1.000000
AGP Gap                                        0.000000
AGP Unknown                                    0.000000
AGP Other                                      0.000000
Ns                                             0.000000
Sequencing Coverage                          116.000000
dtype: float64

MAD

Position                                 1429440.750000
K-mer Coverage                                 0.355904
K-mer Coverage averaged: 500 bp                0.226379
K-mer Coverage averaged: 2.5 Kbp               0.196377
K-mer Coverage averaged: 5 Kbp                 0.178040
K-mer Coverage averaged: 50 Kbp                0.098770
K-mer Coverage averaged: 1 Mbp                 0.038964
K-mer Coverage averaged: 5 Kbp before          0.178125
K-mer Coverage averaged: 5 Kbp after           0.178113
AGP Contig                                     0.090333
AGP Gap                                        0.108216
AGP Unknown                                    0.000035
AGP Other                                      0.000000
Ns                                             0.090363
Sequencing Coverage                           62.134356
dtype: float64

Plot

K-mer Coverage Stats

Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.pdf

Sequencing Coverage Stats

Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.pdf

K-mer Coverage Distribution

Number of rows: 3743704
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.pdf

Sequencing Coverage Distribution

Number of rows: 5452930
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.pdf

Gaps Distribution

Number of rows: 328216
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.pdf

Ns Distribution

Number of rows: 271199
Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.pdf

CSV Output

Original Size 5717763 rows and 15 columns
SAMPLING EVERY 572 ROWS
Saving full data to:  reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
Saving full data to:  reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
Saving full data to:  reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
New Size 9997 rows and 15 columns
Saving data to     : reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
Saving data to     : reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
Saving data to     : reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv

Combined Graph

Saving Image: reports/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.eps
Saving Image: reports/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.png
Saving Image: reports/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.pdf

In [30]:
%run -i probes_cfg_footer.ipynb


Saulo Aflitos Last updated: 30/06/2015 

CPython 2.7.9
IPython 3.0.0

numpy 1.9.2
scipy 0.15.1
matplotlib 1.4.3
pandas 0.16.0
IPython 3.0.0

compiler   : GCC 4.4.7 20120313 (Red Hat 4.4.7-1)
system     : Linux
release    : 3.13.0-46-generic
machine    : x86_64
processor  : x86_64
CPU cores  : 80
interpreter: 64bit
host name  : assembly
Git hash   : 9ac2441d68018b7599187271721aab7af6b50b5a