In [8]:
config = {
'data_folder': 'probes',
'RUN' : 'run_BWA_JKL_23_JU_1_orig',
'chromosome' : 'SL2.40ch12',
'scaffold' : 'SL2.40sc05611',
'out_folder' : 'reports_short'
}
if False:
config['BAC' ] = 'JBPP0904'
config['BAC_coord_start'] = 967164
config['BAC_coord_end' ] = 970727
config['BAC_coord' ] = '%012d-%012d' % ( config['BAC_coord_start'], config['BAC_coord_end' ] )
config['RUN' ] = 'run_BWA_JKL_23_JU_1_orig_PROBES'
In [9]:
%run -i probes_cfg_short.ipynb
In [10]:
%run -i probes_cfg_short_header.ipynb
Singularity Report
Run: run_BWA_JKL_23_JU_1_orig
Chromosome: SL2.40ch12
Scaffold: SL2.40sc05611
Config
RUN : run_BWA_JKL_23_JU_1_orig
chromosome : SL2.40ch12
data_folder : probes
out_extensions : ['eps', 'png', 'pdf']
out_folder : reports_short
scaffold : SL2.40sc05611
Max Rows : 10000
Column Names
K-mer Coverage averaged: 50 Kbp
Sequencing Coverage
Ns
AGP Contig
AGP Gap
In [12]:
%run -i probes_cfg_short_images.ipynb
has config global
Populating the interactive namespace from numpy and matplotlib
WARNING: pylab import has clobbered these variables: ['axes', 'f', 'axis', 'title']
`%matplotlib` prevents importing * from pylab and numpy
Files
Input Files
KmerCoverageFile : True probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc05611.sam.cov.prop.cov.gz
SequencingCoverageFile: True probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc05611.cov.gz
AgpContigFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.contig.agp.cov.gz
AgpGapFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.gap.agp.cov.gz
AgpOtherFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.other.agp.cov.gz
AgpUnknownFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.unknown.agp.cov.gz
NsFile : True probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc05611.tab.cov.gz
all files present
Output Files
Combined graph :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.pdf
Gaps Distribution :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.pdf
K-mer Coverage Distribution :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.pdf
K-mer Coverage Stats :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.pdf
Ns Distribution :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.pdf
Sequencing Coverage Distribution:
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.pdf
Sequencing Coverage Stats :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.pdf
all_data :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
all_data_full :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
Read Files
K-mer Coverage File
probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc05611.sam.cov.prop.cov.gz
Loaded 1203463 rows and 9 columns
Sequencing Coverage File
probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc05611.cov.gz
Loaded 1203463 rows and 2 columns
AGP
Contig
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.contig.agp.cov.gz
Loaded 1203463 rows and 2 columns
Gap
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.gap.agp.cov.gz
Loaded 1203463 rows and 2 columns
Unknown
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.unknown.agp.cov.gz
Loaded 1203463 rows and 2 columns
Other
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc05611.agp.other.agp.cov.gz
Loaded 1203463 rows and 2 columns
Ns
probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc05611.tab.cov.gz
Loaded 1203463 rows and 2 columns
Merge
Saved 1203463 rows and 15 columns
Stats
Describe
Position K-mer Coverage K-mer Coverage averaged: 500 bp \
count 1203463.000000 1203463.000000 1203463.000000
mean 601731.000000 0.723294 0.721793
std 347409.987842 0.377425 0.300210
min 0.000000 0.000000 0.000000
25% 300865.500000 0.521739 0.622060
50% 601731.000000 0.956522 0.851688
75% 902596.500000 1.000000 0.934132
max 1203462.000000 1.000000 0.998004
K-mer Coverage averaged: 2.5 Kbp K-mer Coverage averaged: 5 Kbp \
count 1203463.000000 1203463.000000
mean 0.722996 0.723085
std 0.252134 0.225579
min 0.000000 0.000000
25% 0.647289 0.660703
50% 0.814283 0.794441
75% 0.899640 0.876877
max 0.994767 0.984203
K-mer Coverage averaged: 50 Kbp K-mer Coverage averaged: 1 Mbp \
count 1203463.000000 1203463.000000
mean 0.722214 0.716346
std 0.124127 0.041315
min 0.000000 0.000000
25% 0.662426 0.703745
50% 0.759005 0.725087
75% 0.809083 0.733580
max 0.877406 0.821217
K-mer Coverage averaged: 5 Kbp before \
count 1203463.000000
mean 0.723151
std 0.225565
min 0.000000
25% 0.661111
50% 0.794493
75% 0.876877
max 0.984203
K-mer Coverage averaged: 5 Kbp after AGP Contig AGP Gap \
count 1203463.000000 1203463.000000 1203463.000000
mean 0.721766 0.954131 0.100714
std 0.227992 0.209202 0.300949
min 0.000000 0.000000 0.000000
25% 0.659455 1.000000 0.000000
50% 0.794650 1.000000 0.000000
75% 0.876955 1.000000 0.000000
max 0.984203 1.000000 1.000000
AGP Unknown AGP Other Ns Sequencing Coverage
count 1203463 1203463 1203463.000000 1203463.000000
mean 0 0 0.045900 109.520162
std 0 0 0.209268 248.109755
min 0 0 0.000000 0.000000
25% 0 0 0.000000 48.000000
50% 0 0 0.000000 92.000000
75% 0 0 0.000000 144.000000
max 0 0 1.000000 21485.000000
Quantiles
0.010 0.000000
0.025 0.000000
0.050 0.000000
0.100 0.000000
0.200 0.304348
0.300 0.652174
0.400 0.869565
0.500 0.956522
0.600 1.000000
0.700 1.000000
0.800 1.000000
0.900 1.000000
0.950 1.000000
0.975 1.000000
0.990 1.000000
dtype: float64
Percentiles
count 1203463
count == 1 532012
prop == 1 0.442067599918
Median
Position 601731.000000
K-mer Coverage 0.956522
K-mer Coverage averaged: 500 bp 0.851688
K-mer Coverage averaged: 2.5 Kbp 0.814283
K-mer Coverage averaged: 5 Kbp 0.794441
K-mer Coverage averaged: 50 Kbp 0.759005
K-mer Coverage averaged: 1 Mbp 0.725087
K-mer Coverage averaged: 5 Kbp before 0.794493
K-mer Coverage averaged: 5 Kbp after 0.794650
AGP Contig 1.000000
AGP Gap 0.000000
AGP Unknown 0.000000
AGP Other 0.000000
Ns 0.000000
Sequencing Coverage 92.000000
dtype: float64
MAD
Position 300865.750000
K-mer Coverage 0.319989
K-mer Coverage averaged: 500 bp 0.238182
K-mer Coverage averaged: 2.5 Kbp 0.191599
K-mer Coverage averaged: 5 Kbp 0.167375
K-mer Coverage averaged: 50 Kbp 0.094501
K-mer Coverage averaged: 1 Mbp 0.025689
K-mer Coverage averaged: 5 Kbp before 0.167307
K-mer Coverage averaged: 5 Kbp after 0.169331
AGP Contig 0.087531
AGP Gap 0.181141
AGP Unknown 0.000000
AGP Other 0.000000
Ns 0.087586
Sequencing Coverage 63.680164
dtype: float64
Plot
K-mer Coverage Stats
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Stats.pdf
Sequencing Coverage Stats
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Stats.pdf
K-mer Coverage Distribution
Number of rows: 1018544
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-mer_Coverage_Distribution.pdf
Sequencing Coverage Distribution
Number of rows: 1150567
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Sequencing_Coverage_Distribution.pdf
Gaps Distribution
Number of rows: 121205
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Gaps_Distribution.pdf
Ns Distribution
Number of rows: 55239
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Ns_Distribution.pdf
CSV Output
Original Size 1203463 rows and 15 columns
SAMPLING EVERY 121 ROWS
Saving full data to: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
Saving full data to: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
Saving full data to: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.K-raw_data.full.csv
New Size 9946 rows and 15 columns
Saving data to : reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
Saving data to : reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
Saving data to : reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.raw_data.csv
Combined Graph
[['K-mer Coverage averaged: 50 Kbp', [0, 1], 4, {'color': 'orange'}], ['Sequencing Coverage', None, 10, {'color': 'green', 'kind': 'line', 'logy': True}], None, ['Ns', [0.1, 1], 0, {'color': 'grey'}], ['AGP Contig', [0.1, 1], 0, {'color': 'grey'}], ['AGP Gap', [0.1, 1], 0, {'color': 'grey'}], None, None, None, None, None, None, None, None, None]
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc05611_prop.Combined_graph.pdf
%run -i probes_cfg_short_footer.ipynb