In [11]:
config = {
'data_folder' : 'probes',
'RUN' : 'run_BWA_JKL_23_JU_1_orig',
'chromosome' : 'SL2.40ch12',
'scaffold' : 'SL2.40sc04878',
'out_folder' : 'reports_short'
}
if False:
config['BAC' ] = 'JBSC0201'
config['BAC_coord_start'] = 622710
config['BAC_coord_end' ] = 624594
config['BAC_coord' ] = '%012d-%012d' % ( config['BAC_coord_start'], config['BAC_coord_end' ] )
config['RUN' ] = 'run_BWA_JKL_23_JU_1_orig_PROBES'
In [12]:
%run -i probes_cfg_short.ipynb
In [13]:
%run -i probes_cfg_short_header.ipynb
Singularity Report
Run: run_BWA_JKL_23_JU_1_orig
Chromosome: SL2.40ch12
Scaffold: SL2.40sc04878
Config
RUN : run_BWA_JKL_23_JU_1_orig
chromosome : SL2.40ch12
data_folder : probes
out_extensions : ['eps', 'png', 'pdf']
out_folder : reports_short
scaffold : SL2.40sc04878
Max Rows : 10000
Column Names
K-mer Coverage averaged: 50 Kbp
Sequencing Coverage
Ns
AGP Contig
AGP Gap
In [15]:
%run -i probes_cfg_short_images.ipynb
has config global
Populating the interactive namespace from numpy and matplotlib
Files
Input Files
KmerCoverageFile : True probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc04878.sam.cov.prop.cov.gz
SequencingCoverageFile: True probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc04878.cov.gz
AgpContigFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.contig.agp.cov.gz
AgpGapFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.gap.agp.cov.gz
AgpOtherFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.other.agp.cov.gz
AgpUnknownFile : True probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.unknown.agp.cov.gz
NsFile : True probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc04878.tab.cov.gz
all files present
Output Files
Combined graph :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.pdf
Gaps Distribution :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.pdf
K-mer Coverage Distribution :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.pdf
K-mer Coverage Stats :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.pdf
Ns Distribution :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.pdf
Sequencing Coverage Distribution:
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.pdf
Sequencing Coverage Stats :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.eps
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.png
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.pdf
all_data :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
all_data_full :
- reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
- reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
- reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
Read Files
K-mer Coverage File
probes/run_BWA_JKL_23_JU_1_orig/S_lycopersicum_chromosomes.fa_S_lycopersicum_scaffolds.sam.SL2.40ch12.sam.SL2.40sc04878.sam.cov.prop.cov.gz
Loaded 5717763 rows and 9 columns
Sequencing Coverage File
probes/mapping/out/S_lycopersicum_chromosomes.fa_S_lycopersicum_chromosomes.pos.SL2.40ch12.pos.cov.SL2.40sc04878.cov.gz
Loaded 5717763 rows and 2 columns
AGP
Contig
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.contig.agp.cov.gz
Loaded 5717763 rows and 2 columns
Gap
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.gap.agp.cov.gz
Loaded 5717763 rows and 2 columns
Unknown
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.unknown.agp.cov.gz
Loaded 5717763 rows and 2 columns
Other
probes/agp/S_lycopersicum_scaffolds_from_contigs.2.40.agp.SL2.40sc04878.agp.other.agp.cov.gz
Loaded 5717763 rows and 2 columns
Ns
probes/Ns/S_lycopersicum_scaffolds.fa_NONE.tab.SL2.40ch12.tab.SL2.40sc04878.tab.cov.gz
Loaded 5717763 rows and 2 columns
Merge
Saved 5717763 rows and 15 columns
Stats
Describe
Position K-mer Coverage K-mer Coverage averaged: 500 bp \
count 5717763.000000 5717763.000000 5717763.000000
mean 2858881.000000 0.406522 0.405711
std 1650576.147944 0.390383 0.271069
min 0.000000 0.000000 0.000000
25% 1429440.500000 0.000000 0.187885
50% 2858881.000000 0.347826 0.402065
75% 4288321.500000 0.782609 0.603575
max 5717762.000000 1.000000 0.998004
K-mer Coverage averaged: 2.5 Kbp K-mer Coverage averaged: 5 Kbp \
count 5717763.000000 5717763.000000
mean 0.406334 0.406386
std 0.240735 0.220669
min 0.000000 0.000000
25% 0.239104 0.259757
50% 0.410514 0.414674
75% 0.575874 0.562974
max 0.986405 0.980360
K-mer Coverage averaged: 50 Kbp K-mer Coverage averaged: 1 Mbp \
count 5717763.000000 5717763.000000
mean 0.406687 0.400120
std 0.127137 0.069638
min 0.000000 0.000000
25% 0.331733 0.382051
50% 0.410566 0.394978
75% 0.492965 0.418949
max 0.746490 0.681233
K-mer Coverage averaged: 5 Kbp before \
count 5717763.000000
mean 0.406200
std 0.220758
min 0.000000
25% 0.259505
50% 0.414517
75% 0.562800
max 0.980360
K-mer Coverage averaged: 5 Kbp after AGP Contig AGP Gap \
count 5717763.000000 5717763.000000 5717763.000000
mean 0.406442 0.952585 0.057403
std 0.220757 0.212524 0.232611
min 0.000000 0.000000 0.000000
25% 0.259774 1.000000 0.000000
50% 0.414708 1.000000 0.000000
75% 0.563087 1.000000 0.000000
max 0.980360 1.000000 1.000000
AGP Unknown AGP Other Ns Sequencing Coverage
count 5717763.000000 5717763 5717763.000000 5717763.000000
mean 0.000017 0 0.047431 132.764166
std 0.004182 0 0.212559 147.548038
min 0.000000 0 0.000000 0.000000
25% 0.000000 0 0.000000 80.000000
50% 0.000000 0 0.000000 116.000000
75% 0.000000 0 0.000000 157.000000
max 1.000000 0 1.000000 12518.000000
Quantiles
0.010 0.000000
0.025 0.000000
0.050 0.000000
0.100 0.000000
0.200 0.000000
0.300 0.000000
0.400 0.130435
0.500 0.347826
0.600 0.521739
0.700 0.695652
0.800 0.869565
0.900 1.000000
0.950 1.000000
0.975 1.000000
0.990 1.000000
dtype: float64
Percentiles
count 5717763
count == 1 724324
prop == 1 0.126679612289
Median
Position 2858881.000000
K-mer Coverage 0.347826
K-mer Coverage averaged: 500 bp 0.402065
K-mer Coverage averaged: 2.5 Kbp 0.410514
K-mer Coverage averaged: 5 Kbp 0.414674
K-mer Coverage averaged: 50 Kbp 0.410566
K-mer Coverage averaged: 1 Mbp 0.394978
K-mer Coverage averaged: 5 Kbp before 0.414517
K-mer Coverage averaged: 5 Kbp after 0.414708
AGP Contig 1.000000
AGP Gap 0.000000
AGP Unknown 0.000000
AGP Other 0.000000
Ns 0.000000
Sequencing Coverage 116.000000
dtype: float64
MAD
Position 1429440.750000
K-mer Coverage 0.355904
K-mer Coverage averaged: 500 bp 0.226379
K-mer Coverage averaged: 2.5 Kbp 0.196377
K-mer Coverage averaged: 5 Kbp 0.178040
K-mer Coverage averaged: 50 Kbp 0.098770
K-mer Coverage averaged: 1 Mbp 0.038964
K-mer Coverage averaged: 5 Kbp before 0.178125
K-mer Coverage averaged: 5 Kbp after 0.178113
AGP Contig 0.090333
AGP Gap 0.108216
AGP Unknown 0.000035
AGP Other 0.000000
Ns 0.090363
Sequencing Coverage 62.134356
dtype: float64
Plot
K-mer Coverage Stats
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Stats.pdf
Sequencing Coverage Stats
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Stats.pdf
K-mer Coverage Distribution
Number of rows: 3743704
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-mer_Coverage_Distribution.pdf
Sequencing Coverage Distribution
Number of rows: 5452930
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Sequencing_Coverage_Distribution.pdf
Gaps Distribution
Number of rows: 328216
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Gaps_Distribution.pdf
Ns Distribution
Number of rows: 271199
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Ns_Distribution.pdf
CSV Output
Original Size 5717763 rows and 15 columns
SAMPLING EVERY 572 ROWS
Saving full data to: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
Saving full data to: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
Saving full data to: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.K-raw_data.full.csv
New Size 9997 rows and 15 columns
Saving data to : reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
Saving data to : reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
Saving data to : reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.raw_data.csv
Combined Graph
[['K-mer Coverage averaged: 50 Kbp', [0, 1], 4, {'color': 'orange'}], ['Sequencing Coverage', None, 10, {'color': 'green', 'kind': 'line', 'logy': True}], None, ['Ns', [0.1, 1], 0, {'color': 'grey'}], ['AGP Contig', [0.1, 1], 0, {'color': 'grey'}], ['AGP Gap', [0.1, 1], 0, {'color': 'grey'}], None, None, None, None, None, None, None, None, None]
Saving Image: reports_short/eps/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.eps
Saving Image: reports_short/png/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.png
Saving Image: reports_short/pdf/run_BWA_JKL_23_JU_1_orig_SL2.40ch12_SL2.40sc04878_prop.Combined_graph.pdf
%run -i probes_cfg_short_footer.ipynb