Read BAM file and get information about mapping

Author: Thomas Cokelaer, 2016-2018


In [1]:
%pylab inline


Populating the interactive namespace from numpy and matplotlib

In [2]:
from sequana import BAM, sequana_data

In [3]:
b = BAM(sequana_data('test.bam', 'testing'))

In [4]:
b.plot_bar_mapq()



In [5]:
b.get_flags_as_df().sum()


Out[5]:
0          0
1       1000
2        484
4          2
8          2
16       499
32       500
64       477
128      523
256       64
512        0
1024       0
2048       0
dtype: int64

In [6]:
b.get_full_stats_as_df()


Out[6]:
description count
30 pairs on different chromosomes 0
19 bases duplicated 0
18 bases trimmed 0
13 reads QC failed 0
12 reads MQ0 0
11 reads duplicated 0
1 filtered sequences 0
3 is sorted 1
27 inward oriented pairs 169
8 reads unmapped 2
28 outward oriented pairs 229
29 pairs with other orientation 3
24 average quality 36.9
26 insert size standard deviation 3817.6
4 1st fragments 451
9 reads properly paired 462
25 insert size average 4775.2
5 last fragments 485
20 mismatches 51
14 non-primary alignments 64
17 bases mapped (cigar) 65641
21 error rate 7.769534e-04
16 bases mapped 70050
15 total length 70200
22 average length 75
23 maximum length 75
7 reads mapped and paired 932
6 reads mapped 934
10 reads paired 936
2 sequences 936
0 raw total sequences 936

In [ ]: