Objective

create functions and pipeline for running tmap


In [13]:
%%bash
tmap --help


tmap:   torrent mapper
Version: 3.4.1 git:39c56e6cd6471b8d796e3b8af2c00206f1002fae
Web Site: http://github.com/iontorrent/tmap
Contact: http://ioncommunity.lifetechnologies.com

Pre-processing:
         index           creates the packed FASTA, BWT string, and SA files

Server:
         server          creates a mapping server

Mapping:
         map1            mapping procedure #1 (bwa-short variant)
         map2            mapping procedure #2 (bwa-long/BWASW variant)
         map3            mapping procedure #3 (k-mer lookup)
         map4            mapping procedure #4 (bwa fastmap variant)
         mapvsw          mapping procedure vectorized smith waterman
         mapall          multi-mapping procedure

Utilities:
         fasta2pac       creates the packed FASTA file
         pac2bwt         creates the BWT string file from the packed FASTA file
         bwt2sa          creates the SA file from the BWT string file
         sff2fq          converts a SFF file to a FASTQ file
         sff2sam         converts a SFF file to a SAM file
         refinfo         prints information about the reference
         pac2fasta       converts a packed FASTA to a FASTA file
         bwtupdate       updates the bwt hash width
         indexsize       gives the index size in bytes
         sam2fs          pretty print SAM records in flow space
         samtools        samtools (Tools for alignments in the SAM format)
         bcftools        bcftools (Tools for data in the VCF/BCF formats)
         sw              simple smith waterman

In [9]:
%%bash
tmap index -f ../data/RM8375/ref/HGAP_MiSeq.fasta

In [10]:
%%bash
tmap mapall -f ../data/RM8375/ref/HGAP_MiSeq.fasta -r ../data/RM8375/PGM/fastq/SRR1393710.fastq -i fastq -s SRR1393710.sam stage1 map4

In [12]:
%%bash
head SRR1393710.sam


@HD	VN:1.4
@SQ	SN:unitig_0|quiver|quiver|quiver|pilon	LN:4857490
@SQ	SN:unitig_2|quiver|quiver|quiver|pilon	LN:93928
@RG	ID:NOID	PG:tmap	SM:NOSM
@PG	ID:tmap	CL:mapall -f ../data/RM8375/ref/HGAP_MiSeq.fasta -r ../data/RM8375/PGM/fastq/SRR1393710.fastq -i fastq -s SRR1393710.sam stage1 map4	VN:3.4.1
SRR1393710.1	0	unitig_0|quiver|quiver|quiver|pilon	3754311	81	27M3I14M1I28M1I65M1I116M1D37M3S	*	0	0	CTAACAAAATAACGTGCTGTAATTTTTTTAAAAATAATAAGAGATTTACGTCTGGTTGCAAGAGATCATGACAGGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGCACATGAACAAGTTTCGGAATGTGATCAATTTTAAAAATTTATTGACTTAGGCGGGCAGATACTTTAACCAATATAGGAATACAAGACAGACAAATAAAAATGACAGAGTACACAACATCCATGAACCACATCAGCACCACCACCATACCACCATCACCATTACCACAGGTAACGGTGCGGGCCTG	/5;69>>>1=>9>>@:99@991999999$8888*//(//(////55*6;;;665/64::<8<<?@:::?>>A::::::);8?:>1:19988+5==9?AEAAACD<?ABCCC@@@:::2;F2:::-:>9>896655555/555(5555'55)556;;@A>BB<BBD@DAAA@>:;?)/)/8?6>>B@B2:5:@>>8C/////CBB1/::::*:::????AAAA>=>;A>>>:?...'/)/////;>><<7=>4721,,,,(,61489//)//)//)/488499(//(/889<382..	RG:Z:NOID	PG:Z:tmap	MD:Z:232G17^T37	NM:i:8	AS:i:244	XA:Z:map4-1	XS:i:-2147483647
SRR1393710.2	0	unitig_0|quiver|quiver|quiver|pilon	3754292	96	306M1I10M	*	0	0	CCATGATATTGAAAAAAAACTAACAAAATAACGTGCTGTAATTTTTAAAATAATAAGAGATTACGTCTGGTTGCAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGCACATGAACAAGTTTCGGAATGTGATCAATTTAAAAATTTATTGACTTAGGCGGGCAGATACTTTAACCAATATAGGAATACAAGACAGACAAATAAAAATGACAGAGTACACAACATCCATGAACCGCATCAGCACCACCACCATTACCACCATCACCATTACCACAGGTAACGGTGCGGGCCTGACGCGTA	B>CAA<<591::::::::$6;;7:<::,:;7>>CCCCDEC9??>>->>>1??8?C=CB99;4;:AAAAB=B>BDF<AAC??:::AAA:::::*:;A<D<C<AACA4???7@????@@A8B?CGFCAA@CBB:?>8==>5>A<E?@@@BC?>=.76(7777'77(5=7==CD=CC=DEC:CBDAAABAB9B<A<B;99:?A<F;AABA3:9A@AAA?A499999)999>A@@=>>/,++'+2474>;891<+//;99<?7766;:5;;8===CC=CC9>?=//(.82./(.777489*-415669<14--8-----13	RG:Z:NOID	PG:Z:tmap	MD:Z:316	NM:i:1	AS:i:309	XA:Z:map4-1	XS:i:-2147483647
SRR1393710.3	16	unitig_0|quiver|quiver|quiver|pilon	3754104	86	34M1D8M1D7M1D25M1I2M1I15M1I262M	*	0	0	GGTTTAATGCAACAGACCACAGAATCCGTTGATATGCTGATGATCGCAGTACAGGCGTTACGCGCGCGCCTTTTCACAGCCTGCTAACGACTCATGGAGGCGGCCGATGACCACAAATTAACCGACTGGCTACAACAGCGAATCGGCCTGCTGGGACAGCGAGATACGGTAATGTTGCACCGTTTGGTCCATGATATTGAAAAAAAACTAACAAAATAACGTGCTGTAATTTTTAAAATAATAAGAGATTACGTCTGGTTGCAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGCACATGAACAAGTTTCGGAATGTGATCAATTTAAAAATTTATTG	43(33-3366166634/44485/568@?;?>>;666655;<777<<<<77666/44448995555555/;,<<5/////)////;6:66555555465/;<6:6::667<8>>A:BB<>;>6<:6677;;<;<7<<<:;:46884:4>777<0?5565556;;76,,1,,(,,,1444:6;5*66/646<<?A@772;5$55555555/28/'////4/////::55/9+9995'////2;/)////78>>@B@::3?8?>98<869>:;899;44'4444-3/5/<5<6(6665*55999:?@:4:9CC@A@A>>7838/)//)///296;<555556/5(//&888/(///(//	RG:Z:NOID	PG:Z:tmap	MD:Z:34^T8^A7^T304	NM:i:6	AS:i:311	XA:Z:map4-1	XS:i:-2147483647
SRR1393710.4	16	unitig_0|quiver|quiver|quiver|pilon	3754250	87	53M1I17M2I46M1I82M1I134M	*	0	0	CCTGCTGGGACAGCGAGATACGGTAATGTTGCACCGTTTGGTCCATGATATTGAAAAAAAAACTAACAAAAATAAACGTGCTGTAATTTTTAAAATAATAAGAGATTACGTCTGGTTGCAAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGCACATGAACAAGTTTCGGAATGTGATCAATTTAAAAAATTTATTGACTTAGGCGGGCAGATACTTTAACCAATATAGGAATACAAGACAGACAAATAAAAATGACAGAGTACACAACATCCATGAACCGCATCAGCACCACCACCATTACCACCATCACCATTACCA	/1114-'--.-:4281-----/75/--424111-543?=5=>:744@??9488"8877888=-.'4-&-----+145B?775@@89(4443(5511-3-'-.240-2---798483883(373;<<;7939---&-..-'.'.'=8B?/AA.74A>48444985@677;=?@;?=>5=:3:::CEC=96;:?=9;771:,B:(:::::,@@?;BAA@9E:2::2?=;?<==95,::/:7B<CDFB@861::;:1AC@AAAA:-;:6->?>?@B>??A???BBDF>;::?8????9><@BDBBACBAB=BB<??8B?9??=A@;AAA@@<>:/:9:<6	RG:Z:NOID	PG:Z:tmap	MD:Z:70T261	NM:i:6	AS:i:298	XA:Z:map4-1	XS:i:-2147483647
SRR1393710.5	16	unitig_0|quiver|quiver|quiver|pilon	3754279	94	47M1D39M1I261M	*	0	0	TGCACCGTTTGGTCCATGATATTGAAAAAAAACTAACAAAATAACGTCTGTAATTTTTAAAATAATAAGAGATTACGTCTGGTTGCAAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGCACATGAACAAGTTTCGGAATGTGATCAATTTAAAAATTTATTGACTTAGGCGGGCAGATACTTTAACCAATATAGGAATACAAGACAGACAAATAAAAATGACAGAGTACACAACATCCATGAACCGCATCAGCACCACCACCATTACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG	87843633=:5;5386832--+33$9999995996:7,;::83::9<<7::24'4444&444-'--'--44--3368346895944)44/,,,,,))*,//+;CC@:@:=6/1:7*9998+999?<??AA<C??????@A:98=882B?8>@>8D@=77755:6/6*=6(6666*;;;38,,,07,(,,)B,//DBD;;6BB>B=B<BB::::@:A:::9A???:9:>4=>>-<<<<?>>?AAAABCFGC>@??@:?::::@:AACEDLMCCC?CC=;;9CD>DA<??<C@@?B;AA<AA=BA@;8962//(//;9+:99><????BB??>=D<CCC=<<5):::::8	RG:Z:NOID	PG:Z:tmap	MD:Z:47^G300	NM:i:2	AS:i:333	XA:Z:map4-1	XS:i:-2147483647

tmap commands


In [1]:
## functions for mapping fastq files with tmap
import sys
import time
import subprocess

In [2]:
def tmap_index_ref(in_ref, log_dir):
    '''tmap index reference'''
    print "Indexing reference with TMAP ..."
    
    # prep files
    log_file = open(log_dir + "/tmap_index_ref" + time.strftime("-%Y-%m-%d-%H-%M-%S.log"),'w')
    stderr_file = open(log_dir + "/tmap_index_ref" + time.strftime("-%Y-%m-%d-%H-%M-%S.stder"),'w')
    
    # run command
    tmap_index_ref_command = ["tmap", "index", "-f",ref]
    subprocess.call(tmap_index_ref_command, stdout=log_file,stderr=stderr_file) 
    log_file.close(); stderr_file.close()

In [3]:
def tmap_map_fq(in_ref, in_fq, out_sam, log_dir):
    '''Mapping fastq with tmap'''
    print "Mapping fastq with TMAP ..."
    
    # prep files
    log_file = open(log_dir + "/tmap_map"+ time.strftime("-%Y-%m-%d-%H-%M-%S.log"),'w')
    stderr_file = open(log_dir + "/tmap_map"+ time.strftime("-%Y-%m-%d-%H-%M-%S.stder"),'w')
    
    # run command
    tmap_map_command = ["tmap", "mapall", "-f", ref, "-r", in_fq, "-i", "fastq", "-s", out_sam, "stage1", "map4"]
    subprocess.call(tmap_map_command, stdout=log_file,stderr=stderr_file) 
    log_file.close(); stderr_file.close()

Testing functions


In [4]:
ref = "../data/RM8375/ref/HGAP_MiSeq.fasta"
fq = "../data/RM8375/PGM/fastq/SRR1393711.fastq"
out_sam="SRR1393711.fastq"
log_dir="../dev"

In [5]:
tmap_index_ref(ref,log_dir)
tmap_map_fq(ref,fq,out_sam,log_dir)


Indexing reference with TMAP ...
Mapping fastq with TMAP ...

In [ ]: