Iterative mapping first proposed by (Imakaev et al., 2012), allows to map usually a high number of reads. However other methodologies, less "brute-force" can be used to take into account the chimeric nature of Hi-C reads.
A simple alternative is to allow split mapping, just as with RNA-seq data.
Another way consists in pre-truncating (Ay and Noble, 2015) reads that contain a ligation site and map only the longest part of the read (Wingett et al., 2015).
Finally, an intermediate approach, fragment-based, consists in mapping full length reads first, and than splitting unmapped reads at the ligation sites (Serra, Ba`{u, Filion and Marti-Renom, 2016).
Note: We use GEM (Marco-Sola, Sammeth, Guig\'{o and Ribeca, 2012), performance are very similar to Bowtie2, perhaps a bit better.
For now TADbit is only compatible with GEM.
In [1]:
from pytadbit.mapping.full_mapper import full_mapping
The full mapping function can be used to perform either iterative or fragment-based mapping, or a combination of both.
Here an example of use as iterative mapping:
In [11]:
r_enz = 'HindIII'
In [ ]:
! mkdir -p results/iterativ/$r_enz
! mkdir -p results/iterativ/$r_enz/01_mapping
In [6]:
# for the first side of the reads
full_mapping(gem_index_path='/media/storage/db/reference_genome/Homo_sapiens/hg38/hg38.gem',
out_map_dir='results/iterativ/{0}/01_mapping/mapped_{0}_r1/'.format(r_enz),
fastq_path='/media/storage/FASTQs/K562_%s_1.fastq' % (r_enz),
r_enz='hindIII', frag_map=False, clean=True, nthreads=20,
windows=((1,25),(1,30),(1,35),(1,40),(1,45),(1,50),(1,55),(1,60),(1,65),(1,70),(1,75)),
temp_dir='results/iterativ/{0}/01_mapping/mapped_{0}_r1_tmp/'.format(r_enz))
Out[6]:
And for the second side of the read:
In [8]:
# for the second side of the reads
full_mapping(gem_index_path='/media/storage/db/reference_genome/Homo_sapiens/hg38/hg38.gem',
out_map_dir='results/iterativ/{0}/01_mapping/mapped_{0}_r2/'.format(r_enz),
fastq_path='/media/storage/FASTQs/K562_%s_2.fastq' % (r_enz),
r_enz=r_enz, frag_map=False, clean=True, nthreads=20,
windows=((1,25),(1,30),(1,35),(1,40),(1,45),(1,50),(1,55),(1,60),(1,65),(1,70),(1,75)),
temp_dir='results/iterativ/{0}/01_mapping/mapped_{0}_r2_tmp/'.format(r_enz))
With fragment based mapping it would be:
In [12]:
! mkdir -p results/fragment/$r_enz
! mkdir -p results/fragment/$r_enz/01_mapping
In [13]:
# for the first side of the reads
full_mapping(gem_index_path='/media/storage/db/reference_genome/Homo_sapiens/hg38/hg38.gem',
out_map_dir='results/fragment/{0}/01_mapping/mapped_{0}_r1/'.format(r_enz),
fastq_path='/media/storage/FASTQs/K562_%s_1.fastq' % (r_enz),
r_enz=r_enz, frag_map=True, clean=True, nthreads=20,
temp_dir='results/fragment/{0}/01_mapping/mapped_{0}_r1_tmp/'.format(r_enz))
Out[13]:
In [14]:
# for the second side of the reads
full_mapping(gem_index_path='/media/storage/db/reference_genome/Homo_sapiens/hg38/hg38.gem',
out_map_dir='results/fragment/{0}/01_mapping/mapped_{0}_r2/'.format(r_enz),
fastq_path='/media/storage/FASTQs/K562_%s_2.fastq' % (r_enz),
r_enz=r_enz, frag_map=True, clean=True, nthreads=20,
temp_dir='results/fragment/{0}/01_mapping/mapped_{0}_r2_tmp/'.format(r_enz))
Out[14]:
[^](#ref-1) Imakaev, Maxim V and Fudenberg, Geoffrey and McCord, Rachel Patton and Naumova, Natalia and Goloborodko, Anton and Lajoie, Bryan R and Dekker, Job and Mirny, Leonid A. 2012. Iterative correction of Hi-C data reveals hallmarks of chromosome organization.. URL
[^](#ref-2) Ay, Ferhat and Noble, William Stafford. 2015. Analysis methods for studying the 3D architecture of the genome. URL
[^](#ref-3) Wingett, Steven and Ewels, Philip and Furlan-Magaril, Mayra and Nagano, Takashi and Schoenfelder, Stefan and Fraser, Peter and Andrews, Simon. 2015. HiCUP: pipeline for mapping and processing Hi-C data.. URL
[^](#ref-4) Serra, Fran\c{cois and Ba`{u, Davide and Filion, Guillaume and Marti-Renom, Marc A.. 2016. Structural features of the fly chromatin colors revealed by automatic three-dimensional modeling.. URL
[^](#ref-5) Marco-Sola, Santiago and Sammeth, Michael and Guig\'{o, Roderic and Ribeca, Paolo. 2012. The GEM mapper: fast, accurate and versatile alignment by filtration.