Compartments and TADs detection

Here, we present the analysis to detect the compartments in Mouse B and iPS cells. In this example, we will use the GC-content (guanine-cytosine content) to identify which bins belong to the A or B compartments. The percentage of bases that are either guanine or cytosine on a DNA strand correlates directly with gene density and is a good measure to identify open and close chromatine.

Note: Compartments are detected on the full genome matrix.


In [1]:
from pytadbit.parsers.hic_parser import load_hic_data_from_bam
from pytadbit.parsers.genome_parser import get_gc_content, parse_fasta
from pickle import load

In [2]:
reso = 200000
base_path = 'results/fragment/{0}_both/03_filtering/valid_reads12_{0}.bam'
bias_path = 'results/fragment/{0}_both/04_normalizing/biases_{0}_both_{1}kb.biases'

In [3]:
rich_in_A = get_gc_content(parse_fasta('genome/Mus_musculus-GRCm38.p6/Mus_musculus-GRCm38.p6.fa'), 
                           by_chrom=True ,resolution=reso, n_cpus=8)


Loading cached genome

Compartments

Mouse B cells


In [4]:
cell   = 'mouse_B'

In [5]:
hic_data = load_hic_data_from_bam(base_path.format(cell),
                                  resolution=reso,
                                  biases=bias_path.format(cell, reso // 1000),
                                  ncpus=8)


  (Matrix size 13641x13641)                                                    [2020-02-06 12:04:07]

  - Parsing BAM (122 chunks)                                                   [2020-02-06 12:04:08]
     .......... .......... .......... .......... ..........     50/122
     .......... .......... .......... .......... ..........    100/122
     .......... .......... ..                                  122/122

  - Getting matrices                                                           [2020-02-06 12:06:28]
     .......... .......... .......... .......... ..........     50/122
     .......... .......... .......... .......... ..........    100/122
     .......... .......... ..                                  122/122


In [6]:
! mkdir -p results/fragment/$cell\_both/05_segmenting

In [7]:
chrname = 'chr3'
corr = hic_data.find_compartments(show_compartment_labels=True,
        show=True, crms=[chrname], vmin='auto', vmax='auto', rich_in_A=rich_in_A, 
        savedata='results/fragment/{0}_both/05_segmenting/compartments_{1}_{2}.tsv'.format(cell, chrname, reso),
        savedir='results/fragment/{0}_both/05_segmenting/eigenvectors_{1}_{2}'.format(cell, chrname, reso))