Here, we present the analysis to detect the compartments in Mouse B and iPS cells. In this example, we will use the GC-content (guanine-cytosine content) to identify which bins belong to the A or B compartments. The percentage of bases that are either guanine or cytosine on a DNA strand correlates directly with gene density and is a good measure to identify open and close chromatine.
Note: Compartments are detected on the full genome matrix.
In [1]:
from pytadbit.parsers.hic_parser import load_hic_data_from_bam
from pytadbit.parsers.genome_parser import get_gc_content, parse_fasta
from pickle import load
In [2]:
reso = 200000
base_path = 'results/fragment/{0}_both/03_filtering/valid_reads12_{0}.bam'
bias_path = 'results/fragment/{0}_both/04_normalizing/biases_{0}_both_{1}kb.biases'
In [3]:
rich_in_A = get_gc_content(parse_fasta('genome/Mus_musculus-GRCm38.p6/Mus_musculus-GRCm38.p6.fa'),
by_chrom=True ,resolution=reso, n_cpus=8)
In [4]:
cell = 'mouse_B'
In [5]:
hic_data = load_hic_data_from_bam(base_path.format(cell),
resolution=reso,
biases=bias_path.format(cell, reso // 1000),
ncpus=8)
In [6]:
! mkdir -p results/fragment/$cell\_both/05_segmenting
In [7]:
chrname = 'chr3'
corr = hic_data.find_compartments(show_compartment_labels=True,
show=True, crms=[chrname], vmin='auto', vmax='auto', rich_in_A=rich_in_A,
savedata='results/fragment/{0}_both/05_segmenting/compartments_{1}_{2}.tsv'.format(cell, chrname, reso),
savedir='results/fragment/{0}_both/05_segmenting/eigenvectors_{1}_{2}'.format(cell, chrname, reso))