Applying GenomeDISCO on Hi-C data

Analysis to produce Figure 1B, and Figure 3

Contact: oursu@stanford.edu

3.2 Benchmarking GenomeDISCO on Hi-C datasets

We used more than 80 high quality Hi-C datasets from (Rao et al., 2014) spanning multiple human cell-lines (GM12878, HMEC, HUVEC, IMR90, K562, KBM7, NHEK) to benchmark the behavior of our concordance score (Figure 3). Due to the lack of explicit ground truth about the nature of noise in real datasets, we evaluate the validity of the concordance score by expecting higher scores when comparing pairs of biological replicates of Hi-C data with similar distance-dependence characteristics as compared to scores obtained by comparing Hi-C datasets from different cell types. We focused our analysis on a subset of experiments defined as those done with in-situ Hi-C (see Supplementary Table 2).

Next, we used GenomeDISCO, HiCRep and HiC-Spector to compute concordance scores for all the pairs of biological replicates and pairs of samples from different cell types. Hierarchical clustering of the samples based on the matrix of all pairwise concordance scores revealed that samples from the same cell type cluster together, for all three methods (see Supplementary Figure 5). For each method we defined an empirical threshold for classifying sample-pairs into one of two categories labeled high-concordance and low-concordance. The threshold was determined as the highest score across all pairs of samples from different cell types, since we expect concordant biological replicates to be at least as con- cordant as samples from different cell types. We then analyzed the similarities and differences between the three methods in terms of their classification of the pairs of biological replicates. (Figure 3A).

Out of 149 pairs of biological replicates in the test set, we found that the methods agreed across most samples (93/149 biological replicate pairs were classified consistently between GenomeDISCO and HiCRep, and 103/149 between GenomeDISCO and HiC-Spector). For a small subset of replicate-pairs, HiCRep and/or HiC-Spector classified them as high-concordance, while GenomeDISCO classified them as low concordance. For 27/56 of the discrepancies between Genome-DISCO and HiCRep and 19/46 of the discrepancies with HiC-Spector, the comparisons involved samples with large differences in distance dependence curves (difference in distance dependence curve higher than 0.005, a value that was found to distinguish pairs of biological replicates in the high-concordance class from those in the low concordance class). For example, samples HIC070 and HIC072 (biological replicates for the K562 cell line) are classified as low-concordance by GenomeDISCO (score 0.643), but classified as high-concordance by HiCRep (score 0.911). These samples have a marked difference in their distance dependence curves (ranked as the largest difference in distance dependence curve among all biological replicate pairs) (Figure 3C). In fact, GenomeDISCO scores in general drop proportional to the difference in distance dependence curves between the pair of samples being compared (Figure 3B). Finally, we find 16 cases ranked as non-concordant by both HiCRep and HiC-Spector but deemed concordant by GenomeDISCO. For 5/16 of these, the GenomeDISCO score is equal to the threshold concordance of 0.8. Similarly, there are 19 cases deemed concordant only by HiCRep and 7 deemed concordant only by HiC-Spector.

We also found that 18 replicate pairs were deemed low-concordance by all three methods. In particular, in eight of these cases, replicate pairs classified as low-concordance by all three methods despite being deeply sequenced (>100 million reads) involved sample HIC014 from the GM12878 cell type (specifically HIC014 vs any of HIC004, HIC006, HIC010, HIC018, HIC022, HIC038, HIC042, HIC048). Upon closer inspection, we found that HIC014 exhibited an unusual pattern of uneven coverage across the genome (Figure 3D), likely explaining the observed results.

Finally, we also used the Hi-C data to check whether GenomeDISCO is able to detect differences in protocols or restriction enzymes used for each experiment (see Supplementary Figure 4A). We found that GenomeDISCO scores are lower for comparisons between samples prepared with dilution Hi-C versus in situ Hi-C. This observation is expected because dilution Hi-C experiments capture more random ligations between nuclear and mitochondrial DNA than in-situ Hi-C (see (Rao et al., 2014)). We also found that GenomeDISCO scores are higher for experiments performed with the same enzyme, compared to different enzymes (Supplementary Figure 4B).


In [1]:
require(ggplot2)
require(pheatmap)
require(PRROC)
require(ggplot2)
niceggplot=theme(panel.border = element_blank(),panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))+theme(axis.text=element_text(size=20),axis.title=element_text(size=20))+theme(panel.border = element_blank(), panel.grid.major = element_blank(),panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))


Loading required package: ggplot2
Loading required package: pheatmap
Loading required package: PRROC

Parameters set in stone.


In [2]:
DATA_PATH='/ifs/scratch/oursu/paper_2017-12-20'
RES='50000'
SCORES_PATH=paste(DATA_PATH,'/results/rao/res',RES,'.final/compiled_scores',sep='')
PLOTS_PATH=paste(SCORES_PATH,'/plots',sep='')
system(paste('mkdir -p ',PLOTS_PATH))
DISTDEP_FILE=paste(DATA_PATH,'/results/rao/dd/dddiff.real.txt',sep='')

Divide datasets into training and test based on whether their dataset index is odd or even


In [3]:
metadata=read.table(paste(DATA_PATH,'/data/LA_metadata.txt',sep=''),header=TRUE,sep='\t')
rownames(metadata)=metadata[,'library']

train_data=metadata[seq(1,dim(metadata)[1],2),]
test_data=metadata[seq(2,dim(metadata)[1],2),]
print('Number of datasets in training set')
print(dim(train_data)[1])
print('Number of datasets in test set')
print(dim(test_data)[1])


[1] "Number of datasets in training set"
[1] 44
[1] "Number of datasets in test set"
[1] 43

Read in scores, for each time step, t, for GenomeDISCO


In [4]:
read_in_scores=function(f){
    scores=read.table(f)
    colnames(scores)[1:4]=c('chromosome','m1','m2','score')
    #remove comparisons comparing one dataset to itself
    same=which(as.character(scores[,2])==as.character(scores[,3]))
    if (length(same)>0){
        scores=scores[-same,]
    }
    rownames(scores)=paste(as.character(scores[,2]),as.character(scores[,3]))
    return(scores[,-1])
}

disco_by_t=read_in_scores(paste(SCORES_PATH,'/HiC.GenomeDISCO.scores.multiple_t.genomewide.txt.gz',sep=''))
colnames(disco_by_t)=c('m1','m2','1','2','3','4','5')
head(disco_by_t)


m1m212345
HIC048 HIC066HIC048 HIC066 0.152826090.5796957 0.6792174 0.7167826 0.7424783
HIC048 HIC064HIC048 HIC064 0.044782610.5217826 0.6287391 0.6799130 0.7174783
HIC012 HIC026HIC012 HIC026 0.207913040.7697391 0.8634783 0.8895652 0.9016087
HIC048 HIC062HIC048 HIC062 -0.066173910.3366522 0.5795217 0.6675217 0.7153043
HIC048 HIC060HIC048 HIC060 -0.091173910.3093478 0.5642609 0.6568696 0.7071304
HIC048 HIC068HIC048 HIC068 0.020521740.4820870 0.5798696 0.6274348 0.6646087

Finding the optimal value of t (Figure 1B)

Obtain auPRC and auROC curves as a function of the parameter t


In [5]:
get_roc_pr_from_scores=function(scores){
    myroc=roc.curve(scores.class0=scores[,'score'], weights.class0=scores[,'biorep'],curve=TRUE)
    mypr=pr.curve(scores.class0=scores[,'score'], weights.class0=scores[,'biorep'],curve=TRUE)
    toreturn=list()
    toreturn[['roc']]=myroc[['auc']]
    toreturn[['pr']]=mypr[['auc.integral']]
    return(toreturn)
}

add_metadata=function(metadata,scores){
    scores=data.frame(scores,
                      seqdepth=apply(data.frame(metadata[as.character(scores[,'m1']),'totalReads'],metadata[as.character(scores[,'m2']),'totalReads']),1,min), 
                      cell1=metadata[as.character(scores[,'m1']),'celltype'],
                      cell2=metadata[as.character(scores[,'m2']),'celltype'],
                      re1=metadata[as.character(scores[,'m1']),'re'],
                      re2=metadata[as.character(scores[,'m2']),'re'],
                      crosslinking=paste(metadata[as.character(scores[,'m1']),'crosslinking'],metadata[as.character(scores[,'m2']),'crosslinking']),
                      protocol1=metadata[as.character(scores[,'m1']),'protocol'],
                      protocol2=metadata[as.character(scores[,'m2']),'protocol'])
    return(scores)
}

annotate_biorep_re=function(scores){
    scores=data.frame(scores,biorep=0,re=0)
    scores[which(scores[,'cell1']==scores[,'cell2']),'biorep']=1
    scores[which(scores[,'re1']==scores[,'re2']),'same_re']=1
    return(scores)
}

filter_insitu_crosslinked=function(scores){
    #restrict to in situ experiments
    in_situ=intersect(which(as.character(scores[,'protocol1'])=='in situ'),
                     which(as.character(scores[,'protocol2'])=='in situ'))
    
    #remove comparisons against un-crosslinked samples
    non_crosslink=which(grepl("NA",as.character(scores[,'crosslinking'])))

    #filter to keep only in-situ samples that have been crosslinked
    keep=setdiff(in_situ,non_crosslink)
    scores=scores[keep,]
    
    return(scores)
}

filter_data=function(scores,metadata){
    keep=intersect(which(as.character(scores[,'m1']) %in% rownames(metadata)),
                   which(as.character(scores[,'m2']) %in% rownames(metadata)))
    scores=scores[keep,]

    #fill in the entries from the metadata for the pairs
    scores=add_metadata(metadata,scores)
    
    #specify which are bioreps and which experiments are done with the same restriction enzyme
    scores=annotate_biorep_re(scores)
    
    #filter out comparisons that are not in situ or against non-crosslinked samples
    scores=filter_insitu_crosslinked(scores)
    return(scores)
}

auPRC_auROC_curve=function(scores_orig,metadata,out){
    FIG_WIDTH=4
    FIG_HEIGHT=3
    t_values=as.numeric(as.character(colnames(scores_orig)[3:(dim(scores_orig)[2])]))
    
    scores=filter_data(scores_orig,metadata)
    colnames(scores)[1:dim(scores_orig)[2]]=colnames(scores_orig)

    #Performance analysis with auROC and auPRC ================================
    auprc_values=c()
    auroc_values=c()
    for (t in t_values){
        current_scores=scores[,as.character(t)]
        roc_pr=get_roc_pr_from_scores(data.frame(biorep=scores[,'biorep'],score=current_scores))
        auprc_values=c(auprc_values,roc_pr[['pr']])
        auroc_values=c(auroc_values,roc_pr[['roc']])
    }
        
    #plot auPRC
    pdf(out,width=FIG_WIDTH,height=FIG_HEIGHT)
    auprcs=data.frame(t=t_values,auPRC=auprc_values)
    p=ggplot(auprcs,aes(x=t,y=auPRC))+theme_bw()+
    geom_point(size=3)+ylim(0,1)+xlab('Random walk steps t')+ylab('auPRC')+
    niceggplot+geom_line()+ggtitle('auPRC')
    print(p)
    dev.off()
    
    print(auprcs)
    Sys.sleep(3)
    print(p)     
    
    #plot auROC   
    aurocs=data.frame(t=t_values,auROC=auroc_values)
    p=ggplot(aurocs,aes(x=t,y=auROC))+theme_bw()+
    geom_point(size=3)+ylim(0,1)+xlab('Random walk steps t')+ylab('auROC')+
    niceggplot+geom_line()+ggtitle('auROC')
      
    print(aurocs)
    Sys.sleep(3)
    print(p)
}

print('Training set performance =============================')
auPRC_auROC_curve(disco_by_t,train_data,
                 paste(PLOTS_PATH,'/GenomeDISCO.train.auPRC_curve.pdf',sep=''))

print('Test set performance =============================')
auPRC_auROC_curve(disco_by_t,test_data,
                 paste(PLOTS_PATH,'/GenomeDISCO.test.auPRC_curve.pdf',sep=''))


[1] "Training set performance ============================="
  t     auPRC
1 1 0.4575498
2 2 0.8430601
3 3 0.9605889
4 4 0.9634034
5 5 0.9575416
  t     auROC
1 1 0.6542259
2 2 0.8985076
3 3 0.9689963
4 4 0.9718838
5 5 0.9698116
[1] "Test set performance ============================="
  t     auPRC
1 1 0.4116131
2 2 0.8256656
3 3 0.9252585
4 4 0.9349572
5 5 0.9339472
  t     auROC
1 1 0.6224301
2 2 0.8735770
3 3 0.9431442
4 4 0.9499724
5 5 0.9484326

Based on the above analysis, we used a value of t=3 as optimal. We fix this value for the following analyses.

We also use the following thresholds of concordance, which are the highest scores obtained by non-replicates in this analysis.


In [6]:
GENOMEDISCO_THRESHOLD=0.8
HICREP_THRESHOLD=0.82
HICSPECTOR_THRESHOLD=0.27

Comparing scores from GenomeDISCO with those from other methods (Figure 3A)

Read in data


In [7]:
read_in_scores=function(f){
    scores=read.table(f)
    colnames(scores)=c('chromosome','m1','m2','score')
    #remove comparisons comparing one dataset to itself
    same=which(as.character(scores[,2])==as.character(scores[,3]))
    if (length(same)>0){
        scores=scores[-same,]
    }
    rownames(scores)=paste(as.character(scores[,2]),as.character(scores[,3]))
    return(scores[,-1])
}

disco=read_in_scores(paste(SCORES_PATH,'/HiC.GenomeDISCO.scores.genomewide.txt.gz',sep=''))
hicrep=read_in_scores(paste(SCORES_PATH,'/HiC.HiCRep.scores.genomewide.txt.gz',sep=''))
hicspector=read_in_scores(paste(SCORES_PATH,'/HiC.HiC-Spector.scores.genomewide.txt.gz',sep=''))


common=intersect(rownames(disco),intersect(rownames(hicrep),rownames(hicspector)))
allscores=data.frame(m1=disco[common,'m1'],m2=disco[common,'m2'],
                 GenomeDISCO=disco[common,'score'],
                 HiCRep=hicrep[common,'score'],
                 HiCSpector=hicspector[common,'score'])
head(allscores)


m1m2GenomeDISCOHiCRepHiCSpector
HIC048 HIC066 0.67921740.67856520.1678696
HIC048 HIC064 0.62873910.69908700.1682609
HIC012 HIC026 0.86347830.69960870.2200000
HIC048 HIC062 0.57952170.67573910.1953913
HIC048 HIC060 0.56426090.70221740.1937826
HIC048 HIC068 0.57986960.65086960.1344348

In [26]:
compare_methods=function(scores1,scores2,ddfile,s1name,s2name,metadata,thresh1,thresh2,out,xmin,xmax,ymin,ymax,dist_dep_thresh){
    #======= settings for nice-looking ggplots
    niceggplot=theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))

    niceggplot_fullborders=theme(axis.text=element_text(size=20),
                             axis.title=element_text(size=20))
    #==========================================
    
    #set column and row names
    colnames(scores1)[3]='score'
    colnames(scores2)[3]='score'
    rownames(scores1)=paste(as.character(scores1[,'m1']),as.character(scores1[,'m2']))
    rownames(scores2)=paste(as.character(scores2[,'m1']),as.character(scores2[,'m2']))
    
    #filter data to only keep in situ samples and samples that have been crosslinked
    scores1=filter_data(scores1,metadata)
    scores2=filter_data(scores2,metadata)
    
    #read in differences in distance dependence
    dds=read.table(ddfile)
    rownames(dds)=paste(dds[,1],dds[,2])
    
    #combine scores in one data frame, annotated
    common=intersect(rownames(scores1),rownames(scores2))
    combined=data.frame(m1=scores1[common,'m1'],
                        m2=scores1[common,'m2'],
                        s1=scores1[common,'score'],
                        s2=scores2[common,'score'],
                        dd=dds[common,3],
                        biorep=scores1[common,'biorep'],
                        seqdepth=scores1[common,'seqdepth'])
    combined=combined[order(as.numeric(as.character(combined$dd))),]
    combined=data.frame(combined,dd_binarized=(as.numeric(as.character(combined[,'dd']))>=dist_dep_thresh))
    
    #scatterplot of scores coming from the 2 methods
    FIG_WIDTH=6
    FIG_HEIGHT=10
    MAX_DD_DIFF=0.025
    pdf(paste(out,'.dd.pdf',sep=''),width=FIG_WIDTH,height=FIG_HEIGHT)
    p=ggplot(combined,aes(x=s2,y=s1,col=dd_binarized))+
         geom_point(size=1.5)+xlab(s2name)+ylab(s1name)+
         theme_bw()+facet_wrap(~biorep,nrow=2)+
         scale_colour_manual(values = c('gray','red'))+
         geom_hline(yintercept = thresh2)+geom_vline(xintercept = thresh1)+
         xlim(xmin,xmax)+ylim(ymin,ymax)+niceggplot
    print(p)
    dev.off()
    print(p)  
    
    #======================================================
    #print numbers relevant for the cross-method comparison
    #======================================================
    print(paste('==================','Comparing',s1name,'and',s2name,'==================='))
    print('Total comparisons considered')
    print(dim(combined)[1])
    print('Total biological replicates')
    bioreps=which(as.character(combined$biorep)=='1')
    print(length(bioreps))
    #split into concordant and nonconcordant
    conc1=which(as.numeric(as.character(combined[,'s1']))>=thresh2)
    conc2=which(as.numeric(as.character(combined[,'s2']))>=thresh1)
    nonconc1=which(as.numeric(as.character(combined[,'s1']))<thresh2)
    nonconc2=which(as.numeric(as.character(combined[,'s2']))<thresh1)

    print('Union of comparisons deemed concordant by any of the two methods (bioreps + non-reps)')
    print(length(union(conc1,conc2)))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,union(conc1,conc2))))
    
    print('Intersection of comparisons deemed concordant by both methods (bioreps + non-reps)')
    print(length(intersect(conc1,conc2)))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,intersect(conc1,conc2))))
    
    print('Intersection of comparisons deemed non-concordant by both methods (bioreps + non-reps)')
    print(length(intersect(nonconc1,nonconc2)))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,intersect(nonconc1,nonconc2))))
    
    print('Cases where methods agree in their classification of comparisons')
    print(length(intersect(conc1,conc2))+length(intersect(nonconc1,nonconc2)))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,intersect(conc1,conc2)))+
          length(intersect(bioreps,intersect(nonconc1,nonconc2))))
    
    print('Cases where methods disagree in their classification of comparisons')
    print(dim(combined)[1]-(length(intersect(conc1,conc2))+length(intersect(nonconc1,nonconc2))))
    print('- restricted to biological replicates')
    print(length(bioreps)-(length(intersect(bioreps,intersect(conc1,conc2)))+
          length(intersect(bioreps,intersect(nonconc1,nonconc2)))))
    
    print('Concordant by other method but not GenomeDISCO')
    print(length(setdiff(conc2,conc1)))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,setdiff(conc2,conc1))))
    
    print('Concordant by other method but not GenomeDISCO, explained by distance dep')
    dist_dep_diff=which(as.numeric(as.character(combined$dd))>=dist_dep_thresh)
    print(length(intersect(dist_dep_diff,setdiff(conc2,conc1))))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,intersect(dist_dep_diff,setdiff(conc2,conc1)))))
    
    print('Concordant by GenomeDISCO, not the other method')
    print(length(setdiff(conc1,conc2)))
    print('- restricted to biological replicates')
    print(length(intersect(bioreps,setdiff(conc1,conc2))))
    
}

HICREP_MIN=0
HICREP_MAX=1
GENOMEDISCO_MIN=0.4
GENOMEDISCO_MAX=1
HICSPECTOR_MIN=0
HICSPECTOR_MAX=0.7
DIST_DEP_THRESHOLD=0.005

compare_methods(allscores[,c('m1','m2','GenomeDISCO')],allscores[,c('m1','m2','HiCRep')],
               DISTDEP_FILE,
                'GenomeDISCO score','HiCRep score',
                test_data,
                HICREP_THRESHOLD,GENOMEDISCO_THRESHOLD,
                paste(PLOTS_PATH,'/GenomeDISCO.vs.HiCRep.pdf',sep=''),
                HICREP_MIN,HICREP_MAX,GENOMEDISCO_MIN,GENOMEDISCO_MAX,DIST_DEP_THRESHOLD)

compare_methods(allscores[,c('m1','m2','GenomeDISCO')],allscores[,c('m1','m2','HiCSpector')],
               DISTDEP_FILE,
                'GenomeDISCO score','HiC-Spector score',
                test_data,
                HICSPECTOR_THRESHOLD,GENOMEDISCO_THRESHOLD,
                paste(PLOTS_PATH,'/GenomeDISCO.vs.HiCSpector.pdf',sep=''),
                HICSPECTOR_MIN,HICSPECTOR_MAX,GENOMEDISCO_MIN,GENOMEDISCO_MAX,DIST_DEP_THRESHOLD)


[1] "================== Comparing GenomeDISCO score and HiCRep score ==================="
[1] "Total comparisons considered"
[1] 465
[1] "Total biological replicates"
[1] 149
[1] "Union of comparisons deemed concordant by any of the two methods (bioreps + non-reps)"
[1] 124
[1] "- restricted to biological replicates"
[1] 124
[1] "Intersection of comparisons deemed concordant by both methods (bioreps + non-reps)"
[1] 69
[1] "- restricted to biological replicates"
[1] 69
[1] "Intersection of comparisons deemed non-concordant by both methods (bioreps + non-reps)"
[1] 341
[1] "- restricted to biological replicates"
[1] 25
[1] "Cases where methods agree in their classification of comparisons"
[1] 410
[1] "- restricted to biological replicates"
[1] 94
[1] "Cases where methods disagree in their classification of comparisons"
[1] 55
[1] "- restricted to biological replicates"
[1] 55
[1] "Concordant by other method but not GenomeDISCO"
[1] 34
[1] "- restricted to biological replicates"
[1] 34
[1] "Concordant by other method but not GenomeDISCO, explained by distance dep"
[1] 21
[1] "- restricted to biological replicates"
[1] 21
[1] "Concordant by GenomeDISCO, not the other method"
[1] 21
[1] "- restricted to biological replicates"
[1] 21
[1] "================== Comparing GenomeDISCO score and HiC-Spector score ==================="
[1] "Total comparisons considered"
[1] 465
[1] "Total biological replicates"
[1] 149
[1] "Union of comparisons deemed concordant by any of the two methods (bioreps + non-reps)"
[1] 113
[1] "- restricted to biological replicates"
[1] 113
[1] "Intersection of comparisons deemed concordant by both methods (bioreps + non-reps)"
[1] 66
[1] "- restricted to biological replicates"
[1] 66
[1] "Intersection of comparisons deemed non-concordant by both methods (bioreps + non-reps)"
[1] 352
[1] "- restricted to biological replicates"
[1] 36
[1] "Cases where methods agree in their classification of comparisons"
[1] 418
[1] "- restricted to biological replicates"
[1] 102
[1] "Cases where methods disagree in their classification of comparisons"
[1] 47
[1] "- restricted to biological replicates"
[1] 47
[1] "Concordant by other method but not GenomeDISCO"
[1] 23
[1] "- restricted to biological replicates"
[1] 23
[1] "Concordant by other method but not GenomeDISCO, explained by distance dep"
[1] 12
[1] "- restricted to biological replicates"
[1] 12
[1] "Concordant by GenomeDISCO, not the other method"
[1] 24
[1] "- restricted to biological replicates"
[1] 24

Computing concordance across all 3 methods


In [89]:
#keep our set of in-situ, crosslinked bioreps
combo=filter_data(allscores,test_data)
bioreps=which(as.character(combo[,'biorep'])=='1')
dds=read.table(DISTDEP_FILE)
rownames(dds)=paste(dds[,1],dds[,2])
combo=combo[bioreps,]
rownames(combo)=paste(as.character(combo[,'m1']),as.character(combo[,'m2']))
combo=data.frame(combo,dd=dds[rownames(combo),3])

print('Concordant only by GenomeDISCO (bioreps only)')
disco_only=intersect(intersect(which(as.numeric(as.character(combo[,'GenomeDISCO']))>=GENOMEDISCO_THRESHOLD),
                           which(as.numeric(as.character(combo[,'HiCRep']))<HICREP_THRESHOLD)),
                 which(as.numeric(as.character(combo[,'HiCSpector']))<HICSPECTOR_THRESHOLD))
print(length(disco_only))
print(combo[disco_only,])

print('Concordant only by HiCRep (bioreps only)')
hicrep_only=intersect(intersect(which(as.numeric(as.character(combo[,'GenomeDISCO']))<GENOMEDISCO_THRESHOLD),
                           which(as.numeric(as.character(combo[,'HiCRep']))>=HICREP_THRESHOLD)),
                 which(as.numeric(as.character(combo[,'HiCSpector']))<HICSPECTOR_THRESHOLD))
print(length(hicrep_only))
print(combo[hicrep_only,])

print('Concordant only by HiC-Spector (bioreps only)')
hicspector_only=intersect(intersect(which(as.numeric(as.character(combo[,'GenomeDISCO']))<GENOMEDISCO_THRESHOLD),
                           which(as.numeric(as.character(combo[,'HiCRep']))<HICREP_THRESHOLD)),
                 which(as.numeric(as.character(combo[,'HiCSpector']))>=HICSPECTOR_THRESHOLD))
print(length(hicspector_only))
print(combo[hicspector_only,])

print('Non-concordant by all methods (bioreps only)')
non=intersect(intersect(which(as.numeric(as.character(combo[,'GenomeDISCO']))<GENOMEDISCO_THRESHOLD),
                           which(as.numeric(as.character(combo[,'HiCRep']))<HICREP_THRESHOLD)),
                 which(as.numeric(as.character(combo[,'HiCSpector']))<HICSPECTOR_THRESHOLD))
print(length(non))
print(combo[non,])


[1] "Concordant only by GenomeDISCO (bioreps only)"
[1] 18
                  m1     m2 GenomeDISCO    HiCRep HiCSpector  seqdepth   cell1
HIC012 HIC026 HIC012 HIC026   0.8634783 0.6996087  0.2200000 194217179 GM12878
HIC004 HIC040 HIC004 HIC040   0.8033478 0.7378696  0.2425652 125709561 GM12878
HIC006 HIC012 HIC006 HIC012   0.8428261 0.6542609  0.1919565 153771943 GM12878
HIC008 HIC012 HIC008 HIC012   0.8611304 0.6230435  0.1823913 194217179 GM12878
HIC008 HIC014 HIC008 HIC014   0.8129130 0.4343043  0.1430870 212099519 GM12878
HIC010 HIC016 HIC010 HIC016   0.8250870 0.8196957  0.2677391  55813939 GM12878
HIC004 HIC012 HIC004 HIC012   0.8491304 0.6233913  0.1948261 160649365 GM12878
HIC014 HIC026 HIC014 HIC026   0.8400435 0.5303913  0.1573913 309986657 GM12878
HIC014 HIC024 HIC014 HIC024   0.8088261 0.6827826  0.1497391 111656957 GM12878
HIC012 HIC028 HIC012 HIC028   0.8409565 0.7911739  0.1943478  74167365 GM12878
HIC012 HIC020 HIC012 HIC020   0.8450870 0.7898696  0.2330870 194217179 GM12878
HIC012 HIC022 HIC012 HIC022   0.8049565 0.7760870  0.2694348 194217179 GM12878
HIC010 HIC012 HIC010 HIC012   0.8309130 0.6771739  0.1770000  55813939 GM12878
HIC012 HIC018 HIC012 HIC018   0.8442174 0.7696087  0.2103913 137165012 GM12878
HIC014 HIC028 HIC014 HIC028   0.8017391 0.6573478  0.1496087  74167365 GM12878
HIC014 HIC020 HIC014 HIC020   0.8055652 0.6545217  0.1722609 309986657 GM12878
HIC006 HIC040 HIC006 HIC040   0.8087391 0.7645217  0.2468261 125709561 GM12878
HIC012 HIC024 HIC012 HIC024   0.8407826 0.8080870  0.1938696 111656957 GM12878
                cell2  re1   re2                  crosslinking protocol1
HIC012 HIC026 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC004 HIC040 GM12878 MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC006 HIC012 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC008 HIC012 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC008 HIC014 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC010 HIC016 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC004 HIC012 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC026 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC024 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC028 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC020 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC022 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC010 HIC012 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC018 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC028 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC020 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC006 HIC040 GM12878 MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC024 GM12878 MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
              protocol2 biorep re same_re          dd
HIC012 HIC026   in situ      1  0       1 0.005950547
HIC004 HIC040   in situ      1  0      NA 0.006327920
HIC006 HIC012   in situ      1  0       1 0.002400700
HIC008 HIC012   in situ      1  0       1 0.003233106
HIC008 HIC014   in situ      1  0       1 0.005241182
HIC010 HIC016   in situ      1  0       1 0.001975876
HIC004 HIC012   in situ      1  0       1 0.002807323
HIC014 HIC026   in situ      1  0       1 0.008453534
HIC014 HIC024   in situ      1  0       1 0.006567174
HIC012 HIC028   in situ      1  0       1 0.003057797
HIC012 HIC020   in situ      1  0       1 0.001595254
HIC012 HIC022   in situ      1  0       1 0.001218523
HIC010 HIC012   in situ      1  0       1 0.003517041
HIC012 HIC018   in situ      1  0       1 0.001507090
HIC014 HIC028   in situ      1  0       1 0.004573193
HIC014 HIC020   in situ      1  0       1 0.003020929
HIC006 HIC040   in situ      1  0      NA 0.005464088
HIC012 HIC024   in situ      1  0       1 0.004250290
[1] "Concordant only by HiCRep (bioreps only)"
[1] 18
                  m1     m2 GenomeDISCO    HiCRep HiCSpector  seqdepth
HIC008 HIC038 HIC008 HIC038   0.7663043 0.8858261  0.2204348 101689129
HIC070 HIC074 HIC070 HIC074   0.6765217 0.9254348  0.1997826  80778733
HIC070 HIC072 HIC070 HIC072   0.6440435 0.9107826  0.1888261  79578049
HIC014 HIC040 HIC014 HIC040   0.7266087 0.8223043  0.2116957 125709561
HIC004 HIC038 HIC004 HIC038   0.7808696 0.8866522  0.2429565 101689129
HIC050 HIC052 HIC050 HIC052   0.7759130 0.9453478  0.2577826  26448232
HIC028 HIC040 HIC028 HIC040   0.7881304 0.8977826  0.2421739  74167365
HIC006 HIC038 HIC006 HIC038   0.7878261 0.8990870  0.2473478 101689129
HIC058 HIC062 HIC058 HIC062   0.6487826 0.8905652  0.2258696  19298780
HIC058 HIC060 HIC058 HIC060   0.6336522 0.8836522  0.2055652  16358597
HIC026 HIC038 HIC026 HIC038   0.7150000 0.8916522  0.2477391 101689129
HIC028 HIC038 HIC028 HIC038   0.7565217 0.8920000  0.2402609  74167365
HIC010 HIC038 HIC010 HIC038   0.7574783 0.8921304  0.2118696  55813939
HIC024 HIC038 HIC024 HIC038   0.7218696 0.8698261  0.2174783 101689129
HIC022 HIC024 HIC022 HIC024   0.7874783 0.9477826  0.2577826 111656957
HIC018 HIC038 HIC018 HIC038   0.7850435 0.9041739  0.2513043 101689129
HIC026 HIC040 HIC026 HIC040   0.7514348 0.8382609  0.2623043 125709561
HIC024 HIC040 HIC024 HIC040   0.7632609 0.9187826  0.2386522 111656957
                        cell1           cell2  re1   re2
HIC008 HIC038         GM12878         GM12878 MboI DpnII
HIC070 HIC074  K562 (CCL-243)  K562 (CCL-243) MboI  MboI
HIC070 HIC072  K562 (CCL-243)  K562 (CCL-243) MboI  MboI
HIC014 HIC040         GM12878         GM12878 MboI DpnII
HIC004 HIC038         GM12878         GM12878 MboI DpnII
HIC050 HIC052 IMR90 (CCL-186) IMR90 (CCL-186) MboI  MboI
HIC028 HIC040         GM12878         GM12878 MboI DpnII
HIC006 HIC038         GM12878         GM12878 MboI DpnII
HIC058 HIC062  HMEC (CC-2551)  HMEC (CC-2551) MboI  MboI
HIC058 HIC060  HMEC (CC-2551)  HMEC (CC-2551) MboI  MboI
HIC026 HIC038         GM12878         GM12878 MboI DpnII
HIC028 HIC038         GM12878         GM12878 MboI DpnII
HIC010 HIC038         GM12878         GM12878 MboI DpnII
HIC024 HIC038         GM12878         GM12878 MboI DpnII
HIC022 HIC024         GM12878         GM12878 MboI  MboI
HIC018 HIC038         GM12878         GM12878 MboI DpnII
HIC026 HIC040         GM12878         GM12878 MboI DpnII
HIC024 HIC040         GM12878         GM12878 MboI DpnII
                                   crosslinking protocol1 protocol2 biorep re
HIC008 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC070 HIC074 1.3% FA 10min RT 1.3% FA 10min RT   in situ   in situ      1  0
HIC070 HIC072 1.3% FA 10min RT 1.3% FA 10min RT   in situ   in situ      1  0
HIC014 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC050 HIC052     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC028 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC058 HIC062     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC058 HIC060     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC026 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC028 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC026 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
              same_re          dd
HIC008 HIC038      NA 0.007636971
HIC070 HIC074       1 0.011797264
HIC070 HIC072       1 0.014917285
HIC014 HIC040      NA 0.003192586
HIC004 HIC038      NA 0.006337539
HIC050 HIC052       1 0.001998440
HIC028 HIC040      NA 0.007696364
HIC006 HIC038      NA 0.005681801
HIC058 HIC062       1 0.005077356
HIC058 HIC060       1 0.005148024
HIC026 HIC038      NA 0.011523362
HIC028 HIC038      NA 0.007996622
HIC010 HIC038      NA 0.007693918
HIC024 HIC038      NA 0.008245069
HIC022 HIC024       1 0.004470784
HIC018 HIC038      NA 0.003870426
HIC026 HIC040      NA 0.011907886
HIC024 HIC040      NA 0.008447667
[1] "Concordant only by HiC-Spector (bioreps only)"
[1] 7
                  m1     m2 GenomeDISCO    HiCRep HiCSpector  seqdepth   cell1
HIC016 HIC048 HIC016 HIC048   0.6498261 0.4516087  0.2775652  97520628 GM12878
HIC016 HIC042 HIC016 HIC042   0.7212174 0.7743478  0.3388261 136425264 GM12878
HIC018 HIC048 HIC018 HIC048   0.7526957 0.6760000  0.3293043  97520628 GM12878
HIC028 HIC048 HIC028 HIC048   0.6993043 0.6630870  0.3039565  74167365 GM12878
HIC010 HIC048 HIC010 HIC048   0.7164348 0.7771739  0.2936957  55813939 GM12878
HIC042 HIC048 HIC042 HIC048   0.7970000 0.7605217  0.3067826  97520628 GM12878
HIC026 HIC048 HIC026 HIC048   0.6705652 0.7840000  0.3391739  97520628 GM12878
                cell2   re1   re2                  crosslinking protocol1
HIC016 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC016 HIC042 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC018 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC028 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC010 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC042 HIC048 GM12878 DpnII  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC026 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
              protocol2 biorep re same_re          dd
HIC016 HIC048   in situ      1  0       1 0.009544160
HIC016 HIC042   in situ      1  0      NA 0.005554117
HIC018 HIC048   in situ      1  0       1 0.003402237
HIC028 HIC048   in situ      1  0       1 0.008303553
HIC010 HIC048   in situ      1  0       1 0.007103102
HIC042 HIC048   in situ      1  0      NA 0.002072915
HIC026 HIC048   in situ      1  0       1 0.011123525
[1] "Non-concordant by all methods (bioreps only)"
[1] 18
                  m1     m2 GenomeDISCO    HiCRep HiCSpector  seqdepth   cell1
HIC014 HIC048 HIC014 HIC048   0.6383478 0.1419130  0.1418696  97520628 GM12878
HIC014 HIC042 HIC014 HIC042   0.7087826 0.5143913  0.1671739 136425264 GM12878
HIC006 HIC014 HIC006 HIC014   0.7894348 0.4692174  0.1505652 153771943 GM12878
HIC004 HIC014 HIC004 HIC014   0.7951739 0.4336522  0.1450435 160649365 GM12878
HIC038 HIC048 HIC038 HIC048   0.7874783 0.6806087  0.2303043  97520628 GM12878
HIC040 HIC048 HIC040 HIC048   0.7511304 0.4479130  0.2350000  97520628 GM12878
HIC012 HIC038 HIC012 HIC038   0.7362609 0.7208261  0.2677826 101689129 GM12878
HIC014 HIC018 HIC014 HIC018   0.7913043 0.6231304  0.1610870 137165012 GM12878
HIC020 HIC048 HIC020 HIC048   0.7306087 0.6101304  0.2504348  97520628 GM12878
HIC010 HIC014 HIC010 HIC014   0.7821304 0.5042174  0.1363913  55813939 GM12878
HIC014 HIC022 HIC014 HIC022   0.7590000 0.6257826  0.1874783 309986657 GM12878
HIC008 HIC040 HIC008 HIC040   0.7953043 0.7587391  0.2372609 125709561 GM12878
HIC022 HIC048 HIC022 HIC048   0.7906522 0.7095217  0.2612174  97520628 GM12878
HIC014 HIC038 HIC014 HIC038   0.6859565 0.5695217  0.1931739 101689129 GM12878
HIC010 HIC040 HIC010 HIC040   0.7937826 0.8040000  0.2165217  55813939 GM12878
HIC012 HIC048 HIC012 HIC048   0.6931739 0.3393043  0.1865652  97520628 GM12878
HIC012 HIC042 HIC012 HIC042   0.7626957 0.6740000  0.2230000 136425264 GM12878
HIC024 HIC048 HIC024 HIC048   0.6776522 0.6016087  0.2698696  97520628 GM12878
                cell2   re1   re2                  crosslinking protocol1
HIC014 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC042 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC006 HIC014 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC004 HIC014 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC038 HIC048 GM12878 DpnII  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC040 HIC048 GM12878 DpnII  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC038 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC018 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC020 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC010 HIC014 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC022 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC008 HIC040 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC022 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC014 HIC038 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC010 HIC040 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
HIC012 HIC042 GM12878  MboI DpnII 1% FA 10min RT 1% FA 10min RT   in situ
HIC024 HIC048 GM12878  MboI  MboI 1% FA 10min RT 1% FA 10min RT   in situ
              protocol2 biorep re same_re          dd
HIC014 HIC048   in situ      1  0       1 0.005192062
HIC014 HIC042   in situ      1  0      NA 0.003511967
HIC006 HIC014   in situ      1  0       1 0.004096121
HIC004 HIC014   in situ      1  0       1 0.004701627
HIC038 HIC048   in situ      1  0      NA 0.001838250
HIC040 HIC048   in situ      1  0      NA 0.002133226
HIC012 HIC038   in situ      1  0      NA 0.003862139
HIC014 HIC018   in situ      1  0       1 0.002700233
HIC020 HIC048   in situ      1  0       1 0.003575643
HIC010 HIC014   in situ      1  0       1 0.005405250
HIC014 HIC022   in situ      1  0       1 0.002105217
HIC008 HIC040   in situ      1  0      NA 0.007594824
HIC022 HIC048   in situ      1  0       1 0.002072427
HIC014 HIC038   in situ      1  0      NA 0.004898064
HIC010 HIC040   in situ      1  0      NA 0.007458615
HIC012 HIC048   in situ      1  0       1 0.004066145
HIC012 HIC042   in situ      1  0      NA 0.002274155
HIC024 HIC048   in situ      1  0       1 0.007402548

Also, print all biological replicates, this time sorted by their difference in distance dependence curves


In [92]:
print(combo[order(as.numeric(as.character(combo[,'dd'])),decreasing=TRUE),])


                  m1     m2 GenomeDISCO    HiCRep HiCSpector  seqdepth
HIC070 HIC072 HIC070 HIC072   0.6440435 0.9107826  0.1888261  79578049
HIC026 HIC040 HIC026 HIC040   0.7514348 0.8382609  0.2623043 125709561
HIC070 HIC074 HIC070 HIC074   0.6765217 0.9254348  0.1997826  80778733
HIC026 HIC038 HIC026 HIC038   0.7150000 0.8916522  0.2477391 101689129
HIC026 HIC048 HIC026 HIC048   0.6705652 0.7840000  0.3391739  97520628
HIC016 HIC038 HIC016 HIC038   0.7020435 0.8580000  0.3157826 101689129
HIC016 HIC048 HIC016 HIC048   0.6498261 0.4516087  0.2775652  97520628
HIC016 HIC040 HIC016 HIC040   0.7406522 0.9296957  0.3734348 125709561
HIC014 HIC026 HIC014 HIC026   0.8400435 0.5303913  0.1573913 309986657
HIC024 HIC040 HIC024 HIC040   0.7632609 0.9187826  0.2386522 111656957
HIC028 HIC048 HIC028 HIC048   0.6993043 0.6630870  0.3039565  74167365
HIC024 HIC038 HIC024 HIC038   0.7218696 0.8698261  0.2174783 101689129
HIC028 HIC038 HIC028 HIC038   0.7565217 0.8920000  0.2402609  74167365
HIC028 HIC040 HIC028 HIC040   0.7881304 0.8977826  0.2421739  74167365
HIC010 HIC038 HIC010 HIC038   0.7574783 0.8921304  0.2118696  55813939
HIC008 HIC038 HIC008 HIC038   0.7663043 0.8858261  0.2204348 101689129
HIC008 HIC040 HIC008 HIC040   0.7953043 0.7587391  0.2372609 125709561
HIC010 HIC040 HIC010 HIC040   0.7937826 0.8040000  0.2165217  55813939
HIC024 HIC048 HIC024 HIC048   0.6776522 0.6016087  0.2698696  97520628
HIC010 HIC048 HIC010 HIC048   0.7164348 0.7771739  0.2936957  55813939
HIC008 HIC048 HIC008 HIC048   0.7318696 0.8413913  0.3750435  97520628
HIC026 HIC042 HIC026 HIC042   0.7440000 0.9476522  0.3460870 136425264
HIC022 HIC026 HIC022 HIC026   0.7969565 0.9643913  0.3341739 348393345
HIC014 HIC024 HIC014 HIC024   0.8088261 0.6827826  0.1497391 111656957
HIC004 HIC038 HIC004 HIC038   0.7808696 0.8866522  0.2429565 101689129
HIC004 HIC040 HIC004 HIC040   0.8033478 0.7378696  0.2425652 125709561
HIC060 HIC062 HIC060 HIC062   0.6531304 0.8254348  0.4834348  16358597
HIC012 HIC026 HIC012 HIC026   0.8634783 0.6996087  0.2200000 194217179
HIC004 HIC048 HIC004 HIC048   0.7500435 0.8574348  0.3915652  97520628
HIC006 HIC038 HIC006 HIC038   0.7878261 0.8990870  0.2473478 101689129
HIC016 HIC042 HIC016 HIC042   0.7212174 0.7743478  0.3388261 136425264
HIC006 HIC040 HIC006 HIC040   0.8087391 0.7645217  0.2468261 125709561
HIC010 HIC014 HIC010 HIC014   0.7821304 0.5042174  0.1363913  55813939
HIC008 HIC014 HIC008 HIC014   0.8129130 0.4343043  0.1430870 212099519
HIC014 HIC048 HIC014 HIC048   0.6383478 0.1419130  0.1418696  97520628
HIC058 HIC060 HIC058 HIC060   0.6336522 0.8836522  0.2055652  16358597
HIC058 HIC062 HIC058 HIC062   0.6487826 0.8905652  0.2258696  19298780
HIC014 HIC016 HIC014 HIC016   0.8475652 0.8281304  0.1940870 186171788
HIC006 HIC048 HIC006 HIC048   0.7590870 0.8388261  0.3807826  97520628
HIC014 HIC038 HIC014 HIC038   0.6859565 0.5695217  0.1931739 101689129
HIC016 HIC022 HIC016 HIC022   0.7603478 0.8908696  0.3435217 186171788
HIC004 HIC014 HIC004 HIC014   0.7951739 0.4336522  0.1450435 160649365
HIC014 HIC028 HIC014 HIC028   0.8017391 0.6573478  0.1496087  74167365
HIC024 HIC042 HIC024 HIC042   0.7529565 0.8929130  0.3077391 111656957
HIC022 HIC024 HIC022 HIC024   0.7874783 0.9477826  0.2577826 111656957
HIC028 HIC042 HIC028 HIC042   0.7729565 0.9145652  0.3324783  74167365
HIC012 HIC024 HIC012 HIC024   0.8407826 0.8080870  0.1938696 111656957
HIC020 HIC038 HIC020 HIC038   0.7736087 0.9085217  0.2835652 101689129
HIC010 HIC042 HIC010 HIC042   0.7777391 0.9062174  0.2848696  55813939
HIC018 HIC026 HIC018 HIC026   0.8386087 0.9492609  0.3727826 137165012
HIC006 HIC014 HIC006 HIC014   0.7894348 0.4692174  0.1505652 153771943
HIC012 HIC048 HIC012 HIC048   0.6931739 0.3393043  0.1865652  97520628
HIC020 HIC040 HIC020 HIC040   0.8047826 0.9020435  0.2866522 125709561
HIC008 HIC042 HIC008 HIC042   0.8007826 0.9156522  0.3077391 136425264
HIC018 HIC038 HIC018 HIC038   0.7850435 0.9041739  0.2513043 101689129
HIC012 HIC038 HIC012 HIC038   0.7362609 0.7208261  0.2677826 101689129
HIC022 HIC028 HIC022 HIC028   0.8011304 0.9619130  0.2563478  74167365
HIC010 HIC022 HIC010 HIC022   0.8125652 0.9534783  0.2286087  55813939
HIC008 HIC022 HIC008 HIC022   0.8563043 0.9451304  0.2843478 212099519
HIC020 HIC048 HIC020 HIC048   0.7306087 0.6101304  0.2504348  97520628
HIC012 HIC016 HIC012 HIC016   0.8576957 0.9108696  0.2585217 186171788
HIC010 HIC012 HIC010 HIC012   0.8309130 0.6771739  0.1770000  55813939
HIC014 HIC042 HIC014 HIC042   0.7087826 0.5143913  0.1671739 136425264
HIC018 HIC048 HIC018 HIC048   0.7526957 0.6760000  0.3293043  97520628
HIC018 HIC040 HIC018 HIC040   0.8147391 0.8846522  0.2819130 125709561
HIC008 HIC012 HIC008 HIC012   0.8611304 0.6230435  0.1823913 194217179
HIC014 HIC040 HIC014 HIC040   0.7266087 0.8223043  0.2116957 125709561
HIC004 HIC042 HIC004 HIC042   0.8156087 0.9027826  0.3333478 136425264
HIC012 HIC028 HIC012 HIC028   0.8409565 0.7911739  0.1943478  74167365
HIC014 HIC020 HIC014 HIC020   0.8055652 0.6545217  0.1722609 309986657
HIC016 HIC018 HIC016 HIC018   0.8157826 0.8995652  0.3576087 137165012
HIC020 HIC026 HIC020 HIC026   0.8577826 0.9264783  0.3581304 356956797
HIC012 HIC040 HIC012 HIC040   0.7718696 0.8968696  0.2886522 125709561
HIC004 HIC022 HIC004 HIC022   0.8647391 0.9304783  0.2946957 160649365
HIC004 HIC012 HIC004 HIC012   0.8491304 0.6233913  0.1948261 160649365
HIC006 HIC042 HIC006 HIC042   0.8240435 0.9064348  0.3566522 136425264
HIC014 HIC018 HIC014 HIC018   0.7913043 0.6231304  0.1610870 137165012
HIC018 HIC028 HIC018 HIC028   0.8398696 0.9455652  0.3498261  74167365
HIC018 HIC024 HIC018 HIC024   0.8374348 0.9444783  0.3660435 111656957
HIC006 HIC012 HIC006 HIC012   0.8428261 0.6542609  0.1919565 153771943
HIC024 HIC028 HIC024 HIC028   0.8298261 0.9254348  0.3530000  74167365
HIC052 HIC056 HIC052 HIC056   0.7260000 0.8656522  0.2795217  26448232
HIC040 HIC042 HIC040 HIC042   0.8203913 0.8259565  0.3345652 125709561
HIC006 HIC022 HIC006 HIC022   0.8743478 0.9436957  0.3071304 153771943
HIC012 HIC042 HIC012 HIC042   0.7626957 0.6740000  0.2230000 136425264
HIC016 HIC020 HIC016 HIC020   0.8177826 0.9164783  0.3348261 186171788
HIC010 HIC018 HIC010 HIC018   0.8546522 0.9463478  0.3356087  55813939
HIC040 HIC048 HIC040 HIC048   0.7511304 0.4479130  0.2350000  97520628
HIC038 HIC040 HIC038 HIC040   0.8006087 0.8226957  0.3564783 101689129
HIC020 HIC028 HIC020 HIC028   0.8316087 0.9331304  0.2620435  74167365
HIC014 HIC022 HIC014 HIC022   0.7590000 0.6257826  0.1874783 309986657
HIC006 HIC026 HIC006 HIC026   0.8430870 0.9546087  0.4020435 153771943
HIC026 HIC028 HIC026 HIC028   0.8527391 0.9431739  0.3134783  74167365
HIC010 HIC028 HIC010 HIC028   0.8484348 0.9170435  0.4296522  55813939
HIC042 HIC048 HIC042 HIC048   0.7970000 0.7605217  0.3067826  97520628
HIC022 HIC048 HIC022 HIC048   0.7906522 0.7095217  0.2612174  97520628
HIC016 HIC024 HIC016 HIC024   0.8534783 0.9005652  0.2873478 111656957
HIC050 HIC052 HIC050 HIC052   0.7759130 0.9453478  0.2577826  26448232
HIC010 HIC026 HIC010 HIC026   0.8423478 0.9573913  0.3125217  55813939
HIC010 HIC016 HIC010 HIC016   0.8250870 0.8196957  0.2677391  55813939
HIC008 HIC018 HIC008 HIC018   0.8873043 0.9332609  0.4170435 137165012
HIC052 HIC054 HIC052 HIC054   0.7950870 0.9361304  0.2746087  26448232
HIC038 HIC042 HIC038 HIC042   0.8408261 0.8893478  0.3242609 101689129
HIC022 HIC038 HIC022 HIC038   0.8213043 0.9197391  0.3507826 101689129
HIC038 HIC048 HIC038 HIC048   0.7874783 0.6806087  0.2303043  97520628
HIC020 HIC042 HIC020 HIC042   0.8102609 0.8782609  0.3223913 136425264
HIC006 HIC016 HIC006 HIC016   0.8102174 0.7906087  0.3245652 153771943
HIC006 HIC028 HIC006 HIC028   0.8358696 0.8993913  0.3142609  74167365
HIC018 HIC042 HIC018 HIC042   0.8215652 0.8933043  0.3629565 136425264
HIC004 HIC026 HIC004 HIC026   0.8542174 0.9488696  0.4043043 160649365
HIC022 HIC040 HIC022 HIC040   0.8298261 0.8935217  0.3270435 125709561
HIC004 HIC028 HIC004 HIC028   0.8420000 0.8845217  0.3290435  74167365
HIC010 HIC024 HIC010 HIC024   0.8405217 0.9015652  0.3551739  55813939
HIC012 HIC020 HIC012 HIC020   0.8450870 0.7898696  0.2330870 194217179
HIC020 HIC024 HIC020 HIC024   0.8590435 0.9602174  0.2839130 111656957
HIC004 HIC016 HIC004 HIC016   0.8209130 0.7616522  0.3109130 160649365
HIC010 HIC020 HIC010 HIC020   0.8481739 0.9236957  0.2485217  55813939
HIC016 HIC026 HIC016 HIC026   0.8816957 0.8243043  0.3343043 186171788
HIC012 HIC018 HIC012 HIC018   0.8442174 0.7696087  0.2103913 137165012
HIC016 HIC028 HIC016 HIC028   0.8406957 0.8845652  0.2876522  74167365
HIC004 HIC018 HIC004 HIC018   0.8914348 0.9160435  0.4420000 137165012
HIC008 HIC028 HIC008 HIC028   0.8547391 0.9030000  0.3247826  74167365
HIC020 HIC022 HIC020 HIC022   0.8687391 0.9661739  0.3834783 348393345
HIC054 HIC056 HIC054 HIC056   0.8299565 0.9448261  0.3357826 240438944
HIC024 HIC026 HIC024 HIC026   0.8728261 0.9326957  0.2946522 111656957
HIC072 HIC074 HIC072 HIC074   0.8872174 0.9751739  0.5302609  79578049
HIC006 HIC010 HIC006 HIC010   0.8582609 0.9575652  0.3212174  55813939
HIC006 HIC024 HIC006 HIC024   0.8358261 0.8772609  0.3389130 111656957
HIC004 HIC010 HIC004 HIC010   0.8634348 0.9510870  0.3391304  55813939
HIC018 HIC022 HIC018 HIC022   0.8637826 0.9761739  0.3343478 137165012
HIC012 HIC022 HIC012 HIC022   0.8049565 0.7760870  0.2694348 194217179
HIC008 HIC016 HIC008 HIC016   0.8408261 0.7718261  0.2933913 186171788
HIC008 HIC020 HIC008 HIC020   0.8931304 0.9026522  0.2975217 212099519
HIC006 HIC018 HIC006 HIC018   0.8930435 0.9316087  0.4300000 137165012
HIC008 HIC010 HIC008 HIC010   0.8710435 0.9624783  0.3725217  55813939
HIC004 HIC024 HIC004 HIC024   0.8436522 0.8599565  0.3260870 111656957
HIC008 HIC026 HIC008 HIC026   0.8811739 0.9655217  0.4132609 212099519
HIC008 HIC024 HIC008 HIC024   0.8577391 0.8806522  0.3191304 111656957
HIC012 HIC014 HIC012 HIC014   0.8700435 0.9608696  0.3231304 194217179
HIC022 HIC042 HIC022 HIC042   0.8612609 0.9287826  0.3388261 136425264
HIC004 HIC020 HIC004 HIC020   0.8871739 0.8848696  0.3000870 160649365
HIC050 HIC054 HIC050 HIC054   0.8960000 0.9825652  0.4390435 240438944
HIC006 HIC008 HIC006 HIC008   0.9031304 0.9776522  0.5305652 153771943
HIC076 HIC078 HIC076 HIC078   0.9090870 0.9821304  0.4520435 294486683
HIC018 HIC020 HIC018 HIC020   0.8797826 0.9685652  0.3252174 137165012
HIC004 HIC006 HIC004 HIC006   0.9126087 0.9792174  0.6677826 153771943
HIC004 HIC008 HIC004 HIC008   0.9094783 0.9770870  0.5626522 160649365
HIC006 HIC020 HIC006 HIC020   0.8880435 0.9033043  0.3170000 153771943
HIC050 HIC056 HIC050 HIC056   0.8573913 0.9436957  0.3318696 252725458
                        cell1           cell2   re1   re2
HIC070 HIC072  K562 (CCL-243)  K562 (CCL-243)  MboI  MboI
HIC026 HIC040         GM12878         GM12878  MboI DpnII
HIC070 HIC074  K562 (CCL-243)  K562 (CCL-243)  MboI  MboI
HIC026 HIC038         GM12878         GM12878  MboI DpnII
HIC026 HIC048         GM12878         GM12878  MboI  MboI
HIC016 HIC038         GM12878         GM12878  MboI DpnII
HIC016 HIC048         GM12878         GM12878  MboI  MboI
HIC016 HIC040         GM12878         GM12878  MboI DpnII
HIC014 HIC026         GM12878         GM12878  MboI  MboI
HIC024 HIC040         GM12878         GM12878  MboI DpnII
HIC028 HIC048         GM12878         GM12878  MboI  MboI
HIC024 HIC038         GM12878         GM12878  MboI DpnII
HIC028 HIC038         GM12878         GM12878  MboI DpnII
HIC028 HIC040         GM12878         GM12878  MboI DpnII
HIC010 HIC038         GM12878         GM12878  MboI DpnII
HIC008 HIC038         GM12878         GM12878  MboI DpnII
HIC008 HIC040         GM12878         GM12878  MboI DpnII
HIC010 HIC040         GM12878         GM12878  MboI DpnII
HIC024 HIC048         GM12878         GM12878  MboI  MboI
HIC010 HIC048         GM12878         GM12878  MboI  MboI
HIC008 HIC048         GM12878         GM12878  MboI  MboI
HIC026 HIC042         GM12878         GM12878  MboI DpnII
HIC022 HIC026         GM12878         GM12878  MboI  MboI
HIC014 HIC024         GM12878         GM12878  MboI  MboI
HIC004 HIC038         GM12878         GM12878  MboI DpnII
HIC004 HIC040         GM12878         GM12878  MboI DpnII
HIC060 HIC062  HMEC (CC-2551)  HMEC (CC-2551)  MboI  MboI
HIC012 HIC026         GM12878         GM12878  MboI  MboI
HIC004 HIC048         GM12878         GM12878  MboI  MboI
HIC006 HIC038         GM12878         GM12878  MboI DpnII
HIC016 HIC042         GM12878         GM12878  MboI DpnII
HIC006 HIC040         GM12878         GM12878  MboI DpnII
HIC010 HIC014         GM12878         GM12878  MboI  MboI
HIC008 HIC014         GM12878         GM12878  MboI  MboI
HIC014 HIC048         GM12878         GM12878  MboI  MboI
HIC058 HIC060  HMEC (CC-2551)  HMEC (CC-2551)  MboI  MboI
HIC058 HIC062  HMEC (CC-2551)  HMEC (CC-2551)  MboI  MboI
HIC014 HIC016         GM12878         GM12878  MboI  MboI
HIC006 HIC048         GM12878         GM12878  MboI  MboI
HIC014 HIC038         GM12878         GM12878  MboI DpnII
HIC016 HIC022         GM12878         GM12878  MboI  MboI
HIC004 HIC014         GM12878         GM12878  MboI  MboI
HIC014 HIC028         GM12878         GM12878  MboI  MboI
HIC024 HIC042         GM12878         GM12878  MboI DpnII
HIC022 HIC024         GM12878         GM12878  MboI  MboI
HIC028 HIC042         GM12878         GM12878  MboI DpnII
HIC012 HIC024         GM12878         GM12878  MboI  MboI
HIC020 HIC038         GM12878         GM12878  MboI DpnII
HIC010 HIC042         GM12878         GM12878  MboI DpnII
HIC018 HIC026         GM12878         GM12878  MboI  MboI
HIC006 HIC014         GM12878         GM12878  MboI  MboI
HIC012 HIC048         GM12878         GM12878  MboI  MboI
HIC020 HIC040         GM12878         GM12878  MboI DpnII
HIC008 HIC042         GM12878         GM12878  MboI DpnII
HIC018 HIC038         GM12878         GM12878  MboI DpnII
HIC012 HIC038         GM12878         GM12878  MboI DpnII
HIC022 HIC028         GM12878         GM12878  MboI  MboI
HIC010 HIC022         GM12878         GM12878  MboI  MboI
HIC008 HIC022         GM12878         GM12878  MboI  MboI
HIC020 HIC048         GM12878         GM12878  MboI  MboI
HIC012 HIC016         GM12878         GM12878  MboI  MboI
HIC010 HIC012         GM12878         GM12878  MboI  MboI
HIC014 HIC042         GM12878         GM12878  MboI DpnII
HIC018 HIC048         GM12878         GM12878  MboI  MboI
HIC018 HIC040         GM12878         GM12878  MboI DpnII
HIC008 HIC012         GM12878         GM12878  MboI  MboI
HIC014 HIC040         GM12878         GM12878  MboI DpnII
HIC004 HIC042         GM12878         GM12878  MboI DpnII
HIC012 HIC028         GM12878         GM12878  MboI  MboI
HIC014 HIC020         GM12878         GM12878  MboI  MboI
HIC016 HIC018         GM12878         GM12878  MboI  MboI
HIC020 HIC026         GM12878         GM12878  MboI  MboI
HIC012 HIC040         GM12878         GM12878  MboI DpnII
HIC004 HIC022         GM12878         GM12878  MboI  MboI
HIC004 HIC012         GM12878         GM12878  MboI  MboI
HIC006 HIC042         GM12878         GM12878  MboI DpnII
HIC014 HIC018         GM12878         GM12878  MboI  MboI
HIC018 HIC028         GM12878         GM12878  MboI  MboI
HIC018 HIC024         GM12878         GM12878  MboI  MboI
HIC006 HIC012         GM12878         GM12878  MboI  MboI
HIC024 HIC028         GM12878         GM12878  MboI  MboI
HIC052 HIC056 IMR90 (CCL-186) IMR90 (CCL-186)  MboI  MboI
HIC040 HIC042         GM12878         GM12878 DpnII DpnII
HIC006 HIC022         GM12878         GM12878  MboI  MboI
HIC012 HIC042         GM12878         GM12878  MboI DpnII
HIC016 HIC020         GM12878         GM12878  MboI  MboI
HIC010 HIC018         GM12878         GM12878  MboI  MboI
HIC040 HIC048         GM12878         GM12878 DpnII  MboI
HIC038 HIC040         GM12878         GM12878 DpnII DpnII
HIC020 HIC028         GM12878         GM12878  MboI  MboI
HIC014 HIC022         GM12878         GM12878  MboI  MboI
HIC006 HIC026         GM12878         GM12878  MboI  MboI
HIC026 HIC028         GM12878         GM12878  MboI  MboI
HIC010 HIC028         GM12878         GM12878  MboI  MboI
HIC042 HIC048         GM12878         GM12878 DpnII  MboI
HIC022 HIC048         GM12878         GM12878  MboI  MboI
HIC016 HIC024         GM12878         GM12878  MboI  MboI
HIC050 HIC052 IMR90 (CCL-186) IMR90 (CCL-186)  MboI  MboI
HIC010 HIC026         GM12878         GM12878  MboI  MboI
HIC010 HIC016         GM12878         GM12878  MboI  MboI
HIC008 HIC018         GM12878         GM12878  MboI  MboI
HIC052 HIC054 IMR90 (CCL-186) IMR90 (CCL-186)  MboI  MboI
HIC038 HIC042         GM12878         GM12878 DpnII DpnII
HIC022 HIC038         GM12878         GM12878  MboI DpnII
HIC038 HIC048         GM12878         GM12878 DpnII  MboI
HIC020 HIC042         GM12878         GM12878  MboI DpnII
HIC006 HIC016         GM12878         GM12878  MboI  MboI
HIC006 HIC028         GM12878         GM12878  MboI  MboI
HIC018 HIC042         GM12878         GM12878  MboI DpnII
HIC004 HIC026         GM12878         GM12878  MboI  MboI
HIC022 HIC040         GM12878         GM12878  MboI DpnII
HIC004 HIC028         GM12878         GM12878  MboI  MboI
HIC010 HIC024         GM12878         GM12878  MboI  MboI
HIC012 HIC020         GM12878         GM12878  MboI  MboI
HIC020 HIC024         GM12878         GM12878  MboI  MboI
HIC004 HIC016         GM12878         GM12878  MboI  MboI
HIC010 HIC020         GM12878         GM12878  MboI  MboI
HIC016 HIC026         GM12878         GM12878  MboI  MboI
HIC012 HIC018         GM12878         GM12878  MboI  MboI
HIC016 HIC028         GM12878         GM12878  MboI  MboI
HIC004 HIC018         GM12878         GM12878  MboI  MboI
HIC008 HIC028         GM12878         GM12878  MboI  MboI
HIC020 HIC022         GM12878         GM12878  MboI  MboI
HIC054 HIC056 IMR90 (CCL-186) IMR90 (CCL-186)  MboI  MboI
HIC024 HIC026         GM12878         GM12878  MboI  MboI
HIC072 HIC074  K562 (CCL-243)  K562 (CCL-243)  MboI  MboI
HIC006 HIC010         GM12878         GM12878  MboI  MboI
HIC006 HIC024         GM12878         GM12878  MboI  MboI
HIC004 HIC010         GM12878         GM12878  MboI  MboI
HIC018 HIC022         GM12878         GM12878  MboI  MboI
HIC012 HIC022         GM12878         GM12878  MboI  MboI
HIC008 HIC016         GM12878         GM12878  MboI  MboI
HIC008 HIC020         GM12878         GM12878  MboI  MboI
HIC006 HIC018         GM12878         GM12878  MboI  MboI
HIC008 HIC010         GM12878         GM12878  MboI  MboI
HIC004 HIC024         GM12878         GM12878  MboI  MboI
HIC008 HIC026         GM12878         GM12878  MboI  MboI
HIC008 HIC024         GM12878         GM12878  MboI  MboI
HIC012 HIC014         GM12878         GM12878  MboI  MboI
HIC022 HIC042         GM12878         GM12878  MboI DpnII
HIC004 HIC020         GM12878         GM12878  MboI  MboI
HIC050 HIC054 IMR90 (CCL-186) IMR90 (CCL-186)  MboI  MboI
HIC006 HIC008         GM12878         GM12878  MboI  MboI
HIC076 HIC078            KBM7            KBM7  MboI  MboI
HIC018 HIC020         GM12878         GM12878  MboI  MboI
HIC004 HIC006         GM12878         GM12878  MboI  MboI
HIC004 HIC008         GM12878         GM12878  MboI  MboI
HIC006 HIC020         GM12878         GM12878  MboI  MboI
HIC050 HIC056 IMR90 (CCL-186) IMR90 (CCL-186)  MboI  MboI
                                   crosslinking protocol1 protocol2 biorep re
HIC070 HIC072 1.3% FA 10min RT 1.3% FA 10min RT   in situ   in situ      1  0
HIC026 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC070 HIC074 1.3% FA 10min RT 1.3% FA 10min RT   in situ   in situ      1  0
HIC026 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC026 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC028 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC028 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC028 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC026 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC060 HIC062     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC014     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC014     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC058 HIC060     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC058 HIC062     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC016     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC014     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC028 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC014     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC016     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC012     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC012     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC012     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC012     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC052 HIC056     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC040 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC040 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC038 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC014 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC026 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC042 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC050 HIC052     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC016     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC052 HIC054     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC038 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC038     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC038 HIC048     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC016     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC040     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC016     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC010 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC016 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC028     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC020 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC054 HIC056     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC024 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC072 HIC074 1.3% FA 10min RT 1.3% FA 10min RT   in situ   in situ      1  0
HIC006 HIC010     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC010     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC022     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC016     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC018     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC010     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC026     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC008 HIC024     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC012 HIC014     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC022 HIC042     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC050 HIC054     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC008     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC076 HIC078     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC018 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC006     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC004 HIC008     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC006 HIC020     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
HIC050 HIC056     1% FA 10min RT 1% FA 10min RT   in situ   in situ      1  0
              same_re           dd
HIC070 HIC072       1 0.0149172847
HIC026 HIC040      NA 0.0119078859
HIC070 HIC074       1 0.0117972640
HIC026 HIC038      NA 0.0115233623
HIC026 HIC048       1 0.0111235254
HIC016 HIC038      NA 0.0096533617
HIC016 HIC048       1 0.0095441603
HIC016 HIC040      NA 0.0090629068
HIC014 HIC026       1 0.0084535343
HIC024 HIC040      NA 0.0084476673
HIC028 HIC048       1 0.0083035534
HIC024 HIC038      NA 0.0082450693
HIC028 HIC038      NA 0.0079966220
HIC028 HIC040      NA 0.0076963636
HIC010 HIC038      NA 0.0076939175
HIC008 HIC038      NA 0.0076369712
HIC008 HIC040      NA 0.0075948237
HIC010 HIC040      NA 0.0074586155
HIC024 HIC048       1 0.0074025481
HIC010 HIC048       1 0.0071031017
HIC008 HIC048       1 0.0070798761
HIC026 HIC042      NA 0.0068226510
HIC022 HIC026       1 0.0066698323
HIC014 HIC024       1 0.0065671745
HIC004 HIC038      NA 0.0063375389
HIC004 HIC040      NA 0.0063279199
HIC060 HIC062       1 0.0061102540
HIC012 HIC026       1 0.0059505473
HIC004 HIC048       1 0.0057864486
HIC006 HIC038      NA 0.0056818015
HIC016 HIC042      NA 0.0055541168
HIC006 HIC040      NA 0.0054640880
HIC010 HIC014       1 0.0054052504
HIC008 HIC014       1 0.0052411817
HIC014 HIC048       1 0.0051920623
HIC058 HIC060       1 0.0051480237
HIC058 HIC062       1 0.0050773561
HIC014 HIC016       1 0.0050396602
HIC006 HIC048       1 0.0049990541
HIC014 HIC038      NA 0.0048980637
HIC016 HIC022       1 0.0048301961
HIC004 HIC014       1 0.0047016268
HIC014 HIC028       1 0.0045731933
HIC024 HIC042      NA 0.0045721641
HIC022 HIC024       1 0.0044707837
HIC028 HIC042      NA 0.0043731068
HIC012 HIC024       1 0.0042502903
HIC020 HIC038      NA 0.0041628476
HIC010 HIC042      NA 0.0041279650
HIC018 HIC026       1 0.0041013874
HIC006 HIC014       1 0.0040961212
HIC012 HIC048       1 0.0040661447
HIC020 HIC040      NA 0.0040134732
HIC008 HIC042      NA 0.0039900295
HIC018 HIC038      NA 0.0038704261
HIC012 HIC038      NA 0.0038621387
HIC022 HIC028       1 0.0038468477
HIC010 HIC022       1 0.0037062553
HIC008 HIC022       1 0.0036007652
HIC020 HIC048       1 0.0035756428
HIC012 HIC016       1 0.0035457596
HIC010 HIC012       1 0.0035170414
HIC014 HIC042      NA 0.0035119674
HIC018 HIC048       1 0.0034022371
HIC018 HIC040      NA 0.0033536003
HIC008 HIC012       1 0.0032331059
HIC014 HIC040      NA 0.0031925858
HIC004 HIC042      NA 0.0031115856
HIC012 HIC028       1 0.0030577972
HIC014 HIC020       1 0.0030209295
HIC016 HIC018       1 0.0029627963
HIC020 HIC026       1 0.0029454016
HIC012 HIC040      NA 0.0029432516
HIC004 HIC022       1 0.0028283720
HIC004 HIC012       1 0.0028073229
HIC006 HIC042      NA 0.0027297920
HIC014 HIC018       1 0.0027002325
HIC018 HIC028       1 0.0025438963
HIC018 HIC024       1 0.0024828908
HIC006 HIC012       1 0.0024007004
HIC024 HIC028       1 0.0023957325
HIC052 HIC056       1 0.0023901737
HIC040 HIC042       1 0.0023280735
HIC006 HIC022       1 0.0023182994
HIC012 HIC042      NA 0.0022741554
HIC016 HIC020       1 0.0022698335
HIC010 HIC018       1 0.0022175514
HIC040 HIC048      NA 0.0021332260
HIC038 HIC040       1 0.0021228573
HIC020 HIC028       1 0.0021108602
HIC014 HIC022       1 0.0021052174
HIC006 HIC026       1 0.0020920208
HIC026 HIC028       1 0.0020869586
HIC010 HIC028       1 0.0020802542
HIC042 HIC048      NA 0.0020729152
HIC022 HIC048       1 0.0020724273
HIC016 HIC024       1 0.0020340802
HIC050 HIC052       1 0.0019984402
HIC010 HIC026       1 0.0019936527
HIC010 HIC016       1 0.0019758757
HIC008 HIC018       1 0.0019080923
HIC052 HIC054       1 0.0019064739
HIC038 HIC042       1 0.0018755837
HIC022 HIC038      NA 0.0018481898
HIC038 HIC048      NA 0.0018382495
HIC020 HIC042      NA 0.0017969711
HIC006 HIC016       1 0.0017618971
HIC006 HIC028       1 0.0017488454
HIC018 HIC042      NA 0.0017444291
HIC004 HIC026       1 0.0016414713
HIC022 HIC040      NA 0.0016141167
HIC004 HIC028       1 0.0016044464
HIC010 HIC024       1 0.0015985607
HIC012 HIC020       1 0.0015952542
HIC020 HIC024       1 0.0015854962
HIC004 HIC016       1 0.0015795077
HIC010 HIC020       1 0.0015663446
HIC016 HIC026       1 0.0015180755
HIC012 HIC018       1 0.0015070898
HIC016 HIC028       1 0.0015034674
HIC004 HIC018       1 0.0014601394
HIC008 HIC028       1 0.0014176511
HIC020 HIC022       1 0.0013564212
HIC054 HIC056       1 0.0013525353
HIC024 HIC026       1 0.0013101566
HIC072 HIC074       1 0.0012709316
HIC006 HIC010       1 0.0012651356
HIC006 HIC024       1 0.0012602748
HIC004 HIC010       1 0.0012596346
HIC018 HIC022       1 0.0012331952
HIC012 HIC022       1 0.0012185229
HIC008 HIC016       1 0.0012175154
HIC008 HIC020       1 0.0011986759
HIC006 HIC018       1 0.0011902501
HIC008 HIC010       1 0.0011464670
HIC004 HIC024       1 0.0011305040
HIC008 HIC026       1 0.0011039885
HIC008 HIC024       1 0.0009997339
HIC012 HIC014       1 0.0009616527
HIC022 HIC042      NA 0.0008728769
HIC004 HIC020       1 0.0008654350
HIC050 HIC054       1 0.0008475885
HIC006 HIC008       1 0.0007900051
HIC076 HIC078       1 0.0007752184
HIC018 HIC020       1 0.0007589931
HIC004 HIC006       1 0.0006999928
HIC004 HIC008       1 0.0006861169
HIC006 HIC020       1 0.0006832426
HIC050 HIC056       1 0.0006797000

Plot scores vs difference in distance dependence


In [100]:
plot_single_score_vs_covariates=function(scores,ddfile,sname,metadata,thresh,out,mini){
    #======= settings for nice-looking ggplots
    niceggplot=theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))

    niceggplot_fullborders=theme(axis.text=element_text(size=20),
                             axis.title=element_text(size=20))
    #==========================================
    
    colnames(scores)[3]='score'
    scores=filter_data(scores,metadata)
    rownames(scores)=paste(as.character(scores[,'m1']),as.character(scores[,'m2']))
    
    dds=read.table(ddfile)
    rownames(dds)=paste(dds[,1],dds[,2])
    
    common=rownames(scores)
    
    combined=data.frame(score=scores[common,'score'],
                        dd=dds[common,3],
                        biorep=scores[common,'biorep'],
                        seqdepth=scores[common,'seqdepth'])
    
    combined=combined[order(as.numeric(as.character(combined$biorep)),decreasing=FALSE),]
    pdf(paste(out,'.vs.dd.pdf',sep=''),width=4,height=3)
    p=ggplot(combined,aes(x=dd,y=score,color=biorep))+geom_point(size=0.5)+xlab('Difference in distance dependence')+
         ylab(sname)+
         theme_bw()+scale_colour_gradient(low = "gray", high = "blue")+
         geom_hline(yintercept = thresh)+ylim(mini,1)+xlim(0,0.02)+niceggplot
    print(p)
    dev.off()
    print(p)   
}

plot_single_score_vs_covariates(allscores[,c('m1','m2','GenomeDISCO')],
               DISTDEP_FILE,
                'GenomeDISCO score',
                test_data,GENOMEDISCO_THRESHOLD,
               paste(PLOTS_PATH,'/GenomeDISCO',sep=''),GENOMEDISCO_MIN)

plot_single_score_vs_covariates(allscores[,c('m1','m2','HiCRep')],
               DISTDEP_FILE,
                'HiCRep score',
                test_data,HICREP_THRESHOLD,
               paste(PLOTS_PATH,'/HiCRep',sep=''),HICREP_MIN)

plot_single_score_vs_covariates(allscores[,c('m1','m2','HiCSpector')],
               DISTDEP_FILE,
                'HiC-Spector score',
                test_data,HICSPECTOR_THRESHOLD,
               paste(PLOTS_PATH,'/HiC-Spector',sep=''),0)


Warning message:
“Removed 10 rows containing missing values (geom_point).”Warning message:
“Removed 10 rows containing missing values (geom_point).”Warning message:
“Removed 10 rows containing missing values (geom_point).”Warning message:
“Removed 10 rows containing missing values (geom_point).”
Warning message:
“Removed 10 rows containing missing values (geom_point).”Warning message:
“Removed 10 rows containing missing values (geom_point).”

Supplementary Figure about clustering of Rao samples


In [110]:
require(mclust,lib.loc='~/R/x86_64-pc-linux-gnu-library/3.4')
require(cba,lib.loc='~/R/x86_64-pc-linux-gnu-library/3.4')
require(ggdendro,lib.loc='~/R/x86_64-pc-linux-gnu-library/3.4')


Loading required package: cba

In [109]:
optimal_ordering=function(m,meth){
  if (dim(m)[1]<=2){
  return (m)
  }
  require(cba)
  d=dist(as.matrix(m),method=meth)
  hc=hclust(d)
  co=order.optimal(d, hc$merge)
  m.optimalRows=as.matrix(m)[co$order,]
  return(m.optimalRows)
}

get_score_heatmap=function(scores_orig,metadata,title,out){
    colnames(scores_orig)[3]='score'
    SCORECOL=3
    scores=filter_data(scores_orig,metadata)
    rownames(scores)=paste(as.character(scores[,'m1']),as.character(scores[,'m2']))
    
    #mapping from sample to cell type
    sample2cell=data.frame(sample=c(as.character(scores[,'m1']),as.character(scores[,'m2'])),
                                    cell=c(as.character(scores[,'cell1']),as.character(scores[,'cell2'])))
    dups=which(duplicated(sample2cell))
    if (length(dups)>0){
        sample2cell=sample2cell[-dups,]
    }
    rownames(sample2cell)=paste(as.character(sample2cell[,'cell']),as.character(sample2cell[,'sample']))
    
    #put the scores in a nice heatmap
    scores_short=scores
    scores_short[,1]=paste(as.character(scores_short[,'cell1']),as.character(scores_short[,'m1']))
    scores_short[,2]=paste(as.character(scores_short[,'cell2']),as.character(scores_short[,'m2']))
    samples=unique(c(scores_short[,1],scores_short[,2]))
    samples=samples[order(samples)]
    m=as.matrix(array(1,dim=c(length(samples),length(samples))))
    rownames(m)=colnames(m)=samples
    for (i in c(1:(dim(scores_short)[1]))){
        n1=as.character(scores_short[i,1])
        n2=as.character(scores_short[i,2])
        v=scores_short[i,SCORECOL]
        m[n1,n2]=v
        m[n2,n1]=v
    } 
    
    #plot heatmap
    pdf(paste(out,'.',title,'.heatmap.pdf',sep=''))
    print(pheatmap(m,display_numbers = TRUE, number_format = "%.2f",
            cellwidth=10,cellheight=10,fontsize=4))
    dev.off()
    print(pheatmap(m,display_numbers = TRUE, number_format = "%.2f",
            cellwidth=10,cellheight=10,fontsize=4))
    
    #plot dendrogram
    hc = hclust(dist(m))
    pdf(paste(out,'.',title,'.dendrogram.pdf',sep=''))
    p=ggdendrogram(hc, rotate = TRUE, size = 4)+ggtitle(title)
    print(p)
    dev.off()
    print(p)
    
    #compute arand scores
    ks=c(1:length(samples))
    arands=c()
    #mis=c()
    for (k in ks){
        thecut=cutree(hc,k=k)
        arands=c(arands,adjustedRandIndex(thecut[rownames(sample2cell)],sample2cell[,'cell']))
     } 
    return(arands)
}

gdisco_arand=get_score_heatmap(allscores[,c('m1','m2','GenomeDISCO')],test_data,'GenomeDISCO',
                               paste(PLOTS_PATH,'/analysis',sep=''))

hicrep_arand=get_score_heatmap(allscores[,c('m1','m2','HiCRep')],test_data,'HiCRep',
                               paste(PLOTS_PATH,'/analysis',sep=''))

spector_arand=get_score_heatmap(allscores[,c('m1','m2','HiCSpector')],test_data,'HiC-Spector',
                               paste(PLOTS_PATH,'/analysis',sep=''))


arands=data.frame(k=rep(c(1:length(gdisco_arand)),times=3),
                  ARAND=c(gdisco_arand,hicrep_arand,spector_arand),
                  method=c(rep('GenomeDISCO',times=length(gdisco_arand)),
                           rep('HiCRep',times=length(gdisco_arand)),
                           rep('HiC-Spector',times=length(gdisco_arand))))

pdf(paste(PLOTS_PATH,'/arand.pdf',sep=''),width=7,height=5)
p=ggplot(arands,aes(x=k,y=ARAND,col=method))+geom_line()+geom_point()+theme_bw()+niceggplot+
     xlab('\nNumber of clusters')+ylab('Adjusted RAND index\n')
print(p)
dev.off()
print(p)


$tree_row

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$tree_col

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$kmeans
[1] NA

$gtable
TableGrob (5 x 6) "layout": 6 grobs
  z     cells      name                          grob
1 1 (2-2,3-3)  col_tree polyline[GRID.polyline.11345]
2 2 (4-4,1-1)  row_tree polyline[GRID.polyline.11346]
3 3 (4-4,3-3)    matrix       gTree[GRID.gTree.11349]
4 4 (5-5,3-3) col_names         text[GRID.text.11350]
5 5 (4-4,4-4) row_names         text[GRID.text.11351]
6 6 (3-5,5-5)    legend       gTree[GRID.gTree.11354]

$tree_row

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$tree_col

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$kmeans
[1] NA

$gtable
TableGrob (5 x 6) "layout": 6 grobs
  z     cells      name                          grob
1 1 (2-2,3-3)  col_tree polyline[GRID.polyline.11359]
2 2 (4-4,1-1)  row_tree polyline[GRID.polyline.11360]
3 3 (4-4,3-3)    matrix       gTree[GRID.gTree.11363]
4 4 (5-5,3-3) col_names         text[GRID.text.11364]
5 5 (4-4,4-4) row_names         text[GRID.text.11365]
6 6 (3-5,5-5)    legend       gTree[GRID.gTree.11368]

$tree_row

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$tree_col

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$kmeans
[1] NA

$gtable
TableGrob (5 x 6) "layout": 6 grobs
  z     cells      name                          grob
1 1 (2-2,3-3)  col_tree polyline[GRID.polyline.11441]
2 2 (4-4,1-1)  row_tree polyline[GRID.polyline.11442]
3 3 (4-4,3-3)    matrix       gTree[GRID.gTree.11445]
4 4 (5-5,3-3) col_names         text[GRID.text.11446]
5 5 (4-4,4-4) row_names         text[GRID.text.11447]
6 6 (3-5,5-5)    legend       gTree[GRID.gTree.11450]

$tree_row

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$tree_col

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$kmeans
[1] NA

$gtable
TableGrob (5 x 6) "layout": 6 grobs
  z     cells      name                          grob
1 1 (2-2,3-3)  col_tree polyline[GRID.polyline.11455]
2 2 (4-4,1-1)  row_tree polyline[GRID.polyline.11456]
3 3 (4-4,3-3)    matrix       gTree[GRID.gTree.11459]
4 4 (5-5,3-3) col_names         text[GRID.text.11460]
5 5 (4-4,4-4) row_names         text[GRID.text.11461]
6 6 (3-5,5-5)    legend       gTree[GRID.gTree.11464]

$tree_row

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$tree_col

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$kmeans
[1] NA

$gtable
TableGrob (5 x 6) "layout": 6 grobs
  z     cells      name                          grob
1 1 (2-2,3-3)  col_tree polyline[GRID.polyline.11537]
2 2 (4-4,1-1)  row_tree polyline[GRID.polyline.11538]
3 3 (4-4,3-3)    matrix       gTree[GRID.gTree.11541]
4 4 (5-5,3-3) col_names         text[GRID.text.11542]
5 5 (4-4,4-4) row_names         text[GRID.text.11543]
6 6 (3-5,5-5)    legend       gTree[GRID.gTree.11546]

$tree_row

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$tree_col

Call:
hclust(d = d, method = method)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 31 


$kmeans
[1] NA

$gtable
TableGrob (5 x 6) "layout": 6 grobs
  z     cells      name                          grob
1 1 (2-2,3-3)  col_tree polyline[GRID.polyline.11551]
2 2 (4-4,1-1)  row_tree polyline[GRID.polyline.11552]
3 3 (4-4,3-3)    matrix       gTree[GRID.gTree.11555]
4 4 (5-5,3-3) col_names         text[GRID.text.11556]
5 5 (4-4,4-4) row_names         text[GRID.text.11557]
6 6 (3-5,5-5)    legend       gTree[GRID.gTree.11560]

png: 2

Supplementary Figure about differences in protocols etc.

  • different experiment protocols (dilute Hi-C vs. in-situ Hi-C)
  • different restriction enzymes (HindIII vs. NcoI vs. MboI)

different experiment protocols


In [111]:
different_stuff=function(scores_orig,metadata,title,out){
    MINIMUM_COUNT=3
    WIDTH=10
    HEIGHT=15
    
    #fill in the entries from the metadata for the pairs
    scores=add_metadata(metadata,scores_orig)
    
    #specify which are bioreps and which experiments are done with the same restriction enzyme
    scores=annotate_biorep_re(scores)
    
    #keep only bioreps
    bioreps=which(as.character(scores[,'biorep'])=='1')
    scores=scores[bioreps,]
    
    colnames(scores)[3]='score'
    no_crosslink=which(grepl('NA',scores[,'crosslinking']))
    
    #remove no-crosslinking samples
    scores=scores[-no_crosslink,]
    
    ###############################
    ##### dilution vs in-situ #####
    ###############################
    scores=data.frame(scores,
                      protocol=paste(as.character(scores[,'protocol1']),'|',as.character(scores[,'protocol2'])),
                      stringsAsFactors=FALSE)
    for (i in c(1:(dim(scores)[1]))){
        protocols=c(as.character(scores[i,'protocol1']),as.character(scores[i,'protocol2']))
        protocols=protocols[order(protocols)]
        scores[i,'protocol']=paste(protocols[1],'|',protocols[2])
    }
    counts=table(scores[,'protocol'])
    keep=names(counts)[which(counts>=MINIMUM_COUNT)]
    protocol_scores=scores[which(as.character(scores[,'protocol']) %in% keep),]
    p=ggplot(protocol_scores,aes(y=score,x=protocol))+
    geom_boxplot()+geom_jitter(alpha=0.3)+theme_bw()+niceggplot+ #ylim(0,1)+
    coord_flip()+ylim(0,1)
    print(p)
    pdf(paste(out,'.',title,'.protocol.pdf',sep=''),width=8,height=5)
    print(p)
    dev.off()
    
    ###############################
    ##### restriction enzyme #####
    ###############################
    #for this, keep only in situ data
    in_situ=intersect(which(as.character(scores[,'protocol1'])=='in situ'),
                     which(as.character(scores[,'protocol2'])=='in situ'))
    scores=data.frame(scores,
                      re=paste(as.character(scores[,'re1']),'|',as.character(scores[,'re2'])),
                      stringsAsFactors=FALSE)
    scores=scores[in_situ,]
    for (i in c(1:(dim(scores)[1]))){
        res=c(as.character(scores[i,'re1']),as.character(scores[i,'re2']))
        res=res[order(res)]
        scores[i,'re']=paste(res[1],'|',res[2])
    }
    counts=table(scores[,'re'])
    keep=names(counts)[which(counts>=MINIMUM_COUNT)]
    re_scores=scores[which(as.character(scores[,'re']) %in% keep),]
    p=ggplot(re_scores,aes(y=score,x=re))+
    geom_boxplot()+geom_jitter(alpha=0.3)+theme_bw()+niceggplot+ #ylim(0,1)+
    coord_flip()+ylim(0,1)
    print(p)
    pdf(paste(out,'.',title,'.re.pdf',sep=''),width=6,height=4)
    print(p)
    dev.off()  

}

different_stuff(allscores[,c('m1','m2','GenomeDISCO')],test_data,'GenomeDISCO',
               paste(PLOTS_PATH,'/GenomeDISCO',sep=''))
different_stuff(allscores[,c('m1','m2','HiCRep')],test_data,'HiCRep',
               paste(PLOTS_PATH,'/HiCRep',sep=''))
different_stuff(allscores[,c('m1','m2','HiCSpector')],test_data,'HiC-Spector',
               paste(PLOTS_PATH,'/HiC-Spector',sep=''))


png: 2
png: 2
png: 2

Make Supp Table 2


In [115]:
metadata_write=metadata[c(1:83),]

#add geo, data type, resolution
metadata_write=data.frame(GEO="GSE63525",
                          resolution=50000,
                          datatype="in situ Hi-C",
                            metadata_write)
write.table(metadata_write,
            file=paste(DATA_PATH,'/data/Supp_Table2.txt',sep=''),
           quote=FALSE,sep='\t',row.names=FALSE,col.names=TRUE)

In [ ]: