In [3]:
#Set directories for output
#Define data folders
out_dir="/input_dir/"
In [5]:
#Subsample local file while stripping STUPID fucking header
read_num=1000000
in_fq_1 = /lab/solexa_public/Youn/solexa_public/170505_WIGTC-HISEQ2A_CALADANXX/QualityScores
zcat "input_fq.gz" | sed '1 s/.*@/@/' | head -n $read_num | \
gzip > $out_dir/input_down_fq.gz
to quality-check, trim, and filter your reads using this pipeline before running ANY of the downstream pipelines
In [4]:
#Chip-seq/Chromatin Accessibility
#CTCF mesc
#/root/Euplotid/src/SRA2fq SRR524848 $out_dir CTCF_mesc 1000000
#Input mesc
/root/Euplotid/src/SRA2fq SRR524849 $out_dir input_mesc 5000000
#H3k27Ac mesc
/root/Euplotid/src/SRA2fq SRR066766 $out_dir h3k27ac_mesc 5000000
#ATAC mesc
#/root/Euplotid/src/SRA2fq SRR2927023 $out_dir ATAC_mesc 5000000
#CTCF hesc
#SRA2fq SRR2056018 $out_dir CTCF_hesc 1000000
#Input hesc
#/root/Euplotid/src/SRA2fq SRR2056020 $out_dir input_hesc 1000000
#H3K27Ac hesc
#/root/Euplotid/src/SRA2fq SRR2056016 $out_dir h3k27Ac_hesc 1000000
#ATAC h7 hesc
#/root/Euplotid/src/SRA2fq SRR3689760 $out_dir ATAC_h7 1000000
See this pipeline to call peaks using Homer and/or MACS2 as well as nucleosome positioning using nucleoatac
In [2]:
#chia-PET
#mesc
/root/Euplotid/src/SRA2fq SRR1296617 $out_dir ChiA_SMC1_mesc 1000000
#hesc
/root/Euplotid/src/SRA2fq SRR2054933 $out_dir ChiA_SMC1_hesc 1000000
See this pipeline to prep reads, align, call and normalize pairwise interactions using ChiAPet2 and/or Origami and dump into cooler format.
In [3]:
#Hi-c
#mesc
/root/Euplotid/src/SRA2fq SRR443883 $out_dir HiC_mesc 1000000
#hesc
/root/Euplotid/src/SRA2fq SRR400260 $out_dir HiC_hesc 1000000
See this pipeline to go from fastq reads, align, normalize and dump into cooler format using HiCPro
In [19]:
#Hi-ChIp
#mesc
/root/Euplotid/src/SRA2fq SRR3467183 $out_dir HiChip_mesc 1000000
#GM12878
/root/Euplotid/src/SRA2fq SRR3467176 $out_dir HiChip_hesc 1000000
See this pipeline custom pipeline to go from fastq reads through HiCPro + scripts to normalize and dump into cooler format
In [20]:
#Dnase Hi-c
#mesc patski cells
/root/Euplotid/src/SRA2fq SRR2033066 $out_dir dnaseHiC_patski 1000000
#hesc
/root/Euplotid/src/SRA2fq SRR1248175 $out_dir dnaseHiC_hesc 1000000
See this pipeline to go from fastq reads, align, normalize and dump into cooler format using HiCPro
In [ ]:
#RNA-Seq
#mesc 4cell
/root/Euplotid/src/SRA2fq SRR1840518 $out_dir rnaseq_mesc 1000000
#hesc mesoderm
/root/Euplotid/src/SRA2fq SRR3439456 $out_dir rnaseq_hesc 1000000
See this pipeline to take RNA-Seq reads and align and quantify/normalize expression values (FPKM) using RSEM
In [ ]:
#Gro-seq
#mesc
/root/Euplotid/src/SRA2fq SRR935093 $out_dir groseq_mesc 1000000
#h1 hesc (our data!)
/root/Euplotid/src/SRA2fq SRR574826 $out_dir groseq_hesc 1000000
See this pipeline find nascent transcripts using FStitch and miRNA promoters using mirSTP
In [ ]:
#4C
#mesc poised enhancers = viewpoints
/root/Euplotid/src/SRA2fq SRR4451724 $out_dir 4c_poiEnh_mesc 1000000
#hesc MT2A
/root/Euplotid/src/SRA2fq SRR1409666 $out_dir 4c_MT2A_hesc 1000000
See this pipeline to get wiggle file from fastq reads using HiCPro and/or custom pipeline