ChIP-seq II - Prelab

Table of Contents

  1. ChIP-seq for RNA binding
  2. Chromatin accessibility methods

1. ChIP-seq for RNA binding

In addition to its applications for ChIP-seq analysis, the idea of peak calling (finding 'hot-spots' where reads pile up) is also useful for analyzing other large-scale sequencing methods. In this prelab, we will give a high-level overview of some of the other useful applications of peak calling algorithms and some considerations for these analyses.

As we saw in the first module prelab, ChIP-seq is used to detect hot spots of DNA:protein interactions, where the protein is pulled down by an antibody (so it can be a transcription factor or it can be a histone with a certain modification). Due to the nature of the experimental preparation (formaldehyde crosslinking), this approach detects interactions with DNA, but not RNA. However, there are many contexts in which it is useful to determine how a protein binds to RNA, such as understanding post-transcriptional regulation mediated by proteins that bind to pre-mRNA or translation regulation mediated by proteins that bind to mRNA, among many others. These contexts motivated the development of methods related to ChIP-seq that identify RNA:protein interactions as opposed to DNA:protein interactions.

CLIP-seq / HITS-CLIP

Crosslinking and Immunoprecipitation Sequencing (CLIP-Seq), also known as HITS-CLIP, is a commonly used method to detect these RNA:protein interactions. As opposed to the formaldehyde crosslinking approach used for ChIP-seq, CLIP-seq uses ultraviolet light to crosslink RNA with any proteins that are bound to it, followed by high-throughput sequencing.

The principles of analyzing CLIP-seq data are very similar to those for ChIP-seq, and at its simplest involves aligning the input reads (which may require use of different alignment programs since the reads represent RNA instead of DNA) and performing a very similar peak calling approach.

There are several variations on this idea that can be used in different contexts, including PAR-CLIP in cultured cells and iCLIP which uses random barcoding to improve the resolution of binding sites. See this Wikipedia article for a decent overview of the different approaches: https://en.wikipedia.org/wiki/CLIP

2. Chromatin accessibility methods

Another application of peak calling methods to large-scale sequencing data is for methods that identify genomic regions with open or accessible chromatin, where the nucleosomes are repositioned to be less tightly spaced or completely evicted, allowing access to genomic DNA by the transcriptional machinery as well as by regulatory proteins such as transcription factors.

There are many such methods, and for the rest of this prelab, we will ask you to read part of the following review paper about these methods, which introduces the methods themselves and discusses some of the analysis considerations: http://epigeneticsandchromatin.biomedcentral.com/articles/10.1186/1756-8935-7-33. Read the following sections: "MNase-seq: an indirect chromatin accessibility assay", "Direct chromatin accessibility assays", and "Detection of enriched regions"; if you are interested, the "Stage 3 analysis" and "Stage 4 analysis" sections also have some useful information.

Test your understanding

1) Which of these would CLIP-seq not be suitable for:

a) Figuring out what RNA molecules a protein known to inhibit translation binds to.

b) Figuring out what RNAs may undergo splicing, by determining binding status to a protein known to be involved with splicing.

c) Figuring out what RNA molecules regulate expression of a gene of interest through RNA-binding to its regulatorory region.

2) True or False: We expect open regions of the genome to be less transcriptionally active, because the transcriptional machinery is more likely to fall off the DNA without the tightly coiled DNA holding it in place.

3) Describe MNase-seq in your own words. What is the basic experimental setup? What do you learn from it?