This data set collection is from an early publication on Chromatin immunoprecipitation experiments to determine which transcription factors bind to which genes in yeast Lee et al (2002).
In [1]:
import pods
import pylab as plt
%matplotlib inline
In [2]:
data = pods.datasets.lee_yeast_ChIP()
The data consists of $p$-values for the hypothesized relationships between the transcription factors and the genes. There are 113 transcription factors represented in data['transcription_factors']
.
In [3]:
print(data['transcription_factors'])
And the 6270 gene names and their annotations are given in data['annotations']
.
A pandas
data frame containing all the $p$-values for the binding between genes and transcription factors data is available in data['Y']
.
In [4]:
data['Y'].describe()
Out[4]: