Analysis of Gene Expression Data via Arrays using Bioconductor/R - II

Copy code from the previous in-class set from: Q4, Q7, Q8 and Q9. These are the steps that correspond to:

loading R libraries
loading the file that maps phenotype to cel files (i.e., the .csv file)
storing the list of cel files available
reading the cel files, and (v) making the box plot. (Remember to only include the CEL files of interest here)



In [0]:



In [0]:



In [0]:

create list of your treated/untreated sample and store in a variable called "group". Print it to the screen.
create the design matrix using the "group" variable, and store in a variable called "design". Print it to the screen.



In [0]:

Analyze the data using the R function lmFit(). Use your "design" and "genenorm" variables as arguments. Store result in a new variable called "fit".
Apply empirical Bayes correction using the R function eBayes(). Store result a variable called "efit".
Obtain the top 100 results from the "efit" variable using the R function topTable(). Sort this by P-value, and store result in variable called "tt"
Output the top 5 results to the screen.



In [0]:

Output to file the list of your top 100 results variable tt to a file named "mytop100results.txt". Exclude quotation marks, separate the entries by comma, and include row and column names in the output table.



In [0]:



In [0]:



In [0]:

Create a table with all of the results from the "efit" object into a new object called "allTT"
TIP: Think through (in human terms) the specific tasks/function calls you will need to do perform the task.
Hint: there are three specific tasks/steps, and two functions that will be helpful to you.



In [0]:

run the R function sessionInfo() to record the functions and versions used for your analysis.



In [0]: