In [1]:
%matplotlib inline
import pylab as pl
from math import sqrt
import sys
# importing platemate
sys.path.insert(0, '../src')
import platemate
You first need to map each column in your plate to a colony or a control.
In [2]:
ColumnNames = {
'C' : "Dev1",
'D' : "Dev2",
'E' : "Dev3"
}
controlNames = {
'A' : "LB",
'B' : "LB+Cam",
'F' : "+control",
'G' : "-control1",
'H' : "-control2"
}
Then, we an instance of PlateMate:
In [19]:
reload(platemate)
pm = platemate.PlateMate( colonyNames = ColumnNames, controlNames = controlNames )
The variable pm above is what you will be using to read and parse your data, plot it, and analyze it. It will consider the plate mapping you have defined by colonyNames and controlNames. For instance, you can retrieve this information by using two functions below:
In [20]:
print pm.getColonyNames()
print pm.getControlNames()
Let's get started with importing our data. During this process, PlateMate will look for all files that follow a pattern (in our case, each file starts with "medida"). Then, it will read each of those files and parse them. This is usually a fast process, even for larger data sets, and should take only a fraction of section to complete.
In [21]:
pm.findFiles("medida")
pm.readFluorescence()
pm.readOpticalDensity()
Now our instance pm has all information about the plate readings. We can get a summary from one of the well sets by using the function summary(). As an example, let us check the data from LB wells.
In [22]:
pm.summary("LB")
Out[22]:
Each row displayed above represents a different measure in time (in our case, each measurement were spaced by 1 hour). Thus, we're looking at the data from the first 3 hours. If you look back in our map, all wells with LB were in column A. Each number with A refers to a different row on the actual plate, i.e., A04 represents column A row 4 in the original plate.
You can always retrieve the whole data from a population by using getFluorescence().
In [23]:
pm.getFluorescence("LB")
Out[23]:
Similarly, you can check the optical density associated with that particular population.
In [24]:
pm.getOpticalDensity("LB")
Out[24]:
Because the file also contained information about the temperature at the time of the reading, platemate will also sotre it for any possible analysis.
In [29]:
# retrieving temperature
Temperature = pm.getTemperature()
# printing mean, min and max.
print "Average temperature: %4.1f'C" % ( Temperature.mean() )
print "Temperature range: %4.1f'C - %4.1f'C" % (Temperature.min(), Temperature.max())
pm.plotTemperature()
pl.show()
In [26]:
print "Plotting all wells for each population"
pl.figure(figsize=(6,4.5))
pm.plotIt(["Dev1","Dev2","Dev3"])
pl.show()
print "Plotting averages for each population"
pl.figure(figsize=(6,4.5))
pm.plotMean(["Dev1","Dev2","Dev3"])
pl.show()
print "Plotting averages and 1-std intervals for each population"
pl.figure(figsize=(6,4.5))
pm.plotFuzzyMean(["Dev1","Dev2","Dev3"])
pl.show()
In [27]:
pm.compareFluorescence("LB","Dev1")
Out[27]:
In [28]:
pl.figure(figsize=(7,4))
pm.plotBars(["Dev1","Dev2","Dev3","-control1"], 5)
pl.show()
An important part of comparing expressions of different devices/genes is the use of appropriate statistical testing. For instance, suppose that you want to find out if the response presented by Dev1 is significantly stronger than the response presented by the negative control 1. This can be solved by a simple statistical test such as testing Mann-Whitney's $U$ statistics. Below, we're showing the value evaluated for $U$ and p-value when comparing Dev1, Dev2 and Dev3 with -control1.
In [380]:
print "Device 1 vs -control1:"
print pm.compareFluorescence("Dev1","-control1")
print "Device 2 vs -control1:"
print pm.compareFluorescence("Dev2","-control1")
print "Device 3 vs -control1:"
print pm.compareFluorescence("Dev3","-control1")
The result above shows that we cannot rule out the hypothesis that Dev3 is not significantly larger than the negative control. This completely agrees with the bar plots comparing all devices and the negative control. In other words, this basically shows that no significant expression was observed in our device 3.
Is the expression of device 3 at least stronger than LB?
In [359]:
print "Device 3 vs LB:"
print pm.compareFluorescence("Dev3","LB")
print "-control vs LB:"
print pm.compareFluorescence("-control1","LB")
This shows that it is, although the negative control also shows a response significantly larger than the LB medium.
Using all possible combinations is not the proper way to study multiple populations. Kruskal-Wallis' and (Post-Hoc) Dunn's tests are an extension of the $U$ statistics for comparing multiple populations at once.
Because ANOVA is a very popular test, platemate can perform ANOVA followed by a post-hoc test using Tukey's HSD.
In [360]:
pm.ANOVA(["Dev1","Dev2","Dev3","-control1"])
Out[360]:
In [361]:
pm.TukeyHSD(["Dev1","Dev2","Dev3","-control1"])
Out[361]:
This analysis shows that this data passes an ANOVA test (ruling out the null hypothesis that all populations have the same mean), and Tukey's test shos that except for Dev3 and -control1 is the only pair that does not show sgnificant change in their mean.
It is important to keep in mind, however, that ANOVA assumes that ...
If hypothesis 2 does not hold, then Kruskal-Wallis should be used instead.
In [ ]: