In [ ]:
%matplotlib inline
import numpy as np
import scipy.stats
from matplotlib import pyplot as plt
In [ ]:
tuple, array or list?
In [ ]:
| scientific | common |
|---|---|
| X. laevis | frog |
| M. musculus | mouse |
| H. sapiens | human |
In [ ]:
for loop to print out every 3rd value between -97 and 33.
In [ ]:
time.time() function, which spits out the time in seconds since January 1, 1970 at 12:00 GMT. Capture the output using a numpy array (no lists allowed) and determine the mean.
In [ ]:
import time
for i in range(1000):
instrument_spew = time.time()
Reference material is here.
In [ ]:
plt.show()).
In [ ]:
In [ ]:
You place 100 coins heads up in a row and number them by position, with the coin all the way on the left No. 1 and the one on the rightmost edge No. 100. Next, for every number N, from 1 to 100, you flip over every coin whose position is a multiple of N. For example, first you’ll flip over all the coins, because every number is a multiple of 1. Then you’ll flip over all the even-numbered coins, because they’re multiples of 2. Then you’ll flip coins No. 3, 6, 9, 12 … And so on.
What do the coins look like when you’re done? Specifically, which coins are heads down?
Source: fivethirtyeight
In [ ]:
You are doing a classic blue/white lacZ mutant screen, where bacterial mutants of interest have white colonies rather than blue colonies. You screen a library containing 10,000 mutants. You expect that 1/1,000 mutants will be white. Assuming no bias in the library, how many colonies do you need to look at to have a < 2% chance of missing a mutant? (You can report your sampling to within a factor of ten.)
In [ ]:
You read a paper that makes a big deal out of the following result.
The authors claim that the difference between treatment 1 and 2 is significant and important. You are skeptical and want to test the claim. Fortunately, these scientists published their response data for each treatment condition in the supplement. These are copied below.
In [ ]:
treat_1 = [57.26977195, 46.18382224, 49.53778012, 41.48839620, 60.208242,
50.52545917, 46.35328597, 45.74836944, 48.44702572, 52.524908,
55.10329891, 46.61524479, 52.13253421, 54.72779465, 42.324008,
50.33964928, 52.18085508, 53.24086389, 43.14439906, 45.148827]
treat_2 = [60.41763564, 48.83220035, 51.12384165, 42.96237314, 62.606467,
51.96334172, 50.68015860, 48.75041835, 51.08492900, 55.163020,
58.30618134, 51.06279668, 54.75658646, 57.74810245, 46.318017,
51.32863816, 54.85243237, 55.94919523, 46.42182621, 48.367620]
In [ ]:
In [ ]:
In [ ]:
You are studying the growth of bacteria from a small inoculum to saturation in a flask. The following information will help you answer the questions below.
Given your growth conditions, the number of bacteria at time $t$ will grow according to:
$$N(t) = \frac{N_{c}}{1 + exp(\lambda - kt)}$$where $N_{c}$, $\lambda$, and $k$ are constants. $N_{c}$ is the maximum number of bacteria that can be supported by the environment, $\lambda$ captures how long it takes for the bacteria to start dividing post inoculation, and $k$ is the instantaneous growth rate. (We're ignoring the fact that bacteria eventually start to die after they run out of food). For your wildtype bacteria, these constants are: $N_{c} = 1 \times 10^{10}\ cells \cdot mL^{-1}$, $\lambda = 12$ and $k = 0.05\ min^{-1}$.
You can measure bacterial growth by following the turbidity of your cultures using a spectrophotometer. By careful calibration, you know that the observed $OD_{600}$ is related to the number of cells by: $$OD_{600} = \frac{gN}{N + K}$$
where $g = 3.5$ and $K = 2 \times 10^{9}\ cells \cdot mL^{-1}$.
In [ ]:
In [ ]:
In [ ]:
In [ ]:
You have a tube containing 10 molecules of DNA drawn from a population containing four different species $A$, $B$, $C$, and $D$. You want to want to estimate the frequencies of $A$, $B$, $C$, and $D$ in the original popuation. If you were omniscient, you would know that $A$, $B$, $C$, and $D$ have the following actual frequencies:
| species | frequency |
|---|---|
| A | 0.5 |
| B | 0.2 |
| C | 0.2 |
| D | 0.1 |
Since you aren't omniscient, you make a measurement. You use a Polymerase Chain Reaction (PCR) to amplify those 10 molecules, then use high-throughput sequencing to measure the relative frequencies of A-D in the final pool.
What are the mean and standard deivation of on your estimates of each frequency ($\hat{f}_{A}$, $\hat{f}_{D}$, $\hat{f}_{C}$, $\hat{f}_{D}$)?
Some information about the experiment:
In [ ]: