You are trying to measure a difference in the $K_{D}$ of two proteins binding to a ligand. From previous experiments, you know that the values of replicate measurements of $K_{D}$ follow a normal distribution with $\sigma = 2\ \mu M$. How many measurements would you need to make to confidently tell the difference between two proteins with $K_{D} = 10 \mu M$ and $K_{D} = 12 \mu M$?



In [ ]:

Goals

Know how to use basic numpy.random functions to sample from distributions
Begin to understand how to write a simulation to probe possible experimental outcomes

Create a new notebook with this cell at the top



In [ ]:

    
%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

Figure out how to use `np.random.choice` to simulate 1,000 tosses of a fair coin



In [ ]:

np.random uses a "pseudorandom" number generator to simulate choices

String of numbers that has the same statistical properties as random numbers
Numbers are actually generated deterministically

Numbers look random...



In [30]:

    
numbers = np.random.random(100000)
plt.hist(numbers)









    Out[30]:





(array([ 9980., 10096.,  9971., 10020., 10017.,  9930.,  9997.,  9981.,
         9818., 10190.]),
 array([1.25214565e-05, 1.00009200e-01, 2.00005879e-01, 3.00002559e-01,
        3.99999238e-01, 4.99995917e-01, 5.99992596e-01, 6.99989275e-01,
        7.99985954e-01, 8.99982633e-01, 9.99979312e-01]),
 <a list of 10 Patch objects>)

But numbers are actually deterministic...



In [31]:

    
def simple_psuedo_random(current_value,
                         multiplier=13110243,
                         divisor=13132):

    return current_value*multiplier % divisor



In [33]:

    
seed = 10218888

out = []
current = seed
for i in range(1000):
    current = simple_psuedo_random(current)
    out.append(current)

plt.hist(out)









    Out[33]:





(array([105., 105., 107.,  91.,  93.,  90.,  91., 105., 107., 106.]),
 array([  160. ,  1441.2,  2722.4,  4003.6,  5284.8,  6566. ,  7847.2,
         9128.4, 10409.6, 11690.8, 12972. ]),
 <a list of 10 Patch objects>)

python uses the Mersenne Twister to generate pseudorandom numbers

What does the `seed` do?



In [ ]:

    
seed = 1021888

out = []
current = seed
for i in range(1000):
    current = simple_psuedo_random(current)
    out.append(current)

What will we see if I run this cell twice in a row?



In [37]:

    
s1 = np.random.random(10)
print(s1)









    



[0.91054531 0.87130892 0.43747984 0.14454913 0.37558088 0.47869707
 0.05645585 0.59729579 0.02436505 0.67063894]

What will we see if I run this cell twice in a row?



In [54]:

    
np.random.seed(5235412)
s1 = np.random.random(10)
print(s1)









    



[0.44616627 0.29423635 0.44965624 0.40469388 0.35328769 0.76140219
 0.15757785 0.69331684 0.31833615 0.94897036]

A `seed` lets you specify which pseudo-random numbers you will use.

If you use the same seed, you will get identical samples.
If you use a different seed, you will get wildly different samples.

matplotlib.pyplot.hist



In [56]:

    
numbers = np.random.normal(size=10000)
counts, bins, junk = plt.hist(numbers,
                              range(-10,10))

Basic histogram plotting syntax

COUNTS, BIN_EDGES, GRAPHICS_BIT = plt.hist(ARRAY_TO_BIN,BINS_TO_USE)

Figure out how the function works and report back to the class

What the function does
Arguments normal people would care about
What it returns



In [ ]:

    
np.random.normal
np.random.binomial
np.random.uniform
np.random.poisson
np.random.choice
np.random.shuffle

Calculate:

1000 random samples from a normal distribution with a mean of 5 and a standard deviation of 2.
Create a histogram with a bin size of 1.



In [ ]:

Goals

Create a new notebook with this cell at the top

Figure out how to use np.random.choice to simulate 1,000 tosses of a fair coin

np.random uses a "pseudorandom" number generator to simulate choices

Numbers look random...

But numbers are actually deterministic...

python uses the Mersenne Twister to generate pseudorandom numbers

What does the seed do?

What will we see if I run this cell twice in a row?

What will we see if I run this cell twice in a row?

A seed lets you specify which pseudo-random numbers you will use.

matplotlib.pyplot.hist

Basic histogram plotting syntax

Figure out how the function works and report back to the class

Calculate:

Figure out how to use `np.random.choice` to simulate 1,000 tosses of a fair coin

What does the `seed` do?

A `seed` lets you specify which pseudo-random numbers you will use.