Unit 3: Simulation

Lesson 18: Non-uniform distributions

Notebook Authors

(fill in your two names here)

Facilitator: (fill in name)
Spokesperson: (fill in name)
Process Analyst: (fill in name)
Quality Control: (fill in name)

If there are only three people in your group, have one person serve as both spokesperson and process analyst for the rest of this activity.

At the end of this Lesson, you will be asked to record how long each Model required for your team. The Facilitator should keep track of time for your team.

Computational Focus: Non-uniform distributions

As we saw in the previous lesson, a uniform random number distribution can be easily generated and shifted appropriately using Python. Some applications, however, may require a series of random numbers that are not distributed uniformly. For example, the probability for a system to be at particular energy level as a function of temperature is given by an exponential distribution, the velocities of an ideal gas at a particular temperature follow a Gaussian (normal) distribution, and many environmental, behavioral, and genetic data sets are modeled using these and other non-uniform distributions. This lesson will use two different procedures for creating any desired distribution of random numbers using a uniform random-number generator.

Model 1: Random seed

Random number functions are really an equation that takes a number input (referred to as the seed) and produces a new number as output. When you do not specify the seed (as we have not so far) Python the uses the current system time to set the seed.

Let's look at this briefly. Run the code below.


In [ ]:
import time

In [ ]:
time.time()

1. Describe the results.

Now run time.time() again below.


In [ ]:

2. Describe the results and compare them to the first time.time() call.

Read the info on the time module here: http://www.tutorialspoint.com/python3/python_date_time.htm

3. Explain the output of the time.time() function calls (what is that number, how and why are the results of the two calls different, etc.).

Just in case you ever want to know what time it is, computers can give you a more human readable format (and if you're ever really interested, Python also has the datetime library that has a lot of super useful tools). Run the code below.


In [ ]:
## gets the time, still not very human readable
time.localtime()

In [ ]:
## formats the time nicely
time.asctime(time.localtime())

But we digress. Back to random numbers...

The default seed for the first Python random number generator is actually set by the computer (based on the computer time). Then with each subsequent call for a new random number, the previous random number produced is used as the seed for the next number in the sequence.

4. If the next random number is generated using the one before it, why isn't the series the same every time? (answer the question, then write and run some code to demonstrate what you mean)


In [ ]:
## series of random numbers doesn't repeat

Often this is perfectly fine. But sometimes, we want to be able to reproduce our random sample to repeat our analysis, show someone else our code and have the results be indentical, etc.

If you want a reproducible sequence of random numbers, you can specifically set the seed for the random number sequence using the seed() function.

Type the following each in it's own:

random.seed(100)
random.random()
random.random()
random.seed(100)
random.random()

5. Explain the difference between the 1st and 2nd call to random.random().

6. Explain the difference between the 3rd call to random.random() and the 1st two calls to random.random().

7. Describe one advantage to a "scientific programmer" of setting the same seed each time some code is run.

8. Describe one reason that you think a "scientific programmer" would not want to set the same seed each time some code is run.

Model 2: Transforming distributions

preliminaries

So far we have made uniform random number distributions and shifted them. Before we start transforming them. Use the code cells below (add as necessary) to write and run code that:

  • makes 1000 uniform random numbers
  • plots a histogram of those numbers
  • shifts those numbers to a different interval
  • plots a histogram of the shifted numbers

Label the axes on your graphs and either provide a title or a caption
Also make sure to comment your code heavily to explain what it does!


In [ ]:


In [ ]:

9. Explain the thought process that you used to program the interval shift.

transformations

If we want to simulate a system that is described by exponential decay (bottom right graph in intro) such as the half-life of a radioactive isotope or the half-life of a drug in an organism, we have to transform our uniform distribution to an exponential one.
One simple way to do this is using the transofrmation function below:
$$ y = -ln(x) $$
where $x$ is our random number between 0 and 1.
(Note: we'll see other ways to get the same result soon)

write and run code below that:

  • makes 1000 uniform random numbers
  • plots a histogram of those numbers
  • transforms those numbers to an exponential distribution
  • plots a histogram of the transformed numbers
  • make a second histogram with more numbers and more bins

Label the axes on your graphs and either provide a title or a caption
Also make sure to comment your code heavily to explain what it does!


In [ ]:


In [ ]:

10. Explain the thought process that you used to program the transformation.

Model 3: Built in distribution methods

Don't get mad, but Python has several very useful collections of random number generating and distribution functions either built in or in libraries! Here is a list for reference:

write and run code below that:

  • for 2 different normal (Gaussian) distributions (one "standard" normal and one with a different $\mu$ and $\sigma$)
    • makes 1000 random numbers using that distribution
    • plots a histogram of those numbers
    • makes at least 10000 random numbers using that distribution
    • plots a histogram of the increased number of numbers
    • plots a histogram of the previous with more bins
  • for 2 different, other distributions
    • same list as above

Also, include a markdown cell for each distribution that explains it and includes at least a reference. (you should have 3 explanations - 1 for normal, and 2 of your choice)

Note: make sure that you are using the discrete version of anything fancy that you use (especially from scipy.stats) to make a histogram. (not the pdf, if you don't know what we mean, that's ok, just make sure you get values that you can use for a histogram, not a function that plots as a line)

Label the axes on your graphs and either provide a title or a caption Also make sure to comment your code heavily to explain what it does!


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Temporal Analysis Report

How much time did it require for your team to complete each Model?

Model 1:

Model 2:

Model 3: