Homework 6

CHE 116: Numerical Methods and Statistics

Version 1.0 (2/17/2016)


1. Misc Distribution Problems (10 Points)

Answer symbolically first, indicating what equations your Python program is using, and then compute the answer in Python. If not specified, say which distribution you're assuming.

  1. [1] The time between traffic tickets is exponentially distributed. Based on past experience, you receive a traffic ticket about every 3 years. What's the probability of having one traffic ticket within 12 months? For two bonus points, what about have two traffic tickets within 12 months? Use scipy stats.

  2. [2] You see two deer per day on average. How many days must pass before you have a 99% of having seen a deer? Answer in days, hours, and minutes.

  3. [1] The expected score on a test is 90% with a standard deviation of 15%. You cannot receive more than 100% on this test. What's the probability failing (< 60%)?

  4. [2] Using the above parameters, what's the probability of getting an A (93%-100%)?

  5. [4] Using the definition of expected value, write a for loop that computes the expected value of a binomial distribution with $N = 10$ and $p = 0.3$. Do not use scipy stats. Compre with the fomula $E[x] = pN$ for binomial.

2. CLT Theory (4 Points)

Indicate if the CLT applies with yes or no. If no, state why.

  1. You measure the density of a solution 50 times and take the average
  2. You sum the number of students who attended the last 20 lectures
  3. Flip a coin 25 times and consider a heads 0 and a tails 1 and take the average
  4. Your final grade, which is the average of 25 homeworks, tests, and a project

3. Confidence Intervals (12 Points)

Report the given confidence interval for error in the mean using the data in the next cell and describe in words what the confidence interval is for each example

  1. 80% Double
  2. 99% Upper ( a value such that the mean lies above that value 99% of the time)
  3. 95% Double
  4. Redo part 3 with a known standard deviation of 2

In [4]:
data_3_1 = [93.14,94.66, 102.1, 79.98, 96.85, 106.79, 101.92, 91.99, 97.22, 99.1, 88.7, 123.66, 99.7, 115.03, 99.28, 114.59, 102.25, 88.4, 111.06, 75.19, 107.32, 81.21, 100.49, 109.04, 105.09, 96.17, 78.13, 98.37, 104.47, 95.41]
data_3_2 = [2.24,3.86, 2.19, 1.5, 2.34, 2.55, 1.8, 3.99, 2.64, 3.8]
data_3_3 = [53.43,50.49, 52.55, 51.73]

4. Sample Statistics (17 Points)

Answer the following questions using the data given in the next cell.

  1. [1] What is the sample correlation between $X$ and $Y$?
  2. [5] Write your own method to compute sample covariance using a for loop. Use the code in the second cell below as a starting point. Do not use any numpy methods except to check your answer. Hint: You will need to use two loops
  3. [5 + 1] What is the median of $Y$? Use Python and Numpy. Be careful if you use the sort method, since it will permanently alter your $Y$ array. Use y2 = Y[:] to copy the list

In [2]:
X = [1.6,0.4, -1.05, -0.08, 0.99, -1.89, 0.29, 0.71, -0.47, 1.15]
Y = [3.59,1.49, -2.57, -0.0, 2.0, -3.48, 0.14, 1.38, -1.48, 2.6]

In [3]:
for xi, yi in zip(X, Y): 
    print(xi, yi)


1.6 3.59
0.4 1.49
-1.05 -2.57
-0.08 -0.0
0.99 2.0
-1.89 -3.48
0.29 0.14
0.71 1.38
-0.47 -1.48
1.15 2.6