1. Identifying Distributions (5 Points)

List if a distribution is continuous or discrete, its support, and an example of a random variable which follows the distribution. 1 Bonus point per problem for identifying a rv related to chemical engineering.

  1. Geometric distribution
  2. Poisson Distribution
  3. Binomial Distribution
  4. Exponential Distribution
  5. Bernoulli Distribution


1.1 Discrete, $[1,\infty)$, the number of collisions until two molecules react

1.2 Discrete, $[0, N]$ or $[0,\infty]$, the number of accidents at a refinery

1.3 Discrete, $[0, N]$, the number of failures of a metal at a certain temperature

1.4 Continuous, $(0,\infty)$, the amount of time between between incoming crude oil shipments

1.5 Discrete, $\{0,1\}$, the open/close state of a valve

2. Slicing Lists (6 Points)

Using this sentence: "The quick brown fox jumps over the lazy dog", Create slices of the array to answer the following questions. Answer in Python

  1. What is the first element?
  2. What is the sentence without the last element?
  3. What are the first 5 characeters?
  4. What is the first half of the sentence?
  5. What is the second half?
  6. What is every 3rd character, starting from the 4th?

2.1 Answer

In [2]:
string='The quick brown fox jumps over the lazy dog'
print 'The first element is "{}"' .format(string[0])

The first element is "T"

2.2 Answer

In [3]:
print 'The sentence without the the last element is: ', string[:-1]

The sentence without the the last element is:  The quick brown fox jumps over the lazy do

2.3 Answer

In [4]:
print string[0:5]

The q

2.4 Answer

In [5]:
print s
print '{:s}'.format(string[0:(s/2)])
print k

The quick brown fox j

2.5 Anwser

In [6]:
print s
print string[((s)/2):]
print  l

umps over the lazy dog

2.6 Answer

In [4]:
print string[3::3]

 i o xusv ea g

3. Numpy Arrays and Arithmetic (12 Points)

  1. Create a numpy array containing a set of points between 0 and 30 spaced apart by 0.02. This should be done with a numpy function.
  2. Using a for loop, sum all the elements in the array and print the sum. Check your answer using the numpy sum function.
  3. Using the array from question 3.1, create an array representing a $\lambda = 3$ exponential distribution on the interval $[0,30)$. A factorial function which works on arrays can be imported by calling from scipy.special import factorial. Note: the word lambda is a reserved word in python, meaning you cannot use it for a variable name. You can tell if a variable is reserved because it is a different color/font than other variables in ipython. Also, when applying mathematical functions to numpy arrays, you must use the numpy versions, for example numpy.cos instead of math.cos
  4. Using a for loop and the array from Question 3.3, compute the expected value of the exponential distribution. To evaluate the integral, use $\int f(x)dx = \sum_i f(x_i) \Delta x$, where $\Delta x$ is the spacing between your points and $f(x_i)$ is the function evaluated at point $x_i$. As you know, it should be $1/3$.
  5. Using both the arrays from Question 3.1 and 3.3 and the sum function from numpy, compute the expected value without a for loop.
  6. Using your equation from HW 2, problem 5 (see key), compute the expected value and variance of the sum of two dice

3.1 Answer

In [8]:
import numpy as np
x = np.arange(0,30, 0.02)
print x

[  0.00000000e+00   2.00000000e-02   4.00000000e-02 ...,   2.99400000e+01
   2.99600000e+01   2.99800000e+01]

3.2 Answer

In [9]:
xsum = 0
for element in x:
    xsum += element
print xsum, np.sum(x)

22485.0 22485.0

3.3 Answer

In [10]:
from scipy.special import factorial
lamb = 3.0
y = np.exp(-lamb * x) * lamb
print y

[  3.00000000e+00   2.82529360e+00   2.66076131e+00 ...,   2.94300426e-39
   2.77161703e-39   2.61021062e-39]

3.4 Answer

$$E[x] = \int_0^{30} \underbrace{x P(x)}_{f(x_i)} \,dx$$$$P(x) = \lambda e^{-\lambda x}$$$$E[x] \approx \sum_i^N \lambda e^{-\lambda x_i} x_i \Delta x$$

In [11]:
expected_value = 0
for element in x:
    expected_value += element * np.exp(-lamb * element) * lamb * 0.02
print expected_value, 1 / 3.

0.333233351331 0.333333333333

3.5 Answer

In [12]:
expected_value = np.sum(x * y * 0.02)
print expected_value, 1. / 3

0.333233351331 0.333333333333

3.6 Answer

In [16]:
#create the sample space
Q = np.arange(2,13)
#probability of a roll
P_n = (6. - abs(Q - 7)) / 36
#sum of n * P_n to get E[n]
expected_value = np.sum(Q * P_n)
#sum of n^2 * P_n to get E[n^2]
expected_value_2 = np.sum(Q**2 * P_n)
#variance via E[n^2] - E[n]^2
variance = expected_value_2 - expected_value ** 2
print 'E[x] = {}, Var(x) = {}'.format(expected_value, variance)

E[x] = 7.0, Var(x) = 5.83333333333

4. Plotting (4 Points)

  1. Create a numpy array of points and evaluate the exponential distribution on it. Make sure your inteval is within the support of the distribution. Plot the distribution
  2. Do the same for the geometric distribution. Think carefully about whether it should be a line or a point plot and what the sample space of the distribution is.
  3. Plot the binomial distribution with $N=10$, $p=0.25$.
  4. Plot the binomial distribution with $N=150$, $p=0.5$.

4.1 Answer

In [9]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn
x = np.linspace(0, 4, 250)
y = 2.0 * np.exp(-2.0 * x)

4.2 Answer

In [10]:
n=np.arange(1, 9)
plt.plot(n, pgeo, 'yo-')

4.3 Plot the binomial distribution with $N=10$, $p=0.25$.

In [11]:
from scipy.special import comb
p = 0.25
N = 10
L=comb(N, h)
plt.plot(h, pbi, 'ro-')

In [15]:
N = 150
p = 0.5
L=comb(N, h)
plt.plot(h, pbi, 'go-')

In [ ]: