Day 5: Normal Distribution I

https://www.hackerrank.com/challenges/s10-normal-distribution-1

Objective

In this challenge, we learn about normal distributions. Check out the Tutorial tab for learning materials!

Task

In a certain plant, the time taken to assemble a car is a random variable, X, having a normal distribution with a mean of 20 hours and a standard deviation of 2 hours. What is the probability that a car can be assembled at this plant in:

Question 1:

Less than 19.5 hours?

Question 2:

Between 20 and 22 hours?

Input Format

There are lines of input (shown below):

20 2
19.5
20 22

The first line contains 2 space-separated values denoting the respective mean and standard deviation for X. The second line contains the number associated with question 1. The third line contains 2 space-separated values describing the respective lower and upper range boundaries for question 2.

If you do not wish to read this information from stdin, you can hard-code it into your program.


In [7]:
import math
from matplotlib import pylab as plt
%matplotlib inline

In [8]:
def pdf(x, m, variance):
    sigma = math.sqrt(variance)
    """probability density function"""
    return 1 / (sigma * math.sqrt(2 * math.pi)) * math.e ** (-1 * ((x - m)**2 / (2 * variance ** 2)))

In [26]:
pdf(20, 20, 4)


Out[26]:
0.19947114020071635

Example


In [28]:
N = range(50)
mean = 15
variance = 3
plt.plot(N, [pdf(i, mean, variance) for i in N])


Out[28]:
[<matplotlib.lines.Line2D at 0x105bc3dd8>]

In [79]:
def cumulative(x, m, stdv):
    return 0.5 * (1 + math.erf((x-m)/ (stdv * math.sqrt(2)) ))

In [56]:
[(i, cumulative(i, 20.0, 3.0)) for i in range(1, 30)]


Out[56]:
[(1, 0.0),
 (2, 0.0),
 (3, 0.0),
 (4, 0.0),
 (5, 0.0),
 (6, 0.0),
 (7, 0.0),
 (8, 6.106226635438361e-16),
 (9, 1.1224354778960333e-13),
 (10, 1.3083922834056239e-11),
 (11, 9.865876449133282e-10),
 (12, 4.8213033676525185e-08),
 (13, 1.5306267365233772e-06),
 (14, 3.167124183311998e-05),
 (15, 0.0004290603331968401),
 (16, 0.003830380567589775),
 (17, 0.022750131948179264),
 (18, 0.09121121972586788),
 (19, 0.2524925375469229),
 (20, 0.5),
 (21, 0.7475074624530771),
 (22, 0.9087887802741321),
 (23, 0.9772498680518207),
 (24, 0.9961696194324102),
 (25, 0.9995709396668031),
 (26, 0.9999683287581669),
 (27, 0.9999984693732635),
 (28, 0.9999999517869663),
 (29, 0.9999999990134123)]

In [58]:
mean = 20 
variance = 4

Question 1

Less than 19.5 hours?


In [80]:
x = 19.5
q1_answer = cumulative(x, mean, math.sqrt(variance))
q1_answer


Out[80]:
0.4012936743170763

Question 2

Between 20 and 22 hours?


In [81]:
lower_hour, upper_hour = 20.0, 22.0
q2_answer = cumulative(upper_hour, mean, math.sqrt(variance)) - cumulative(lower_hour, mean, math.sqrt(variance))
q2_answer


Out[81]:
0.34134474606854304

In [ ]: