Name | Abbreviation | Mathematical Expression | input | output |
---|---|---|---|---|
Probability Mass Function | PMF | $P(q)$ | element in sample space | probability |
? | $p(q)$ | ? | must be integrated, outputs probability of interval | |
? | ? | $P(x < y)$ | upper bound of interval ($y)$ | ? |
? | PPF | $y$ such that ? | probability of interval ($p$) | ? |
[2 point] What are the inputs/outputs to a random variable?
[2 points] If a problem statement gives you $P(X = 3 | Y = 4)$ and $P(Y = 4)$, what equation gives you $P(X = 3, Y = 4)$?
[4 points] If you flip a coin twelve times, how many permutations are possible and how many combinations of heads/tails are possible?
[4 points] Load the sunspot.month dataset and histogram the last column (number of sunspots per month). Use this code snippet: sunspot_data = pydataset.data('sunspot.month').values
. If you get an error when loading pydataset
that says No Module named 'pydataset'
, then execute this code in a new cell once: !pip install --user pydataset
.
[4 points] Using numpy compute the sample mean and sample standard deviation
[6 points] Convert your data into a list (e.g., l = list(sunspot_data[:,1])
) and write a for
loop to compute the sample mean without using numpy
[4 points] Use the sort
method to find the median.
[4 points] Load the sleep data. It contains change in reaction times in milliseconds of participants as a funciton of their sleep deprevation. Create a boxplot of the data where the x-axis is days of sleep deprevation.
[2 points] Based on your plot, does reaction time change as you get less sleep?
In homework 1, I asked you to begin collecting data on two random processes. Please list again what your two things were and compute their mean, median and standard deviation. If your data does not have numerical values (e.g., colors, clothes), choose a random variable based on your data and compute it's mean. 6 points.