Relevant lectures: 8
In today's discussion, you'll get practice with inference concepts and dive deeper into the work we did in lecture 8.
This discussion will not be turned in. In fact, there is no code in this discussion; all your answers will be written in the text cells below.
The purpose of this exercise is to think about and communicate your point of view, so please work through these problems together in groups of 2 or 3.
Recall from lecture: J drives a daily commute to UC Berkeley from Beaumont Ave. in Oakland.
He wants to know what lane is best to take.
Specifically, he wants to know: is Lane 4 (the rightmost lane) better than Lane 1 (the leftmost lane)?
Our dataset contains all the work day flows over 60 minute intervals (7-8am) near Beaumont Ave.
Here's a plot of the flows from 7-8am over the time period in our data:
And here are the distributions of the flows:
Write your answer here, replacing this text.
Write your answer here, replacing this text.
Write your answer here, replacing this text.
In order to tell whether our difference is significant, we bootstrapped the mean difference between Lane 1 and 4.
This is the distribution we got:
According to this distribution, estimate the probability that we get a flow difference of 0 if the lane flows fluctuated by chance.
Why can we look at this distribution and find a probability?
Is our probability a p-value?
Finally, why did we look at the probability of getting 0 or more extreme rather than getting 20 (our previously computed mean difference) or more extreme?
Write your answer here, replacing this text.
Write your answer here, replacing this text.
Write your answer here, replacing this text.
One good way to check whether you understand something is to tweak the problem and see if you can still figure it out. Let's do that!
Let's suppose we didn't bootstrap the differences. Instead, we bootstrap the mean flow for Lane 1 and Lane 4 separately. Can we still answer our original question? If so, how? If not, explain why not.
Write your answer here, replacing this text.
Rephrase the question, null, and alternative hypotheses so that you would construct a one-sided confidence interval instead of the two-sided one above.
Then, use the plot above to estimate your one-sided confidence interval. How do the bounds of this interval compare with your previous bounds?
Write your answer here, replacing this text.
Write your answer here, replacing this text.
Write your answer here, replacing this text.
In [ ]: