Here we need to note that our population is 4 (1 to 4). And we are taking samples of size 2 from this population.
We have the following cases
If we take samples of this then we are essentailly taking mean of sampling distribution.
For any distribution if we draw a lot of samples and draw their sampling distribution it will turn out to be approximately normal given that the sample size is big enough.
Let's try and find standard deviation of sampling distribution created when rolling 2 dies and taking their sum.
The population standard deviation can be found using the STDEVP function in google spreadsheets
In [9]:
import math
def mean_standard_deviation(population):
mean = sum(population) / len(population)
differences = [element - mean for element in population]
squared_differences = [diff ** 2 for diff in differences]
mean_squared_differences = sum(squared_differences) / len(squared_differences)
SD = math.sqrt(mean_squared_difference)
return mean, SD
def standard_deviation_sample(population, sample_size):
mean_population, sd_population = mean_standard_deviation(population)
return sd_population / math.sqrt(sample_size)
population = list(range(1, 7))
print(standard_deviation_sample(population, 2))
print(standard_deviation_sample(population, 5))
As sample size increases the SD of sampling distribution decreases and hence it becomes skinnier.
Sampling applet is present at http://onlinestatbook.com/stat_sim/sampling_dist/index.html
It helps us find where a sample that we have lies on the sampling distibution. That helps us understand whether our sample is normal or is there anything special going on compared to other possible samples.