In [1]:
pnorm(0.95,mean=0.9,sd=0.0212,lower.tail=F)
Out[1]:
In [2]:
sum(dbinom(190:200,200,0.90))
Out[2]:
In [15]:
#Confidence interval, one categorical variable,two outcome levels, observing success
p = 0.85
n = 670
CL = 0.95
SE = sqrt(p*(1-p)/n)
z_star = round(qnorm((1-CL)/2,lower.tail=F),digits=2)
ME = z_star * SE
c(p-ME, p+ME)
Out[15]:
So based on this data, we can interpret confidence interval as:
In [4]:
#Required sample size proportion for desired ME
p = 0.85
z_star = 1.96
ME = 0.01
z_star**2*p*(1-p)/ME**2
Out[4]:
In [31]:
#Required sample size proportion for desired ME
p = 0.5
z_star = 1.96
ME = 0.01
z_star**2*p*(1-p)/ME**2
Out[31]:
In [6]:
#Confidence Interval
#Observe one level in categorical variable, of categorical of two levels.
#1 = Coursera, 2 = US
n_1 = 83
p_1 = 0.71
n_2 = 1028
p_2 = 0.25
CL = 0.95
SE = sqrt( (p_1*(1-p_1)/n_1)+(p_2*(1-p_2)/n_2) )
z_star = round(qnorm((1-CL)/2, lower.tail=F),digits=2)
ME = z_star*SE
c((p_1-p_2)-ME, (p_1-p_2)+ME)
Out[6]:
In [5]:
#Hypothesis testing, one categorical variable, given null value(p)
p = 0.5
p_hat = 0.6
n = 1983
SL = 0.05
SE = sqrt(p*(1-p)/n)
z_star = round(qnorm((1-CL)/2,lower.tail=F),digits=2)
pnorm(p_hat,mean=p,sd=SE,lower.tail=p_hat < p)
Out[5]:
question? Is majority Americans believe evolution? the data provide convincing evidence that majority of all Americans believe in evolution.
In [6]:
#Confidence Interval
#Observe one level in categorical variable, of categorical of two levels.
#1 = Coursera, 2 = US
n_1 = 83
p_1 = 0.71
n_2 = 1028
p_2 = 0.25
CL = 0.95
SE = sqrt( (p_1*(1-p_1)/n_1)+(p_2*(1-p_2)/n_2) )
z_star = round(qnorm((1-CL)/2, lower.tail=F),digits=2)
ME = z_star*SE
c((p_1-p_2)-ME, (p_1-p_2)+ME)
Out[6]:
In [27]:
#Confidence Interval
#Observe one level in categorical variable, of categorical of two levels.
#1 = Coursera, 2 = US
n_1 = 144
p_1 = 71/144
n_2 = 389
p_2 = 224/389
CL = 0.95
SE = sqrt( (p_1*(1-p_1)/n_1)+(p_2*(1-p_2)/n_2) )
z_star = round(qnorm((1-CL)/2, lower.tail=F),digits=2)
ME = z_star*SE
c((p_1-p_2)-ME, (p_1-p_2)+ME)
Out[27]:
In [30]:
SE
Out[30]:
we are 95% confident that proportion of Courserians is 36% to 56% higher than US that believe there should be law for banning gun possesion
In [13]:
#Hypothesis testing for null value zero
#Observe one level in categorical variable, of categorical of two levels.
#1 = Male, 2 = Female
n_1 = 90
np_1 = 34
p_1 = round(np_1/n_1,digits=2)
n_2 = 122
np_2 = 61
p_2 = round(np_2/n_2,digits=2)
p_pool = round((np_1+np_2)/(n_1+n_2),digits=2)
null = 0
SE = sqrt((p_pool*(1-p_pool)/n_1) + (p_pool*(1-p_pool)/n_2))
pe = p_1 - p_2
pnorm(pe,mean=null,sd=SE, lower.tail=pe < null) * 2
Out[13]:
there is no difference between males and females with respect to likelihood reporting their kids to being bullied
In [14]:
source("http://bit.ly/dasi_inference")
In [16]:
paul = factor(c(rep('yes',8),rep('no',0)), levels=c('yes','no'))
inference(paul,est='proportion',type='ht',method='simulation',success='yes',null=0.5,alternative='greater')
Screenshot taken from Coursera 04:57
Screenshot taken from Coursera 11:12
In [17]:
chi_square = 22.63
dof = 4
pchisq(chi_square,dof,lower.tail = F)
Out[17]:
the data provide convincing evidence that the observed counts distribution of race ethnicity for jurors did not follow population distribution
In [18]:
chi_square = 22.63
dof = 4
pchisq(chi_square,dof,lower.tail = F)
Out[18]:
the data provide convincing evidence, that obesity and relationship are related.
In [21]:
male = 6+15+3
nostop = 4+3
grandtotal = 6+16+4+6+15+3
(male+nostop)/grandtotal
Out[21]:
In [26]:
7/50
Out[26]:
In [24]:
0.14*male
Out[24]:
In [25]:
0.14*24
Out[25]:
In [ ]: