(We are covering, what is referred to as, frequentist method of Hypothesis testing)
We would like to know if the effects we see in the sample(observed data) are likely to occur in the population.
The way classical hypothesis testing works is by conducting a statistical test to answer the following question:
Given the sample and an effect, what is the probability of seeing that effect just by chance?
Here are the steps on how we would do this
If p-value is very low(most often than now, below 0.05), the effect is considered statistically significant. That means that effect is unlikely to have occured by chance. The inference? The effect is likely to be seen in the population too.
This process is very similar to the proof by contradiction paradigm. We first assume that the effect is false. That's the null hypothesis. Next step is to compute the probability of obtaining that effect (the p-value). If p-value is very low(<0.05 as a rule of thumb), we reject the null hypothesis.
In [1]:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib as mpl
%matplotlib inline
In [2]:
import seaborn as sns
sns.set(color_codes=True)
In [26]:
cars = pd.read_csv("cars_v1.csv", encoding="ISO-8859-1")
In [27]:
mileage_is_null = cars.Mileage.isnull()
In [ ]:
mileage_is_null
In [29]:
cars = cars.ix[~mileage_is_null]
In [30]:
cars.head()
Out[30]:
In [31]:
cars_Automatic = cars[cars.GearType==' Automatic'].copy().reset_index()
cars_Manual = cars[cars.GearType==' Manual'].copy().reset_index()
In [32]:
print("Average Mileage of Automatic cars:", cars_Automatic.Mileage.mean())
print("Average Mileage of Manual cars:", cars_Manual.Mileage.mean())
In [33]:
#Mean and standard deviation of Manual Cars
print("Mean of Manual Cars:", cars_Manual.Mileage.mean())
print("Standard Deviation of Manual Cars:", cars_Manual.Mileage.std())
Exercise
In [10]:
#Mean and standard deviation of Automatic Cars
In [ ]:
In [34]:
#Confidence interval on the mean of manual cars
stats.norm.interval(0.95, loc=cars_Manual.Mileage.mean(), scale = cars_Manual.Mileage.std()/np.sqrt(len(cars_Manual)))
Out[34]:
Exercise
In [12]:
#Confidence interval on the mean of automatic cars
Out[12]:
In [ ]:
Effect Size
In [35]:
print("Effect size:", cars_Manual.Mileage.mean() - cars_Automatic.Mileage.mean())
Null Hypothesis: Mean prices aren't significantly different
Perform t-test and determine the p-value.
In [36]:
stats.ttest_ind(cars_Manual.Mileage, cars_Automatic.Mileage, equal_var=True)
Out[36]:
p-value is the probability that the effective size was by chance. And here, p-value is almost 0.
Conclusion: The mileage difference is significant.
One assumption is that the data used came from a normal distribution.
There's a Shapiro-Wilk test to test for normality. If p-value is less than 0.05, then there's a low chance that the distribution is normal.
In [37]:
stats.shapiro(cars_Manual.Mileage)
Out[37]:
In [38]:
stats.shapiro(cars_Automatic.Mileage)
Out[38]:
Errrrrr.....
Comparing two versions to check which one performs better. Eg: Show to people two variants for the same webpage that they want to see and find which one provides better conversion rate (or the relevant metric). wiki
Higher variance
Answer:
Chi-Square tests are used when the data are frequencies, rather than numerical score/price.
The following two tests make use of chi-square statistic
Chi-square test is a non-parametric test. They do not require assumptions about population parameters and they do not test hypotheses about population parameters.
In [ ]:
stats.chisquare(Observed, Expected)
In [ ]: