What is the true normal human body temperature?

Background

The mean normal body temperature was held to be 37$^{\circ}$C or 98.6$^{\circ}$F for more than 120 years since it was first conceptualized and reported by Carl Wunderlich in a famous 1868 book. In 1992, this value was revised to 36.8$^{\circ}$C or 98.2$^{\circ}$F.

Exercise

In this exercise, you will analyze a dataset of human body temperatures and employ the concepts of hypothesis testing, confidence intervals, and statistical significance.

Answer the following questions in this notebook below and submit to your Github account.

  1. Is the distribution of body temperatures normal?
    • Remember that this is a condition for the CLT, and hence the statistical tests we are using, to apply.
  2. Is the true population mean really 98.6 degrees F?
    • Bring out the one sample hypothesis test! In this situation, is it approriate to apply a z-test or a t-test? How will the result be different?
  3. At what temperature should we consider someone's temperature to be "abnormal"?
    • Start by computing the margin of error and confidence interval.
  4. Is there a significant difference between males and females in normal temperature?
    • Set up and solve for a two sample hypothesis testing.

You can include written notes in notebook cells using Markdown:

Resources



In [1]:
import pandas as pd

In [2]:
%matplotlib inline

In [3]:
df = pd.read_csv('data/human_body_temperature.csv')

Is the distribution of body temperatures normal?

  • Data is right skewed but it is nearly normal. We can apply CLT for hypothesis testing as population size is more than 30.

In [4]:
df.hist()


Out[4]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f6d511668d0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f6d50ec0748>]], dtype=object)

In [5]:
df.describe()


Out[5]:
temperature heart_rate
count 130.000000 130.000000
mean 98.249231 73.761538
std 0.733183 7.062077
min 96.300000 57.000000
25% 97.800000 69.000000
50% 98.300000 74.000000
75% 98.700000 79.000000
max 100.800000 89.000000

Is the true population mean really 98.6 degrees F?

Bring out the one sample hypothesis test! In this situation, is it approriate to apply a z-test or a t-test? How will the result be different?
For z-test Vs t-test:
  • We need to apply t-test if sample size is smaller than 30. Since sample size is more than 30 it is better to use z-test. Result will be almost the same as data is not extreme skewed and sample size is large enough.

Sample Mean = 98.249231 Sample stddev = 0.733183 n = 130

Sample Hypotesis Test:

Step 1:

- Null Hypothesis : Mean = 98.6
- Alternative Hypothesis : Mean != 98.6

Step 2:

- Point of estimate sample mean = 98.6 
- Calculate Standard Error (SE)

Step 3:

- Check condition
    -- Independence ==> True
    -- If Sample is skewed then sample size > 30 ==> True

Step 4:

- Calculate z score and pvalue

Step 5:

- Based on p-value check if Null can be rejected. 

In [16]:
import scipy.special

In [19]:
n = df.count()['temperature']
sigma = df['temperature'].std()
x_bar = df['temperature'].mean()
standard_error = sigma/((n)**(1.0/2))
z_score = ( x_bar - 98.6)/standard_error
p_values = 2*scipy.special.ndtr(z_score)
p_values


Out[19]:
4.9021570141133797e-08

Since p_value is much more less than 5%, null hypothesis can be rejected that the true population mean is 98.6 degrees Fahrenheit.

  • Testing for t-test

In [22]:
import scipy.stats as stats

In [23]:
stats.ttest_1samp(df.temperature,98.6)


Out[23]:
(-5.4548232923645195, 2.4106320415561276e-07)
For t-test p value is little different from the pvalue from z-test but evidence is strong enough to reject null hypotesis. 

==================================================================================================================

  • At what temperature should we consider someone's temperature to be "abnormal"?

95% confidence interval can be considered good enough for this assesment. Margin of Error (M.E) = (critical value * standard error)

Critical value for confidence interval 95% = 1.96

Confidence interval = (Mean - Margin of Error, Mean + Margin of Error)


In [24]:
margin_of_error = 1.96*standard_error

In [25]:
confidence_interval = [x_bar - margin_of_error, x_bar + margin_of_error]
confidence_interval


Out[25]:
[98.123194112228518, 98.375267426233037]

If temprature goes out of above range it might be considered as abnormal.

=======================================================================================================================

  • Is there a significant difference between males and females in normal temperature?

In [26]:
import numpy as np

In [27]:
female_temprature = np.array(df.temperature[df.gender=='F'])
len(female_temprature)


Out[27]:
65

In [28]:
male_temprature = np.array(df.temperature[df.gender=='M'])
len(male_temprature)


Out[28]:
65

Again sample size is large enough to test using z-test.


In [36]:
from statsmodels.stats.weightstats import ztest
tstat,p_val = ztest(female_temprature, male_temprature)
p_val_percent = p_val*100
if p_val_percent < 5:
    print ("p-value is less then 5% so null hypothesis should be rejected.\n"
           "There a significant difference between males and females in normal temperature ")


p-value is less then 5% so null hypothesis should be rejected.
There a significant difference between males and females in normal temperature 

In [ ]:


In [ ]: