Here I will use the combined dataframe of CO2 and Pitch over a given time frame to perform statiscial tests for correlation.
1) Upload datafram
2) Look for normal distribution by ----
3) Perform a Spearman's Rank Test to find the p value for statistical significance
In [3]:
# I import useful libraries (with functions) so I can visualize my data
# I use Pandas because this dataset has word/string column titles and I like the readability features of commands and finish visual products that Pandas offers
import pandas as pd
import matplotlib.pyplot as plt
import re
import numpy as np
%matplotlib inline
In [4]:
#call in choir_division.csv with line separation
#lines = open('[COMBINED DATAFRAME NAME HERE]', 'r').read().strip().split('\n')
In [ ]:
my_model = sm.formula.ols('y ~ x', data=mydat).fit()
my_model.summary()
my_model.rsquared
my_model.pvalues
statsdf = pd.DataFrame({'formula':['y~x', 'y~x', 'y~x']}, #values to put in the three columns of the DF
'test_stat':['rsquared', 'pval1', 'pval2'],
'val':[mymodel.rsquared, mymodel.pvalues[0], mymodel.pvalues[1]]})
pvals_df=pd.DataFrame({'pvals':mymodelpvalues})
#first think about the table you want created with stats values
In [ ]:
# STATS: print (lm.summary())
In [ ]:
In [6]:
#Have gif in presentation

In [ ]:
In [ ]: