Ebola Outbreak 2014

Data retrieval


In [1]:
df = pd.DataFrame.from_csv('https://raw.githubusercontent.com/cmrivers/ebola/master/country_timeseries.csv',
                 index_col=0)
df = df.sort_index()
df = df.fillna(method='ffill')

Cases per country


In [2]:
titles = [k for k in df.columns if 'cases'  in k.lower()]
df.plot(y=titles, kind='area')
legend()


Out[2]:
<matplotlib.legend.Legend at 0x7fedd7c33590>

Deaths per Country


In [3]:
titles = [k for k in df.columns if 'Deaths' in k]
df.plot(y=titles,kind='area')
legend()


Out[3]:
<matplotlib.legend.Legend at 0xb9eee50>

Total Deaths


In [5]:
df['total deaths'] = df[titles].sum(axis =1)
df.plot(y='total deaths', 
        title='Total Deaths in \n 2014 Ebola Outbreak')
ylabel('Total Deaths')


Out[5]:
<matplotlib.text.Text at 0xc13b390>

Exponential Fit

Plot Log(deaths) and exponential fit


In [6]:
import seaborn as sn
df['log total deaths'] = log10(df['total deaths'].values)
sn.lmplot('Day','log total deaths', df)


Out[6]:
<seaborn.axisgrid.FacetGrid at 0xc144bd0>

Fit statistics


In [7]:
import statsmodels.formula.api as sm

ols = sm.OLS(df['Day'].values, df['log total deaths'].values)
ols.fit().summary()


Out[7]:
OLS Regression Results
Dep. Variable: y R-squared: 0.864
Model: OLS Adj. R-squared: 0.862
Method: Least Squares F-statistic: 520.9
Date: Fri, 17 Oct 2014 Prob (F-statistic): 2.83e-37
Time: 12:07:40 Log-Likelihood: -430.28
No. Observations: 83 AIC: 862.6
Df Residuals: 82 BIC: 865.0
Df Model: 1
coef std err t P>|t| [95.0% Conf. Int.]
x1 39.8578 1.746 22.823 0.000 36.384 43.332
Omnibus: 66.167 Durbin-Watson: 0.003
Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.733
Skew: 0.003 Prob(JB): 0.0345
Kurtosis: 1.605 Cond. No. 1.00

In [ ]: