In October 2012, the US government's Center for Medicare and Medicaid Services (CMS) began reducing Medicare payments for Inpatient Prospective Payment System hospitals with excess readmissions. Excess readmissions are measured by a ratio, by dividing a hospital’s number of “predicted” 30-day readmissions for heart attack, heart failure, and pneumonia by the number that would be “expected,” based on an average hospital with similar patients. A ratio greater than 1 indicates excess readmissions.
In this exercise, you will:
More instructions provided below. Include your work in this notebook and submit to your Github account.
In [32]:
%matplotlib inline
import pandas as pd
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
import bokeh.plotting as bkp
from mpl_toolkits.axes_grid1 import make_axes_locatable
In [2]:
# read in readmissions data provided
hospital_read_df = pd.read_csv('data/cms_hospital_readmissions.csv')
In [3]:
# deal with missing and inconvenient portions of data
clean_hospital_read_df = hospital_read_df[hospital_read_df['Number of Discharges'] != 'Not Available']
clean_hospital_read_df.loc[:, 'Number of Discharges'] = clean_hospital_read_df['Number of Discharges'].astype(int)
clean_hospital_read_df = clean_hospital_read_df.sort_values('Number of Discharges')
In [4]:
# generate a scatterplot for number of discharges vs. excess rate of readmissions
# lists work better with matplotlib scatterplot function
x = [a for a in clean_hospital_read_df['Number of Discharges'][81:-3]]
y = list(clean_hospital_read_df['Excess Readmission Ratio'][81:-3])
fig, ax = plt.subplots(figsize=(8,5))
ax.scatter(x, y,alpha=0.2)
ax.fill_between([0,350], 1.15, 2, facecolor='red', alpha = .15, interpolate=True)
ax.fill_between([800,2500], .5, .95, facecolor='green', alpha = .15, interpolate=True)
ax.set_xlim([0, max(x)])
ax.set_xlabel('Number of discharges', fontsize=12)
ax.set_ylabel('Excess rate of readmissions', fontsize=12)
ax.set_title('Scatterplot of number of discharges vs. excess rate of readmissions', fontsize=14)
ax.grid(True)
fig.tight_layout()
Read the following results/report. While you are reading it, think about if the conclusions are correct, incorrect, misleading or unfounded. Think about what you would change or what additional analyses you would perform.
A. Initial observations based on the plot above
B. Statistics
C. Conclusions
D. Regulatory policy recommendations
In [28]:
# A. Do you agree with the above analysis and recommendations? Why or why not?
import seaborn as sns
relevant_columns = clean_hospital_read_df[['Excess Readmission Ratio', 'Number of Discharges']][81:-3]
sns.regplot(relevant_columns['Number of Discharges'], relevant_columns['Excess Readmission Ratio'])
Out[28]:
Include your work on the following in this notebook and submit to your Github account.
A. Do you agree with the above analysis and recommendations? Why or why not?
B. Provide support for your arguments and your own recommendations with a statistically sound analysis:
You can compose in notebook cells using Markdown:
Overall, rate of readmissions is trending down with increasing number of discharges
With lower number of discharges, there is a greater incidence of excess rate of readmissions (area shaded red)
With higher number of discharges, there is a greater incidence of lower rates of readmissions (area shaded green)
In [43]:
rv =relevant_columns
print rv[rv['Number of Discharges'] < 100][['Excess Readmission Ratio']].mean()
print '\nPercent of subset with excess readmission rate > 1: ', len(rv[(rv['Number of Discharges'] < 100) & (rv['Excess Readmission Ratio'] > 1)]) / len(rv[relevant_columns['Number of Discharges'] < 100])
print '\n', rv[rv['Number of Discharges'] > 1000][['Excess Readmission Ratio']].mean()
print '\nPercent of subset with excess readmission rate > 1: ', len(rv[(rv['Number of Discharges'] > 1000) & (rv['Excess Readmission Ratio'] > 1)]) / len(rv[relevant_columns['Number of Discharges'] > 1000])
In hospitals/facilities with number of discharges < 100, mean excess readmission rate is 1.023 and 63% have excess readmission rate greater than 1
In hospitals/facilities with number of discharges > 1000, mean excess readmission rate is 0.978 and 44% have excess readmission rate greater than 1
In [47]:
np.corrcoef(rv['Number of Discharges'], rv['Excess Readmission Ratio'])
Out[47]:
In [ ]:
In [ ]:
In [ ]: