notebook.community

This is a dataset of Assisted Living, Nursing and Residential Care facilities in Oregon, open as of September, 2016. For each, we have:

Data were munged here.

facility_id: Unique ID used to join to complaints
fac_ccmunumber: Unique ID used to join to ownership history
facility_type: NF - Nursing Facility; RCF - Residential Care Facility; ALF - Assisted Living Facility
fac_capacity: Number of beds facility is licensed to have. Not necessarily the number of beds facility does have.
facility_name: Facility name at time of September extract.
offline: created in munging notebook, a count of complaints that DO NOT appear when facility is searched on state's complaint search website.
online: created in munging notebook, a count of complaints that DO appear when facility is searched on state's complaint search website.



In [2]:

    
import pandas as pd
import numpy as np
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))



In [3]:

    
df = pd.read_csv('../../data/processed/facilities-3-29-scrape.csv')

How many facilities are there?



In [4]:

    
df.count()[0]









    Out[4]:





642

How many facilities have accurate records online?

Those that have no offline records.



In [10]:

    
df[(df['offline'].isnull())].count()[0]









    Out[10]:





57

How many facilities have inaccurate records online?<h/3>

Those that have offline records.



In [11]:

    
df[(df['offline'].notnull())].count()[0]









    Out[11]:





585

How many facilities had more than double the number of complaints shown online?



In [12]:

    
df[(df['offline']>df['online']) & (df['online'].notnull())].count()[0]









    Out[12]:





357

How many facilities show zero complaints online but have complaints offline?



In [13]:

    
df[(df['online'].isnull()) & (df['offline'].notnull())].count()[0]









    Out[13]:





59

How many facilities have complaints and are accurate online?



In [14]:

    
df[(df['online'].notnull()) & (df['offline'].isnull())].count()[0]









    Out[14]:





14

How many facilities have complaints?



In [15]:

    
df[(df['online'].notnull()) | df['offline'].notnull()].count()[0]









    Out[15]:





599

What percent of facilities have accurate records online?



In [16]:

    
df[(df['offline'].isnull())].count()[0]/df.count()[0]*100









    Out[16]:





8.8785046728971952

What is the total capacity of all facilities with inaccurate records?



In [17]:

    
df[df['offline'].notnull()].sum()['fac_capacity']









    Out[17]:





35238.0

How many facilities appear to have no complaints, whether or not they do?



In [22]:

    
df[df['online'].isnull()].count()[0]









    Out[22]:





102