This is a dataset of Assisted Living, Nursing and Residential Care facilities in Oregon, open as of September, 2016. For each, we have:

Data were munged here.

  1. facility_id: Unique ID used to join to complaints
  2. fac_ccmunumber: Unique ID used to join to ownership history
  3. facility_type: NF - Nursing Facility; RCF - Residential Care Facility; ALF - Assisted Living Facility
  4. fac_capacity: Number of beds facility is licensed to have. Not necessarily the number of beds facility does have.
  5. facility_name: Facility name at time of September extract.
  6. offline: created in munging notebook, a count of complaints that DO NOT appear when facility is searched on state's complaint search website.
  7. online: created in munging notebook, a count of complaints that DO appear when facility is searched on state's complaint search website.

In [2]:
import pandas as pd
import numpy as np
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))



In [3]:
df = pd.read_csv('../../data/processed/facilities-3-29-scrape.csv')

How many facilities are there?


In [4]:
df.count()[0]


Out[4]:
642

How many facilities have accurate records online?

Those that have no offline records.


In [10]:
df[(df['offline'].isnull())].count()[0]


Out[10]:
57

How many facilities have inaccurate records online?<h/3>

Those that have offline records.


In [11]:
df[(df['offline'].notnull())].count()[0]


Out[11]:
585

How many facilities had more than double the number of complaints shown online?


In [12]:
df[(df['offline']>df['online']) & (df['online'].notnull())].count()[0]


Out[12]:
357

How many facilities show zero complaints online but have complaints offline?


In [13]:
df[(df['online'].isnull()) & (df['offline'].notnull())].count()[0]


Out[13]:
59

How many facilities have complaints and are accurate online?


In [14]:
df[(df['online'].notnull()) & (df['offline'].isnull())].count()[0]


Out[14]:
14

How many facilities have complaints?


In [15]:
df[(df['online'].notnull()) | df['offline'].notnull()].count()[0]


Out[15]:
599

What percent of facilities have accurate records online?


In [16]:
df[(df['offline'].isnull())].count()[0]/df.count()[0]*100


Out[16]:
8.8785046728971952

What is the total capacity of all facilities with inaccurate records?


In [17]:
df[df['offline'].notnull()].sum()['fac_capacity']


Out[17]:
35238.0

How many facilities appear to have no complaints, whether or not they do?


In [22]:
df[df['online'].isnull()].count()[0]


Out[22]:
102