This is a dataset of Assisted Living, Nursing and Residential Care facilities in Oregon, open as of January, 2017. For each, we have:

  1. facility_id: Unique ID used to join to complaints
  2. fac_ccmunumber: Unique ID used to join to ownership history
  3. facility_type: NF - Nursing Facility; RCF - Residential Care Facility; ALF - Assisted Living Facility
  4. fac_capacity: Number of beds facility is licensed to have. Not necessarily the number of beds facility does have.
  5. offline: created in munging notebook, a count of complaints that DO NOT appear when facility is searched on state's complaint search website (https://apps.state.or.us/cf2/spd/facility_complaints/).
  6. online: created in munging notebook, a count of complaints that DO appear when facility is searched on state's complaint search website (https://apps.state.or.us/cf2/spd/facility_complaints/).

In [1]:
import pandas as pd
import numpy as np
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))



In [2]:
df = pd.read_csv('/Users/fzarkhin/OneDrive - Advance Central Services, Inc/fproj/github/database-story/data/processed/facilities.csv')

How many facilities have accurate records online?

Those that have no offline records.


In [3]:
df[(df['offline'].isnull())].count()[0]


Out[3]:
57

How many facilities have inaccurate records online?<h/3>

Those that have offline records.


In [4]:
df[(df['offline'].notnull())].count()[0]


Out[4]:
585

How many facilities had more than double the number of complaints shown online?


In [5]:
df[(df['offline']>df['online']) & (df['online'].notnull())].count()[0]


Out[5]:
357

How many facilities show zero complaints online but have complaints offline?


In [6]:
df[(df['online'].isnull()) & (df['offline'].notnull())].count()[0]


Out[6]:
59

How many facilities have complaints and are accurate online?


In [7]:
df[(df['online'].notnull()) & (df['offline'].isnull())].count()[0]


Out[7]:
14

How many facilities have complaints?


In [8]:
df[(df['online'].notnull()) | df['offline'].notnull()].count()[0]


Out[8]:
599

What percent of facilities have accurate records online?


In [9]:
df[(df['offline'].isnull())].count()[0]/df.count()[0]*100


Out[9]:
8.8785046728971952

What is the total capacity of all facilities with inaccurate records?


In [10]:
df[df['offline'].notnull()].sum()['fac_capacity']


Out[10]:
35238.0

In [11]:
df[df['fac_capacity'].isnull()]


Out[11]:
facility_id fac_ccmunumber facility_type fac_capacity facility_name offline online

In [12]:
#df#['fac_capacity'].sum()

In [ ]: