Read Washington Medicaid Fee Schedules

The Washington state Health Care Authority website for fee schedules is here.

  • Fee schedules come in Excel format
  • Fee schedules are usually biannual (January and July)
  • Publicly available fee schedules go back to January 2011

However, Washington's Medicaid fee schedules are a pain in the ass. They are publicly available as Microsoft Excel files but...

  • File names are not systematic
  • They do not read directly into R nicely (using either the readxl or xlsx packages)
  • Data lines start at different rows

All these issues makes codifying difficult. As a workaround, the following steps were taken.

  1. Excel files are saved locally
  2. Excel files are converted to CSV
  3. CSV files are version controlled in this repository (since they are not large)
  4. CSV files are read into R

The first 3 steps were done manually. The SHA for the commit of the CSV files is 5bde7f3e33e0c83bdace0ed0cf04553a41a8efb1 (5/5/2016). Step 4 is below.


In [1]:
files <- paste("Data", list.files(file.path(getwd(), "Data")), sep="/")
files


Out[1]:
  1. "Data/HCA_PREOH_January_1_2013.csv"
  2. "Data/physician_010114.csv"
  3. "Data/physician_010115.csv"
  4. "Data/physician_010116.csv"
  5. "Data/physician_040115.csv"
  6. "Data/physician_040116.csv"
  7. "Data/physician_070114.csv"
  8. "Data/physician_070115.csv"
  9. "Data/physician_100115.csv"
  10. "Data/preoh_010112.csv"
  11. "Data/preoh_01012011.csv"
  12. "Data/preoh_070112.csv"
  13. "Data/preoh_070113.csv"
  14. "Data/preoh_07012011.csv"

In [2]:
library(data.table)
readFS <- function (f, skip) {
    require(data.table, quietly=TRUE)
    for (i in 11:16) {if (grepl(sprintf("%d\\.csv", i), f)) {year <- as.numeric(sprintf("20%d", i))}}
    for (i in 1:12) {
        monname <- format(as.Date(sprintf("%d-%d-01", year, i)), format="%B")
        if (grepl(sprintf("_%02d", i), f) | grepl(tolower(monname), f, ignore.case=TRUE)) {
            mm <- i
        }
    }
    colClasses <- rep("character", 9)
    D <- data.table(read.csv(f, header=FALSE, colClasses=colClasses, skip=skip, na.strings=c(""), strip.white=TRUE))
    old <- names(D)
    keep <- c("code_status_indicator",
              "code",
              "mod",
              "nfs_maximum_allowable",
              "fs_maximum_allowable",
              "pa_required",
              "global_days",
              "comments")
    if (length(old) > length(keep)) {new <- c(keep, old[(length(keep) + 1):length(old)])}
    else {new <- keep}
    setnames(D, old, new)
    D <- D[, effective_date := as.Date(sprintf("%d-%d-01", year, mm))]
    D[, c(keep, "effective_date"), with=FALSE]
}

In [3]:
fs <- rbindlist(list(readFS(file.path(getwd(), "Data/HCA_PREOH_January_1_2013.csv"), 9),
                     readFS(file.path(getwd(), "Data/physician_010114.csv"), 9),
                     readFS(file.path(getwd(), "Data/physician_010115.csv"), 9),
                     readFS(file.path(getwd(), "Data/physician_010116.csv"), 10),
                     readFS(file.path(getwd(), "Data/physician_040115.csv"), 9),
                     readFS(file.path(getwd(), "Data/physician_040116.csv"), 10),
                     readFS(file.path(getwd(), "Data/physician_070114.csv"), 9),
                     readFS(file.path(getwd(), "Data/physician_070115.csv"), 10),
                     readFS(file.path(getwd(), "Data/physician_100115.csv"), 10),
                     readFS(file.path(getwd(), "Data/preoh_010112.csv"), 6),
                     readFS(file.path(getwd(), "Data/preoh_01012011.csv"), 6),
                     readFS(file.path(getwd(), "Data/preoh_070112.csv"), 9),
                     readFS(file.path(getwd(), "Data/preoh_070113.csv"), 9),
                     readFS(file.path(getwd(), "Data/preoh_07012011.csv"), 6)))
str(fs)


Classes 'data.table' and 'data.frame':	164912 obs. of  9 variables:
 $ code_status_indicator: chr  NA NA NA NA ...
 $ code                 : chr  "00100" "00102" "00103" "00104" ...
 $ mod                  : chr  NA NA NA NA ...
 $ nfs_maximum_allowable: chr  "5 Base" "6 Base" "5 Base" "4 Base" ...
 $ fs_maximum_allowable : chr  "5 Base" "6 Base" "5 Base" "4 Base" ...
 $ pa_required          : chr  NA NA NA NA ...
 $ global_days          : chr  "000" "000" "000" "000" ...
 $ comments             : chr  "N" "N" "N" "N" ...
 $ effective_date       : Date, format: "2013-01-01" "2013-01-01" ...
 - attr(*, ".internal.selfref")=<externalptr> 

In [4]:
fs[, .N, effective_date][order(effective_date)]


Out[4]:
effective_dateN
12011-01-0111697
22011-07-0111518
32012-01-0111768
42012-07-0111606
52013-01-0111832
62013-07-0111656
72014-01-0111803
82014-07-0111743
92015-01-0112027
102015-04-0111835
112015-07-0111837
122015-10-0111841
132016-01-0111958
142016-04-0111791

In [5]:
head(fs)
tail(fs)


Out[5]:
code_status_indicatorcodemodnfs_maximum_allowablefs_maximum_allowablepa_requiredglobal_dayscommentseffective_date
1NA00100NA5 Base5 BaseNA000N2013-01-01
2NA00102NA6 Base6 BaseNA000N2013-01-01
3NA00103NA5 Base5 BaseNA000N2013-01-01
4NA00104NA4 Base4 BaseNA000N2013-01-01
5NA00120NA5 Base5 BaseNA000N2013-01-01
6NA00124NA4 Base4 BaseNA000N2013-01-01
Out[5]:
code_status_indicatorcodemodnfs_maximum_allowablefs_maximum_allowablepa_requiredglobal_dayscommentseffective_date
1NAV2631NA342.42342.42NA000N2011-07-01
2NAV2632NA342.42342.42NA000N2011-07-01
3NAV2785NA##NA000N2011-07-01
4NAV2787NA##NA000N2011-07-01
5NAV2788NA##NA000N2011-07-01
6NAV2799NA##NA000N2011-07-01