Arterial line study

This notebook reproduces the arterial line study in MIMIC-III. The following is an outline of the notebook:

Generate necessary materialized views in SQL
Combine materialized views and acquire a single dataframe
Write this data to file for use in R

The R code then evaluates whether an arterial line is associated with mortality after propensity matching.

Note that the original arterial line study used a genetic algorithm to select the covariates in the propensity score. We omit the genetic algorithm step, and instead use the final set of covariates described by the authors. For more detail, see:

Hsu DJ, Feng M, Kothari R, Zhou H, Chen KP, Celi LA. The association between indwelling arterial catheters and mortality in hemodynamically stable patients with respiratory failure: a propensity score analysis. CHEST Journal. 2015 Dec 1;148(6):1470-6.



In [ ]:

    
# Install OS dependencies.  This only needs to be run once for each new notebook instance.
!pip install PyAthena



In [1]:

    
from pyathena import connect
from pyathena.util import as_pandas
from __future__ import print_function

# Import libraries
import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import boto3
from botocore.client import ClientError
# below is used to print out pretty pandas dataframes
from IPython.display import display, HTML
%matplotlib inline


s3 = boto3.resource('s3')
client = boto3.client("sts")
account_id = client.get_caller_identity()["Account"]
my_session = boto3.session.Session()
region = my_session.region_name
athena_query_results_bucket = 'aws-athena-query-results-'+account_id+'-'+region

try:
    s3.meta.client.head_bucket(Bucket=athena_query_results_bucket)
except ClientError:
    bucket = s3.create_bucket(Bucket=athena_query_results_bucket)
    print('Creating bucket '+athena_query_results_bucket)
cursor = connect(s3_staging_dir='s3://'+athena_query_results_bucket+'/athena/temp').cursor()


# The Glue database name of your MIMIC-III parquet data
gluedatabase="mimiciii"

# location of the queries to generate aline specific materialized views
aline_path = './'

# location of the queries to generate materialized views from the MIMIC code repository
concepts_path = './concepts/'

1 - Generate materialized views

Before generating the aline cohort, we require the following materialized views to be already generated:

angus - from angus.sql
heightweight - from HeightWeightQuery.sql
aline_vaso_flag - from aline_vaso_flag.sql

You can generate the above by executing the below codeblock. If you haven't changed the directory structure, the below should work, otherwise you may need to modify the concepts_path variable above.



In [2]:

    
# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.angus_sepsis;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(concepts_path,'sepsis/angus-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'angus_sepsis\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')

# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.heightweight;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(concepts_path,'demographics/HeightWeightQuery-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'heightweight\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')


# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.aline_vaso_flag;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(aline_path,'aline_vaso_flag-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'aline_vaso_flag\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')


# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.ventsettings;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(concepts_path,'durations/ventilation-settings-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'vent_settings\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')


# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.ventdurations;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(concepts_path,'durations/ventilation-durations-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'vent_durations\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')









    



Generating table 'angus_sepsis' using ./concepts/sepsis/angus-awsathena.sql ... done.
Generating table 'heightweight' using ./concepts/demographics/HeightWeightQuery-awsathena.sql ... done.
Generating table 'aline_vaso_flag' using ./aline_vaso_flag-awsathena.sql ... done.
Generating table 'vent_settings' using ./concepts/durations/ventilation-settings-awsathena.sql ... done.
Generating table 'vent_durations' using ./concepts/durations/ventilation-durations-awsathena.sql ... done.

Now we generate the aline_cohort table using the aline_cohort.sql file.

Afterwards, we can generate the remaining 6 materialized views in any order, as they all depend on only aline_cohort and raw MIMIC-III data.



In [3]:

    
# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.aline_cohort_all;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(aline_path,'aline_cohort-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'aline_cohort_all\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')


# Load in the query from file
query='DROP TABLE IF EXISTS DATABASE.aline_cohort;'
cursor.execute(query.replace("DATABASE", gluedatabase))
f = os.path.join(aline_path,'aline_final_cohort-awsathena.sql')
with open(f) as fp:
    query = ''.join(fp.readlines())
    
# Execute the query
print('Generating table \'aline_cohort\' using {} ...'.format(f),end=' ')
cursor.execute(query.replace("DATABASE", gluedatabase))
print('done.')









    



Generating table 'aline_cohort_all' using ./aline_cohort-awsathena.sql ... done.
Generating table 'aline_cohort' using ./aline_final_cohort-awsathena.sql ... done.



In [4]:

    
query = """
select
icustay_id
, exclusion_readmission
, exclusion_shortstay
, exclusion_vasopressors
, exclusion_septic
, exclusion_aline_before_admission
, exclusion_not_ventilated_first24hr
, exclusion_service_surgical
from DATABASE.aline_cohort_all
"""

cursor.execute(query.replace("DATABASE", gluedatabase))
# Load the result of the query into a dataframe
df = as_pandas(cursor)

# print out exclusions
idxRem = df['icustay_id'].isnull()
for c in df.columns:
    if 'exclusion_' in c:
        print('{:5d} - {}'.format(df[c].sum(), c))
        idxRem[df[c]==1] = True   
        
# final exclusion (excl sepsis/something else)
print('Will remove {} of {} patients.'.format(np.sum(idxRem), df.shape[0]))


print('')
print('')
print('Reproducing the flow of the flowchart from Chest paper.')

# first stay
idxRem = (df['exclusion_readmission']==1) | (df['exclusion_shortstay']==1)
print('{:5d} - removing {:5d} ({:2.2f}%) patients - short stay // readmission.'.format(
        df.shape[0], np.sum(idxRem), 100.0*np.mean(idxRem)))
df = df.loc[~idxRem,:]

idxRem = df['exclusion_not_ventilated_first24hr']==1
print('{:5d} - removing {:5d} ({:2.2f}%) patients - not ventilated in first 24 hours.'.format(
        df.shape[0], np.sum(idxRem), 100.0*np.mean(idxRem)))

df = df.loc[df['exclusion_not_ventilated_first24hr']==0,:]

print('{:5d}'.format(df.shape[0]))
idxRem = df['icustay_id'].isnull()
for c in ['exclusion_septic', 'exclusion_vasopressors',
            'exclusion_aline_before_admission', 'exclusion_service_surgical']:
    print('{:5s} - removing {:5d} ({:2.2f}%) patients - additional {:5d} {:2.2f}% - {}'.format(
            '', df[c].sum(), 100.0*df[c].mean(),
            np.sum((idxRem==0)&(df[c]==1)), 100.0*np.mean((idxRem==0)&(df[c]==1)),
            c))
    idxRem = idxRem | (df[c]==1)

df = df.loc[~idxRem,:]
print('{} - final cohort.'.format(df.shape[0]))









    



14024 - exclusion_readmission
 8948 - exclusion_shortstay
17399 - exclusion_vasopressors
17007 - exclusion_septic
12961 - exclusion_aline_before_admission
29449 - exclusion_not_ventilated_first24hr
15499 - exclusion_service_surgical
Will remove 49907 of 52430 patients.


Reproducing the flow of the flowchart from Chest paper.
52430 - removing 20816 (39.70%) patients - short stay // readmission.
31614 - removing 15738 (49.78%) patients - not ventilated in first 24 hours.
15876
      - removing  5717 (36.01%) patients - additional  5717 36.01% - exclusion_septic
      - removing  9110 (57.38%) patients - additional  5751 36.22% - exclusion_vasopressors
      - removing  7601 (47.88%) patients - additional  1598 10.07% - exclusion_aline_before_admission
      - removing  6747 (42.50%) patients - additional   287 1.81% - exclusion_service_surgical
2523 - final cohort.

The following codeblock loads in the SQL from each file in the aline subfolder and executes the query to generate the materialized view. We specifically exclude the aline_cohort.sql file as we have already executed it above. Again, the order of query execution does not matter for these queries. Note also that the filenames are the same as the created materialized view names for convenience.



In [5]:

    
# get a list of all files in the subfolder
aline_queries = [f for f in os.listdir(aline_path) 
                 # only keep the filename if it is actually a file (and not a directory)
                if os.path.isfile(os.path.join(aline_path,f))
                 # and only keep the filename if it is an SQL file
                & f.endswith('.sql')
                # and we do *not* want aline_cohort - it's generated above
                & (f != 'aline_cohort-awsathena.sql') & (f != 'aline_final_cohort-awsathena.sql') & (f != 'aline_vaso_flag-awsathena.sql')]



for f in aline_queries:
    # Load in the query from file
    table=f.split('-')
    query='DROP TABLE IF EXISTS DATABASE.{};'.format(table[0])
    cursor.execute(query.replace("DATABASE", gluedatabase))
    print('Executing {} ...'.format(f), end=' ')
    
    with open(os.path.join(aline_path,f)) as fp:
        query = ''.join(fp.readlines())
    cursor.execute(query.replace("DATABASE", gluedatabase))
    print('done.')









    



Executing aline_bmi-awsathena.sql ... done.
Executing aline_sofa-awsathena.sql ... done.
Executing aline_labs-awsathena.sql ... done.
Executing aline_sedatives-awsathena.sql ... done.
Executing aline_icd-awsathena.sql ... done.
Executing aline_vitals-awsathena.sql ... done.

Summarize the cohort exclusions before we pull all the data together.

2 - Extract all covariates and outcome measures

We now aggregate all the data from the various views into a single dataframe.



In [6]:

    
# Load in the query from file
query = """
--FINAL QUERY
select
  co.subject_id, co.hadm_id, co.icustay_id

  -- static variables from patient tracking tables
  , co.age
  , co.gender
  -- , co.gender_num -- gender, 0=F, 1=M
  , co.intime as icustay_intime
  , co.day_icu_intime -- day of week, text
  --, co.day_icu_intime_num -- day of week, numeric (0=Sun, 6=Sat)
  , co.hour_icu_intime -- hour of ICU admission (24 hour clock)
  , case 
      when co.hour_icu_intime >= 7
       and co.hour_icu_intime < 19
         then 1
      else 0
    end as icu_hour_flag
  , co.outtime as icustay_outtime

  -- outcome variables
  , co.icu_los_day
  , co.hospital_los_day
  , co.hosp_exp_flag -- 1/0 patient died within current hospital stay
  , co.icu_exp_flag -- 1/0 patient died within current ICU stay
  , co.mort_day -- days from ICU admission to mortality, if they died
  , co.day_28_flag -- 1/0 whether the patient died 28 days after *ICU* admission
  , co.mort_day_censored -- days until patient died *or* 150 days (150 days is our censor time)
  , co.censor_flag -- 1/0 did this patient have 150 imputed in mort_day_censored

  -- aline flags
  -- , co.initial_aline_flag -- always 0, we remove patients admitted w/ aline
  , co.aline_flag -- 1/0 did the patient receive an aline
  , co.aline_time_day -- if the patient received aline, fractional days until aline put in

  -- demographics extracted using regex + echos
  , bmi.weight as weight_first
  , bmi.height as height_first
  , bmi.bmi

  -- service patient was admitted to the ICU under
  , co.service_unit

  -- severity of illness just before ventilation
  , so.sofa as sofa_first

  -- vital sign value just preceeding ventilation
  , vi.map as map_first
  , vi.heartrate as hr_first
  , vi.temperature as temp_first
  , vi.spo2 as spo2_first

  -- labs!
  , labs.bun_first
  , labs.creatinine_first
  , labs.chloride_first
  , labs.hgb_first
  , labs.platelet_first
  , labs.potassium_first
  , labs.sodium_first
  , labs.tco2_first
  , labs.wbc_first

  -- comorbidities extracted using ICD-9 codes
  , icd.chf as chf_flag
  , icd.afib as afib_flag
  , icd.renal as renal_flag
  , icd.liver as liver_flag
  , icd.copd as copd_flag
  , icd.cad as cad_flag
  , icd.stroke as stroke_flag
  , icd.malignancy as malignancy_flag
  , icd.respfail as respfail_flag
  , icd.endocarditis as endocarditis_flag
  , icd.ards as ards_flag
  , icd.pneumonia as pneumonia_flag

  -- sedative use
  , sed.sedative_flag
  , sed.midazolam_flag
  , sed.fentanyl_flag
  , sed.propofol_flag
  
from DATABASE.aline_cohort co
-- The following tables are generated by code within this repository
left join DATABASE.aline_sofa so
on co.icustay_id = so.icustay_id
left join DATABASE.aline_bmi bmi
  on co.icustay_id = bmi.icustay_id
left join DATABASE.aline_icd icd
  on co.hadm_id = icd.hadm_id
left join DATABASE.aline_vitals vi
  on co.icustay_id = vi.icustay_id
left join DATABASE.aline_labs labs
  on co.icustay_id = labs.icustay_id
left join DATABASE.aline_sedatives sed
  on co.icustay_id = sed.icustay_id
order by co.icustay_id
"""

cursor.execute(query.replace("DATABASE", gluedatabase))
# Load the result of the query into a dataframe
df = as_pandas(cursor)
df.describe().T









    Out[6]:







  
    
      
      count
      mean
      std
      min
      25%
      50%
      75%
      max
    
  
  
    
      subject_id
      2523.0
      41200.547364
      29712.636902
      22.000000
      15641.000000
      31269.000000
      66535.000000
      99881.000000
    
    
      hadm_id
      2523.0
      149881.509314
      29258.187301
      100016.000000
      124222.000000
      150164.000000
      175300.000000
      199962.000000
    
    
      icustay_id
      2523.0
      250829.827190
      28926.610595
      200019.000000
      226843.000000
      251681.000000
      275897.000000
      299995.000000
    
    
      age
      2523.0
      64.544657
      50.086406
      16.203184
      42.349852
      57.126036
      73.741161
      300.052602
    
    
      hour_icu_intime
      2523.0
      12.770908
      7.533683
      0.000000
      5.000000
      14.000000
      19.000000
      23.000000
    
    
      icu_hour_flag
      2523.0
      0.404677
      0.490927
      0.000000
      0.000000
      0.000000
      1.000000
      1.000000
    
    
      icu_los_day
      2523.0
      3.657932
      3.403448
      1.000579
      1.709288
      2.525197
      4.355093
      37.304780
    
    
      hospital_los_day
      2523.0
      8.445863
      7.819610
      0.038194
      3.807292
      6.450000
      10.527083
      123.687500
    
    
      hosp_exp_flag
      2523.0
      0.131193
      0.337678
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      icu_exp_flag
      2523.0
      0.090765
      0.287332
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      mort_day
      910.0
      429.589134
      690.777814
      0.000694
      5.094097
      62.694097
      558.330556
      3731.972222
    
    
      day_28_flag
      2523.0
      0.155767
      0.362706
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      mort_day_censored
      2523.0
      250.842692
      435.912143
      0.000694
      150.000000
      150.000000
      150.000000
      3731.972222
    
    
      censor_flag
      2523.0
      0.639318
      0.480294
      0.000000
      0.000000
      1.000000
      1.000000
      1.000000
    
    
      aline_flag
      2523.0
      0.514863
      0.499878
      0.000000
      0.000000
      1.000000
      1.000000
      1.000000
    
    
      aline_time_day
      1299.0
      0.286686
      0.710809
      0.000694
      0.018056
      0.066667
      0.246528
      11.843750
    
    
      weight_first
      2446.0
      80.868029
      26.924396
      1.000000
      65.000000
      77.300000
      91.100000
      710.000000
    
    
      height_first
      898.0
      169.866036
      17.067155
      15.240000
      162.560000
      170.180000
      177.800000
      444.500000
    
    
      bmi
      898.0
      0.003432
      0.014497
      0.000815
      0.002325
      0.002683
      0.003150
      0.434431
    
    
      sofa_first
      2523.0
      4.365834
      1.996958
      0.000000
      3.000000
      4.000000
      5.000000
      15.000000
    
    
      map_first
      2523.0
      84.808427
      16.846600
      -6.000000
      73.000000
      83.666702
      95.000000
      174.000000
    
    
      hr_first
      2523.0
      87.056282
      19.476617
      0.000000
      74.000000
      85.000000
      99.000000
      174.000000
    
    
      temp_first
      2513.0
      36.807150
      0.908248
      32.222222
      36.222221
      36.777778
      37.388889
      40.444446
    
    
      spo2_first
      2520.0
      98.641270
      3.020225
      47.000000
      98.000000
      100.000000
      100.000000
      100.000000
    
    
      bun_first
      2474.0
      20.187146
      15.136237
      1.000000
      12.000000
      16.000000
      23.000000
      139.000000
    
    
      creatinine_first
      2474.0
      1.140097
      1.168660
      0.100000
      0.700000
      0.900000
      1.100000
      18.800000
    
    
      chloride_first
      2481.0
      104.074970
      5.933029
      56.000000
      101.000000
      104.000000
      107.000000
      129.000000
    
    
      hgb_first
      2479.0
      12.220976
      2.235188
      4.100000
      10.700000
      12.300000
      13.800000
      19.800000
    
    
      platelet_first
      2471.0
      240.770943
      101.345871
      6.000000
      177.000000
      231.000000
      294.000000
      1313.000000
    
    
      potassium_first
      2485.0
      4.065634
      0.681857
      1.800000
      3.600000
      4.000000
      4.400000
      9.200000
    
    
      sodium_first
      2481.0
      139.494156
      4.874810
      74.000000
      137.000000
      140.000000
      142.000000
      165.000000
    
    
      tco2_first
      2497.0
      25.123348
      5.209551
      3.000000
      22.000000
      25.000000
      28.000000
      53.000000
    
    
      wbc_first
      2470.0
      12.232874
      6.163138
      0.200000
      8.000000
      11.300000
      15.200000
      94.000000
    
    
      chf_flag
      2523.0
      0.070947
      0.256788
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      afib_flag
      2523.0
      0.138327
      0.345312
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      renal_flag
      2523.0
      0.065002
      0.246578
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      liver_flag
      2523.0
      0.057868
      0.233539
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      copd_flag
      2523.0
      0.015458
      0.123389
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      cad_flag
      2523.0
      0.093539
      0.291245
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      stroke_flag
      2523.0
      0.158938
      0.365691
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      malignancy_flag
      2523.0
      0.156163
      0.363082
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      respfail_flag
      2523.0
      0.365042
      0.481537
      0.000000
      0.000000
      0.000000
      1.000000
      1.000000
    
    
      endocarditis_flag
      2523.0
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
      0.000000
    
    
      ards_flag
      2523.0
      0.017043
      0.129458
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      pneumonia_flag
      2523.0
      0.177963
      0.382557
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      sedative_flag
      2523.0
      0.211256
      0.408281
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      midazolam_flag
      2523.0
      0.024970
      0.156065
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      fentanyl_flag
      2523.0
      0.040824
      0.197922
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000
    
    
      propofol_flag
      2523.0
      0.187872
      0.390687
      0.000000
      0.000000
      0.000000
      0.000000
      1.000000

Now we need to remove obvious outliers, including correcting ages > 200 to 91.4 (i.e. replace anonymized ages with 91.4, the median age of patients older than 89).



In [7]:

    
# plot the rest of the distributions
for col in df.columns:
    if df.dtypes[col] in ('int64','float64'):
        plt.figure(figsize=[12,6])
        plt.hist(df[col].dropna(), bins=50, normed=True)
        plt.xlabel(col,fontsize=24)
        plt.show()









    



/home/ec2-user/anaconda3/envs/amazonei_mxnet_p27/lib/python2.7/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.
  warnings.warn("The 'normed' kwarg is deprecated, and has been "



In [8]:

    
# apply corrections
df.loc[df['age']>89, 'age'] = 91.4

3 - Write to file



In [9]:

    
df.to_csv('aline_data.csv',index=False)

4 - Create a propensity score using this data

We will create the propensity score using R in the Jupyter Notebook file aline_propensity_score.ipynb.

	count	mean	std	min	25%	50%	75%	max
subject_id	2523.0	41200.547364	29712.636902	22.000000	15641.000000	31269.000000	66535.000000	99881.000000
hadm_id	2523.0	149881.509314	29258.187301	100016.000000	124222.000000	150164.000000	175300.000000	199962.000000
icustay_id	2523.0	250829.827190	28926.610595	200019.000000	226843.000000	251681.000000	275897.000000	299995.000000
age	2523.0	64.544657	50.086406	16.203184	42.349852	57.126036	73.741161	300.052602
hour_icu_intime	2523.0	12.770908	7.533683	0.000000	5.000000	14.000000	19.000000	23.000000
icu_hour_flag	2523.0	0.404677	0.490927	0.000000	0.000000	0.000000	1.000000	1.000000
icu_los_day	2523.0	3.657932	3.403448	1.000579	1.709288	2.525197	4.355093	37.304780
hospital_los_day	2523.0	8.445863	7.819610	0.038194	3.807292	6.450000	10.527083	123.687500
hosp_exp_flag	2523.0	0.131193	0.337678	0.000000	0.000000	0.000000	0.000000	1.000000
icu_exp_flag	2523.0	0.090765	0.287332	0.000000	0.000000	0.000000	0.000000	1.000000
mort_day	910.0	429.589134	690.777814	0.000694	5.094097	62.694097	558.330556	3731.972222
day_28_flag	2523.0	0.155767	0.362706	0.000000	0.000000	0.000000	0.000000	1.000000
mort_day_censored	2523.0	250.842692	435.912143	0.000694	150.000000	150.000000	150.000000	3731.972222
censor_flag	2523.0	0.639318	0.480294	0.000000	0.000000	1.000000	1.000000	1.000000
aline_flag	2523.0	0.514863	0.499878	0.000000	0.000000	1.000000	1.000000	1.000000
aline_time_day	1299.0	0.286686	0.710809	0.000694	0.018056	0.066667	0.246528	11.843750
weight_first	2446.0	80.868029	26.924396	1.000000	65.000000	77.300000	91.100000	710.000000
height_first	898.0	169.866036	17.067155	15.240000	162.560000	170.180000	177.800000	444.500000
bmi	898.0	0.003432	0.014497	0.000815	0.002325	0.002683	0.003150	0.434431
sofa_first	2523.0	4.365834	1.996958	0.000000	3.000000	4.000000	5.000000	15.000000
map_first	2523.0	84.808427	16.846600	-6.000000	73.000000	83.666702	95.000000	174.000000
hr_first	2523.0	87.056282	19.476617	0.000000	74.000000	85.000000	99.000000	174.000000
temp_first	2513.0	36.807150	0.908248	32.222222	36.222221	36.777778	37.388889	40.444446
spo2_first	2520.0	98.641270	3.020225	47.000000	98.000000	100.000000	100.000000	100.000000
bun_first	2474.0	20.187146	15.136237	1.000000	12.000000	16.000000	23.000000	139.000000
creatinine_first	2474.0	1.140097	1.168660	0.100000	0.700000	0.900000	1.100000	18.800000
chloride_first	2481.0	104.074970	5.933029	56.000000	101.000000	104.000000	107.000000	129.000000
hgb_first	2479.0	12.220976	2.235188	4.100000	10.700000	12.300000	13.800000	19.800000
platelet_first	2471.0	240.770943	101.345871	6.000000	177.000000	231.000000	294.000000	1313.000000
potassium_first	2485.0	4.065634	0.681857	1.800000	3.600000	4.000000	4.400000	9.200000
sodium_first	2481.0	139.494156	4.874810	74.000000	137.000000	140.000000	142.000000	165.000000
tco2_first	2497.0	25.123348	5.209551	3.000000	22.000000	25.000000	28.000000	53.000000
wbc_first	2470.0	12.232874	6.163138	0.200000	8.000000	11.300000	15.200000	94.000000
chf_flag	2523.0	0.070947	0.256788	0.000000	0.000000	0.000000	0.000000	1.000000
afib_flag	2523.0	0.138327	0.345312	0.000000	0.000000	0.000000	0.000000	1.000000
renal_flag	2523.0	0.065002	0.246578	0.000000	0.000000	0.000000	0.000000	1.000000
liver_flag	2523.0	0.057868	0.233539	0.000000	0.000000	0.000000	0.000000	1.000000
copd_flag	2523.0	0.015458	0.123389	0.000000	0.000000	0.000000	0.000000	1.000000
cad_flag	2523.0	0.093539	0.291245	0.000000	0.000000	0.000000	0.000000	1.000000
stroke_flag	2523.0	0.158938	0.365691	0.000000	0.000000	0.000000	0.000000	1.000000
malignancy_flag	2523.0	0.156163	0.363082	0.000000	0.000000	0.000000	0.000000	1.000000
respfail_flag	2523.0	0.365042	0.481537	0.000000	0.000000	0.000000	1.000000	1.000000
endocarditis_flag	2523.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
ards_flag	2523.0	0.017043	0.129458	0.000000	0.000000	0.000000	0.000000	1.000000
pneumonia_flag	2523.0	0.177963	0.382557	0.000000	0.000000	0.000000	0.000000	1.000000
sedative_flag	2523.0	0.211256	0.408281	0.000000	0.000000	0.000000	0.000000	1.000000
midazolam_flag	2523.0	0.024970	0.156065	0.000000	0.000000	0.000000	0.000000	1.000000
fentanyl_flag	2523.0	0.040824	0.197922	0.000000	0.000000	0.000000	0.000000	1.000000
propofol_flag	2523.0	0.187872	0.390687	0.000000	0.000000	0.000000	0.000000	1.000000