medication

The medications table reflects the active medication orders for patients. These are orders but do not necessarily reflect administration to the patient. For example, while existence of data in the infusionDrug table confirms a patient received a continuous infusion, existence of the same data in this table only indicates that the infusion was ordered for the patient. Most orders are fulfilled, but not all. Furthermore, many orders are done pro re nata, or PRN, which means "when needed". Administration of these orders is difficult to quantify.

In the US, all orders must be reviewed by a pharmacist. The majority of hospitals have an HL7 medication interface system in place which automatically synchronizes the orders with eCareManager (the source of this database) as they are verified by the pharmacist in the source pharmacy system. For hospitals without a medication interface, the eICU staff may enter a selection of medications to facilitate population management and completeness for reporting purposes.



In [1]:

    
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import psycopg2
import getpass
import pdvega

# for configuring connection 
from configobj import ConfigObj
import os

%matplotlib inline









    



/home/alistairewj/.local/lib/python3.5/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
  """)



In [2]:

    
# Create a database connection using settings from config file
config='../db/config.ini'

# connection info
conn_info = dict()
if os.path.isfile(config):
    config = ConfigObj(config)
    conn_info["sqluser"] = config['username']
    conn_info["sqlpass"] = config['password']
    conn_info["sqlhost"] = config['host']
    conn_info["sqlport"] = config['port']
    conn_info["dbname"] = config['dbname']
    conn_info["schema_name"] = config['schema_name']
else:
    conn_info["sqluser"] = 'postgres'
    conn_info["sqlpass"] = ''
    conn_info["sqlhost"] = 'localhost'
    conn_info["sqlport"] = 5432
    conn_info["dbname"] = 'eicu'
    conn_info["schema_name"] = 'public,eicu_crd'
    
# Connect to the eICU database
print('Database: {}'.format(conn_info['dbname']))
print('Username: {}'.format(conn_info["sqluser"]))
if conn_info["sqlpass"] == '':
    # try connecting without password, i.e. peer or OS authentication
    try:
        if (conn_info["sqlhost"] == 'localhost') & (conn_info["sqlport"]=='5432'):
            con = psycopg2.connect(dbname=conn_info["dbname"],
                                   user=conn_info["sqluser"])            
        else:
            con = psycopg2.connect(dbname=conn_info["dbname"],
                                   host=conn_info["sqlhost"],
                                   port=conn_info["sqlport"],
                                   user=conn_info["sqluser"])
    except:
        conn_info["sqlpass"] = getpass.getpass('Password: ')

        con = psycopg2.connect(dbname=conn_info["dbname"],
                               host=conn_info["sqlhost"],
                               port=conn_info["sqlport"],
                               user=conn_info["sqluser"],
                               password=conn_info["sqlpass"])
query_schema = 'set search_path to ' + conn_info['schema_name'] + ';'









    



Database: eicu
Username: alistairewj

Examine a single patient



In [3]:

    
patientunitstayid = 237395



In [4]:

    
query = query_schema + """
select *
from medication
where patientunitstayid = {}
order by drugorderoffset
""".format(patientunitstayid)

df = pd.read_sql_query(query, con)
df.head()









    Out[4]:







  
    
      
      medicationid
      patientunitstayid
      drugorderyear
      drugordertime24
      drugordertime
      drugorderoffset
      drugstartyear
      drugstarttime24
      drugstarttime
      drugstartoffset
      ...
      drughiclseqno
      dosage
      routeadmin
      loadingdose
      prn
      drugstopyear
      drugstoptime24
      drugstoptime
      drugstopoffset
      gtc
    
  
  
    
      0
      7461175
      237395
      2015
      22:56:26
      midnight
      -12
      2015
      21:37:00
      night
      -91
      ...
      10093.0
      1,500 3
      IV
      
      No
      2015
      21:36:54
      night
      -92
      19
    
    
      1
      10812193
      237395
      2015
      22:56:26
      midnight
      -12
      2015
      21:23:00
      night
      -105
      ...
      4053.0
      2000 MG
      IV
      
      No
      2015
      23:45:00
      midnight
      37
      19
    
    
      2
      11612134
      237395
      2015
      22:56:26
      midnight
      -12
      2015
      22:07:00
      midnight
      -61
      ...
      4034.0
      420 MG
      IV
      
      No
      2015
      03:15:00
      morning
      247
      19
    
    
      3
      11206166
      237395
      2015
      22:56:26
      midnight
      -12
      2015
      21:23:00
      night
      -105
      ...
      10093.0
      1,500 3
      IV
      
      No
      2015
      02:06:00
      morning
      178
      19
    
    
      4
      8246015
      237395
      2015
      23:38:17
      midnight
      30
      2015
      05:00:00
      morning
      352
      ...
      4053.0
      2000 MG
      IV
      
      No
      2015
      15:45:32
      evening
      3877
      19
    
  

5 rows × 23 columns



In [5]:

    
df.columns









    Out[5]:





Index(['medicationid', 'patientunitstayid', 'drugorderyear', 'drugordertime24',
       'drugordertime', 'drugorderoffset', 'drugstartyear', 'drugstarttime24',
       'drugstarttime', 'drugstartoffset', 'drugivadmixture',
       'drugordercancelled', 'drugname', 'drughiclseqno', 'dosage',
       'routeadmin', 'loadingdose', 'prn', 'drugstopyear', 'drugstoptime24',
       'drugstoptime', 'drugstopoffset', 'gtc'],
      dtype='object')



In [6]:

    
# Look at a subset of columns
cols = ['medicationid','patientunitstayid',
        'drugorderoffset','drugorderoffset', 'drugstopoffset',
        'drugivadmixture', 'drugordercancelled', 'drugname','drughiclseqno', 'gtc',
       'dosage','routeadmin','loadingdose', 'prn']
df[cols].head().T









    Out[6]:







  
    
      
      0
      1
      2
      3
      4
    
  
  
    
      medicationid
      7461175
      10812193
      11612134
      11206166
      8246015
    
    
      patientunitstayid
      237395
      237395
      237395
      237395
      237395
    
    
      drugorderoffset
      -12
      -12
      -12
      -12
      30
    
    
      drugorderoffset
      -12
      -12
      -12
      -12
      30
    
    
      drugstopoffset
      -92
      37
      247
      178
      3877
    
    
      drugivadmixture
      No
      Yes
      Yes
      No
      Yes
    
    
      drugordercancelled
      No
      No
      No
      No
      No
    
    
      drugname
      VANCOMYCIN 1.5 GM IN NS 250 ML IVPB (REPACKAGE)
      AZTREONAM 2 G IJ SOLR
      TOBRAMYCIN SULFATE 80 MG/2ML IJ SOLN
      VANCOMYCIN 1.5 GM IN NS 250 ML IVPB (REPACKAGE)
      AZTREONAM 2 G IJ SOLR
    
    
      drughiclseqno
      10093
      4053
      4034
      10093
      4053
    
    
      gtc
      19
      19
      19
      19
      19
    
    
      dosage
      1,500 3
      2000 MG
      420 MG
      1,500 3
      2000 MG
    
    
      routeadmin
      IV
      IV
      IV
      IV
      IV
    
    
      loadingdose
      
      
      
      
      
    
    
      prn
      No
      No
      No
      No
      No

Here we can see that, roughly on ICU admission, the patient had an order for vancomycin, aztreonam, and tobramycin.

Identifying patients admitted on a single drug

Let's look for patients who have an order for vancomycin using exact text matching.



In [7]:

    
drug = 'VANCOMYCIN'
query = query_schema + """
select 
  distinct patientunitstayid
from medication
where drugname like '%{}%'
""".format(drug)

df_drug = pd.read_sql_query(query, con)
print('{} unit stays with {}.'.format(df_drug.shape[0], drug))









    



29737 unit stays with VANCOMYCIN.

Exact text matching is fairly weak, as there's no systematic reason to prefer upper case or lower case. Let's relax the case matching.



In [8]:

    
drug = 'VANCOMYCIN'
query = query_schema + """
select 
  distinct patientunitstayid
from medication
where drugname ilike '%{}%'
""".format(drug)

df_drug = pd.read_sql_query(query, con)
print('{} unit stays with {}.'.format(df_drug.shape[0], drug))









    



41867 unit stays with VANCOMYCIN.

HICL codes are used to group together drugs which have the same underlying ingredient (i.e. most frequently this is used to group brand name drugs with the generic name drugs). We can see above the HICL for vancomycin is 10093, so let's try grabbing that.



In [9]:

    
hicl = 10093
query = query_schema + """
select 
  distinct patientunitstayid
from medication
where drughiclseqno = {}
""".format(hicl)

df_hicl = pd.read_sql_query(query, con)
print('{} unit stays with HICL = {}.'.format(df_hicl.shape[0], hicl))









    



9716 unit stays with HICL = 10093.

No luck! I wonder what we missed? Let's go back to the original query, this time retaining HICL and the name of the drug.



In [10]:

    
drug = 'VANCOMYCIN'
query = query_schema + """
select 
  drugname, drughiclseqno, count(*) as n
from medication
where drugname ilike '%{}%'
group by drugname, drughiclseqno
order by n desc
""".format(drug)

df_drug = pd.read_sql_query(query, con)
df_drug.head()









    Out[10]:







  
    
      
      drugname
      drughiclseqno
      n
    
  
  
    
      0
      VANCOMYCIN HCL 1000 MG IV SOLR
      4042.0
      9661
    
    
      1
      vancomycin
      4042.0
      7826
    
    
      2
      VANCOmycin 1 GM in NS 250 mL IVPB
      10093.0
      3977
    
    
      3
      VANCOMYCIN HCL 10 G IV SOLR
      4042.0
      3891
    
    
      4
      VANCOMYCIN HCL
      4042.0
      3064

It appears there are more than one HICL - we can group by HICL in this query to get an idea.



In [11]:

    
df_drug['drughiclseqno'].value_counts()









    Out[11]:





4042.0     333
10093.0    206
37442.0    119
8466.0      53
920.0       15
24859.0      8
3976.0       7
1694.0       3
4285.0       3
4283.0       3
8259.0       3
18084.0      2
6312.0       2
1403.0       1
Name: drughiclseqno, dtype: int64

Unfortunately, we can't be sure that these HICLs always identify only vancomycin. For example, let's look at drugnames for HICL = 1403.



In [12]:

    
hicl = 1403
query = query_schema + """
select 
  drugname, count(*) as n
from medication
where drughiclseqno = {}
group by drugname
order by n desc
""".format(hicl)

df_hicl = pd.read_sql_query(query, con)
df_hicl.head()









    Out[12]:







  
    
      
      drugname
      n
    
  
  
    
      0
      compounded cream
      9
    
    
      1
      Rx compound and Mix
      8
    
    
      2
      butt butter compound
      7
    
    
      3
      TESTOSTERONE 1 % (25 MG/2.5 G) TD GLPK
      7
    
    
      4
      zinc/aquaphor/nystatin
      5

This HICL seems more focused on the use of creams than on vancomycin. Let's instead inspect the top 3.



In [13]:

    
for hicl in [4042, 10093, 37442]:
    query = query_schema + """
    select 
      drugname, count(*) as n
    from medication
    where drughiclseqno = {}
    group by drugname
    order by n desc
    """.format(hicl)

    df_hicl = pd.read_sql_query(query, con)
    print('HICL {}'.format(hicl))
    print('Number of rows: {}'.format(df_hicl['n'].sum()))
    print('Top 5 rows by frequency:')
    print(df_hicl.head())
    print()









    



HICL 4042
Number of rows: 68951
Top 5 rows by frequency:
                         drugname     n
0  VANCOMYCIN HCL 1000 MG IV SOLR  9661
1                      vancomycin  7826
2                        VANCOCIN  4613
3     VANCOMYCIN HCL 10 G IV SOLR  3891
4                  VANCOMYCIN HCL  3064

HICL 10093
Number of rows: 18553
Top 5 rows by frequency:
                                           drugname     n
0                 VANCOmycin 1 GM in NS 250 mL IVPB  3977
1  VANCOMYCIN 1.25 GM IN NS 250 ML IVPB (REPACKAGE)  2037
2               VANCOmycin 1.25 GM in NS 500ML IVPB  1970
3                VANCOmycin 1.5 GM in NS 500ML IVPB  1944
4   VANCOMYCIN 1.5 GM IN NS 250 ML IVPB (REPACKAGE)  1475

HICL 37442
Number of rows: 3977
Top 5 rows by frequency:
                                          drugname     n
0         VANCOMYCIN 1 G MINI-BAG PLUS (VIAL MATE)  1856
1  VANCOMYCIN HCL IN DEXTROSE 750 MG/150ML IV SOLN   822
2                        VANCOMYCIN 5gm VIAL (MDV)   525
3                                       vancomycin   131
4                          Vancomycin Per Pharmacy    93

This is fairly convincing that these only refer to vancomycin. An alternative approach is to acquire the code book for HICL codes and look up vancomycin there.

Hospitals with data available



In [14]:

    
query = query_schema + """
with t as
(
select distinct patientunitstayid
from medication
)
select 
  pt.hospitalid
  , count(distinct pt.patientunitstayid) as number_of_patients
  , count(distinct t.patientunitstayid) as number_of_patients_with_tbl
from patient pt
left join t
  on pt.patientunitstayid = t.patientunitstayid
group by pt.hospitalid
""".format(patientunitstayid)

df = pd.read_sql_query(query, con)
df['data completion'] = df['number_of_patients_with_tbl'] / df['number_of_patients'] * 100.0
df.sort_values('number_of_patients_with_tbl', ascending=False, inplace=True)
df.head(n=10)









    Out[14]:







  
    
      
      hospitalid
      number_of_patients
      number_of_patients_with_tbl
      data completion
    
  
  
    
      11
      73
      7059
      6836
      96.840912
    
    
      54
      167
      6092
      5825
      95.617203
    
    
      106
      264
      5237
      5111
      97.594042
    
    
      184
      420
      4679
      4618
      98.696303
    
    
      134
      338
      4277
      4241
      99.158289
    
    
      58
      176
      4328
      4186
      96.719039
    
    
      90
      243
      4243
      4172
      98.326656
    
    
      71
      199
      4240
      4116
      97.075472
    
    
      206
      458
      3701
      3624
      97.919481
    
    
      200
      443
      3656
      3612
      98.796499



In [15]:

    
df[['data completion']].vgplot.hist(bins=10,
                                    var_name='Number of hospitals',
                                    value_name='Percent of patients with data')

Here we can see there are a few hospitals with no interface, and thus 0 patients, though the majority have >90% data completion.

	medicationid	patientunitstayid	drugorderyear	drugordertime24	drugordertime	drugorderoffset	drugstartyear	drugstarttime24	drugstarttime	drugstartoffset	...	drughiclseqno	dosage	routeadmin	prn	drugstopyear	drugstoptime24	drugstoptime	drugstopoffset	gtc
0	7461175	237395	2015	22:56:26	midnight	-12	2015	21:37:00	night	-91	...	10093.0	1,500 3	IV	No	2015	21:36:54	night	-92	19
1	10812193	237395	2015	22:56:26	midnight	-12	2015	21:23:00	night	-105	...	4053.0	2000 MG	IV	No	2015	23:45:00	midnight	37	19
2	11612134	237395	2015	22:56:26	midnight	-12	2015	22:07:00	midnight	-61	...	4034.0	420 MG	IV	No	2015	03:15:00	morning	247	19
3	11206166	237395	2015	22:56:26	midnight	-12	2015	21:23:00	night	-105	...	10093.0	1,500 3	IV	No	2015	02:06:00	morning	178	19
4	8246015	237395	2015	23:38:17	midnight	30	2015	05:00:00	morning	352	...	4053.0	2000 MG	IV	No	2015	15:45:32	evening	3877	19

	0	1	2	3	4
medicationid	7461175	10812193	11612134	11206166	8246015
patientunitstayid	237395	237395	237395	237395	237395
drugorderoffset	-12	-12	-12	-12	30
drugorderoffset	-12	-12	-12	-12	30
drugstopoffset	-92	37	247	178	3877
drugivadmixture	No	Yes	Yes	No	Yes
drugordercancelled	No	No	No	No	No
drugname	VANCOMYCIN 1.5 GM IN NS 250 ML IVPB (REPACKAGE)	AZTREONAM 2 G IJ SOLR	TOBRAMYCIN SULFATE 80 MG/2ML IJ SOLN	VANCOMYCIN 1.5 GM IN NS 250 ML IVPB (REPACKAGE)	AZTREONAM 2 G IJ SOLR
drughiclseqno	10093	4053	4034	10093	4053
gtc	19	19	19	19	19
dosage	1,500 3	2000 MG	420 MG	1,500 3	2000 MG
routeadmin	IV	IV	IV	IV	IV
loadingdose
prn	No	No	No	No	No

	drugname	drughiclseqno	n
0	VANCOMYCIN HCL 1000 MG IV SOLR	4042.0	9661
1	vancomycin	4042.0	7826
2	VANCOmycin 1 GM in NS 250 mL IVPB	10093.0	3977
3	VANCOMYCIN HCL 10 G IV SOLR	4042.0	3891
4	VANCOMYCIN HCL	4042.0	3064

	drugname	n
0	compounded cream	9
1	Rx compound and Mix	8
2	butt butter compound	7
3	TESTOSTERONE 1 % (25 MG/2.5 G) TD GLPK	7
4	zinc/aquaphor/nystatin	5

	hospitalid	number_of_patients	number_of_patients_with_tbl	data completion
11	73	7059	6836	96.840912
54	167	6092	5825	95.617203
106	264	5237	5111	97.594042
184	420	4679	4618	98.696303
134	338	4277	4241	99.158289
58	176	4328	4186	96.719039
90	243	4243	4172	98.326656
71	199	4240	4116	97.075472
206	458	3701	3624	97.919481
200	443	3656	3612	98.796499