In [1]:
import requests
import json
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
URIBASE = 'http://java.epa.gov/chemview/'

Can we get chemical use classification data?

i.e., lists of chemicals classified by use.

First, get the controlled vocabulary of uses.


In [2]:
uri = URIBASE + 'uses'
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)

In [3]:
print(len(j))


49

In [4]:
DataFrame(j)


Out[4]:
id useName
0 3982369 Abrasive
1 3999991 Adhesive and sealants
2 3353384 Adsorbent and absorbent
3 3978578 Agricultural chemicals (non-pesticidal)
4 3374957 Anti-erosion agent
5 3354035 Bleaching agent
6 3209634 Chelating agent
7 4449226 Children's Products
8 128090 Cleaning agent
9 4449119 Commercial
10 4449118 Consumer
11 3344552 Corrosion inhibitor and anti-scaling agent
12 3209893 Defoamer
13 130484 Developer
14 79489 Dyes/pigment
15 3209682 Enzyme and enzyme stabilizer
16 3355321 Filler
17 3340880 Finishing agent
18 124470 Flame retardant
19 3200140 Fragrance
20 3351545 Fuel and fuel additive
21 3405480 Hydraulic fluid
22 3342446 Intermediate
23 3976954 Ion exchange agent
24 3341380 Laboratory chemical
25 3976566 Lubricant and lubricant additives
26 3505275 Measuring device
27 129029 Metalworking fluid
28 3978683 Odor agents
29 3210221 Oxidizing/reducing agent
30 123564 Paint additive and coating additive
31 3978577 Paint additive and coating additives not descr...
32 138434 Pesticide
33 3352302 Photosensitive chemical
34 3353422 Plasticizer
35 3978345 Plating agent and surfactant
36 81462 Polymer
37 80596 Preservative and antioxidant
38 3352593 Process regulator
39 80557 Processing aid
40 3359653 Propellant and blowing agent
41 3505028 Refrigerant
42 3978647 Solids separation agent
43 80819 Solvent
44 3211067 Specialized industrial chemical
45 82391 Surfactant
46 3505276 Switches
47 3342149 Testing reagent
48 3361470 Viscosity adjustor

Getting the "details" on a use... not so useful


In [5]:
uri = URIBASE + 'uses/124470' # "Flame retardant"
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


Out[5]:
{'id': 124470, 'useName': 'Flame retardant'}

Can it return a list of chemicals classified with a specific use ID?

Unfortunately, the monster URI that the documentation provides (item 3, p 3) for doesn't really do much, or I am not using it correctly.


In [6]:
uri = URIBASE + 'chemicals/datatable?isTemplateFilter=false&chemicalIds=&snurUseIds=&useIds=124470&groupIds=&categoryIds=&endpointKeys=&synonymIds=&sourceIds='
# &sEcho=4&iColumns=6&sColumns=&iDisplayStart=0&iDisplayLength=10&mDataProp_0=0&mDataProp_1=1\
# &mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&mDataProp_5=5&sSearch=&bRegex=false&sSearch_0=\
# &bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=true&sSearch_2=\
# &bRegex_2=false&bSearchable_2=true&sSearch_3=&bRegex_3=false&bSearchable_3=true&sSearch_4=\
# &bRegex_4=false&bSearchable_4=true&sSearch_5=&bRegex_5=false&bSearchable_5=true&iSortCol_0=0\
# &sSortDir_0=asc&iSortingCols=1&bSortable_0=false&bSortable_1=true&bSortable_2=false\
# &bSortable_3=false&bSortable_4=false&bSortable_5=false'
print(uri)
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


http://java.epa.gov/chemview/chemicals/datatable?isTemplateFilter=false&chemicalIds=&snurUseIds=&useIds=124470&groupIds=&categoryIds=&endpointKeys=&synonymIds=&sourceIds=
Out[6]:
{}

Trying something different: learn from the URIs that ChemView generates when you do a search and export the results.

  • Searched for chemicals matching the use "Flame retardant" (from the drop-down menu) in all sources.
  • Replaced mediaType=xls to retrieve json instead in the resulting URI.

...This doesn't work either.


In [7]:
uri = URIBASE + 'datatable?mediaType=json&useIds=124470&sourceIds=2-5-6-7-3-10-9-8-1-16-4-11-1981377'
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
r.text


Out[7]:
'<html><head><title>Apache Tomcat/7.0.59 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 404 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>The requested resource is not available.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.59</h3></body></html>'

Looking at 'sources': can we get SNUR info?


In [8]:
uri = URIBASE + 'sources'
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)

In [9]:
sources_df = DataFrame(j)

In [10]:
sources_df


Out[10]:
chemicals endpointCategories externalFileUrl id inputMode sortId sourceDesc sourceId sourceLink sourceName sourceType templateType
0 [{'id': 3978785, 'identifier': None, 'template... [] http://java.epa.gov/oppt_chemical_search/downl... 2 ETL 101 Chemical Test Rule Data 2 http://www.epa.gov/opptintr/chemtest/pubs/view... Chemical Test Rule Data Data Submitted to EPA Endpoint
1 [{'id': 168984, 'identifier': '8EHQ-0990-1066'... [] http://java.epa.gov/oppt_chemical_search/downl... 5 ETL 102 Substantial Risk Reports 5 None 8E Data Submitted to EPA Form
2 [{'id': 171405, 'identifier': '86920000890', '... [] http://java.epa.gov/oppt_chemical_search/downl... 6 ETL 103 Health and Safety Studies 6 None 8D Data Submitted to EPA Form
3 [{'id': 117590, 'identifier': None, 'templateI... [] http://java.epa.gov/oppt_chemical_search/downl... 7 External 104 High Production Volume Information System 7 None HPVIS Data Submitted to EPA External
4 [{'id': 5833944, 'identifier': None, 'template... [] http://java.epa.gov/oppt_chemical_search/downl... 3 ETL 201 Hazard Characterizations 3 http://iaspub.epa.gov/oppthpv/hpv_hc_character... HC EPA Assessments Endpoint
5 [{'id': 90420, 'identifier': None, 'templateId... [] http://java.epa.gov/oppt_chemical_search/downl... 10 External 203 Integrated Risk Information System 10 None IRIS EPA Assessments External
6 [] [] http://java.epa.gov/oppt_chemical_search/downl... 13 ETL 204 Screening Work Plan Chemicals 13 None SWPC EPA Assessments Form
7 [{'id': 175658, 'identifier': None, 'templateI... [] http://java.epa.gov/oppt_chemical_search/downl... 9 ETL 205 Design for the Environment Alternative Assessm... 9 http://www.epa.gov/dfe/alternative_assessments... DFE AA EPA Assessments Form
8 [{'id': 3210975, 'identifier': 'Processing Aid... [] http://java.epa.gov/oppt_chemical_search/downl... 8 ETL 206 Design for the Environment: Safer Chemical Ing... 8 None DFE SCIL EPA Assessments External
9 [{'id': 3496613, 'identifier': None, 'template... [] http://java.epa.gov/oppt_chemical_search/downl... 1 ETL 301 Significant New Use Rules 1 http://www.epa.gov/opptintr/existingchemicals/... SNUR EPA Actions Form
10 [] [] http://java.epa.gov/oppt_chemical_search/downl... 14 ETL 302 Limitations on Manufacturing, Processing & Use 14 None LMPU EPA Actions Form
11 [] [] http://java.epa.gov/oppt_chemical_search/downl... 15 ETL 303 Pre-manufacture Notification Review Results 15 None PNRR EPA Actions Form
12 [{'id': 5991449, 'identifier': None, 'template... [] http://java.epa.gov/oppt_chemical_search/downl... 16 ETL 304 Consent Orders 16 None CO EPA Actions Form
13 [{'id': 5198350, 'identifier': None, 'template... [] http://java.epa.gov/oppt_chemical_search/downl... 4 ETL 401 Chemical Data Reporting 4 http://epa.gov/cdr/ CDR Manufacturing, Processing, Use, and Release Da... Form
14 [{'id': 85568, 'identifier': None, 'templateId... [] http://java.epa.gov/oppt_chemical_search/downl... 11 External 402 Toxics Release Inventory 11 None TRI Manufacturing, Processing, Use, and Release Da... External
15 [] [] http://java.epa.gov/oppt_chemical_search/downl... 17 ETL 403 Production, Use, Exposure Information 17 None PUEI Manufacturing, Processing, Use, and Release Da... Form
16 [{'id': 2053635, 'identifier': None, 'template... [] http://www.epa.gov/enviro/facts/tri/p2.html 1981377 External 404 TRI Pollution Prevention 1981377 None P2 Manufacturing, Processing, Use, and Release Da... External
17 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981354 Other 501 United Nations WHO (IARC) 1981354 http://www.iarc.fr/index.php UN IARC International Other
18 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981355 Other 502 Domestic Substance List of Canada 1981355 http://ec.gc.ca/lcpe-cepa/default.asp?lang=En&... CAN DSL International Other
19 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981356 Other 503 OECD: eChem Portal 1981356 http://www.echemportal.org/echemportal/index?p... OECD ECHEM International Other
20 [] [] http://www.epa.gov/enviro/facts/tri/p2.html 1981357 Other 601 USEPA - TRI Pollution Prevention 1981357 http://ofmpub.epa.gov/enviro/P2_EF_Query.maste... USA P2 U.S. - Government Other
21 [] [] http://actor.epa.gov/dashboard 1981358 Other 602 USEPA - iCSS Dashboard 1981358 http://actor.epa.gov/dashboard/data-service/ch... USA ICSS U.S. - Government Other
22 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981359 Other 701 California Prop 65 1981359 http://oehha.ca.gov/prop65.html CA P65 U.S. - State Other
23 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981360 Other 702 New Jersey EHS 1981360 http://www.state.nj.us/dep/opppc/ NJ EHS U.S. - State Other
24 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981361 Other 703 Maine CHC 1981361 http://www.maine.gov/dep/safechem/highconcern/ ME CHC U.S. - State Other
25 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981362 Other 801 Environmental Defense Fund 1981362 http://www.edf.org/ EDF Non-Governmental Entities Other
26 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981363 Other 802 Syracuse Research Corp. Datalog Search 1981363 http://esc.syrres.com/scripts/CASdlcgi.exe SRC Non-Governmental Entities Other
27 [] [] http://java.epa.gov/oppt_chemical_search/downl... 1981364 Other 803 Environmental Working Group 1981364 http://www.ewg.org/ EWG Non-Governmental Entities Other

In [11]:
# Calculate the number of items in the 'chemicals' field for each source.
sources_df['num_chems'] = sources_df['chemicals'].apply(len)
sources_df[['sourceId', 'sourceDesc', 'num_chems']]


Out[11]:
sourceId sourceDesc num_chems
0 2 Chemical Test Rule Data 2
1 5 Substantial Risk Reports 2
2 6 Health and Safety Studies 2
3 7 High Production Volume Information System 2
4 3 Hazard Characterizations 2
5 10 Integrated Risk Information System 2
6 13 Screening Work Plan Chemicals 0
7 9 Design for the Environment Alternative Assessm... 2
8 8 Design for the Environment: Safer Chemical Ing... 2
9 1 Significant New Use Rules 2
10 14 Limitations on Manufacturing, Processing & Use 0
11 15 Pre-manufacture Notification Review Results 0
12 16 Consent Orders 2
13 4 Chemical Data Reporting 2
14 11 Toxics Release Inventory 2
15 17 Production, Use, Exposure Information 0
16 1981377 TRI Pollution Prevention 2
17 1981354 United Nations WHO (IARC) 0
18 1981355 Domestic Substance List of Canada 0
19 1981356 OECD: eChem Portal 0
20 1981357 USEPA - TRI Pollution Prevention 0
21 1981358 USEPA - iCSS Dashboard 0
22 1981359 California Prop 65 0
23 1981360 New Jersey EHS 0
24 1981361 Maine CHC 0
25 1981362 Environmental Defense Fund 0
26 1981363 Syracuse Research Corp. Datalog Search 0
27 1981364 Environmental Working Group 0

In [12]:
sources_df.ix[9,:]


Out[12]:
chemicals             [{'id': 3496613, 'identifier': None, 'template...
endpointCategories                                                   []
externalFileUrl       http://java.epa.gov/oppt_chemical_search/downl...
id                                                                    1
inputMode                                                           ETL
sortId                                                              301
sourceDesc                                    Significant New Use Rules
sourceId                                                              1
sourceLink            http://www.epa.gov/opptintr/existingchemicals/...
sourceName                                                         SNUR
sourceType                                                  EPA Actions
templateType                                                       Form
num_chems                                                             2
Name: 9, dtype: object

In [13]:
DataFrame(sources_df.ix[9,0])


Out[13]:
endpoints externalLink id identifier synonyms templateId
0 [] http://java.epa.gov/oppt_chemical_search/downl... 3496613 None [{'id': 3496612, 'isUnregistered': False, 'sor... 3493162
1 [] http://java.epa.gov/oppt_chemical_search/downl... 3497805 None [{'id': 3497804, 'isUnregistered': False, 'sor... 3493230

This tells us that if you ask ChemView for information form SNUR sources, you will get information about... just two chemicals?


In [18]:
uri = URIBASE + 'chemicals/f&sourceIds=1' #&chemicalIds=&snurUseIds=&useIds=&groupIds=&categoryIds=&endpointKeys=&synonymIds='
# &sEcho=4&iColumns=6&sColumns=&iDisplayStart=0&iDisplayLength=10&mDataProp_0=0&mDataProp_1=1\
# &mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&mDataProp_5=5&sSearch=&bRegex=false&sSearch_0=\
# &bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=true&sSearch_2=\
# &bRegex_2=false&bSearchable_2=true&sSearch_3=&bRegex_3=false&bSearchable_3=true&sSearch_4=\
# &bRegex_4=false&bSearchable_4=true&sSearch_5=&bRegex_5=false&bSearchable_5=true&iSortCol_0=0\
# &sSortDir_0=asc&iSortingCols=1&bSortable_0=false&bSortable_1=true&bSortable_2=false\
# &bSortable_3=false&bSortable_4=false&bSortable_5=false'
print(uri)
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


http://java.epa.gov/chemview/chemicals/datatable?isTemplateFilter=false&sourceIds=1
Out[18]:
{}

Try to get SNUR information for a known ID

What if we look up info about one of these chemicals, specifying SNURs as the source.


In [14]:
uri = URIBASE + 'chemicals/3554283?sourceIds=1'
print(uri)
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


http://java.epa.gov/chemview/chemicals/3554283?sourceIds=1
Out[14]:
{}

That returned nothing.

OK, we also know that chemical ID 3565112 corresponds to PMN Number P-11-0607 and that ChemView has a record of the SNURs linked to this substance...


In [15]:
uri = URIBASE + 'chemicals/3565112?sourceIds=1&synonymIds='
print(uri)
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


http://java.epa.gov/chemview/chemicals/3565112?sourceIds=1&synonymIds=
Out[15]:
{'accessionNo': None,
 'casNo': '',
 'epaId': None,
 'id': 3565112,
 'pmnNo': 'P-11-0607',
 'sourceTypes': ['EPA Actions'],
 'sources': [{'chemicals': [{'endpoints': [],
     'externalLink': 'http://java.epa.gov/oppt_chemical_search/download?filename=77_fr_66149_november_2_2012.pdf',
     'id': 3565115,
     'identifier': None,
     'synonyms': [{'chemicalName': 'Polyaromatic Organophosphorus Compound (generic)',
       'id': 3565113,
       'isIupac': False,
       'isRegistry': False,
       'isSystematic': False,
       'isTscaInv': False,
       'isUnregistered': False,
       'isWorkPlan': False,
       'sortOrder': 5},
      {'chemicalName': 'Polyaromatic organophosphorus compound (generic)',
       'id': 3565114,
       'isIupac': True,
       'isRegistry': False,
       'isSystematic': False,
       'isTscaInv': False,
       'isUnregistered': False,
       'isWorkPlan': False,
       'sortOrder': 1}],
     'templateId': 3564983},
    {'endpoints': [],
     'externalLink': 'http://java.epa.gov/oppt_chemical_search/download?filename=77_fr_66149_november_2_2012.pdf',
     'id': 3565256,
     'identifier': None,
     'synonyms': [{'chemicalName': 'Polyaromatic Organophosphorus Compound (generic)',
       'id': 3565113,
       'isIupac': False,
       'isRegistry': False,
       'isSystematic': False,
       'isTscaInv': False,
       'isUnregistered': False,
       'isWorkPlan': False,
       'sortOrder': 5},
      {'chemicalName': 'Polyaromatic organophosphorus compound (generic)',
       'id': 3565114,
       'isIupac': True,
       'isRegistry': False,
       'isSystematic': False,
       'isTscaInv': False,
       'isUnregistered': False,
       'isWorkPlan': False,
       'sortOrder': 1}],
     'templateId': 3564993}],
   'endpointCategories': [],
   'externalFileUrl': 'http://java.epa.gov/oppt_chemical_search/download?filename=',
   'id': 1,
   'inputMode': 'ETL',
   'sortId': 301,
   'sourceDesc': 'Significant New Use Rules',
   'sourceId': 1,
   'sourceLink': 'http://www.epa.gov/opptintr/existingchemicals/pubs/sect5a2.html',
   'sourceName': 'SNUR',
   'sourceType': 'EPA Actions',
   'templateType': 'Form'}],
 'synonyms': [{'chemicalName': 'Polyaromatic organophosphorus compound (generic)',
   'id': 3565114,
   'isIupac': True,
   'isRegistry': False,
   'isSystematic': False,
   'isTscaInv': False,
   'isUnregistered': False,
   'isWorkPlan': False,
   'sortOrder': 1},
  {'chemicalName': 'Polyaromatic Organophosphorus Compound (generic)',
   'id': 3565113,
   'isIupac': False,
   'isRegistry': False,
   'isSystematic': False,
   'isTscaInv': False,
   'isUnregistered': False,
   'isWorkPlan': False,
   'sortOrder': 5}]}

In [16]:
print(j['sources'][0]['chemicals'][0]['externalLink'])


http://java.epa.gov/oppt_chemical_search/download?filename=77_fr_66149_november_2_2012.pdf

That did return some actual information. The external links about the specific chemicals both point to a PDF of the SNURs published in the Federal Register. We already know that this is not the extent of EPA's public data on these SNURs, so where is it in ChemView?

I navigated to the ChemView record for PMN number P-09-0248 and clicked on it to get a summary of the SNUR:

Below, I copied the link that it gives you when you click "E-mail Url", but added &mediaType=json.


In [17]:
uri = 'http://java.epa.gov/chemview?tf=0&ch=P-09-0248&su=2-5-6-7&as=3-10-9-8&ac=1-16&ma=4-11-1981377&tds=0&tdl=10&tas1=1&tas2=asc&tas3=undefined&tss=&modal=template&modalId=3517608&modalSrc=1&modalDetailId=3517610&mediaType=json'
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


Out[17]:
{}

Apparently these data are not API-present yet.

Trying something else by tweaking the URL from a different search...


In [22]:
uri = 'http://java.epa.gov/chemview?tf=1&su=2-5-6-7&as=3-10-9-8&ac=1-16&ma=4-11-1981377&tds=0&tdl=10&tas1=1&tas2=asc&tas3=undefined&tss=&modal=template&modalId=103298&modalSrc=3&modalDetailId=5636434&modalVae=0-0-1-0-0&mediaType=json'
r = requests.get(uri, headers = {'Accept': 'application/json, */*'})
j = json.loads(r.text)
j


Out[22]:
{}