In [1]:

    
%matplotlib inline
import pandas as pd
import nivapy3 as nivapy
import matplotlib.pyplot as plt

plt.style.use('ggplot')

Explore "core" ICPW data

Prior to updating the "core" ICPW datasets in RESA, I need to get an overview of what's already in the database and what isn't.



In [2]:

    
# Connect to db
eng = nivapy.da.connect()









    



Connection successful.

1. Query ICPW projects

The are 18 projects (one for each country) currently in RESA. We also have data for some countries that do not yet have a project defined (e.g. the Netherlands).



In [3]:

    
# Query projects
prj_grid = nivapy.da.select_resa_projects(eng)
prj_grid









    



581 projects in the RESA database.



In [4]:

    
prj_df = prj_grid.get_selected_df()
print(len(prj_df))
prj_df









    



18






    Out[4]:







  
    
      
      project_id
      project_number
      project_name
      contact_person
      project_description
    
  
  
    
      383
      1679
      None
      ICPWaters US
      None
      None
    
    
      424
      2120
      None
      ICPWaters NO
      None
      None
    
    
      425
      2140
      None
      ICPWaters CA
      None
      None
    
    
      467
      2923
      None
      ICPWaters UK
      None
      None
    
    
      468
      2943
      None
      ICPWaters FI
      None
      None
    
    
      469
      2944
      None
      ICPWaters SE
      None
      None
    
    
      473
      2985
      None
      ICPWaters BY
      None
      None
    
    
      474
      2986
      None
      ICPWaters CZ
      None
      None
    
    
      475
      2987
      None
      ICPWaters IT
      None
      None
    
    
      476
      2988
      None
      ICPWaters PL
      None
      None
    
    
      477
      2989
      None
      ICPWaters CH
      None
      None
    
    
      478
      2990
      None
      ICPWaters LV
      None
      None
    
    
      479
      2991
      None
      ICPWaters EE
      None
      None
    
    
      517
      3425
      None
      ICPWaters ES
      None
      None
    
    
      518
      3445
      None
      ICPWaters IE
      None
      None
    
    
      536
      3745
      None
      ICPWaters ME
      None
      None
    
    
      537
      3746
      None
      ICPWaters AM
      None
      None
    
    
      380
      1660
      None
      ICPWaters DE
      None
      None

2. Get station list

There are 262 stations currently associated with the projects in RESA.



In [5]:

    
# Get stations
stn_df = nivapy.da.select_resa_project_stations(prj_df, eng)
print(len(stn_df))
stn_df.head()









    



262






    Out[5]:







  
    
      
      station_id
      station_code
      station_name
      latitude
      longitude
      altitude
    
  
  
    
      0
      221
      BIE01
      Birkenes
      58.385426
      8.241721
      190.0
    
    
      1
      12080
      STE01
      Storgama v. dam
      59.052333
      8.653606
      587.0
    
    
      2
      23505
      DE01
      Schwarzwald, Dürreychbach
      48.750167
      8.443167
      730.0
    
    
      3
      23506
      DE02
      Fichtelgebirge, Eger
      50.085000
      11.824500
      664.0
    
    
      4
      23507
      DE03
      Rothaargebirge, Elberndorfer Bach
      50.987833
      8.201333
      540.0



In [6]:

    
# Map
nivapy.spatial.quickmap(stn_df, popup='station_code')









    Out[6]:

3. Get parameters

Get a list of parameters available at these stations. I assume that all data submissions to ICPW will report pH, so extracting pH data should be a good way to get an indication of which stations actually have data.



In [7]:

    
# Select parameters
par_grid = nivapy.da.select_resa_station_parameters(stn_df,
                                                    '1970-01-01',
                                                    '2019-01-01',
                                                    eng)
par_grid









    



110 parameters available for the selected stations and dates.



In [9]:

    
# Get selected pars
par_df = par_grid.get_selected_df()
par_df









    Out[9]:







  
    
      
      parameter_id
      parameter_name
      unit
    
  
  
    
      108
      1
      pH
      None

4. Get chemistry data



In [10]:

    
# Get data
wc_df, dup_df = nivapy.da.select_resa_water_chemistry(stn_df,
                                                      par_df,
                                                      '1970-01-01',
                                                      '2019-01-01',
                                                      eng,
                                                      lod_flags=False,
                                                      drop_dups=True)

wc_df.head()









    Out[10]:







  
    
      
      station_id
      station_code
      station_name
      sample_date
      depth1
      depth2
      pH_
    
    
      
      
      
      
      
      
      
      
    
  
  
    
      0
      108
      LAE01
      Langtjern, utlÃ¸p
      1972-09-01 00:00:00
      0.0
      0.0
      5.04
    
    
      1
      108
      LAE01
      Langtjern, utlÃ¸p
      1972-12-11 00:00:00
      0.0
      0.0
      4.92
    
    
      2
      108
      LAE01
      Langtjern, utlÃ¸p
      1973-04-04 15:15:00
      0.0
      0.0
      4.80
    
    
      3
      108
      LAE01
      Langtjern, utlÃ¸p
      1973-04-05 17:40:00
      0.0
      0.0
      4.69
    
    
      4
      108
      LAE01
      Langtjern, utlÃ¸p
      1973-04-06 12:00:00
      0.0
      0.0
      4.67



In [11]:

    
# How many stations have pH data
len(wc_df['station_code'].unique())









    Out[11]:





238



In [12]:

    
# Which stations do not have pH data?
all_stns = set(stn_df['station_code'].unique())
no_ph = list(all_stns - set(wc_df['station_code'].unique()))
no_ph_stns = stn_df.query('station_code in @no_ph').reset_index()
print(len(no_ph_stns))
no_ph_stns









    



24






    Out[12]:







  
    
      
      index
      station_id
      station_code
      station_name
      latitude
      longitude
      altitude
    
  
  
    
      0
      219
      36451
      ES08
      CRAM (High Mountain Research Centre)
      NaN
      NaN
      NaN
    
    
      1
      220
      36452
      ES09
      Lac Redon weather station
      NaN
      NaN
      NaN
    
    
      2
      221
      36453
      ES10
      Aigüestortes weather station
      NaN
      NaN
      NaN
    
    
      3
      227
      36459
      SE20
      Stormyrbäcken
      NaN
      NaN
      NaN
    
    
      4
      228
      36460
      SE21
      Pipbäcken Nedre
      NaN
      NaN
      NaN
    
    
      5
      229
      36461
      SE22
      Muddusälven
      NaN
      NaN
      NaN
    
    
      6
      230
      36462
      SE23
      Lommabäcken Nedre
      NaN
      NaN
      NaN
    
    
      7
      231
      36463
      SE24
      Lill-Fämtan
      NaN
      NaN
      NaN
    
    
      8
      232
      36464
      SE25
      Laxtjärnsbäcken
      NaN
      NaN
      NaN
    
    
      9
      234
      36465
      ME 01
      Black lake
      43.143300
      19.087500
      NaN
    
    
      10
      235
      36466
      AM 1
      Pambak v.Shirakamut
      40.850000
      44.233333
      NaN
    
    
      11
      236
      36467
      AM 2
      Pambak v.Vanadzor
      40.800000
      44.500000
      NaN
    
    
      12
      237
      36468
      AM 3
      Debed v.Airum
      41.217000
      44.883333
      NaN
    
    
      13
      238
      36469
      AM 4
      Sevjur v.Ranchpar
      40.016667
      44.366667
      NaN
    
    
      14
      239
      36470
      AM 5
      Hrazdan t. Masis
      40.016667
      44.433333
      NaN
    
    
      15
      240
      36471
      AM 6
      Arpa v. Areni
      39.733333
      45.200000
      NaN
    
    
      16
      241
      36472
      AM 7
      Voghchi c.Kapan
      39.200000
      46.400000
      NaN
    
    
      17
      242
      36473
      AM 8
      Meghriget c.Meghri
      38.916667
      46.233333
      NaN
    
    
      18
      243
      36474
      AM 9
      Gorisget c. Goris
      39.516667
      46.333333
      NaN
    
    
      19
      244
      36475
      AM 10
      Agstev c. Ijevan
      40.866667
      45.150000
      NaN
    
    
      20
      245
      36476
      AM 11
      Agstev  v.Fioletovo
      40.716667
      44.733333
      NaN
    
    
      21
      246
      36477
      AM 12
      Marmarik v.Aghavnadzor
      40.566667
      44.683333
      NaN
    
    
      22
      247
      36478
      AM 13
      Vorotan v. Vorotan
      39.466667
      46.150000
      NaN
    
    
      23
      248
      37765
      US127
      Barnes Lake, Adirondacks
      43.566700
      -75.227800
      393.0



In [13]:

    
# What data do these stations have?
par_grid2 = nivapy.da.select_resa_station_parameters(no_ph_stns,
                                                     '1970-01-01',
                                                     '2019-01-01',
                                                     eng)
par_grid2









    



0 parameters available for the selected stations and dates.

So, there are 262 stations within the "core" ICPW projects, but 24 of these have no data whatsoever associated with them (listed above).

5. Date for last sample by country

The code below gets the most recent pH sample in the database for each country.



In [14]:

    
# Most recent datab
for idx, row in prj_df.iterrows():
    # Get stations
    cnt_stns = nivapy.da.select_resa_project_stations([row['project_id'],], eng)
    
    # Get pH data
    wc, dups = nivapy.da.select_resa_water_chemistry(cnt_stns,
                                                     [1,],          # pH
                                                     '1970-01-01',
                                                     '2019-01-01',
                                                     eng,
                                                     lod_flags=False,
                                                     drop_dups=True)
    
    # Print results
    print(row['project_name'], '\t', len(cnt_stns), '\t', wc['sample_date'].max())









    



ICPWaters US 	 95 	 2012-12-31 00:00:00
ICPWaters NO 	 5 	 2018-11-12 00:00:00
ICPWaters CA 	 18 	 2012-12-18 00:00:00
ICPWaters UK 	 6 	 2012-03-07 00:00:00
ICPWaters FI 	 10 	 2012-12-12 11:00:00
ICPWaters SE 	 22 	 2013-10-23 00:00:00
ICPWaters BY 	 1 	 2010-12-09 00:00:00
ICPWaters CZ 	 8 	 2016-01-04 00:00:00
ICPWaters IT 	 6 	 2015-12-14 00:00:00
ICPWaters PL 	 4 	 2012-12-20 00:00:00
ICPWaters CH 	 9 	 2015-12-09 00:00:00
ICPWaters LV 	 8 	 2015-12-16 11:30:00
ICPWaters EE 	 1 	 2013-12-03 00:00:00
ICPWaters ES 	 10 	 2011-12-22 00:00:00
ICPWaters IE 	 10 	 2010-12-08 00:00:00
ICPWaters ME 	 1 	 nan
ICPWaters AM 	 13 	 nan
ICPWaters DE 	 35 	 2012-12-20 00:00:00

These results are largely as expected:

US. Complete up to 2012. John sent an entirely new dataset for the 2018 trends work, which can be used to replace the existing data series for the "core" stations as well
Norway. Transferred automatically
Canada. Partly processed for the trends work
UK. Partly processed for the trends work
Finland. Jussi sent an entirely new dataset (with changes suggested to site selections) in 2018
Sweden. Update using data collected via API. Also need to modify site selections
Belarus. Data up to 2010 are already in the database. Completed templates are available for 2012 to 2014 (2011 data are missing). No response to calls for data since 2014
Czech Republic. 2016 data needs to be uploaded
Italy. Data submitted during 2017 and 2018 need adding. Several new sites need creating
Poland. Recent data submissions seem very complex! Need to check through e-mails
Switzerland. Templates need combining and adding
Latvia. Data for 2016 and 2017 need combining and tidying
Estonia. No data for 2012 or 2014. Data from 2013, 2015, 2016 and 2017 needs merging and adding
Ireland. An entirely new set of stations and codes have been proposed. Some overlap with old stations, but need checking
Montenegro. Data from 2006 to 2009 are available in Excel. Need transferring to template and adding to database. Nothing since 2009
Armenia. Data from 2004 to 2008 are in Excel. Nothing since and no replies to annual Calls for Data
Germany. Excel templates cover 2015 to 2017. Nothing for 2013 or 2014? Some site codes have changes and some are no longer monitored. Also some old errors to fix - see e-mails?
Austria. Data up to 2012 are in Excel. No data since then
Moldova. Data from 2014 to 2017 in Excel. No project or stations in database
Netherlands. Excel data supplied in 2016. No idea what to make of these spreadsheets! No stations or project in RESA
Russia. Data provided from 2009 to 2014 in Excel. No stations or projects in database
Slovakia. Not currently in the "core" ICPW project, but data supplied by Jiri as part of the trends work. Fits well with Polish data

	project_id	project_number	project_name	contact_person	project_description
383	1679	None	ICPWaters US	None	None
424	2120	None	ICPWaters NO	None	None
425	2140	None	ICPWaters CA	None	None
467	2923	None	ICPWaters UK	None	None
468	2943	None	ICPWaters FI	None	None
469	2944	None	ICPWaters SE	None	None
473	2985	None	ICPWaters BY	None	None
474	2986	None	ICPWaters CZ	None	None
475	2987	None	ICPWaters IT	None	None
476	2988	None	ICPWaters PL	None	None
477	2989	None	ICPWaters CH	None	None
478	2990	None	ICPWaters LV	None	None
479	2991	None	ICPWaters EE	None	None
517	3425	None	ICPWaters ES	None	None
518	3445	None	ICPWaters IE	None	None
536	3745	None	ICPWaters ME	None	None
537	3746	None	ICPWaters AM	None	None
380	1660	None	ICPWaters DE	None	None

	station_id	station_code	station_name	latitude	longitude	altitude
0	221	BIE01	Birkenes	58.385426	8.241721	190.0
1	12080	STE01	Storgama v. dam	59.052333	8.653606	587.0
2	23505	DE01	Schwarzwald, Dürreychbach	48.750167	8.443167	730.0
3	23506	DE02	Fichtelgebirge, Eger	50.085000	11.824500	664.0
4	23507	DE03	Rothaargebirge, Elberndorfer Bach	50.987833	8.201333	540.0

	station_id	station_code	station_name	sample_date	depth1	depth2	pH_

0	108	LAE01	Langtjern, utlÃ¸p	1972-09-01 00:00:00	0.0	0.0	5.04
1	108	LAE01	Langtjern, utlÃ¸p	1972-12-11 00:00:00	0.0	0.0	4.92
2	108	LAE01	Langtjern, utlÃ¸p	1973-04-04 15:15:00	0.0	0.0	4.80
3	108	LAE01	Langtjern, utlÃ¸p	1973-04-05 17:40:00	0.0	0.0	4.69
4	108	LAE01	Langtjern, utlÃ¸p	1973-04-06 12:00:00	0.0	0.0	4.67

	index	station_id	station_code	station_name	latitude	longitude	altitude
0	219	36451	ES08	CRAM (High Mountain Research Centre)	NaN	NaN	NaN
1	220	36452	ES09	Lac Redon weather station	NaN	NaN	NaN
2	221	36453	ES10	Aigüestortes weather station	NaN	NaN	NaN
3	227	36459	SE20	Stormyrbäcken	NaN	NaN	NaN
4	228	36460	SE21	Pipbäcken Nedre	NaN	NaN	NaN
5	229	36461	SE22	Muddusälven	NaN	NaN	NaN
6	230	36462	SE23	Lommabäcken Nedre	NaN	NaN	NaN
7	231	36463	SE24	Lill-Fämtan	NaN	NaN	NaN
8	232	36464	SE25	Laxtjärnsbäcken	NaN	NaN	NaN
9	234	36465	ME 01	Black lake	43.143300	19.087500	NaN
10	235	36466	AM 1	Pambak v.Shirakamut	40.850000	44.233333	NaN
11	236	36467	AM 2	Pambak v.Vanadzor	40.800000	44.500000	NaN
12	237	36468	AM 3	Debed v.Airum	41.217000	44.883333	NaN
13	238	36469	AM 4	Sevjur v.Ranchpar	40.016667	44.366667	NaN
14	239	36470	AM 5	Hrazdan t. Masis	40.016667	44.433333	NaN
15	240	36471	AM 6	Arpa v. Areni	39.733333	45.200000	NaN
16	241	36472	AM 7	Voghchi c.Kapan	39.200000	46.400000	NaN
17	242	36473	AM 8	Meghriget c.Meghri	38.916667	46.233333	NaN
18	243	36474	AM 9	Gorisget c. Goris	39.516667	46.333333	NaN
19	244	36475	AM 10	Agstev c. Ijevan	40.866667	45.150000	NaN
20	245	36476	AM 11	Agstev v.Fioletovo	40.716667	44.733333	NaN
21	246	36477	AM 12	Marmarik v.Aghavnadzor	40.566667	44.683333	NaN
22	247	36478	AM 13	Vorotan v. Vorotan	39.466667	46.150000	NaN
23	248	37765	US127	Barnes Lake, Adirondacks	43.566700	-75.227800	393.0