Example of DOV search methods for groundwater samples (grondwatermonsters)

Use cases:

  • Get groundwater samples in a bounding box
  • Get groundwater samples with specific properties
  • Get the coordinates of all groundwater samples in Ghent
  • Get groundwater samples based on a combination of specific properties
  • Get groundwater samples based on a selection of screens (filters)

In [1]:
%matplotlib inline
import inspect, sys

In [2]:
# check pydov path
import pydov

Get information about the datatype 'GrondwaterMonster'


In [3]:
from pydov.search.grondwatermonster import GrondwaterMonsterSearch
gwmonster = GrondwaterMonsterSearch()

A description is provided for the 'GrondwaterMonster' datatype:


In [4]:
print(gwmonster.get_description())


In de Databank Ondergrond Vlaanderen zijn verschillende grondwatermeetnetten opgenomen. Deze meetnetten staan in functie van uitgebreide monitoringprogramma’s met de bedoeling een goed beeld te krijgen van de beschikbare grondwaterkwantiteit en grondwaterkwaliteit van de watervoerende lagen in Vlaanderen.

Deze kaartlaag toont alle watermonsters die in de meetnetten opgenomen zijn.

The different fields that are available for objects of the 'GrondwaterMonster' datatype can be requested with the get_fields() method:


In [5]:
fields = gwmonster.get_fields()

# print available fields
for f in fields.values():
    print(f['name'])


gw_id
filternummer
pkey_filter
grondwatermonsternummer
pkey_grondwatermonster
pkey_grondwaterlocatie
x
y
start_grondwaterlocatie_mtaw
gemeente
datum_monstername
kationen
anionen
zware_metalen
pesticiden_actieve_stoffen
pesticiden_relevante_metabolieten
niet_relevante_metabolieten_van_pesticiden
fysico_chemische_parameters
organische_verbindingen
andere_parameters
opdrachten
parametergroep
parameter
detectie
waarde
eenheid
veld_labo

You can get more information of a field by requesting it from the fields dictionary:

  • name: name of the field
  • definition: definition of this field
  • cost: currently this is either 1 or 10, depending on the datasource of the field. It is an indication of the expected time it will take to retrieve this field in the output dataframe.
  • notnull: whether the field is mandatory or not
  • type: datatype of the values of this field

In [6]:
# print information for a certain field
fields['waarde']


Out[6]:
{'name': 'waarde',
 'type': 'float',
 'definition': 'waarde (numeriek) van de parameter',
 'notnull': False,
 'query': False,
 'cost': 10}

Optionally, if the values of the field have a specific domain the possible values are listed as values:


In [9]:
# if an attribute can have several values, these are listed under 'values', e.g. for 'parameter':
list(fields['parameter']['values'].items())[0:10]


Out[9]:
[('Minolie', 'Minerale olien'),
 ('volumemassa nat', 'Volumemassa nat'),
 ('Humusgehalte', 'Humusgehalte'),
 ('pH', 'Zuurtegraad'),
 ('Al', 'Aluminium'),
 ('vegetatie', 'Vegetatie'),
 ('Vochtspanning', 'Vochtspanning'),
 ('Volumetrisch vochtgehalte', 'Volumetrisch vochtgehalte'),
 ('Doorlatendheid', 'Doorlatendheidsproef - algemeen'),
 ('Fractie > 500 µm', 'Fractie > 500 µm')]

In [10]:
fields['parameter']['values']['NH4']


Out[10]:
'Ammonium'

Example use cases

Get groundwater samples in a bounding box

Get data for all the groundwater samples that are geographically located within the bounds of the specified box.

The coordinates are in the Belgian Lambert72 (EPSG:31370) coordinate system and are given in the order of lower left x, lower left y, upper right x, upper right y.


In [11]:
from pydov.util.location import Within, Box

df = gwmonster.search(location=Within(Box(93378, 168009, 94246, 169873)))
df.head()


[000/050] ..................................................
Out[11]:
pkey_grondwatermonster grondwatermonsternummer pkey_grondwaterlocatie gw_id pkey_filter filternummer x y start_grondwaterlocatie_mtaw gemeente datum_monstername parametergroep parameter detectie waarde eenheid veld_labo
0 https://www.dov.vlaanderen.be/data/watermonste... SWPP006-05547 https://www.dov.vlaanderen.be/data/put/2018-00... SWPP006 https://www.dov.vlaanderen.be/data/filter/1999... 1 94147.000000 169582.0 9.40 Wortegem-Petegem 1999-05-18 Onbekend %AfwijkBalans NaN 3.21 % LABO
1 https://www.dov.vlaanderen.be/data/watermonste... SWPP018-05531 https://www.dov.vlaanderen.be/data/put/2018-00... SWPP018 https://www.dov.vlaanderen.be/data/filter/1999... 1 93789.000000 169132.0 12.38 Wortegem-Petegem 1999-11-15 Onbekend %AfwijkBalans NaN 1.94 % LABO
2 https://www.dov.vlaanderen.be/data/watermonste... SWPP007-05551 https://www.dov.vlaanderen.be/data/put/2018-00... SWPP007 https://www.dov.vlaanderen.be/data/filter/1999... 1 94040.000000 169657.0 10.43 Wortegem-Petegem 2000-05-23 Onbekend %AfwijkBalans NaN 5.44 % LABO
3 https://www.dov.vlaanderen.be/data/watermonste... SWPP017-05529 https://www.dov.vlaanderen.be/data/put/2017-00... SWPP017 https://www.dov.vlaanderen.be/data/filter/1999... 1 93818.601562 169034.0 10.52 Wortegem-Petegem 2000-05-23 Onbekend %AfwijkBalans NaN 3.32 % LABO
4 https://www.dov.vlaanderen.be/data/watermonste... SWPP023-36842 https://www.dov.vlaanderen.be/data/put/2018-00... SWPP023 https://www.dov.vlaanderen.be/data/filter/2007... 1 94053.000000 169638.0 10.09 Wortegem-Petegem 2011-12-21 Onbekend %AfwijkBalans NaN 1.55 % LABO

Using the pkey attributes one can request the details of the corresponding grondwatermonster in a webbrowser (only showing the first unique records):


In [12]:
for pkey_grondwatermonster in df.pkey_grondwatermonster.unique()[0:5]:
    print(pkey_grondwatermonster)


https://www.dov.vlaanderen.be/data/watermonster/1999-246645
https://www.dov.vlaanderen.be/data/watermonster/1999-246670
https://www.dov.vlaanderen.be/data/watermonster/2000-246650
https://www.dov.vlaanderen.be/data/watermonster/2000-246669
https://www.dov.vlaanderen.be/data/watermonster/2011-246677

Get groundwater samples with specific properties

Next to querying groundwater samples based on their geographic location within a bounding box, we can also search for groundwater samples matching a specific set of properties. For this we can build a query using a combination of the 'GrondwaterMonster' fields and operators provided by the WFS protocol.

A list of possible operators can be found below:


In [13]:
[i for i,j in inspect.getmembers(sys.modules['owslib.fes'], inspect.isclass) if 'Property' in i]


Out[13]:
['PropertyIsBetween',
 'PropertyIsEqualTo',
 'PropertyIsGreaterThan',
 'PropertyIsGreaterThanOrEqualTo',
 'PropertyIsLessThan',
 'PropertyIsLessThanOrEqualTo',
 'PropertyIsLike',
 'PropertyIsNotEqualTo',
 'PropertyIsNull',
 'SortProperty']

In this example we build a query using the PropertyIsEqualTo operator to find all groundwater samples that are within the community (gemeente) of 'Leuven':


In [14]:
from owslib.fes import PropertyIsEqualTo

query = PropertyIsEqualTo(
            propertyname='gemeente',
            literal='Leuven')

df = gwmonster.search(query=query)
df.head()


[000/177] ..................................................
[050/177] ..................................................
[100/177] ..................................................
[150/177] ...........................
Out[14]:
pkey_grondwatermonster grondwatermonsternummer pkey_grondwaterlocatie gw_id pkey_filter filternummer x y start_grondwaterlocatie_mtaw gemeente datum_monstername parametergroep parameter detectie waarde eenheid veld_labo
0 https://www.dov.vlaanderen.be/data/watermonste... 2-0005/M2009 https://www.dov.vlaanderen.be/data/put/2017-00... 2-0005 https://www.dov.vlaanderen.be/data/filter/1974... 1 171548.64 172680.85 26.21 Leuven 2009-12-14 Fysico-chemische parameters TOC NaN 37.29 mg/l LABO
1 https://www.dov.vlaanderen.be/data/watermonste... 2-0005/M2009 https://www.dov.vlaanderen.be/data/put/2017-00... 2-0005 https://www.dov.vlaanderen.be/data/filter/1974... 1 171548.64 172680.85 26.21 Leuven 2009-12-14 Zware metalen Al < 0.02 mg/l LABO
2 https://www.dov.vlaanderen.be/data/watermonste... 2-0005/M2009 https://www.dov.vlaanderen.be/data/put/2017-00... 2-0005 https://www.dov.vlaanderen.be/data/filter/1974... 1 171548.64 172680.85 26.21 Leuven 2009-12-14 Zware metalen Ni < 5.00 µg/l LABO
3 https://www.dov.vlaanderen.be/data/watermonste... 2-0005/M2009 https://www.dov.vlaanderen.be/data/put/2017-00... 2-0005 https://www.dov.vlaanderen.be/data/filter/1974... 1 171548.64 172680.85 26.21 Leuven 2009-12-14 Fysico-chemische parameters EC(Lab.) NaN 650.00 µS/cm(20°C) LABO
4 https://www.dov.vlaanderen.be/data/watermonste... 2-0005/M2009 https://www.dov.vlaanderen.be/data/put/2017-00... 2-0005 https://www.dov.vlaanderen.be/data/filter/1974... 1 171548.64 172680.85 26.21 Leuven 2009-12-14 Kationen NH4 NaN 0.29 mg/l LABO

Once again we can use the pkey_grondwatermonster as a permanent link to the information of the groundwater samples:


In [15]:
for pkey_grondwatermonster in df.pkey_grondwatermonster.unique()[0:5]:
    print(pkey_grondwatermonster)


https://www.dov.vlaanderen.be/data/watermonster/2009-001087
https://www.dov.vlaanderen.be/data/watermonster/2006-060958
https://www.dov.vlaanderen.be/data/watermonster/2006-060964
https://www.dov.vlaanderen.be/data/watermonster/2008-060966
https://www.dov.vlaanderen.be/data/watermonster/2004-060970

We can add the descriptions of the parameter values as an extra column 'parameter_label':


In [16]:
df['parameter_label'] = df['parameter'].map(fields['parameter']['values'])
df[['pkey_grondwatermonster', 'datum_monstername', 'parameter', 'parameter_label', 'waarde', 'eenheid']].head()


Out[16]:
pkey_grondwatermonster datum_monstername parameter parameter_label waarde eenheid
0 https://www.dov.vlaanderen.be/data/watermonste... 2009-12-14 TOC Totaal organische koolstof 37.29 mg/l
1 https://www.dov.vlaanderen.be/data/watermonste... 2009-12-14 Al Aluminium 0.02 mg/l
2 https://www.dov.vlaanderen.be/data/watermonste... 2009-12-14 Ni Nikkel 5.00 µg/l
3 https://www.dov.vlaanderen.be/data/watermonste... 2009-12-14 EC(Lab.) Geleidbaarheid in het labo 650.00 µS/cm(20°C)
4 https://www.dov.vlaanderen.be/data/watermonste... 2009-12-14 NH4 Ammonium 0.29 mg/l

Get groundwater screens based on a combination of specific properties

Get all groundwater screens in Hamme that have measurements for cations (kationen). And filter to get only Sodium values after fetching all records.


In [17]:
from owslib.fes import Or, Not, PropertyIsNull, PropertyIsLessThanOrEqualTo, And, PropertyIsLike

query = And([PropertyIsEqualTo(propertyname='gemeente',
                               literal='Hamme'),
             PropertyIsEqualTo(propertyname='kationen',
                               literal='true')
             ])
df_hamme = gwmonster.search(query=query,
                     return_fields=('pkey_grondwatermonster', 'parameter', 'parametergroep', 'waarde', 'eenheid','datum_monstername'))
df_hamme.head()


[000/278] ..................................................
[050/278] ..................................................
[100/278] ..................................................
[150/278] ..................................................
[200/278] ..................................................
[250/278] ............................
Out[17]:
pkey_grondwatermonster parameter parametergroep waarde eenheid datum_monstername
0 https://www.dov.vlaanderen.be/data/watermonste... CO3 Anionen 1.000 mg/l 2011-03-28
1 https://www.dov.vlaanderen.be/data/watermonste... SO4 Anionen 290.100 mg/l 2011-03-28
2 https://www.dov.vlaanderen.be/data/watermonste... Cd Zware metalen 0.519 µg/l 2011-03-28
3 https://www.dov.vlaanderen.be/data/watermonste... pH Fysico-chemische parameters 6.700 Sörensen 2011-03-28
4 https://www.dov.vlaanderen.be/data/watermonste... Mg Kationen 45.200 mg/l 2011-03-28

You should note that this initial dataframe contains all parameters (not just the cations). The filter will only make sure that only samples where any cation was analysed are in the list. If we want to filter more, we should do so in the resulting dataframe.


In [18]:
df_hamme = df_hamme[df_hamme.parameter=='Na']
df_hamme.head()


Out[18]:
pkey_grondwatermonster parameter parametergroep waarde eenheid datum_monstername
28 https://www.dov.vlaanderen.be/data/watermonste... Na Kationen 66.2 mg/l 2011-03-28
38 https://www.dov.vlaanderen.be/data/watermonste... Na Kationen 70.5 mg/l 2009-03-13
90 https://www.dov.vlaanderen.be/data/watermonste... Na Kationen 350.0 mg/l 2014-02-25
103 https://www.dov.vlaanderen.be/data/watermonste... Na Kationen 61.2 mg/l 2009-12-23
132 https://www.dov.vlaanderen.be/data/watermonste... Na Kationen 364.4 mg/l 2009-03-13

Working with water samples

For further analysis and visualisation of the time series data, we can use the data analysis library pandas and visualisation library matplotlib.


In [19]:
import pandas as pd
import matplotlib.pyplot as plt

Query the data of a specific filter using its pkey:


In [20]:
query = PropertyIsEqualTo(
            propertyname='pkey_filter',
            literal='https://www.dov.vlaanderen.be/data/filter/1991-001040')

df = gwmonster.search(query=query)
df.head()


[000/012] cccccccccccc
Out[20]:
pkey_grondwatermonster grondwatermonsternummer pkey_grondwaterlocatie gw_id pkey_filter filternummer x y start_grondwaterlocatie_mtaw gemeente datum_monstername parametergroep parameter detectie waarde eenheid veld_labo
0 https://www.dov.vlaanderen.be/data/watermonste... 4-0076/M2010 https://www.dov.vlaanderen.be/data/put/2017-00... 4-0076 https://www.dov.vlaanderen.be/data/filter/1991... 1 133977.0 199147.0 4.65 Hamme 2011-03-28 Anionen CO3 < 1.000 mg/l LABO
1 https://www.dov.vlaanderen.be/data/watermonste... 4-0076/M2010 https://www.dov.vlaanderen.be/data/put/2017-00... 4-0076 https://www.dov.vlaanderen.be/data/filter/1991... 1 133977.0 199147.0 4.65 Hamme 2011-03-28 Anionen SO4 NaN 290.100 mg/l LABO
2 https://www.dov.vlaanderen.be/data/watermonste... 4-0076/M2010 https://www.dov.vlaanderen.be/data/put/2017-00... 4-0076 https://www.dov.vlaanderen.be/data/filter/1991... 1 133977.0 199147.0 4.65 Hamme 2011-03-28 Zware metalen Cd NaN 0.519 µg/l LABO
3 https://www.dov.vlaanderen.be/data/watermonste... 4-0076/M2010 https://www.dov.vlaanderen.be/data/put/2017-00... 4-0076 https://www.dov.vlaanderen.be/data/filter/1991... 1 133977.0 199147.0 4.65 Hamme 2011-03-28 Fysico-chemische parameters pH NaN 6.700 Sörensen LABO
4 https://www.dov.vlaanderen.be/data/watermonste... 4-0076/M2010 https://www.dov.vlaanderen.be/data/put/2017-00... 4-0076 https://www.dov.vlaanderen.be/data/filter/1991... 1 133977.0 199147.0 4.65 Hamme 2011-03-28 Kationen Mg NaN 45.200 mg/l LABO

The date is still stored as a string type. Transforming to a data type using the available pandas function to_datetime and using these dates as row index:


In [21]:
df['datum_monstername'] = pd.to_datetime(df['datum_monstername'])

For many usecases, it is useful to create a pivoted table, showing the value per parameter


In [22]:
pivot = df.pivot_table(columns=df.parameter, values='waarde', index='datum_monstername')
pivot


Out[22]:
parameter %AfwijkBalans Al As B CO3 Ca Cd Cl Co Cr ... PO4(Tot.) Pb SO4 SomAN SomKAT T TOC Zn pH pH(Lab.)
datum_monstername
2006-05-24 NaN 0.0500 5.000 NaN NaN 399.0000 0.500 108.0000 NaN 2.1 ... 0.1 5.0 488.0000 NaN NaN 13.90 58.000 16.0000 6.700 NaN
2009-03-13 NaN NaN 5.000 NaN 1.0 394.7000 0.500 96.0000 NaN 10.0 ... NaN 10.0 431.0000 NaN NaN 12.70 8.585 10.0000 6.760 6.80
2009-12-23 NaN NaN 12.467 NaN 1.0 447.2000 0.500 105.0000 NaN 5.0 ... NaN 5.0 433.6000 NaN NaN 10.80 8.902 24.9390 7.040 6.90
2011-03-28 NaN NaN 12.998 NaN 1.0 331.1000 0.519 107.0000 NaN 5.0 ... NaN 5.0 290.1000 NaN NaN 12.00 9.984 12.9790 6.700 6.90
2011-11-04 NaN NaN 8.400 112.0 1.0 345.0000 0.500 99.0000 5.00 5.0 ... NaN 5.0 385.4900 NaN NaN 12.10 8.337 10.9000 6.730 7.40
2012-11-26 NaN NaN NaN NaN 1.0 360.0000 NaN 95.0000 NaN NaN ... NaN NaN 313.0000 NaN NaN 12.70 7.784 NaN 6.610 7.18
2014-02-25 NaN NaN 2.000 100.0 1.0 330.0000 0.030 91.0000 0.57 1.0 ... NaN 2.0 190.0000 NaN NaN 12.40 7.500 5.0000 6.820 7.10
2014-09-30 NaN 0.0200 2.300 100.0 1.0 280.0000 0.030 87.7459 0.65 1.0 ... NaN 2.0 143.2035 NaN NaN 12.40 5.800 5.0000 6.820 6.90
2015-09-30 -6.0485 0.0486 5.000 115.5 1.0 299.4059 0.500 88.8800 5.00 5.0 ... NaN 5.0 171.1500 22.3459 19.7969 11.99 9.188 47.2927 6.680 6.80
2016-07-12 NaN 0.0200 5.000 138.7 1.0 335.5201 0.400 91.5800 3.00 1.0 ... NaN 5.0 224.0200 22.5263 22.4309 12.20 8.750 16.4651 7.413 6.79
2017-03-27 -7.3053 0.0974 5.000 97.3 1.0 289.1559 0.400 76.4800 3.00 1.0 ... NaN 5.0 218.4100 23.0207 19.8862 12.30 7.995 10.0000 6.757 6.77
2018-09-03 0.8345 0.0200 5.000 134.9 1.0 354.3010 0.400 88.6000 3.00 1.0 ... NaN 5.0 222.8400 23.1974 23.5878 12.30 9.138 10.0000 6.870 6.88

12 rows × 39 columns

Plotting

The default plotting functionality of Pandas can be used:


In [35]:
parameters = ['NO3', 'NO2', 'NH4']
ax = pivot[parameters].plot.line(style='.-', figsize=(12, 5))
ax.set_xlabel('');
ax.set_ylabel('concentration (mg/l)');
ax.set_title('Concentration nitrite, nitrate and ammonium for filter id 1991-001040');


Combine search in filters and groundwater samples

For this example, we will first search filters, and later search all samples for this selection. We will select filters in the primary network located in Kalmthout.


In [73]:
from pydov.search.grondwaterfilter import GrondwaterFilterSearch
from pydov.util.query import Join

gfs = GrondwaterFilterSearch()

gemeente = 'Kalmthout'
filter_query = And([PropertyIsLike(propertyname='meetnet',
                       literal='meetnet 1 %'),
                    PropertyIsEqualTo(propertyname='gemeente',
                       literal=gemeente)])

filters = gfs.search(query=filter_query, return_fields=['pkey_filter'])

monsters = gwmonster.search(query=Join(filters, 'pkey_filter'))
monsters.head()


[000/181] cccccccccccccccccccccccccccccccccccccccccccccccccc
[050/181] cccccccccccccccccccccccccccccccccccccccccccccccccc
[100/181] cccccccccccccccccccccccccccccccccccccccccccccccccc
[150/181] ccccccccccccccccccccccccccccccc
Out[73]:
pkey_grondwatermonster grondwatermonsternummer pkey_grondwaterlocatie gw_id pkey_filter filternummer x y start_grondwaterlocatie_mtaw gemeente datum_monstername parametergroep parameter detectie waarde eenheid veld_labo
0 https://www.dov.vlaanderen.be/data/watermonste... 1-0409/M2006 https://www.dov.vlaanderen.be/data/put/2017-00... 1-0409 https://www.dov.vlaanderen.be/data/filter/1981... 1 154510.0 228040.0 21.29 Kalmthout 2006-05-05 Fysico-chemische parameters O2 NaN 1.91 mg/l LABO
1 https://www.dov.vlaanderen.be/data/watermonste... 1-0409/M2006 https://www.dov.vlaanderen.be/data/put/2017-00... 1-0409 https://www.dov.vlaanderen.be/data/filter/1981... 1 154510.0 228040.0 21.29 Kalmthout 2006-05-05 Kationen Mn NaN 0.23 mg/l LABO
2 https://www.dov.vlaanderen.be/data/watermonste... 1-0409/M2006 https://www.dov.vlaanderen.be/data/put/2017-00... 1-0409 https://www.dov.vlaanderen.be/data/filter/1981... 1 154510.0 228040.0 21.29 Kalmthout 2006-05-05 Fysico-chemische parameters EC NaN 608.00 µS/cm LABO
3 https://www.dov.vlaanderen.be/data/watermonste... 1-0409/M2006 https://www.dov.vlaanderen.be/data/put/2017-00... 1-0409 https://www.dov.vlaanderen.be/data/filter/1981... 1 154510.0 228040.0 21.29 Kalmthout 2006-05-05 Zware metalen Cu < 3.00 µg/l LABO
4 https://www.dov.vlaanderen.be/data/watermonste... 1-0409/M2006 https://www.dov.vlaanderen.be/data/put/2017-00... 1-0409 https://www.dov.vlaanderen.be/data/filter/1981... 1 154510.0 228040.0 21.29 Kalmthout 2006-05-05 Zware metalen Cd < 0.50 µg/l LABO

We will filter out some parameters, and show trends per location.


In [131]:
parameter = 'NH4'
trends_sel = monsters[(monsters.parameter==parameter) & (monsters.veld_labo=='LABO')]
trends_sel = trends_sel.set_index('datum_monstername')
trends_sel['label'] = trends_sel['gw_id'] + ' F' + trends_sel['filternummer'] 

# By pivoting, we get each location in a different column
trends_sel_pivot = trends_sel.pivot_table(columns='label', values='waarde', index='datum_monstername')
trends_sel_pivot.index = pd.to_datetime(trends_sel_pivot.index)

# resample to yearly values and plot data
ax = trends_sel_pivot.resample('A').median().plot.line(style='.-', figsize=(12, 5))
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))

ax.set_title(f'Long term evolution of {parameter} in {gemeente}');
ax.set_xlabel('year');
ax.set_ylabel('concentration (mg/l)');