In [1]:
import pandas as pd
from lizard_connector import Client
cli = Client()
cli.endpoints


Out[1]:
('annotations',
 'assetgroups',
 'bridges',
 'buildings',
 'colormaps',
 'contactgroups',
 'contacts',
 'culverts',
 'datasources',
 'domains',
 'events',
 'eventseries',
 'favourites',
 'filters',
 'fixeddrainagelevelareas',
 'groundwaterstations',
 'inbox',
 'leveecrosssections',
 'leveereferencepoints',
 'leveerings',
 'levees',
 'leveesections',
 'leveezones',
 'locations',
 'manholes',
 'measuringstations',
 'messages',
 'monitoringwells',
 'nodes',
 'observationtypes',
 'opticalfibers',
 'organisations',
 'orifices',
 'outlets',
 'overflows',
 'parcels',
 'pi',
 'pipes',
 'polders',
 'pressurepipes',
 'pumpeddrainageareas',
 'pumps',
 'pumpstations',
 'raster_aggregates',
 'rasteralarms',
 'rasters',
 'regions',
 'roads',
 'scenario_results',
 'scenarios',
 'search',
 'sluices',
 'timeseries',
 'timeseriesalarms',
 'timeseriestypes',
 'users',
 'wastewatertreatmentplants',
 'weirs',
 'wmslayers')

The connection with Lizard is made. Above all endpoints are shown Now we collect the metadata for a first timeseries with uuid 867b166a-fa39-457d-a9e9-4bcb2ff04f61:


In [2]:
result = cli.timeseries.get(uuid="867b166a-fa39-457d-a9e9-4bcb2ff04f61")
result.metadata


Out[2]:
access_modifier code datasource end id interval last_value location__code location__geometry location__name ... observation_type__unit observation_type__url start supplier supplier_code supply_frequency timeseries_type url uuid value_type
0 Publiek WNS1400.1h::second::1::3600 None 2018-04-12 20:00:00 1419 3600 None 06348 None Cabauw ... mm https://demo.lizard.net/api/v3/observationtype... 946681200000 None 3600 None https://demo.lizard.net/api/v3/timeseries/867b... 867b166a-fa39-457d-a9e9-4bcb2ff04f61 float

1 rows × 40 columns

Download of timeseries with uuid 867b166a-fa39-457d-a9e9-4bcb2ff04f61 From December 31st 1999 untill Februari 14th 2018. This is the metadata:


In [3]:
queryparams = {
    "end":1518631200000, 
    "start":946681200000,
    "window":"month"
}
result = cli.timeseries.get(uuid="867b166a-fa39-457d-a9e9-4bcb2ff04f61", **queryparams)
result.metadata


Out[3]:
access_modifier code datasource end id interval last_value location__code location__geometry location__name ... observation_type__url percentiles start supplier supplier_code supply_frequency timeseries_type url uuid value_type
0 Publiek WNS1400.1h::second::1::3600 None 2018-04-12 20:00:00 1419 3600 None 06348 None Cabauw ... https://demo.lizard.net/api/v3/observationtype... [] 946681200000 None 3600 None https://demo.lizard.net/api/v3/timeseries/867b... 867b166a-fa39-457d-a9e9-4bcb2ff04f61 float

1 rows × 41 columns

And this is the data:


In [4]:
result.data[0]


Out[4]:
max min timestamp
0 2.9 0.0 2000-01-01
1 3.4 0.0 2000-02-01
2 13.2 0.0 2000-03-01
3 2.3 0.0 2000-04-01
4 6.6 0.0 2000-05-01
5 9.0 0.0 2000-06-01
6 21.9 0.0 2000-07-01
7 7.5 0.0 2000-08-01
8 7.0 0.0 2000-09-01
9 7.4 0.0 2000-10-01
10 3.3 0.0 2000-11-01
11 3.0 0.0 2000-12-01
12 4.0 0.0 2001-01-01
13 2.9 0.0 2001-02-01
14 4.6 0.0 2001-03-01
15 2.9 0.0 2001-04-01
16 3.2 0.0 2001-05-01
17 15.4 0.0 2001-06-01
18 6.2 0.0 2001-07-01
19 15.2 0.0 2001-08-01
20 6.9 0.0 2001-09-01
21 3.0 0.0 2001-10-01
22 3.3 0.0 2001-11-01
23 3.7 0.0 2001-12-01
24 6.4 0.0 2002-01-01
25 3.7 0.0 2002-02-01
26 3.7 0.0 2002-03-01
27 2.8 0.0 2002-04-01
28 2.0 0.0 2002-05-01
29 8.8 0.0 2002-06-01
... ... ... ...
188 5.9 0.0 2015-09-01
189 1.5 0.0 2015-10-01
190 3.2 0.0 2015-11-01
191 4.7 0.0 2015-12-01
192 3.4 0.0 2016-01-01
193 2.4 0.0 2016-02-01
194 2.1 0.0 2016-03-01
195 3.7 0.0 2016-04-01
196 4.7 0.0 2016-05-01
197 13.3 0.0 2016-06-01
198 6.2 0.0 2016-07-01
199 21.1 0.0 2016-08-01
200 3.2 0.0 2016-09-01
201 5.2 0.0 2016-10-01
202 4.3 0.0 2016-11-01
203 1.9 0.0 2016-12-01
204 3.8 0.0 2017-01-01
205 4.3 0.0 2017-02-01
206 2.5 0.0 2017-03-01
207 3.9 0.0 2017-04-01
208 2.7 0.0 2017-05-01
209 11.1 0.0 2017-06-01
210 14.9 0.0 2017-07-01
211 5.4 0.0 2017-08-01
212 7.9 0.0 2017-09-01
213 5.1 0.0 2017-10-01
214 4.2 0.0 2017-11-01
215 3.2 0.0 2017-12-01
216 3.5 0.0 2018-01-01
217 2.0 0.0 2018-02-01

218 rows × 3 columns

Now we can search for other timeseries based on the metadata. We are going to look at the correlation between precipitation, evaporation and windspeed. First the observation type metadata:


In [5]:
location__uuid = result.metadata['location__uuid'][0]
metadata_multiple, events_multiple = cli.timeseries.get(location__uuid=location__uuid, **queryparams)
columns = [x for x in metadata_multiple.columns if "observation_type" in x or  "uuid" in x]
metadata_multiple[columns]


Out[5]:
location__uuid node__uuid observation_type__code observation_type__compartment observation_type__description observation_type__domain_values observation_type__parameter observation_type__reference_frame observation_type__scale observation_type__unit observation_type__url uuid
0 908dc271-91e8-471c-8f67-b491e48d7c99 6be7b0dd-b65b-4e33-adcc-80ef1e28b4b5 WNS1400.1h NT None Neerslag ratio mm https://demo.lizard.net/api/v3/observationtype... 867b166a-fa39-457d-a9e9-4bcb2ff04f61
1 908dc271-91e8-471c-8f67-b491e48d7c99 6be7b0dd-b65b-4e33-adcc-80ef1e28b4b5 WNS3832 LT None Luchttemperatuur interval oC https://demo.lizard.net/api/v3/observationtype... 1acf8b8a-45d6-43fb-88c6-c9786f7d9663
2 908dc271-91e8-471c-8f67-b491e48d7c99 6be7b0dd-b65b-4e33-adcc-80ef1e28b4b5 WNS8874 NT None Windsnelheid interval m/s https://demo.lizard.net/api/v3/observationtype... 4e769b01-488a-42e4-b7bb-cdcb8c91e4cb
3 908dc271-91e8-471c-8f67-b491e48d7c99 6be7b0dd-b65b-4e33-adcc-80ef1e28b4b5 WNS928 NT None Druk interval bar https://demo.lizard.net/api/v3/observationtype... 9f36f0b2-ada8-4286-a386-b8f206096776
4 908dc271-91e8-471c-8f67-b491e48d7c99 6be7b0dd-b65b-4e33-adcc-80ef1e28b4b5 WNS9027 LT None Verdamping None ratio mm https://demo.lizard.net/api/v3/observationtype... fb30d529-1bc5-4a18-a02f-a48e0d88aa52

The data has different lengths. We cannot correlate the data unless we slice the data. Here we make sure the data is of the same length. Because one of our queryparameters was {"window":"month"} we already resampled the data.


In [6]:
indexed_events = [e.set_index('timestamp') for e in events_multiple if 'timestamp' in e.columns]
first = max([indexed.index[0] for indexed in indexed_events])
last = min([indexed.index[-1] for indexed in indexed_events])
print(first, last)
indexed_events_ranged = [e[first:last] for e in indexed_events]
[e.shape for e in indexed_events_ranged]


2014-12-01 00:00:00 2016-05-01 00:00:00
Out[6]:
[(18, 2), (18, 2), (18, 2), (18, 2), (18, 2)]

We can see the data has the same length. Now we select the max for the different timeseries:


In [7]:
observation_types = metadata_multiple['observation_type__parameter'].tolist()
max_weerdata = pd.DataFrame({observation_type: events['max'] for observation_type, events in zip(observation_types, indexed_events_ranged)})
max_weerdata


Out[7]:
Druk Luchttemperatuur Neerslag Verdamping Windsnelheid
timestamp
2014-12-01 1037.8 NaN 3.6 0.5 NaN
2015-01-01 1036.1 12.5 6.1 0.6 19.0
2015-02-01 1041.2 10.4 2.9 1.2 12.0
2015-03-01 1038.1 16.7 4.0 1.9 19.0
2015-04-01 1037.5 21.9 1.9 3.5 13.0
2015-05-01 1027.5 24.7 5.6 4.5 17.0
2015-06-01 1032.2 31.1 2.9 5.4 13.0
2015-07-01 1025.1 32.7 24.0 5.8 17.0
2015-08-01 1024.0 27.9 13.5 4.6 9.0
2015-09-01 1038.7 20.1 5.9 3.0 11.0
2015-10-01 1033.5 17.7 1.5 2.2 8.0
2015-11-01 1030.8 18.1 3.2 1.1 18.0
2015-12-01 1036.0 14.6 4.7 0.6 14.0
2016-01-01 1033.0 13.6 3.4 0.8 14.0
2016-02-01 1037.3 12.1 2.4 1.4 17.0
2016-03-01 1036.2 14.2 2.1 2.3 16.0
2016-04-01 1032.8 20.0 3.7 3.4 12.0
2016-05-01 1030.0 25.7 4.7 4.9 11.0

At last a correlation is easily calculated because of pandas:


In [8]:
max_weerdata.corr()


Out[8]:
Druk Luchttemperatuur Neerslag Verdamping Windsnelheid
Druk 1.000000 -0.750063 -0.649189 -0.666295 0.092358
Luchttemperatuur -0.750063 1.000000 0.599179 0.942602 -0.218572
Neerslag -0.649189 0.599179 1.000000 0.516933 0.087513
Verdamping -0.666295 0.942602 0.516933 1.000000 -0.280535
Windsnelheid 0.092358 -0.218572 0.087513 -0.280535 1.000000

In [ ]: