In [1]:
import pandas as pd
from lizard_connector import Client
cli = Client()
cli.endpoints
Out[1]:
The connection with Lizard is made. Above all endpoints are shown
Now we collect the metadata for a first timeseries with uuid 867b166a-fa39-457d-a9e9-4bcb2ff04f61
:
In [2]:
result = cli.timeseries.get(uuid="867b166a-fa39-457d-a9e9-4bcb2ff04f61")
result.metadata
Out[2]:
Download of timeseries with uuid 867b166a-fa39-457d-a9e9-4bcb2ff04f61
From December 31st 1999 untill Februari 14th 2018.
This is the metadata:
In [3]:
queryparams = {
"end":1518631200000,
"start":946681200000,
"window":"month"
}
result = cli.timeseries.get(uuid="867b166a-fa39-457d-a9e9-4bcb2ff04f61", **queryparams)
result.metadata
Out[3]:
And this is the data:
In [4]:
result.data[0]
Out[4]:
Now we can search for other timeseries based on the metadata. We are going to look at the correlation between precipitation, evaporation and windspeed. First the observation type metadata:
In [5]:
location__uuid = result.metadata['location__uuid'][0]
metadata_multiple, events_multiple = cli.timeseries.get(location__uuid=location__uuid, **queryparams)
columns = [x for x in metadata_multiple.columns if "observation_type" in x or "uuid" in x]
metadata_multiple[columns]
Out[5]:
The data has different lengths. We cannot correlate the data unless we slice the data. Here we make sure the data is of the same length. Because one of our queryparameters was {"window":"month"}
we already resampled the data.
In [6]:
indexed_events = [e.set_index('timestamp') for e in events_multiple if 'timestamp' in e.columns]
first = max([indexed.index[0] for indexed in indexed_events])
last = min([indexed.index[-1] for indexed in indexed_events])
print(first, last)
indexed_events_ranged = [e[first:last] for e in indexed_events]
[e.shape for e in indexed_events_ranged]
Out[6]:
We can see the data has the same length. Now we select the max for the different timeseries:
In [7]:
observation_types = metadata_multiple['observation_type__parameter'].tolist()
max_weerdata = pd.DataFrame({observation_type: events['max'] for observation_type, events in zip(observation_types, indexed_events_ranged)})
max_weerdata
Out[7]:
At last a correlation is easily calculated because of pandas:
In [8]:
max_weerdata.corr()
Out[8]:
In [ ]: