CIE 5703 Week 6 - Assignment workflow

Requirements:

Suggestion to use python with the libraries

  • pandas (for data reading and handling)
  • numpy
  • matplotlib (for plotting)
  • scipy (for GEV)

For MATLAB there should be similar functions to read/convert the data and resample/accumulate the data.

The following steps are described with python functions only.

Detailed instructions for the data handling:

  1. Read in the dataset using pandas read_csv

  2. Format the timestamps to computer-read-able format (datetime) with pandas to_datetime and set index to the generated datetime

  3. Replace invalid data with NaN

  4. Create new dataframes to accumulate the rainfall data to hourly/daily/monthly timescales using pandas resample and sum and mean

  5. Create new dataframes with accumulated hourly data in the summer and winter periods by designing a logical mask (pandas loc and index)

Answering the assignments:

1.

  • Compute mean, std and skew over 15 (Charlotte) or 10 (Rotterdam) dataset with pandas mean, std, skew
  • Plot the histogram using pandas hist (and use a logarithmic scale for better visibility) with and without zeros
  • Repeat with 24h and 1h dataset using mm/15(10)min timescale and compare

2.

  • Index new columns by year and month of the monthly accumulated datasets (pandas index)
  • Plot the boxplots with pandas boxplot over accumulated rainfall column by month / year
  • Repeat on the hourly scale (for diurnal cycles): boxplot by hour and observe. Repeat this by neglecting data <1mm/h, <3mm/h, ... and observe
  • Do the same with summer / winter period
  • Select hourly events of >10mm/h and describe

3.

  • Create numpy linspace with desired range
  • Use scipy genextreme.fit with POT values
  • Plot the scipy genextreme.pdf with fitted values
  • Add histogram to it (normalized) for visual inspection
  • Use genextreme.ppf with required return period and fitted values

In [ ]: