Convert CNR Wave Data to NetCDF DSG (CF-1.6)

From Davide Bonaldo at CNR-ISMAR : here's a time series of wave data from Jesolo.

  • Columns 1 to 6: date (y m d h m s)
  • Column 7: Significant wave height (m)
  • Column 8: Mean period (s)
  • Column 9: Mean direction (deg)
  • Column 10: Sea surface elevation (m)

In [1]:
import numpy as np
import urllib
%matplotlib inline

In [2]:
url='https://www.dropbox.com/s/0epy3vsjgl1h8ld/ONDE_Jesolo.txt?dl=1'
local_file = '/usgs/data2/notebook/data/ONDE_Jesolo.txt'

In [3]:
urllib.urlretrieve (url,local_file)


Out[3]:
('/usgs/data2/notebook/data/ONDE_Jesolo.txt',
 <httplib.HTTPMessage instance at 0x7fb90cbd1fc8>)

In [4]:
from datetime import datetime
import pandas as pd

def date_parser(year, month, day, hour, minute, second):
    var = year, month, day, hour, minute, second
    var = [int(float(x)) for x in var]
    return datetime(*var)

df = pd.read_csv(local_file, header=None,
                 delim_whitespace=True, index_col='datetime',
                 parse_dates={'datetime': [0, 1, 2, 3, 4, 5]},
                 date_parser=date_parser)

In [5]:
df.columns=['Hsig','Twave','Dwave','Wlevel']

In [6]:
df[['Hsig','Wlevel']].plot()


Out[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb906e5e410>

In [7]:
import calendar
times     = [ calendar.timegm(x.timetuple()) for x in df.index ]
times=np.asarray(times, dtype=np.int64)

In [8]:
def pd_to_secs(df):
    import calendar
    """
    convert a pandas datetime index to seconds since 1970
    """
    return np.asarray([ calendar.timegm(x.timetuple()) for x in df.index ], dtype=np.int64)

In [9]:
secs = pd_to_secs(df)

In [18]:
# z is positive down, will generate ADCP if all z is not the same, simple time series otherwise
# here assuming wave height is measured at the sea surface
z = 0.*np.ones_like(secs)

In [19]:
values = df['Hsig'].values

Now we use some tools from Kyle Wilcox (https://github.com/kwilcox/pytools) to write a single time series to a netCDF file. Right now it is set up to only handle one variable, so it's the total solution for dataframes with more than one column of time series data. Each variable could be written to a separate file and then merged together using NCO (or NcML).

To use the timeseries create tool, we pass in a list of tuples data=[(time, vertical, value)] or three numpy arrays times=, verticals=, values=, where time is seconds since epoch, and the verticals are the heights. The code was written for tide gauge data, and writes the vertical as positive down, and relative to sea surface. So it should be modified for land applications.


In [20]:
from pytools.netcdf.sensors import create,ncml,merge,crawl

In [21]:
sensor_urn='urn:it.cnr.ismar.ve:sensor:wave_height'
station_urn='urn:it.cnr.ismar.ve:station:piattaforma_aqua_alta'

In [22]:
attributes={'units':'m'}

In [23]:
create.create_timeseries_file(output_directory='/usgs/data2/notebook/data',
                              latitude=0.29,
                              longitude=36.9,
                              full_station_urn=station_urn,
                              full_sensor_urn=sensor_urn,
                              sensor_vertical_datum='MSL',
                              times=secs,
                              verticals=z, 
                              values=values,
                              attributes=attributes,
                              global_attributes={},
                              output_filename='wave_data.nc')

The generated NetCDF CF-1.6 DSG file is served on TDS here:

http://geoport.whoi.edu/thredds/catalog/usgs/data2/notebook/data/catalog.html?dataset=usgs/data2/notebook/data/wave_data.nc

It appears that the SOS service is working, which is one of the reasons to write CF-1.6 DSG!


In [ ]: