Here is a little graphical representation of the way to think about this data. For clarification on how multidimensional data are represented in xray, visit: http://xray.readthedocs.org/en/latest/
In [1]:
from IPython.display import Image
Image(url='http://xray.readthedocs.org/en/latest/_images/dataset-diagram.png', embed=True, width=950, height=300)
Out[1]:
In [2]:
import numpy as np
import pandas as pd
import xray
This is an example of what our soil moisture data from the radio tower install will look like. Each site has a lat, lon, and elevation and at the site we will record rainfall as well as soil temp and soil moisture at two depths. So there are up to 3 dimensions along which the data are recorded: site, depth, and time.
In [3]:
temp = 15 + 8 * np.random.randn(2, 2, 3)
VW = 15 + 10 * abs(np.random.randn(2, 2, 3))
precip = 10 * np.random.rand(2, 3)
depths = [5, 20]
lons = [-99.83, -99.79]
lats = [42.63, 42.59]
elevations = [1600, 1650]
ds = xray.Dataset({'temperature': (['site', 'depth', 'time'], temp, {'units':'C'}),
'soil_moisture': (['site', 'depth', 'time'], VW, {'units':'percent'}),
'precipitation': (['site', 'time'], precip, {'units':'mm'})},
coords={'lon': (['site'], lons, {'units':'degrees east'}),
'lat': (['site'], lats, {'units':'degrees north'}),
'elevation': (['site'], elevations, {'units':'m'}),
'site': ['Acacia', 'Riverine'],
'depth': (['depth'], depths, {'units': 'cm'}),
'time': pd.date_range('2015-05-19', periods=3)})
In [4]:
ds
Out[4]:
To select the data for a specific site we just write:
In [5]:
ds.sel(site='Acacia')
Out[5]:
Now if we are only interested in soil moisture at the upper depth at a specific time, we can pull out just that one data point:
In [6]:
print ds.soil_moisture.sel(site='Acacia', time='2015-05-19', depth=5).values
For precip there are no depth values, so a specific data point can be pulled just by selecting for time and site:
In [7]:
print ds.precipitation.sel(site='Acacia', time='2015-05-19').values
In [8]:
ds.to_dataframe()
Out[8]:
In [9]:
ds.to_netcdf('test.nc')
In [10]:
sites = ['MainTower'] # can be replaced if there are more specific sites
lons = [36.8701] # degrees east
lats = [0.4856] # degrees north
elevations = [1610] # m above see level
coords={'site': (['site'], sites),
'lon': (['site'], lons, dict(units='degrees east')),
'lat': (['site'], lats, dict(units='degrees north')),
'elevation': (['site'], elevations, dict(units='m')),
'time': pd.date_range('2015-05-19', periods=3)}
precip = 10 * np.random.rand(1, 3)
ds = xray.Dataset({'precipitation': (['site', 'time'], precip, {'units':'mm'})},
coords=coords)
ds
Out[10]:
In [11]:
df = ds.to_dataframe()
df
Out[11]:
In [12]:
df.index
Out[12]:
In [13]:
from __init__ import *
from TOA5_to_netcdf import *
In [14]:
lons = [36.8701] # degrees east
lats = [0.4856] # degrees north
elevations = [1610] # m above see level
coords={'lon': (['site'], lons, dict(units='degrees east')),
'lat': (['site'], lats, dict(units='degrees north')),
'elevation': (['site'], elevations, dict(units='m'))}
In [15]:
path = os.getcwd().replace('\\','/')+'/current_data/'
input_file = path + 'CR3000_SN4709_flux.dat'
input_dict = {'has_header': True,
'header_file': input_file,
'datafile': 'soil',
'path': path,
'filename': 'CR3000_SN4709_flux.dat'}
In [16]:
df = createDF(input_file, input_dict, attrs)[0]
attrs, local_attrs = get_attrs(input_dict['header_file'], attrs)
ds = createDS(df, input_dict, attrs, local_attrs, site, coords_vals)
ds.to_netcdf(path='test2.nc', format='NETCDF3_64BIT')
xray.open_dataset('test2.nc')
Out[16]: