In [ ]:
#include <string>
#include <iostream>
#include "xtensor/xrandom.hpp"
#include "xtensor/xmath.hpp"
#include "xframe/xio.hpp"
#include "xframe/xvariable.hpp"
#include "xframe/xvariable_view.hpp"
#include "xframe/xvariable_masked_view.hpp"
#include "xframe/xreindex_view.hpp"
Let's first define some useful type aliases so we can reduce the amount of typing
In [ ]:
using coordinate_type = xf::xcoordinate<xf::fstring>;
using variable_type = xf::xvariable<double, coordinate_type>;
using data_type = variable_type::data_type;
In the following we define a 2D variable called dry_temperature
. A variable in xframe
is the composition of a tensor data and a coordinate system. It is the equivalent of DataArray
from xarray. The tensor data can be any valid xtensor
expression whose value_type
is xoptional
. Common types are xarray_optional
, xtensor_optional
and xoptional_assembly
, which allows to create an optional expression from existing regular tensor expressions.
In [ ]:
data_type dry_temperature_data = xt::eval(xt::random::rand({6, 3}, 15., 25.));
dry_temperature_data(0, 0).has_value() = false;
dry_temperature_data(2, 1).has_value() = false;
In [ ]:
dry_temperature_data
Once the data is defined, we can define the coordinate system. A coordinate system is a mapping of dimension names with label axes. Although it is possible to create an axe from a vector of labels, then the coordinate system from a map containing axes and dimension names, and finally the variable from this coordinate system and the previously created data, xframe
makes use of the initialize-list syntax so everything can be created in place with a very expressive syntax:
In [ ]:
auto time_axis = xf::axis({"2018-01-01", "2018-01-02", "2018-01-03", "2018-01-04", "2018-01-05", "2018-01-06"});
In [ ]:
auto dry_temperature = variable_type(
dry_temperature_data,
{
{"date", time_axis},
{"city", xf::axis({"London", "Paris", "Brussels"})}
}
);
In [ ]:
dry_temperature
Like xarray, xframe
supports four different kinds of indexing as described below:
Dimension lookup: Positional - Index lookup: By integer
In [ ]:
dry_temperature(3, 0)
Dimension lookup: Positional - Index lookup: By label
In [ ]:
dry_temperature.locate("2018-01-04", "London")
Dimension lookup: By name - Index lookup: By integer
In [ ]:
dry_temperature.iselect({{"date", 3}, {"city", 0}})
Dimension lookup: By name - Index lookup: By label
In [ ]:
dry_temperature.select({{"date", "2018-01-04"}, {"city", "London"}})
Contrary to xarray, these methods return a single value, they do not allow to create views of the variable by selecting many data points. This feature is possible with xframe
though, by using the free function counterparts of the methods described above, and will be covered in a next section.
Variable support all the common mathematics operations and functions; like xtensor, these operations are lazy and return expressions. xframe
supports operations on variables with different dimensions and labels thanks to broadcasting. This one is performed according the dimension names rather than the dimension positions as shown below.
Let's first define a variable containing the relative humidity for cities:
In [ ]:
data_type relative_humidity_data = xt::eval(xt::random::rand({3}, 50.0, 70.0));
auto relative_humidity = variable_type(
relative_humidity_data,
{
{"city", xf::axis({"Paris", "London", "Brussels"})}
}
);
relative_humidity
We will use it and the previously defined dry_temperature
variable (that we show again below) to compute the water_pour_pressure
In [ ]:
dry_temperature
In [ ]:
auto water_vapour_pressure = 0.01 * relative_humidity * 6.1 * xt::exp((17.27 * dry_temperature) / (237.7 + dry_temperature));
In [ ]:
water_vapour_pressure
The relative humidity has been broadcasted so its values are repeated for each date. When the labels of variables involved in an operation are not the same, the result contains the intersection of the label sets:
In [ ]:
data_type coeff_data = xt::eval(xt::random::rand({6, 3}, 0.7, 0.9));
dry_temperature_data(0, 0).has_value() = false;
dry_temperature_data(2, 1).has_value() = false;
auto coeff = variable_type(
coeff_data,
{
{"date", time_axis},
{"city", xf::axis({"London", "New York", "Brussels"})}
}
);
coeff
In [ ]:
auto res = coeff * dry_temperature;
res
In [ ]:
data_type pressure_data = {{{ 1., 2., 3. },
{ 4., 5., 6. },
{ 7., 8., 9. }},
{{ 1.3, 1.5, 1.},
{ 2., 2.3, 2.4},
{ 3.1, 3.8, 3.}},
{{ 8.5, 8.2, 8.6},
{ 7.5, 8.6, 9.7},
{ 4.5, 4.4, 4.3}}};
In [ ]:
auto pressure = variable_type(
pressure_data,
{
{"x", xf::axis(3)},
{"y", xf::axis(3, 6, 1)},
{"z", xf::axis(3)},
}
);
In [ ]:
pressure
In [ ]:
dry_temperature
Dimension lookup: Positional - Index lookup: By integer
In [ ]:
auto v1 = ilocate(dry_temperature, xf::irange(0, 5, 2), xf::irange(1, 3));
v1
Dimension lookup: Positional - Index lookup: By label
In [ ]:
auto v2 = locate(dry_temperature, xf::range("2018-01-01", "2018-01-06", 2), xf::range("Paris", "Brussels"));
v2
Dimension lookup: By name - Index lookup: By integer
In [ ]:
auto v3 = iselect(dry_temperature, {{"city", xf::irange(1, 3)}, {"date", xf::irange(0, 5, 2)}});
v3
Dimension lookup: By name - Index lookup: By label
In [ ]:
auto v4 = select(dry_temperature,
{{"city", xf::range("Paris", "Brussels")},
{"date", xf::range("2018-01-01", "2018-01-06", 2)}});
v4
Dimension lookup: Positional - Index lookup: By integer
In [ ]:
auto v5 = ilocate(dry_temperature, xf::ikeep(0, 2, 4), xf::idrop(0));
v5
Dimension lookup: By name - Index lookup: By integer
In [ ]:
auto v6 = locate(dry_temperature, xf::keep("2018-01-01", "2018-01-03", "2018-01-05"), xf::drop("London"));
v6
Dimension lookup: By name - Index lookup: By integer
In [ ]:
auto v7 = iselect(dry_temperature, {{"city", xf::idrop(0)}, {"date", xf::ikeep(0, 2, 4)}});
v7
Dimension lookup: By name - Index lookup: By label
In [ ]:
auto v8 = select(dry_temperature,
{{"city", xf::drop("London")},
{"date", xf::keep("2018-01-01", "2018-01-03", "2018-01-05")}});
v8
In [ ]:
pressure
In [ ]:
auto masked_pressure = xf::where(
pressure,
not_equal(pressure.axis<int>("x"), 2) && pressure.axis<int>("z") < 2
);
In [ ]:
masked_pressure
When assigning to a masking view, masked values are not changed. Like other views, a masking view is a proxy on its junderlying expression, no copy is made, so changing a unmasked value actually changes the corresponding value in the underlying expression.
In [ ]:
masked_pressure = 1.;
masked_pressure
In [ ]:
pressure
Reindexing views give variables new set of coordinates to corresponding dimensions. Like other views, no copy is involved. Asking for values corresponding to new labels not found in the original set of coordinates returns missing values. In the next example, we reindex the city
dimension.
In [ ]:
dry_temperature
In [ ]:
auto temp = reindex(dry_temperature, {{"city", xf::axis({"London", "New York", "Brussels"})}});
temp
The reindex_like
is a shortcut that allows to reindex a variable given the set of coordinates of another variable
In [ ]:
auto dry_temp2 = variable_type(
dry_temperature_data,
{
{"date", time_axis},
{"city", xf::axis({"London", "New York", "Brussels"})}
}
);
auto temp2 = reindex_like(dry_temperature, dry_temp2);
temp2
In [ ]: