Computed Fields in Blaze/DyND

This demo shows a mechanism for creating computed fields, creating 'full name' and 'age' fields.

To start, we import the DyND library.


In [1]:
from dynd import nd, ndt

Creating a Table

Let's create a table with name and birthday fields.


In [3]:
a = nd.array([('Smith', 'John', '1979-01-22'),
                ('Katz', 'Barbara', '1990-12-03'),
                ('Barker', 'Henry', '1979-06-12')],
            dtype='{lastname: string(32); firstname: string(32); birthday: date}')

In [4]:
a


Out[4]:
nd.array([["Smith", "John", 1979-01-22], ["Katz", "Barbara", 1990-12-03], ["Barker", "Henry", 1979-06-12]], strided_dim<cstruct<string<32> lastname, string<32> firstname, date birthday>>)

Because we used fixed-buffer strings, we can reassign their values and convert them into NumPy.


In [5]:
a[2].firstname = 'George'

In [6]:
nd.as_numpy(a, allow_copy=True)


Out[6]:
array([(u'Smith', u'John', datetime.date(1979, 1, 22)),
       (u'Katz', u'Barbara', datetime.date(1990, 12, 3)),
       (u'Barker', u'George', datetime.date(1979, 6, 12))], 
      dtype={'names':['lastname','firstname','birthday'], 'formats':['<U32','<U32','<M8[D]'], 'offsets':[0,128,256], 'itemsize':264, 'aligned':True})

Adding Computed Fields

Now let's add our two computed fields.

The first one, 'fullname', is simple, we simply concatenate the first and last name strings together. The second one, 'age', is a little trickier. There doesn't seem to be a simple one-liner for this, and the code here will crash on February 29th, but hopefully it demonstrates the idea nicely.


In [7]:
b = nd.add_computed_fields(a,
    [('fullname', ndt.string,
        'firstname + " " + lastname'),
     ('age', ndt.int32,
        'date.today().year - birthday.year - 1 + (date.today().replace(year=birthday.year) >= birthday)')])

Now 'b' is a deferred evaluation object, with a fairly complicated dynd type. You don't need to know what's going on in the type printout, but here it is for show.


In [9]:
nd.type_of(b)


Out[9]:
ndt.type('strided_dim<expr<cstruct<string<32> lastname, string<32> firstname, date birthday, string fullname, int32 age>, op0=cstruct<string<32> lastname, string<32> firstname, date birthday>, expr=computed_field_expr(op0)>>')

We can evaluate back to an object with no deferred expression using the 'eval' method.


In [10]:
b.eval()


Out[10]:
nd.array([["Smith", "John", 1979-01-22, "John Smith", 34], ["Katz", "Barbara", 1990-12-03, "Barbara Katz", 22], ["Barker", "George", 1979-06-12, "George Barker", 34]], strided_dim<cstruct<string<32> lastname, string<32> firstname, date birthday, string fullname, int32 age>>)

Testing the Deferred Evaluation

Finally, let's modify values in 'a', and see how that affects 'b'.


In [11]:
print(nd.as_py(b[1]))
print(nd.as_py(b.age))


{u'lastname': u'Katz', u'fullname': u'Barbara Katz', u'birthday': datetime.date(1990, 12, 3), u'age': 22, u'firstname': u'Barbara'}
[34, 22, 34]

In [12]:
a[1] = ['Ford', 'Carol', '1967-05-12']

In [13]:
print(nd.as_py(b[1]))
print(nd.as_py(b.age))


{u'lastname': u'Ford', u'fullname': u'Carol Ford', u'birthday': datetime.date(1967, 5, 12), u'age': 46, u'firstname': u'Carol'}
[34, 46, 34]