Kiva-Tiny Example


Starting a Blaze Server

Before running this notebook, start the example Blaze server in the [blaze_web]/example directory.

    ~/blaze_web/example $ python start_server.py
    Starting Blaze Server

Viewing an array from the browser

If the server started properly, it is now serving data from localhost:8080. The data it is serving corresponds with the array data in the example/arrays subdirectory, which you can browse to see the included example data. To see the tiny subset of a Kiva data snapshot, navigate to

http://localhost:8080/kiva_tiny/loans

What you should see is an annotated version of the Blaze datashape for the raw dataset. At the beginning, you should see

type BlazeDataShape = 23, {

which tells you that the data is an array of 23 elements, and each element is a record. In this case, the data is from samples/server/arrays/kiva_tiny/loans.json.

At the top of the record datashape, there is an id field. We can click on this field to access this field, leading to the following url:

http://localhost:8080/kiva_tiny/loans.id

where we can see that accessing fields is like attribute access in Python. You can see a JSON version of the data for the array by clicking on the JSON link near the top. This takes you to

http://localhost:8080/kiva_tiny/loans.id?r=data.json

Similarly, you can get the Blaze datashape in its raw form by clicking on the 'BlazeDataShape' link.

http://localhost:8080/kiva_tiny/loans.id?r=datashape

Click back to the kiva_tiny/loans array, which is

http://localhost:8080/kiva_tiny/loans

This can be treated as an array of loans. We can use Python-style indexing to access

http://localhost:8080/kiva_tiny/loans[-1]/

This access the last item, and the datashape you get starts like

type BlazeDataShape = {

indicating that it's a single record.

Python slicing syntax is supported here, for example

http://localhost:8080/kiva_tiny/loans[1:15]/

From here, you can immediately view the data at a deeper level of the record hierarchy. If you scroll near the bottom to the 'payments' field, and click on the 'JSON' link beside the field 'amount', it will take you to the link

http://localhost:8080/kiva_tiny/loans[1:15].payments.amount?r=data.json

Viewing an array from Python

In addition to directly accessing the data from a web browser, there is a remote array Python object that can be used to get at the same data. This support is a work in progress, and will move into its proper home, blaze-core, relatively soon.


In [1]:
import blaze
from blaze.datadescriptor import RemoteDataDescriptor
from IPython.core.display import HTML

In [2]:
r = blaze.array(RemoteDataDescriptor('http://localhost:8080/kiva_tiny/loans'))

Let's look at just the payments for the same array subset we just viewed in the web browser.


In [3]:
rpart = r[1:3].payments

The repr of an rarray gives a little bit of information about where the data is from, and what its datashape is.


In [4]:
rpart


Out[4]:
array(RemoteDataDescriptor('http://localhost:8080/kiva_tiny/loans[1:3].payments'),
      dshape='2, var, { amount : float64; local_amount : float64; processed_date : json; settlement_date : json; rounded_local_amount : float64; currency_exchange_loss_amount : float64; payment_id : int64; comment : json }')

The data can be retrieved locally using the 'blaze.eval' function. This downloads the data into a local in-memory array.


In [5]:
blaze.eval(rpart)


Out[5]:
array(
[[],
 [ {'amount': 1.0, 'processed_date': '"2010-09-30T07:00:00Z"', 'payment_id': 172244766, 'settlement_date': '"2010-10-16T08:33:19Z"', 'rounded_local_amount': 1.0, 'local_amount': 1.0, 'currency_exchange_loss_amount': 0.0, 'comment': 'null'}]],
      dshape='2, var, { amount : float64; local_amount : float64; processed_date : json; settlement_date : json; rounded_local_amount : float64; currency_exchange_loss_amount : float64; payment_id : int64; comment : json }')

Compute Sessions and Computed Fields

NOTE: The following does not work presently, it's being redesigned.

Now let's take a brief look at a way to do remote computations on a Blaze server, by creating computed fields. Let's say we want to have a field which sums all the payment amounts, so at a glance we can see the total payments. Each loan has a variable-sized array of payments associated at it, for the first payment it looks like this:


In [6]:
p = r[0].payments
p


Out[6]:
array(RemoteDataDescriptor('http://localhost:8080/kiva_tiny/loans[0].payments'),
      dshape='9, { amount : float64; local_amount : float64; processed_date : json; settlement_date : json; rounded_local_amount : float64; currency_exchange_loss_amount : float64; payment_id : int64; comment : json }')

In [7]:
p.amount


Out[7]:
array(RemoteDataDescriptor('http://localhost:8080/kiva_tiny/loans[0].payments.amount'),
      dshape='9, float64')

We'd like a field which gives us


In [ ]:
import numpy as np
np.sum(p.amount)

To create this computed field, first we need to start a compute session on the server, as follows.


In [ ]:
from blaze.client.session import session
s = session(r.url)

Now we can use the add_computed_fields function provided on 's' to get an array with the sum.


In [ ]:
r2 = s.add_computed_fields(r, [('payment_total_amount', ndt.float64,
                               'sum(as_numpy(payments.amount))')])
HTML('<a href="%s" target="_blank">%s</a>' % (r2.url, r2.url))

Click on the link, scroll down to the bottom, and you should see a new field called 'payment_total_amount'. If you click on the 'JSON' link beside it, it should give you the sums. Alternatively, we can get it with Python.


In [ ]:
r2.payment_total_amount.get_dynd()

It may be interesting to compare this total amount paid with the original loan amount and the separate paid_amount field. These values are here:


In [ ]:
r2.terms.loan_amount.get_dynd()

In [ ]:
r2.paid_amount.get_dynd()

This last field is still preserved in JSON, instead of being treated as a native DyND type, because DyND doesn't yet support option types for the missing 'null' values.

To see the percentage paid, let's add another computed field.


In [ ]:
r3 = s.add_computed_fields(r2, [('percentage_paid', ndt.float64,
                                 '100.0 * payment_total_amount / as_py(terms.loan_amount)')])
HTML('<a href="%s" target="_blank">%s</a>' % (r3.url, r3.url))

In [ ]:
r3.percentage_paid.get_dynd()

In [ ]: