Treasure Data has a python client, which means pandas/python users can connect directly from their iPython Notebooks.
All you need is a Treasure Data account, which you can get from here
In [2]:
import tdclient
import pandas as pd
import numpy as np
%matplotlib inline
You need to get your Treasure Data API key. There are two ways to fetch your API keys after you sign up for Treasure Data.
td
command user, running the following command exposes your API key.
td apikey:show
In [3]:
apikey = 'Your API key here' # Setting your API key
In [4]:
client = tdclient.Client(apikey) # instantiating the client
As you can see below, running queries is easy. Just use the query
method, which accepts three arguments.
type='presto'
here to use Presto and not Hive.
In [5]:
job = client.query('sample_datasets',
"SELECT TD_TIME_FORMAT(time, 'yyyy') AS t, SUM(volume) "
"FROM nasdaq "
"WHERE symbol='AMZN' "
"GROUP BY TD_TIME_FORMAT(time, 'yyyy') "
"ORDER BY t", type='presto')
In [6]:
[job.status(), job.finished()]
Out[6]:
In [7]:
results = [r for r in job.result()]
In [8]:
results_df = pd.DataFrame.from_records(results, columns=('year', 'AMZN trade volume'))
In [9]:
results_df.plot(x='year')
Out[9]: