This notebook is meant to demonstrate the transformation of an annotated notebook into a HTTP API using the Jupyter kernel gateway. The result is a simple scotch recommendation engine.
The original scotch data is from https://www.mathstat.strath.ac.uk/outreach/nessie/nessie_whisky.html.
In [1]:
import pandas as pd
import pickle
import requests
import json
In [2]:
features_uri = 'https://ibm.box.com/shared/static/2vntdqbozf9lzmukkeoq1lfi2pcb00j1.dataframe'
sim_uri = 'https://ibm.box.com/shared/static/54kzs5zquv0vjycemjckjbh0n00e7m5t.dataframe'
In [3]:
resp = requests.get(features_uri)
resp.raise_for_status()
features_df = pickle.loads(resp.content)
In [4]:
resp = requests.get(sim_uri)
resp.raise_for_status()
sim_df = pickle.loads(resp.content)
Drop the cluster column. Don't need it here.
In [5]:
features_df = features_df.drop('cluster', axis=1)
In [6]:
REQUEST = json.dumps({
'path' : {},
'args' : {}
})
Provide a way to get the names of all the scotches known by the model.
In [7]:
# GET /scotches
names = sim_df.columns.tolist()
print(json.dumps(dict(names=names)))
Let clients query for features about a specific scotch given its name.
In [ ]:
# GET /scotches/:scotch
request = json.loads(REQUEST)
name = request['path'].get('scotch', 'Talisker')
features = features_df.loc[name]
# can't use to_dict because it retains numpy types which blow up when we json.dumps
print('{"features":%s}' % features.to_json())
Let clients request a set of scotches similar to the one named. Let clients specify how many results they wish to receive (count
) and if they want all of the raw feature data included in the result or not (include_features
).
In [ ]:
# GET /scotches/:scotch/similar
request = json.loads(REQUEST)
name = request['path'].get('scotch', 'Talisker')
count = request['args'].get('count', 5)
inc_features = request['args'].get('include_features', True)
similar = sim_df[name].order(ascending=False)
similar.name = 'Similarity'
df = pd.DataFrame(similar).ix[1:count+1]
if inc_features:
df = df.join(features_df)
df = df.reset_index().rename(columns={'Distillery': 'Name'})
result = {
'recommendations' : [row[1].to_dict() for row in df.iterrows()],
'for': name
}
print(json.dumps(result))