In [1]:
import json
import pandas as pd
from pandas.io.json import json_normalize

The data is a JSON file containing ratings and reviews for a coffee shop.


In [9]:
with open("blue-bottle-coffee-san-francisco.json", "r") as f:
     data = json.load(f)

Let's extract the contents of reviewList (i.e. ratings and content) into a Pandas dataframe.


In [3]:
df = json_normalize(data, "reviewList")
df.head(3)


Out[3]:
content ratings
0 I don't think they're the friendliest here but... 4.0
1 I freaking love blue bottle coffee. It's so ri... 5.0
2 I love this location. It's in the cutest garag... 5.0

Since ratings were originally strings, let's convert them to numeric values so that we can do analyses on them.


In [4]:
df.dtypes


Out[4]:
content    object
ratings    object
dtype: object

In [5]:
df["ratings"] = pd.to_numeric(df["ratings"])
df.dtypes


Out[5]:
content     object
ratings    float64
dtype: object

In [7]:
df["ratings"].describe()


Out[7]:
count    760.000000
mean       4.373684
std        0.926746
min        1.000000
25%        4.000000
50%        5.000000
75%        5.000000
max        5.000000
Name: ratings, dtype: float64

Now that our Pandas dataframe is in the correct format, let's write it to BigQuery. The df.to_gcp() function below creates a dataset named mydataset and a table named mytable whose schema is df.dtypes. You may check that this dataset is present in the Bigquery UI.


In [6]:
project_id = "your-project-ID"
df.to_gbq("mydataset.mytable", project_id=project_id, verbose=True, if_exists="replace")




Streaming Insert is 100% Complete


You may also query this dataset from within Pandas, which returns a dataframe with the query results.


In [8]:
query = "SELECT * FROM mydataset.mytable LIMIT 5"
pd.read_gbq(query=query, dialect="standard", project_id=project_id)


Requesting query... ok.
Job ID: job_LNTKgGB5jRLL6RSbk6yGHgm_dKjo
Query running...
Query done.
Processed: 0.0 B
Standard price: $0.00 USD

Retrieving results...
Got 5 rows.

Total time taken 3.37 s.
Finished at 2017-08-18 18:22:07.
Out[8]:
content ratings
0 GREAT COFFEE. 5 stars for the coffee, minus 4 ... 1.0
1 After hearing so much about Blue Bottle Coffee... 4.0
2 I've died and gone to coffee heaven. Just anot... 5.0
3 The BEST COFFEE and I am a picky biyatch when ... 5.0
4 Good coffee, but really bitter.  I don't eat m... 3.0