JSON documents are very popular, especially when it comes to API responces and/or financial data. They provide nice, dictionary-like interface to data with the opportunity of working with keys rather than indecies only. Thus, Python has a built-in support for JSON documents with necessary ready-made functions. To access those functions, one needs to import the JSON library, which comes directly installed with Python.
In [3]:
import json
Let's create a sample JSON fie and save it to some variable called input.
In [4]:
input = '''[
{ "id" : "01",
"status" : "Instructor",
"name" : "Hrant"
} ,
{ "id" : "02",
"status" : "Student",
"name" : "Jimmy"
}
]'''
As you can see here, our JSON documents is nothing else than a list of two dictionaires with 3 keys each (and a value for each key). To parse it as a usual Python object (list in this case), the loads() function from the json package is used.
In [5]:
# parse/load string
data = json.loads(input)
# data is a usual list
In [6]:
type(data)
Out[6]:
In [7]:
print(data)
In [8]:
from pprint import pprint
In [9]:
pprint(data)
In [10]:
print 'User count:', len(data), "\n"
In [11]:
data[0]['name']
Out[11]:
In [12]:
for element in data:
print 'Name: ', element['name']
print 'Id: ', element['id']
print 'Status: ', element['status'], "\n"
In [13]:
import pandas as pd
In [14]:
address = "C:\Data_scraping\JSON\sample_data.json"
In [15]:
my_json_data = pd.read_json(address)
In [16]:
my_json_data.head()
Out[16]:
In [17]:
import json
In [18]:
with open(address,"r") as file:
local_json = json.load(file)
In [19]:
print(local_json)
In [20]:
type(local_json)
Out[20]:
In [21]:
pprint(local_json)
In [22]:
with open('our_json_w.json', 'w') as output:
json.dump(local_json, output)
Yet, as you may have already noticed, the saved JSON files not human-readible. To make them more user friendly, we may sort the Keys and provide 4-tab indentation.
In [23]:
with open('our_json_w.json', 'w') as output:
json.dump(local_json, output, sort_keys = True, indent = 4)
In [25]:
import csv, json
address = "C:\Data_scraping\JSON\sample_data.json"
with open(address,"r") as file:
local_json = json.load(file)
with open("from_json.csv", "w") as f:
writer = csv.writer(f)
writer.writerow(["ID","Name","Status"])
for item in local_json:
writer.writerow([item['id'],item['name'],item['status']])