Sets and Dictionaries in Python: JSON (Learner Version)

Objectives

Correctly define "JSON" and give simple examples of valid JSON structures.
Describe JSON's strengths and weaknesses as a storage format.
Write code to read and write JSON-formatted data files using standard libraries.

Lesson

Custom data formats require custom code, which requires debugging and maintenance
Use JavaScript Object Notation (JSON) instead
Represent any mix of numbers, strings, Booleans, None, lists, and dictionaries
Can be written and read by many languages



In [1]:

    
import json
birthdays = {'Curie' : 1867, 'Hopper' : 1906, 'Franklin' : 1920}
as_string = json.dumps(birthdays)
print as_string
print type(as_string)









    



{"Curie": 1867, "Hopper": 1906, "Franklin": 1920}
<type 'str'>



In [2]:

    
writer = open('/tmp/example.json', 'w')
json.dump(birthdays, writer)
writer.close()

reader = open('/tmp/example.json', 'r')
duplicate = json.load(reader)
reader.close()

print 'original:', birthdays









    



original: {'Curie': 1867, 'Hopper': 1906, 'Franklin': 1920}



In [5]:

    
print 'duplicate:', duplicate









    



duplicate: {u'Curie': 1867, u'Hopper': 1906, u'Franklin': 1920}

Note: strings are Unicode

Data read in is the same:



In [6]:

    
print 'original == duplicate:', birthdays == duplicate









    



original == duplicate: True

But it's a new copy:



In [7]:

    
print 'original is duplicate:', birthdays is duplicate









    



original is duplicate: False



In [8]:

    
!cat /tmp/example.json









    



{"Curie": 1867, "Hopper": 1906, "Franklin": 1920}

Inventory data file is now:



In [3]:

    
!cat inventory-03.json









    



{"He" : 1, "H" : 4, "O" : 3}

Formulas file is now:



In [4]:

    
!cat formulas-03.json









    



{"helium"   : {"He" : 1},
 "water"    : {"H" : 2, "O" : 1},
 "hydrogen" : {"H" : 2}}



In [9]:

    
def main(inventory_file, formula_file):
    with open(inventory_file, 'r') as reader:
        inventory = json.load(reader)
    with open(formula_file, 'r') as reader:
        formulas = json.load(reader)
    counts = calculate_counts(inventory, formulas)
    show_counts(counts)

The read_inventory and read_formulas functions no longer need to be written (debugged, documented, maintained, ...)
Programs in other languages can read our data files
But JSON doesn't support comments
And doesn't handle aliasing

Key Points

The JSON data format can represent arbitrarily-nested lists and dictionaries containing strings, numbers, Booleans, and None.
Using JSON reduces the code we have to write ourselves and improves interoperability with other programming languages.
JSON doesn't allow for comments, and doesn't handle aliasing.