Sets and Dictionaries in Python: JSON (Learner Version)

Objectives

  • Correctly define "JSON" and give simple examples of valid JSON structures.
  • Describe JSON's strengths and weaknesses as a storage format.
  • Write code to read and write JSON-formatted data files using standard libraries.

Lesson

  • Custom data formats require custom code, which requires debugging and maintenance
  • Use JavaScript Object Notation (JSON) instead
  • Represent any mix of numbers, strings, Booleans, None, lists, and dictionaries
  • Can be written and read by many languages

In [1]:
import json
birthdays = {'Curie' : 1867, 'Hopper' : 1906, 'Franklin' : 1920}
as_string = json.dumps(birthdays)
print as_string
print type(as_string)


{"Curie": 1867, "Hopper": 1906, "Franklin": 1920}
<type 'str'>

In [2]:
writer = open('/tmp/example.json', 'w')
json.dump(birthdays, writer)
writer.close()

reader = open('/tmp/example.json', 'r')
duplicate = json.load(reader)
reader.close()

print 'original:', birthdays


original: {'Curie': 1867, 'Hopper': 1906, 'Franklin': 1920}

In [5]:
print 'duplicate:', duplicate


duplicate: {u'Curie': 1867, u'Hopper': 1906, u'Franklin': 1920}
  • Note: strings are Unicode
  • Data read in is the same:

In [6]:
print 'original == duplicate:', birthdays == duplicate


original == duplicate: True
  • But it's a new copy:

In [7]:
print 'original is duplicate:', birthdays is duplicate


original is duplicate: False

In [8]:
!cat /tmp/example.json


{"Curie": 1867, "Hopper": 1906, "Franklin": 1920}
  • Inventory data file is now:

In [3]:
!cat inventory-03.json


{"He" : 1, "H" : 4, "O" : 3}
  • Formulas file is now:

In [4]:
!cat formulas-03.json


{"helium"   : {"He" : 1},
 "water"    : {"H" : 2, "O" : 1},
 "hydrogen" : {"H" : 2}}

In [9]:
def main(inventory_file, formula_file):
    with open(inventory_file, 'r') as reader:
        inventory = json.load(reader)
    with open(formula_file, 'r') as reader:
        formulas = json.load(reader)
    counts = calculate_counts(inventory, formulas)
    show_counts(counts)
  • The read_inventory and read_formulas functions no longer need to be written (debugged, documented, maintained, ...)
  • Programs in other languages can read our data files
  • But JSON doesn't support comments
  • And doesn't handle aliasing

Key Points

  • The JSON data format can represent arbitrarily-nested lists and dictionaries containing strings, numbers, Booleans, and None.
  • Using JSON reduces the code we have to write ourselves and improves interoperability with other programming languages.
  • JSON doesn't allow for comments, and doesn't handle aliasing.