In-Class Coding Lab: Understanding The Foundations of Web APIs

Overview

This lab covers the foundations of what is necessary to properly use consume HTTP web service API's with Python . Here's what we will cover.

  1. Understading requests and responses
  2. Proper error handling
  3. Parameter handling
  4. Refactoring as a function

In [ ]:
# Run this to make sure you have the pre-requisites!
!pip install -q requests

Part 1: Understanding Requests and responses

In this part we learn about the Python requests module. http://docs.python-requests.org/en/master/user/quickstart/

This module makes it easy to write code to send HTTP requests over the internet and handle the responses. It will be the cornerstone of our API consumption in this course. While there are other modules which accomplish the same thing, requests is the most straightforward and easiest to use.

We'll begin by importing the modules we will need. We do this here so we won't need to include these lines in the other code we write in this lab.


In [ ]:
# start by importing the modules we will need
import requests
import json

The request

As you learned in class and your assigned readings, the HTTP protocol has verbs which consititue the type of request you will send to the remote resource, or url. Based on the url and request type, you will get a response.

The following line of code makes a get request (that's the HTTP verb) to Google's Geocoding API service. This service attempts to convert the address (in this case Syracuse University) into a set of coordinates global coordinates (Latitude and Longitude), so that location can be plotted on a map.


In [ ]:
url = 'https://nominatim.openstreetmap.org/search?q=Hinds+Hall+Syracuse+University&format=json'
response = requests.get(url)

The response

The get() method returns a Response object variable. I called it response in this example but it could be called anything.

The HTTP response consists of a status code and body. The status code lets you know if the request worked, while the body of the response contains the actual data.


In [ ]:
response.ok # did the request work?

In [ ]:
response.text  # what's in the body of the response, as a raw string

Converting responses into Python object variables

In the case of web site url's the response body is HTML. This should be rendered in a web browser. But we're dealing with Web Service API's so...

In the case of web API url's the response body could be in a variety of formats from plain text, to XML or JSON. In this course we will only focus on JSON format because as we've seen these translate easily into Python object variables.

Let's convert the response to a Python object variable. I this case it will be a Python dictionary


In [ ]:
geodata = response.json()  # try to decode the response from JSON format
geodata                    # this is now a Python object variable

With our Python object, we can now walk the list of dictionary to retrieve the latitude and longitude


In [ ]:
lat = geodata[0]['lat']
lon =geodata[0]['lon']
print(lat, lon)

In the code above we "walked" the Python list of dictionary to get to the location

  • geodata is a list
  • geodata[0] is the first item in that list, a dictionary
  • geodata[0]['lat'] is a dictionary key which represents the latitude
  • geodata[0]['lon'] is a dictionary key which represents the longitude

It should be noted that this process will vary for each API you call, so its important to get accustomed to performing this task. You'll be doing it quite often.

One final thing to address. What is the type of lat and lon?


In [ ]:
type(lat), type(lon)

Bummer they are strings. we want them to be floats so we will need to parse the strings with the float() function:


In [ ]:
lat = float(geodata[0]['lat'])
lon = float(geodata[0]['lon'])
print("Latitude: %f, Longitude: %f" % (lat, lon))

Now You Try It!

Walk the geodata object variable and reteieve the value under the key display_name and the key bounding_box


In [ ]:
# todo:
# retrieve the place_id put in a variable
# retrieve the formatted_address put it in a variable
# print both of them out

Part 2: Parameter Handling

In the example above we hard-coded "Hinds Hall Syracuse University" into the request:

url = 'https://nominatim.openstreetmap.org/search?q=Hinds+Hall+Syracuse+University&format=json'

A better way to write this code is to allow for the input of any location and supply that to the service. To make this work we need to send parameters into the request as a dictionary. This way we can geolocate any address!

You'll notice that on the url, we are passing key-value pairs the key is q and the value is Hinds+Hall+Syracuse+University. The other key is format and the value is json. Hey, Python dictionaries are also key-value pairs so:


In [ ]:
url = 'https://nominatim.openstreetmap.org/search'  # base URL without paramters after the "?"
search = 'Hinds Hall Syracuse University'
options = { 'q' : search, 'format' : 'json'}
response = requests.get(url, params = options)            
geodata = response.json()
coords = { 'lat' : float(geodata[0]['lat']), 'lng' : float(geodata[0]['lon']) }
print("Search for:", search)
print("Coordinates:", coords)
print("%s is located at (%f,%f)" %(search, coords['lat'], coords['lng']))

Looking up any address

RECALL: For requests.get(url, params = options) the part that says params = options is called a named argument, which is Python's way of specifying an optional function argument.

With our parameter now outside the url, we can easily re-write this code to work for any location! Go ahead and execute the code and input Queens, NY. This will retrieve the coordinates (40.728224,-73.794852)


In [ ]:
url = 'https://nominatim.openstreetmap.org/search'  # base URL without paramters after the "?"
search = input("Enter a loacation to Geocode: ")
options = { 'q' : search, 'format' : 'json'}
response = requests.get(url, params = options)            
geodata = response.json()
coords = { 'lat' : float(geodata[0]['lat']), 'lng' : float(geodata[0]['lon']) }
print("Search for:", search)
print("Coordinates:", coords)
print("%s is located at (%f,%f)" %(search, coords['lat'], coords['lng']))

So useful, it should be a function!

One thing you'll come to realize quickly is that your API calls should be wrapped in functions. This promotes readability and code re-use. For example:


In [ ]:
def get_coordinates(search):
    url = 'https://nominatim.openstreetmap.org/search'  # base URL without paramters after the "?"
    options = { 'q' : search, 'format' : 'json'}
    response = requests.get(url, params = options)            
    geodata = response.json()
    coords = { 'lat' : float(geodata[0]['lat']), 'lng' : float(geodata[0]['lon']) }
    return coords

# main program here:
location = input("Enter a location: ")
coords = get_coordinates(location)
print("%s is located at (%f,%f)" %(location, coords['lat'], coords['lng']))

Other request methods

Not every API we call uses the get() method. Some use post() because the amount of data you provide it too large to place on the url.

An example of this is the Text-Processing.com sentiment analysis service. http://text-processing.com/docs/sentiment.html This service will detect the sentiment or mood of text. You give the service some text, and it tells you whether that text is positive, negative or neutral.


In [ ]:
# 'you suck' == 'negative'
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : 'you suck'}
response = requests.post(url, data = options)
sentiment = response.json()
sentiment

In [ ]:
# 'I love cheese' == 'positive'
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : 'I love cheese'}
response = requests.post(url, data = options)
sentiment = response.json()
sentiment

In the examples provided we used the post() method instead of the get() method. the post() method has a named argument data which takes a dictionary of data. The key required by text-processing.com is text which hold the text you would like to process for sentiment.

We use a post in the event the text we wish to process is very long. Case in point:


In [ ]:
tweet = "Arnold Schwarzenegger isn't voluntarily leaving the Apprentice, he was fired by his bad (pathetic) ratings, not by me. Sad end to a great show"
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : tweet }
response = requests.post(url, data = options)
sentiment = response.json()
sentiment

Now You Try It!

Use the above example to write a program which will input any text and print the sentiment using this API!


In [ ]:
# todo write code here

Part 3: Proper Error Handling (In 3 Simple Rules)

When you write code that depends on other people's code from around the Internet, there's a lot that can go wrong. Therefore we perscribe the following advice:

Assume anything that CAN go wrong WILL go wrong

Rule 1: Don't assume the internet 'always works'

The first rule of programming over a network is to NEVER assume the network is available. You need to assume the worst. No WiFi, user types in a bad url, the remote website is down, etc.

We handle this in the requests module by catching the requests.exceptions.RequestException Here's an example:


In [ ]:
url = "http://this is not a website"
try:

    response = requests.get(url)  # throws an exception when it cannot connect

# internet is broken
except requests.exceptions.RequestException as e:
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

Rule 2: Don't assume the response you get back is valid

Assuming the internet is not broken (Rule 1) You should now check for HTTP response 200 which means the url responded successfully. Other responses like 404 or 501 indicate an error occured and that means you should not keep processing the response.

Here's one way to do it:


In [ ]:
url = 'http://www.syr.edu/mikeisawesum'  # this should 404
try:
    
    response = requests.get(url)
    
    if response.ok:  # same as response.status_code == 200
        data = response.text
    else: # Some other non 200 response code
        print("There was an Error requesting:", url, " HTTP Response Code: ", response.status_code)

# internet is broken
except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

Rule 2a: Use exceptions instead of if else in this case

Personally I don't like to use if ... else to handle an error. Instead, I prefer to instruct requests to throw an exception of requests.exceptions.HTTPError whenever the response is not ok. This makes the code you write a little cleaner.

Errors are rare occurences, and so I don't like error handling cluttering up my code.


In [ ]:
url = 'http://www.syr.edu/mikeisawesum'  # this should 404
try:
    
    response = requests.get(url)  # throws an exception when it cannot connect
    response.raise_for_status()   # throws an exception when not 'ok'
    data = response.text

# response not ok
except requests.exceptions.HTTPError as e:
    print("ERROR: Response from ", url, 'was not ok.')
    print("DETAILS:", e)
        
# internet is broken
except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

Rule 3: Don't assume the data you get back is the data you expect.

And finally, do not assume the data arriving the the response is the data you expected. Specifically when you try and decode the JSON don't assume that will go smoothly. Catch the json.decoder.JSONDecodeError.


In [ ]:
url = 'http://www.syr.edu' # this is HTML, not JSON
try:

    response = requests.get(url)  # throws an exception when it cannot connect
    response.raise_for_status()   # throws an exception when not 'ok'
    data = response.json()        # throws an exception when cannot decode json
    
# cannot decode json
except json.decoder.JSONDecodeError as e: 
    print("ERROR: Cannot decode the response into json")
    print("DETAILS", e)

# response not ok
except requests.exceptions.HTTPError as e:
    print("ERROR: Response from ", url, 'was not ok.')
    print("DETAILS:", e)
        
# internet is broken
except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

Now You try it!

Using the last example above, write a program to input a location, call the get_coordinates() function, then print the coordindates. Make sure to handle all three types of exceptions!!!


In [ ]:
# todo write code here to input a location, look up coordinates, and print
# it should handle errors!!!

Metacognition

Please answer the following questions. This should be a personal narrative, in your own voice. Answer the questions by double clicking on the question and placing your answer next to the Answer: prompt.

Questions

  1. Record any questions you have about this lab that you would like to ask in recitation. It is expected you will have questions if you did not complete the code sections correctly. Learning how to articulate what you do not understand is an important skill of critical thinking.

Answer:

  1. What was the most difficult aspect of completing this lab? Least difficult?

Answer:

  1. What aspects of this lab do you find most valuable? Least valuable?

Answer:

  1. Rate your comfort level with this week's material so far.

1 ==> I can do this on my own and explain how to do it.
2 ==> I can do this on my own without any help.
3 ==> I can do this with help or guidance from others. If you choose this level please list those who helped you.
4 ==> I don't understand this at all yet and need extra help. If you choose this please try to articulate that which you do not understand.

Answer:


In [ ]:
# SAVE YOUR WORK FIRST! CTRL+S
# RUN THIS CODE CELL TO TURN IN YOUR WORK!
from ist256.submission import Submission
Submission().submit()