This lab covers the foundations of what is necessary to properly use consume HTTP web service API's with Python . Here's what we will cover.
In [ ]:
# Run this to make sure you have the pre-requisites!
!pip install -q requests
In this part we learn about the Python requests module. http://docs.python-requests.org/en/master/user/quickstart/
This module makes it easy to write code to send HTTP requests over the internet and handle the responses. It will be the cornerstone of our API consumption in this course. While there are other modules which accomplish the same thing, requests
is the most straightforward and easiest to use.
We'll begin by importing the modules we will need. We do this here so we won't need to include these lines in the other code we write in this lab.
In [ ]:
# start by importing the modules we will need
import requests
import json
As you learned in class and your assigned readings, the HTTP protocol has verbs which consititue the type of request you will send to the remote resource, or url. Based on the url and request type, you will get a response.
The following line of code makes a get request (that's the HTTP verb) to Google's Geocoding API service. This service attempts to convert the address (in this case Syracuse University
) into a set of coordinates global coordinates (Latitude and Longitude), so that location can be plotted on a map.
In [ ]:
url = 'https://nominatim.openstreetmap.org/search?q=Hinds+Hall+Syracuse+University&format=json'
response = requests.get(url)
In [ ]:
response.ok # did the request work?
In [ ]:
response.text # what's in the body of the response, as a raw string
In the case of web site url's the response body is HTML. This should be rendered in a web browser. But we're dealing with Web Service API's so...
In the case of web API url's the response body could be in a variety of formats from plain text, to XML or JSON. In this course we will only focus on JSON format because as we've seen these translate easily into Python object variables.
Let's convert the response to a Python object variable. I this case it will be a Python dictionary
In [ ]:
geodata = response.json() # try to decode the response from JSON format
geodata # this is now a Python object variable
With our Python object, we can now walk the list of dictionary to retrieve the latitude and longitude
In [ ]:
lat = geodata[0]['lat']
lon =geodata[0]['lon']
print(lat, lon)
In the code above we "walked" the Python list of dictionary to get to the location
geodata
is a listgeodata[0]
is the first item in that list, a dictionarygeodata[0]['lat']
is a dictionary key which represents the latitude geodata[0]['lon']
is a dictionary key which represents the longitudeIt should be noted that this process will vary for each API you call, so its important to get accustomed to performing this task. You'll be doing it quite often.
One final thing to address. What is the type of lat
and lon
?
In [ ]:
type(lat), type(lon)
Bummer they are strings. we want them to be floats so we will need to parse the strings with the float()
function:
In [ ]:
lat = float(geodata[0]['lat'])
lon = float(geodata[0]['lon'])
print("Latitude: %f, Longitude: %f" % (lat, lon))
In [ ]:
# todo:
# retrieve the place_id put in a variable
# retrieve the formatted_address put it in a variable
# print both of them out
In the example above we hard-coded "Hinds Hall Syracuse University" into the request:
url = 'https://nominatim.openstreetmap.org/search?q=Hinds+Hall+Syracuse+University&format=json'
A better way to write this code is to allow for the input of any location and supply that to the service. To make this work we need to send parameters into the request as a dictionary. This way we can geolocate any address!
You'll notice that on the url, we are passing key-value pairs the key is q
and the value is Hinds+Hall+Syracuse+University
. The other key is format
and the value is json
. Hey, Python dictionaries are also key-value pairs so:
In [ ]:
url = 'https://nominatim.openstreetmap.org/search' # base URL without paramters after the "?"
search = 'Hinds Hall Syracuse University'
options = { 'q' : search, 'format' : 'json'}
response = requests.get(url, params = options)
geodata = response.json()
coords = { 'lat' : float(geodata[0]['lat']), 'lng' : float(geodata[0]['lon']) }
print("Search for:", search)
print("Coordinates:", coords)
print("%s is located at (%f,%f)" %(search, coords['lat'], coords['lng']))
RECALL: For requests.get(url, params = options)
the part that says params = options
is called a named argument, which is Python's way of specifying an optional function argument.
With our parameter now outside the url, we can easily re-write this code to work for any location! Go ahead and execute the code and input Queens, NY
. This will retrieve the coordinates (40.728224,-73.794852)
In [ ]:
url = 'https://nominatim.openstreetmap.org/search' # base URL without paramters after the "?"
search = input("Enter a loacation to Geocode: ")
options = { 'q' : search, 'format' : 'json'}
response = requests.get(url, params = options)
geodata = response.json()
coords = { 'lat' : float(geodata[0]['lat']), 'lng' : float(geodata[0]['lon']) }
print("Search for:", search)
print("Coordinates:", coords)
print("%s is located at (%f,%f)" %(search, coords['lat'], coords['lng']))
In [ ]:
def get_coordinates(search):
url = 'https://nominatim.openstreetmap.org/search' # base URL without paramters after the "?"
options = { 'q' : search, 'format' : 'json'}
response = requests.get(url, params = options)
geodata = response.json()
coords = { 'lat' : float(geodata[0]['lat']), 'lng' : float(geodata[0]['lon']) }
return coords
# main program here:
location = input("Enter a location: ")
coords = get_coordinates(location)
print("%s is located at (%f,%f)" %(location, coords['lat'], coords['lng']))
Not every API we call uses the get()
method. Some use post()
because the amount of data you provide it too large to place on the url.
An example of this is the Text-Processing.com sentiment analysis service. http://text-processing.com/docs/sentiment.html This service will detect the sentiment or mood of text. You give the service some text, and it tells you whether that text is positive, negative or neutral.
In [ ]:
# 'you suck' == 'negative'
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : 'you suck'}
response = requests.post(url, data = options)
sentiment = response.json()
sentiment
In [ ]:
# 'I love cheese' == 'positive'
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : 'I love cheese'}
response = requests.post(url, data = options)
sentiment = response.json()
sentiment
In the examples provided we used the post()
method instead of the get()
method. the post()
method has a named argument data
which takes a dictionary of data. The key required by text-processing.com is text
which hold the text you would like to process for sentiment.
We use a post in the event the text we wish to process is very long. Case in point:
In [ ]:
tweet = "Arnold Schwarzenegger isn't voluntarily leaving the Apprentice, he was fired by his bad (pathetic) ratings, not by me. Sad end to a great show"
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : tweet }
response = requests.post(url, data = options)
sentiment = response.json()
sentiment
In [ ]:
# todo write code here
The first rule of programming over a network is to NEVER assume the network is available. You need to assume the worst. No WiFi, user types in a bad url, the remote website is down, etc.
We handle this in the requests
module by catching the requests.exceptions.RequestException
Here's an example:
In [ ]:
url = "http://this is not a website"
try:
response = requests.get(url) # throws an exception when it cannot connect
# internet is broken
except requests.exceptions.RequestException as e:
print("ERROR: Cannot connect to ", url)
print("DETAILS:", e)
Assuming the internet is not broken (Rule 1) You should now check for HTTP response 200 which means the url responded successfully. Other responses like 404 or 501 indicate an error occured and that means you should not keep processing the response.
Here's one way to do it:
In [ ]:
url = 'http://www.syr.edu/mikeisawesum' # this should 404
try:
response = requests.get(url)
if response.ok: # same as response.status_code == 200
data = response.text
else: # Some other non 200 response code
print("There was an Error requesting:", url, " HTTP Response Code: ", response.status_code)
# internet is broken
except requests.exceptions.RequestException as e:
print("ERROR: Cannot connect to ", url)
print("DETAILS:", e)
Personally I don't like to use if ... else
to handle an error. Instead, I prefer to instruct requests
to throw an exception of requests.exceptions.HTTPError
whenever the response is not ok. This makes the code you write a little cleaner.
Errors are rare occurences, and so I don't like error handling cluttering up my code.
In [ ]:
url = 'http://www.syr.edu/mikeisawesum' # this should 404
try:
response = requests.get(url) # throws an exception when it cannot connect
response.raise_for_status() # throws an exception when not 'ok'
data = response.text
# response not ok
except requests.exceptions.HTTPError as e:
print("ERROR: Response from ", url, 'was not ok.')
print("DETAILS:", e)
# internet is broken
except requests.exceptions.RequestException as e:
print("ERROR: Cannot connect to ", url)
print("DETAILS:", e)
In [ ]:
url = 'http://www.syr.edu' # this is HTML, not JSON
try:
response = requests.get(url) # throws an exception when it cannot connect
response.raise_for_status() # throws an exception when not 'ok'
data = response.json() # throws an exception when cannot decode json
# cannot decode json
except json.decoder.JSONDecodeError as e:
print("ERROR: Cannot decode the response into json")
print("DETAILS", e)
# response not ok
except requests.exceptions.HTTPError as e:
print("ERROR: Response from ", url, 'was not ok.')
print("DETAILS:", e)
# internet is broken
except requests.exceptions.RequestException as e:
print("ERROR: Cannot connect to ", url)
print("DETAILS:", e)
In [ ]:
# todo write code here to input a location, look up coordinates, and print
# it should handle errors!!!
Please answer the following questions. This should be a personal narrative, in your own voice. Answer the questions by double clicking on the question and placing your answer next to the Answer: prompt.
Answer:
Answer:
Answer:
1 ==> I can do this on my own and explain how to do it.
2 ==> I can do this on my own without any help.
3 ==> I can do this with help or guidance from others. If you choose this level please list those who helped you.
4 ==> I don't understand this at all yet and need extra help. If you choose this please try to articulate that which you do not understand.
Answer:
In [ ]:
# SAVE YOUR WORK FIRST! CTRL+S
# RUN THIS CODE CELL TO TURN IN YOUR WORK!
from ist256.submission import Submission
Submission().submit()