Using the Google Geocoding API to geolocate METAR data

What is Geocoding?

What is METAR?

Import useful libraries that (other) people have written

We need a library to understand URLs and XML documents. If we want to make some nice plots of our data, we need a library for that, too.


In [ ]:
import urllib
import xml.etree.ElementTree as ET
from matplotlib import pyplot

Manage the Google API using web documentation


In [ ]:
gcBaseURL='http://maps.googleapis.com/maps/api/geocode/xml?'

Populate an address in human readable form


In [ ]:
address = '3300 Mitchell Lane, Boulder, CO'
#address = 'South Pole'
#address = '1817 Pineapple Ave., Melbourne, FL'
#address = '708 N. Harvard Ave., Ventnor Heights, NJ'
print address

Our address needs to look more like what the Google API expects

  • No spaces in the address
  • Simple string manipulation

In [ ]:
address = address.replace(' ','+')
print address

Now let's construct the URL

  • Set up the sensor component
  • Assemble the URL
  • Do a sanity check

In [ ]:
gcParameters='address=' + address + '&sensor=false'
gcURL = gcBaseURL + gcParameters
print gcURL

What was that?


In [ ]:
tree = ET.parse(urllib.urlopen(gcURL))
root = tree.getroot()

for result in root.findall('result'):
    for geometry in result.findall('geometry'):
        for location in geometry.findall('location'):
            lat = location.find('lat').text
            lon = location.find('lng').text
        
print '[lat, lon] = [' + lat + ', ' + lon + ']'

Now we need to construct a URL based on the cdmrf form submission


In [ ]:
varkeys = ['req', 'variables', 'var','var','var','var', 'latitude', 'longitude', 'spatial', 'temporal', 'time', 'accept']
print varkeys

In [ ]:
varvals = ['data', 'some', 'air_pressure_at_sea_level','air_temperature','wind_from_direction','wind_speed',lat,lon,'point','point','2013-07-16T01%3A55%3A00Z','csv']
print varvals

In [ ]:
varreqs = []
for i in range(len(varkeys)):
    varreqs.append(varkeys[i] + '=' + varvals[i])
print varreqs

Let's put all of the form requests together

  • We need an '&' between each argument
  • We need to get rid of the trailing '&'

In [ ]:
querystr = ''
for varreq in varreqs:
    querystr += varreq + '&'
print querystr

In [ ]:
if querystr[-1] == '&':
    querystr = querystr[0:-1]
print querystr

Even though the negative indexing is neat, there's a better way to join a list in the way we want


In [ ]:
querystr = "&".join(varreqs)
print querystr

Construct our request to the TDS using the expected base URL and our query string


In [ ]:
baseurl = 'http://thredds.ucar.edu/thredds/cdmrfeature/nws/metar/ncdecoded/Metar_Station_Data_fc.cdmr?'
url = baseurl + querystr
print url

Hit the URL and parse the data!


In [ ]:
output = urllib.urlopen(url)
lines = output.readlines()
if len(lines) < 2:
    print 'Only one line in output suggests no data was returned!'
else:
    names = lines[0].split(',')
    vals = lines[1].split(',')
    for i in range(0,len(names)):
        print names[i] + ' : ' + vals[i]

Can we fix that funny formatting issue?

  • Newlines are often a problem parsing data
  • Need to chomp off the newline, if it is there

In [ ]:
output = urllib.urlopen(url)
lines = output.readlines()
if len(lines) < 2:
    print 'Only one line in output suggests no data was returned!'
else:
    names = lines[0].split(',')
    vals = lines[1].split(',')
    for i in range(0,len(names)):
        print names[i].rstrip('\n') + ' : ' + vals[i]

Now for the obligatory time series...


In [ ]:
url = 'http://thredds.ucar.edu/thredds/cdmrfeature/nws/metar/ncdecoded/Metar_Station_Data_fc.cdmr?req=data&variables=some&var=air_temperature&north=&west=&east=&south=&latitude=&longitude=&spatial=stns&stn=BJC&temporal=all&time_start=2013-05-10T00%3A00%3A00Z&time_end=2013-07-10T00%3A00%3A00Z&time=2013-05-10T00%3A00%3A00Z&accept=csv'
output = urllib.urlopen(url)
t_air = []
for line in output.readlines():
    vals = line.split(",")
    t_air.append(float(vals[4]))
print t_air

In [ ]:
url = 'http://thredds.ucar.edu/thredds/cdmrfeature/nws/metar/ncdecoded/Metar_Station_Data_fc.cdmr?req=data&variables=some&var=air_temperature&north=&west=&east=&south=&latitude=&longitude=&spatial=stns&stn=BJC&temporal=all&time_start=2013-05-10T00%3A00%3A00Z&time_end=2013-07-10T00%3A00%3A00Z&time=2013-05-10T00%3A00%3A00Z&accept=csv'
output = urllib.urlopen(url)
t_air = []
lines = output.readlines()
line_iterator = iter(lines)
next(line_iterator) # extract and discard the first line
for line in line_iterator:
    vals = line.split(",")
    t_air.append(float(vals[4]))
print t_air

In [ ]:
pyplot.plot(t_air)