What is the likelihood of a gap?

Our goal is to develop a statistical model the tells us, for a given vessel, the chance that the satellites will fail to receive a message beacuse the density of vessels in the surrounding area is too high. We will then look for the very unlikely events, and classify these as events that were likely due to the AIS being shut off.

A few challenges:

  • The odds will change depending on the number of satellites in the sky, and how well the satellites are operating. Thus we'll have to modify the model depending on the day.
  • Satelites don't cover the globe uniformly. They will spend more time at different latitudes, meaning that there is likely going to be a function of latitude in this equation.
  • Some vessels appear to have weak AIS signals. It would be useful to identify the poorly operating devices.
  • AIS devices broadcast more frequently if a vessel is moving more quickly or changing its course. This has to be accounted for somehow.

Initial Assumptions about AIS Interference

If we assume that at a given time T, the satellite can hear S signals, each vessel broadcasts one signal, and there are N other vessels in the area. Then the chance that your vessels' signal will interfere with another signal, and thus be potentially lost, in this time T, is (1/C)^N. That is, if there are 500 channels, and there is only one other vessel around, the odds are about 1/500 that you'll both accidentally broadcast at the same time (AIS is, I think, smart about not broadcasting at the same time when the vessels are in line of sight with each other -- about 100km radius -- but the satellites see an area of hte earth ~1500km radius, so we can approximate that the vessels can't see each other.)

So, the odds that you'll be heard in a given time T are proportional to (1-1/C)^N.

If the satellite stayed direclty overhead, you'd see that positions/day = K (1-1/C)^N, where K is some constant, N is the number of vessels in the footprint of the satellite. But, the satellites move. And you can't just average the number of vess -- you have to average this function.

Sometimes the vessel you care about will be at the corner of the satellite's radius, and sometimes the satellite will be directly above the vessel. The total number of vessels seen by the satellite (N) could be very different in those two cases, and you can't just average them. So, we need to calculate the number of vessels the satellite sees at every position of its orbit, and frequently the satellite is in each position, and which positions the satellite can see a given vessel. For simplicity, let's say the satellite visits every lat and lon equally (later we can correct for latitude).

So, for each lat and lon, we calculate N, the number of vessels within the satellite radius. Then, for each lat and lon, we calculate (1-1/C)^n, and average this across the satellite radius (to get every position of the satellite that will see the vessels.

For vessels in a given location, then, we expect their number of positions in a day to be a binomial distribution.

The Frequency of AIS Brodcast

The movement of the vessel affects how frequently it will try to broadcast its location. Based on http://www.milltechmarine.com/faq.htm:

Class A: Ships Dynamic Conditions | Dual Channel Receiver | Single Channel Receiver
Ship at anchor or moored | 3 min | 6 min
SOG 0-14 knots | 10 sec | 20 sec
SOG 0-14 knots, changing course | 3.3 sec | 6.6 sec
SOG 14-23 knots | 6 sec | 12 sec
SOG 14-23 knots, changing course| 2 sec | 4 sec
SOG >23 knots | 2 sec | 4 sec
Ship Static Information | 6 min | 12 min

Class B:
Ships Dynamic Conditions | Dual Channel Receiver | Single Channel Receiver SOG < 2 knots | 3 min | 6 min
SOG > 2 knots | 10 sec | 20 sec
Ship Static Information | 6 min | 12 min

Message Types Class A: 1,2,3 Class B: 18,19

For a first cut, we will analyze the positions of a vessel that never goes slower than 2 knots if it is Class B, or is never 0 knots if it is Class A, as these messages are supposed to be ~20 to ~60 times less frequent.


In [2]:
import time
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from matplotlib import colors,colorbar
import matplotlib
%matplotlib inline
import csv 
import math
from math import radians, cos, sin, asin, sqrt
from scipy import stats
import math
import cPickle

In [ ]:
import bq  
client = bq.Client.Get()

In [182]:
def Query(q):
    t0 = time.time()
    answer = client.ReadTableRows(client.Query(q)['configuration']['query']['destinationTable'])
    print 'Query time: ' + str(time.time() - t0) + ' seconds.'
    return answer

In [138]:
def make_map(raster,title=None, colorbar_label=None, maximum=100, 
             minimum=1, scale_type ="log",cutoff=0,savefig=0, 
             png="temp.png", continent_color = '#111111' ):
    
    plt.rcParams["figure.figsize"] = [12,7]
    firstlat = 90-cutoff
    lastlat = -90+cutoff
    firstlon = -180
    lastlon = 180
    scale = cellsize
    vessel_days_truncated = raster[cutoff/cellsize:(180/cellsize)-cutoff/cellsize][:]

    numlats = int((firstlat-lastlat)/scale+.5)
    numlons = int((lastlon-firstlon)/scale+.5)
    lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
    lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)
    fig = plt.figure()
    m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
              llcrnrlon=lastlon, urcrnrlon=firstlon, lat_ts=0, projection='robin',resolution="h", lon_0=0)
    m.drawmapboundary(fill_color='#111111')
    # m.drawcoastlines(linewidth=.2)
    m.fillcontinents(continent_color,lake_color=continent_color)#, lake_color, ax, zorder, alpha)
    x = np.linspace(-180, 180, 360/cellsize)
    y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
    x, y = np.meshgrid(x, y)
    converted_x, converted_y = m(x, y)
    from matplotlib import colors,colorbar
    if scale_type == 'log':
        norm = colors.LogNorm(vmin=minimum, vmax=maximum)
    else:
        norm = colors.Normalize(vmin=minimum, vmax=maximum)

    m.pcolormesh(converted_x, converted_y, vessel_days_truncated, norm=norm, vmin=minimum, vmax=maximum, cmap = plt.get_cmap('viridis'))
    if title == None:
        title = "A Raster of Some Sorts"
    plt.title(title, color = "#000000", fontsize=18)

    ax = fig.add_axes([0.2, 0.1, 0.4, 0.02]) 
    if scale_type == "log":
        lvls = np.logspace(np.log10(minimum),np.log10(maximum),num=8)
        cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', ticks=lvls, cmap = plt.get_cmap('viridis'))
        the_labels = []
        for l in lvls:
            if l>=1:
                l = int(l)
            the_labels.append(l)
        cb.ax.set_xticklabels(the_labels, fontsize=10, color = "#000000")
    else:
        cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', cmap = plt.get_cmap('viridis'))
        
    cb.set_label(colorbar_label,labelpad=-40, y=0.45, color = "#000000")
    ax.text(1.7, -0.5, 'Data Source: Orbcomm\nMap by Global Fishing Watch',
            verticalalignment='bottom', horizontalalignment='right',
            transform=ax.transAxes, color='#000000', fontsize=6)
    if savefig:
        plt.savefig(png,bbox_inches='tight',dpi=300,transparent=True,pad_inches=.1, facecolor="#000000")
    plt.show()


  File "<ipython-input-138-58039819f29a>", line 52
    cb.set_label(colorbar_label,labelpad=-40, y=0.45, color = "#000000")
     ^
IndentationError: expected an indented block

In [223]:
q = '''
SELECT
  integer(FLOOR(first_lat*10)) lat_bin,
  integer(FLOOR(first_lon*10)) lon_bin,
  integer(FLOOR(avg_lat*10)) lat_bin_avg,
  integer(FLOOR(avg_lon*10)) lon_bin_avg,
  satellite_positions sat_positions,
  terrestrial_positions terrestrial_positions,
  positions_weighted,
  avg_speed,
  slow_pings
FROM
(SELECT
  mmsi,
  SUM( CASE WHEN speed = 0 OR (speed<=2 AND type IN (18, 19)) THEN 180 
     WHEN (speed > 0 AND speed <14 AND type IN (1,2,3)AND turn = 0 )
         OR (speed>2 AND type IN (18,19)) THEN 10 
      when speed>0 and speed<14 and type in (1,2,3) and turn !=0 then 3.3
      when speed>=14 and speed<23 and type in (1,2,3) and turn = 0 then 6
      when type in (1,2,3) and (speed>=23 or (speed>=14 and turn !=0)) then 2
      END) positions_weighted,
   first(lat) first_lat,
   first(lon) first_lon,
   avg(lat) avg_lat,
   avg(lon) avg_lon,
   max(lat) max_lat,
   min(lat) min_lat,
   max(lon) max_lon,
   min(lon) min_lon,
   avg(speed) avg_speed,
   sum(if( (speed=0 and type in (1,2,3)) or (speed<2 and type in (18,19)),1,0 )) slow_pings,
   sum( if(REGEXP_REPLACE(tagblock_station, 'u', '') IN ('rORBCOMM000',
        'rORBCOMM01',
        'rORBCOMM008',
        'rORBCOMM009',
        'rORBCOMM010'),1,0)) terrestrial_positions,
   sum( if(REGEXP_REPLACE(tagblock_station, 'u', '') not IN ('rORBCOMM000',
        'rORBCOMM01',
        'rORBCOMM008',
        'rORBCOMM009',
        'rORBCOMM010'),1,0)) satellite_positions,       
FROM
  [pipeline_normalize.20150101]
WHERE
  type IN (1,2,3,18,19) and lat is not null and lon is not null and speed is not null and turn is not null
group by mmsi
)
  where
  max_lat - min_lat <5
  AND (max_lon - min_lon < 10
    OR first_lon > 170
    OR first_lon < -170)
  AND mmsi IN (select mmsi from
[scratch_david_gapanalysis.good_mmsi_2015_1000pings])
'''

positions = Query(q)


Waiting on bqjob_r7cb929094e60fce9_0000015343908a16_17 ... (11s) Current status: DONE   
Query time: 54.2345890999 seconds.

In [224]:
cPickle.dump(positions, open('../../data/density/20150101_v2_vessels.p', 'wb'))

In [3]:
positions = cPickle.load(open('../../data/density/20150101_v2_vessels.p', 'rb'))

In [4]:
len(positions)


Out[4]:
115560

In [6]:
# first calculate a raster of vessel locations from the query
cellsize = 2
num_lons = 360/cellsize
num_lats = 180/cellsize

vessels = np.zeros(shape=(num_lats,num_lons)) 
for row in positions:
    lat = int(row[0])
    lon = int(row[1])
    if lat<900 and lat>-900 and lon>-1800 and lon<1800:
        lat_index = (lat+900)/(cellsize*10)
        lon_index = (lon+1800)/(cellsize*10)
        vessels[lat_index][lon_index] += 1 # one vessel

In [7]:
vessels.max()


Out[7]:
6774.0

In [ ]:
make_map(vessels,title="Number of Vessels, Jan 1 2015",
             colorbar_label="Number of Vessels in each 2 by 2 degreee cell", maximum=vessels.max(), minimum=1)

In [8]:
# map this density

plt.rcParams["figure.figsize"] = [12,7]

cutoff = 0 # 5 degress away from the pole
firstlat = 90-cutoff
lastlat = -90+cutoff
firstlon = -180
lastlon = 180
scale = cellsize

vessel_days_truncated = vessels[cutoff/cellsize:(180/cellsize)-cutoff/cellsize][:]

numlats = int((firstlat-lastlat)/scale+.5)
numlons = int((lastlon-firstlon)/scale+.5)
    
lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)

fig = plt.figure()
m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
          llcrnrlon=lastlon, urcrnrlon=firstlon, lat_ts=0, projection='robin',resolution="h", lon_0=0)

m.drawmapboundary(fill_color='#111111')
# m.drawcoastlines(linewidth=.2)
m.fillcontinents('#111111',lake_color='#111111')#, lake_color, ax, zorder, alpha)

x = np.linspace(-180, 180, 360/cellsize)
y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
x, y = np.meshgrid(x, y)
converted_x, converted_y = m(x, y)
from matplotlib import colors,colorbar

maximum = 1000
minimum = .0001

norm = colors.LogNorm(vmin=minimum, vmax=maximum)
# norm = colors.Normalize(vmin=0, vmax=1000)

m.pcolormesh(converted_x, converted_y, vessel_days_truncated, norm=norm, vmin=minimum, vmax=maximum, cmap = plt.get_cmap('viridis'))

t = "Density of Vessels with AIS"
plt.title(t, color = "#000000", fontsize=18)

ax = fig.add_axes([0.2, 0.1, 0.4, 0.02]) #x coordinate , 
norm = colors.LogNorm(vmin=minimum, vmax=maximum)
# norm = colors.Normalize(vmin=0, vmax=1000)
lvls = np.logspace(np.log10(minimum),np.log10(maximum),num=8)
cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', ticks=lvls, cmap = plt.get_cmap('viridis'))

the_labels = []
for l in lvls:
    if l>=1:
        l = int(l)
    the_labels.append(l)

#cb.ax.set_xticklabels(["0" ,round(m3**.5,1), m3, round(m3**1.5,1), m3*m3,round(m3**2.5,1), str(round(m3**3,1))+"+"], fontsize=10)
cb.ax.set_xticklabels(the_labels, fontsize=10, color = "#000000")
cb.set_label('Number of Vessels by 2x2 degree grid, Jan 1 2015',labelpad=-40, y=0.45, color = "#000000")

ax.text(1.7, -0.5, 'Data Source: Orbcomm\nMap by Global Fishing Watch',
        verticalalignment='bottom', horizontalalignment='right',
        transform=ax.transAxes,
        color='#000000', fontsize=6)

# plt.savefig("vessel_density_2015.png",bbox_inches='tight',dpi=300,transparent=True,pad_inches=.1, facecolor="#000000")
plt.show()



In [9]:
# Calculate the Number of Vessels Seen by the Satellite at Every Lat and Lon

grid_for_average = np.load('../../data/density/grid_for_average_2degree.npy')
sat_footprints = np.zeros(shape=(num_lats,num_lons)) # 2 by 2 grid

for i in range(num_lats):
    for j in range(num_lons):
        for item in grid_for_average[i][j]:
            if vessels[item[0]][item[1]] and item[2]>.5:
                sat_footprints[i][j] += vessels[item[0]][item[1]]

In [105]:
def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

def distance_between_boxes(lat_box_0,lon_box_0,lat_box_1,lon_box_1):
    """
    Convert our lat and lon index to the lat and lon of the middle of
    a cell
    """
    lat1 = lat_box_0*cellsize-90+cellsize/2.
    lat2 = lat_box_1*cellsize-90+cellsize/2.
    lon1 = lon_box_0*cellsize-180+cellsize/2.
    lon2 = lon_box_1*cellsize-180+cellsize/2.
    return haversine(lon1, lat1, lon2, lat2)

In [122]:
# Now we have the problem of averaging on a sphere. There are geospatial
# functions to do this, but I'll just write my own here
# grid_for_average[lat_box][lon_box] is a list of indices representing
# the satellite's footprint when it is at an index lat_box, lon_box
# on the grid.

satellite_radius = 2800 # measured in km, ~1500 km
grid_for_average = [ [[] for i in range(num_lons)] for k in range(num_lats)]
for lat_box in range(num_lats):
    for lon_box in range(num_lons):
        # lat_above and lat_below are so that we don't have to 
        # loop through the entire grid
        lat_above = int(lat_box + satellite_radius/110. + 1)
        if lat_above > num_lats: lat_above = num_lats
        lat_below = int(lat_box - satellite_radius/110. - 1)
        if lat_below < 0: lat_below = 0
        for i in range(lat_below,lat_above):
            for k in range (num_lons):
                dist = distance_between_boxes(i,k,lat_box,lon_box)
                if dist < satellite_radius:
                    grid_for_average[lat_box][lon_box].append([i,k])

In [130]:
# Now we add count the number of vessels that are in the satellite's footprint
# when the satellite is at a given lat and lon
sat_footprints = np.zeros(shape=(num_lats,num_lons)) 
for i in range(num_lats):
    for j in range(num_lons):
        for item in grid_for_average[i][j]:
            sat_footprints[i][j] += vessels[item[0]][item[1]]

In [135]:
sat_footprints[0][0]


Out[135]:
16.0

In [136]:
sat_footprints.max()


Out[136]:
51931.0

In [139]:
make_map(sat_footprints,title="Number of Vessels Seen by Satellite \nat each Lat and Lon, Jan 1 2015",
             colorbar_label="Number of Vessels", maximum=sat_footprints.max(), minimum=1)



In [116]:
#now, let's map this


plt.rcParams["figure.figsize"] = [12,7]
cutoff = 0 # 5 degress away from the pole
firstlat = 90-cutoff
lastlat = -90+cutoff
firstlon = -180
lastlon = 180
scale = cellsize
vessel_days_truncated = sat_footprints[cutoff/cellsize:(180/cellsize)-cutoff/cellsize][:]
numlats = int((firstlat-lastlat)/scale+.5)
numlons = int((lastlon-firstlon)/scale+.5)
lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)
fig = plt.figure()
m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
          llcrnrlon=lastlon, urcrnrlon=firstlon, lat_ts=0, projection='robin',resolution="h", lon_0=0)

m.drawmapboundary(fill_color='#111111')
# m.drawcoastlines(linewidth=.2)
m.fillcontinents('#111111',lake_color='#111111')#, lake_color, ax, zorder, alpha)

x = np.linspace(-180, 180, 360/cellsize)
y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
x, y = np.meshgrid(x, y)
converted_x, converted_y = m(x, y)
from matplotlib import colors,colorbar

maximum = 100000
minimum = .01

norm = colors.LogNorm(vmin=minimum, vmax=maximum)
# norm = colors.Normalize(vmin=0, vmax=1000)

m.pcolormesh(converted_x, converted_y, vessel_days_truncated, norm=norm, vmin=minimum, vmax=maximum, cmap = plt.get_cmap('viridis'))

t = "Number of Vessels Seen by Satellite at each Lat and Lon"
plt.title(t, color = "#000000", fontsize=18)

ax = fig.add_axes([0.2, 0.1, 0.4, 0.02]) #x coordinate , 
norm = colors.LogNorm(vmin=minimum, vmax=maximum)
# norm = colors.Normalize(vmin=0, vmax=1000)
lvls = np.logspace(np.log10(minimum),np.log10(maximum),num=8)
cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', ticks=lvls, cmap = plt.get_cmap('viridis'))

the_labels = []
for l in lvls:
    if l>=1:
        l = int(l)
    the_labels.append(l)

#cb.ax.set_xticklabels(["0" ,round(m3**.5,1), m3, round(m3**1.5,1), m3*m3,round(m3**2.5,1), str(round(m3**3,1))+"+"], fontsize=10)
cb.ax.set_xticklabels(the_labels, fontsize=10, color = "#000000")
cb.set_label('Number of Vessels Seen by Satellite at each Lat and Lon, Jan 1 2015',labelpad=-40, y=0.45, color = "#000000")

ax.text(1.7, -0.5, 'Data Source: Orbcomm\nMap by Global Fishing Watch',
        verticalalignment='bottom', horizontalalignment='right',
        transform=ax.transAxes,
        color='#000000', fontsize=6)

# plt.savefig("vessel_density_2015.png",bbox_inches='tight',dpi=300,transparent=True,pad_inches=.1, facecolor="#000000")
plt.show()



In [77]:
#estimate the pings effect for each point
sat_pings_footprint = np.zeros(shape=(num_lats,num_lons))
for i in range(num_lats):
    for j in range(num_lons):
        sat_pings_footprint[i][j] = (1-1/2500.)**sat_footprints[i][j]

In [79]:
#now, let's map the predicted positions

plt.rcParams["figure.figsize"] = [12,7]
cutoff = 0 # 5 degress away from the pole
firstlat = 90-cutoff
lastlat = -90+cutoff
firstlon = -180
lastlon = 180
scale = cellsize
vessel_days_truncated = sat_pings_footprint[cutoff/cellsize:(180/cellsize)-cutoff/cellsize][:] 
numlats = int((firstlat-lastlat)/scale+.5)
numlons = int((lastlon-firstlon)/scale+.5)
lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)
fig = plt.figure()
m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
          llcrnrlon=lastlon, urcrnrlon=firstlon, lat_ts=0, projection='robin',resolution="h", lon_0=0)

m.drawmapboundary(fill_color='#111111')
# m.drawcoastlines(linewidth=.2)
m.fillcontinents('#111111',lake_color='#111111')#, lake_color, ax, zorder, alpha)

x = np.linspace(-180, 180, 360/cellsize)
y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
x, y = np.meshgrid(x, y)
converted_x, converted_y = m(x, y)
from matplotlib import colors,colorbar

maximum = 1
minimum = 0

# norm = colors.LogNorm(vmin=minimum, vmax=maximum)
norm = colors.Normalize(vmin=minimum, vmax=maximum)

m.pcolormesh(converted_x, converted_y, vessel_days_truncated, norm=norm, vmin=minimum, vmax=maximum, cmap = plt.get_cmap('viridis'))

t = "Density of Vessels with AIS"
plt.title(t, color = "#000000", fontsize=18)

ax = fig.add_axes([0.2, 0.1, 0.4, 0.02]) #x coordinate , 
# norm = colors.LogNorm(vmin=minimum, vmax=maximum)
norm = colors.Normalize(vmin=minimum, vmax=maximum)
# lvls = np.logspace(np.log10(minimum),np.log10(maximum),num=8)
cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', cmap = plt.get_cmap('viridis'))#, ticks=lvls,)

the_labels = []
for l in lvls:
    if l>=1:
        l = int(l)
    the_labels.append(l)

#cb.ax.set_xticklabels(["0" ,round(m3**.5,1), m3, round(m3**1.5,1), m3*m3,round(m3**2.5,1), str(round(m3**3,1))+"+"], fontsize=10)
# cb.ax.set_xticklabels(the_labels, fontsize=10, color = "#000000")
cb.set_label('Number of Vessels Seen by Satellite at each Lat and Lon, Jan 1 2015',labelpad=-40, y=0.45, color = "#000000")

ax.text(1.7, -0.5, 'Data Source: Orbcomm\nMap by Global Fishing Watch',
        verticalalignment='bottom', horizontalalignment='right',
        transform=ax.transAxes,
        color='#000000', fontsize=6)

# plt.savefig("vessel_density_2015.png",bbox_inches='tight',dpi=300,transparent=True,pad_inches=.1, facecolor="#000000")
plt.show()



In [78]:
#now, average again
estimated_positions = np.zeros(shape=(num_lats,num_lons))
density_averaged = np.zeros(shape=(num_lats,num_lons))
for i in range(num_lats):
    for j in range(num_lons):
        a = 0
        for item in grid_for_average[i][j]:
            if item[2]>.5:
                # crap, have to adjust by area 
                latitude = (i-45)*2+1 # i goes from 0 to 90, map to -91 to 89, every two degrees
                area = math.cos(latitude*3.1416/180.)
                estimated_positions[i][j] += sat_pings_footprint[item[0]][item[1]]*area
                density_averaged[i][j] +=  sat_footprints[item[0]][item[1]]/area
                a += area
        estimated_positions[i][j] = (500./a)*estimated_positions[i][j]

In [85]:
#now, let's map the predicted positions

plt.rcParams["figure.figsize"] = [12,7]
cutoff = 0 # 5 degress away from the pole
firstlat = 90-cutoff
lastlat = -90+cutoff
firstlon = -180
lastlon = 180
scale = cellsize
vessel_days_truncated = estimated_positions[cutoff/cellsize:(180/cellsize)-cutoff/cellsize][:]/2. #note this scale factor
numlats = int((firstlat-lastlat)/scale+.5)
numlons = int((lastlon-firstlon)/scale+.5)
lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)
fig = plt.figure()
m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
          llcrnrlon=lastlon, urcrnrlon=firstlon, lat_ts=0, projection='robin',resolution="h", lon_0=0)

m.drawmapboundary(fill_color='#111111')
# m.drawcoastlines(linewidth=.2)
m.fillcontinents('#111111',lake_color='#111111')#, lake_color, ax, zorder, alpha)

x = np.linspace(-180, 180, 360/cellsize)
y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
x, y = np.meshgrid(x, y)
converted_x, converted_y = m(x, y)
from matplotlib import colors,colorbar

maximum = 200
minimum = 0

# norm = colors.LogNorm(vmin=minimum, vmax=maximum)
norm = colors.Normalize(vmin=minimum, vmax=maximum)

m.pcolormesh(converted_x, converted_y, vessel_days_truncated, norm=norm, vmin=minimum, vmax=maximum, cmap = plt.get_cmap('viridis'))

t = "Predicted Positions per Day"
plt.title(t, color = "#000000", fontsize=18)

ax = fig.add_axes([0.2, 0.1, 0.4, 0.02]) #x coordinate , 
# norm = colors.LogNorm(vmin=minimum, vmax=maximum)
norm = colors.Normalize(vmin=minimum, vmax=maximum)
# lvls = np.logspace(np.log10(minimum),np.log10(maximum),num=8)
cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', cmap = plt.get_cmap('viridis'))#, ticks=lvls,)

the_labels = []
for l in lvls:
    if l>=1:
        l = int(l)
    the_labels.append(l)

#cb.ax.set_xticklabels(["0" ,round(m3**.5,1), m3, round(m3**1.5,1), m3*m3,round(m3**2.5,1), str(round(m3**3,1))+"+"], fontsize=10)
# cb.ax.set_xticklabels(the_labels, fontsize=10, color = "#000000")
cb.set_label('Predicted Positions per Day',labelpad=-40, y=0.45, color = "#000000")

ax.text(1.7, -0.5, 'Data Source: Orbcomm\nMap by Global Fishing Watch',
        verticalalignment='bottom', horizontalalignment='right',
        transform=ax.transAxes,
        color='#000000', fontsize=6)

# plt.savefig("vessel_density_2015.png",bbox_inches='tight',dpi=300,transparent=True,pad_inches=.1, facecolor="#000000")
plt.show()



In [81]:
# what are the actual positions per day?
actual_positions = np.zeros(shape=(num_lats,num_lons))
number_of_vessels = np.zeros(shape=(num_lats,num_lons))

for row in positions:
    lat = int(row[2])
    lon_ave = int(row[3])
    lon_f = int(row[1])
    if abs(lon_f - lon_ave > 50): # use average, except near the dateline
        lon = lon_f
    else:
        lon = lon_ave
    sat_pos = int(row[4])
    all_pos = int(row[5])+int(row[4])
    speed = float(row[7])
    slow_pos = int(row[8])
    # must only be satellite positions
    if sat_pos == all_pos and lat<900 and lat>-900 and lon>-1800 and lon<1800 and speed>2 and slow_pos == 0:# and abs(lat)<300:
        lat_index = (lat+900)/(cellsize*10)
        lon_index = (lon+1800)/(cellsize*10)
        number_of_vessels[lat_index][lon_index] +=1
        actual_positions[lat_index][lon_index] += sat_pos
        
for i in range(num_lats):
    for j in range(num_lons):
        if number_of_vessels[i][j]:
            actual_positions[i][j] = float(actual_positions[i][j])/number_of_vessels[i][j]

In [104]:
#now, let's map the predicted positions

plt.rcParams["figure.figsize"] = [12,7]
cutoff = 0 # 5 degress away from the pole
firstlat = 90-cutoff
lastlat = -90+cutoff
firstlon = -180
lastlon = 180
scale = cellsize
vessel_days_truncated = actual_positions[cutoff/cellsize:(180/cellsize)-cutoff/cellsize][:] #note this scale factor
vessel_days_truncated[vessel_days_truncated==0] = np.nan
numlats = int((firstlat-lastlat)/scale+.5)
numlons = int((lastlon-firstlon)/scale+.5)
lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)
fig = plt.figure()
m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
          llcrnrlon=lastlon, urcrnrlon=firstlon, lat_ts=0, projection='robin',resolution="h", lon_0=0)

m.drawmapboundary(fill_color='#111111')
# m.drawcoastlines(linewidth=.2)
m.fillcontinents('#111111',lake_color='#111111')#, lake_color, ax, zorder, alpha)

x = np.linspace(-180, 180, 360/cellsize)
y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
x, y = np.meshgrid(x, y)
converted_x, converted_y = m(x, y)
from matplotlib import colors,colorbar

maximum = 300
minimum = 0

# norm = colors.LogNorm(vmin=minimum, vmax=maximum)
norm = colors.Normalize(vmin=minimum, vmax=maximum)

m.pcolormesh(converted_x, converted_y, vessel_days_truncated, norm=norm, vmin=minimum, vmax=maximum, cmap = plt.get_cmap('viridis'))

t = "Predicted Positions per Day"
plt.title(t, color = "#000000", fontsize=18)

ax = fig.add_axes([0.2, 0.1, 0.4, 0.02]) #x coordinate , 
# norm = colors.LogNorm(vmin=minimum, vmax=maximum)
norm = colors.Normalize(vmin=minimum, vmax=maximum)
# lvls = np.logspace(np.log10(minimum),np.log10(maximum),num=8)
cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', cmap = plt.get_cmap('viridis'))#, ticks=lvls,)

the_labels = []
for l in lvls:
    if l>=1:
        l = int(l)
    the_labels.append(l)

#cb.ax.set_xticklabels(["0" ,round(m3**.5,1), m3, round(m3**1.5,1), m3*m3,round(m3**2.5,1), str(round(m3**3,1))+"+"], fontsize=10)
# cb.ax.set_xticklabels(the_labels, fontsize=10, color = "#000000")
cb.set_label('Predicted Positions per Day',labelpad=-40, y=0.45, color = "#000000")

ax.text(1.7, -0.5, 'Data Source: Orbcomm\nMap by Global Fishing Watch',
        verticalalignment='bottom', horizontalalignment='right',
        transform=ax.transAxes,
        color='#000000', fontsize=6)

# plt.savefig("vessel_density_2015.png",bbox_inches='tight',dpi=300,transparent=True,pad_inches=.1, facecolor="#000000")
plt.show()



In [97]:
predicted_pos = []
pos = []
for i in range(num_lats):
    for j in range(num_lons):
        if actual_positions[i][j] and number_of_vessels[i][j]>3:
            predicted_pos.append(estimated_positions[i][j]/2)
            pos.append(actual_positions[i][j])
plt.scatter(predicted_pos, pos,alpha = .2)

slope, intercept, r_value, p_value, std_err = stats.linregress(predicted_pos,pos)
print r_value, std_err, intercept
y = [slope*x + intercept for x in predicted_pos]

plt.plot(predicted_pos,y, "r")
plt.show()


0.784463191119 0.0159529802447 1.42078685304

In [103]:
#now... is there a relation???

den = []
predicted_pos = []
pos = []
lats = []
pos_weighted = []
count = 0

for row in positions:
    lat = int(row[2])
    lon_ave = int(row[3])
    lon_f = int(row[1])
    if abs(lon_f - lon_ave > 50): # use average, except near the dateline
        lon = lon_f
    else:
        lon = lon_ave
    sat_pos = int(row[4])
    all_pos = int(row[5])+int(row[4])
    speed = float(row[7])
    slow_pos = int(row[8])
    # must only be satellite positions
    if sat_pos == all_pos and ((lat>=0 and lat<450) or (lat<0 and lat>-450)) and lon>-1800 and lon<1800 and speed>2 and slow_pos == 0:# and abs(lat)<300:
        lat_index = (lat+900)/(cellsize*10)
        lon_index = (lon+1800)/(cellsize*10)
        if estimated_positions[lat_index][lon_index]/2 < 150:
            den.append(sat_footprints[lat_index][lon_index])
            predicted_pos.append(estimated_positions[lat_index][lon_index]/2)
            pos_weighted.append(float(row[6]))
            pos.append(sat_pos)
            lats.append(lat)


plt.rcParams["figure.figsize"] = [12,7]
plt.scatter(predicted_pos, pos,  alpha=.05)

slope, intercept, r_value, p_value, std_err = stats.linregress(predicted_pos,pos)
print r_value, std_err, intercept
y = [slope*x + intercept for x in predicted_pos]

plt.plot(predicted_pos,y, "r")

z = np.polyfit(predicted_pos,pos, 2)
p = np.poly1d(z)
plt.scatter(predicted_pos, p(predicted_pos))

plt.title("Satellite positions per day versus Predicted")
plt.xlabel('Predicted Positions)')
plt.ylabel('Actual Positions)')
plt.xlim([0,250])
plt.ylim([0,250])
plt.show()

# thetext = "slope: "+ str(round(slope,2))+"\nr_squared: "+str(round(r_value**2,2))+"\nstandard error: "+str(round(std_err,3))
# plt.scatter(x, ys, color = 'red')


0.751586537692 0.00699227127084 -4.33882084612

In [90]:
plt.scatter(predicted_pos, pos_weighted,  alpha=.05)
plt.title("Satellite positions per day versus Predicted")
plt.xlabel('Predicted Positions)')
plt.ylabel('Weighted Positions)')
slope, intercept, r_value, p_value, std_err = stats.linregress(predicted_pos, pos_weighted)
print r_value, std_err, intercept

y = [slope*x + intercept for x in predicted_pos]

plt.plot(predicted_pos,y, "r")
plt.xlim([0,250])
plt.ylim([0,2500])
plt.show()


0.733176304757 0.0517012749114 -15.474988504

In [ ]:
# create the density that is seen by satellites

def create_average_raster(filename):
    # load the file that helps with the satellite averaging
    grid_for_average = np.load(filename)
    avgs = np.zeros(shape=(num_lats,num_lons)) # 2 by 2 grid

    for i in range(num_lats):
        for j in range(num_lons):
            count = len(grid_for_average[i][j])
            total = 0
            for item in grid_for_average[i][j]:
                if vessels[item[0]][item[1]]:
                    total+= vessels[item[0]][item[1]]*(item[2]**2)
                    #total += np.log10(float(vessels[item[0]][item[1]]))#*item[2]
            avgs[i][j] = total/4.
            # divide by 4 because the satellite sees only 1/4 of the area that we're 
            #averaging over
            #avgs[i][j]=10**(float(total)/len(grid_for_average[i][j])) 

    return avgs

In [12]:
infile = '../../data/density/grid_for_average_2degree.npy'
averages = create_average_raster(infile)

In [ ]:
# make a chart of pings versus the average density seen by the satellite


den = []
pos = []
lats = []
pos_weighted = []
count = 0

for row in positions:
    lat = int(row[2])
    lon_ave = int(row[3])
    lon_f = int(row[1])
    if abs(lon_f - lon_ave > 50): # use average, except near the dateline
        lon = lon_f
    else:
        lon = lon_ave
    sat_pos = int(row[4])
    all_pos = int(row[5])+int(row[4])
    speed = float(row[7])
    slow_pos = int(row[8])
    # must only be satellite positions
    if sat_pos == all_pos and lat<900 and lat>-900 and lon>-1800 and lon<1800 and speed>2 and slow_pos == 0:# and abs(lat)<300:
        lat_index = (lat+900)/(cellsize*10)
        lon_index = (lon+1800)/(cellsize*10)
        if (density_averaged[lat_index][lon_index])<1000:
            den.append(density_averaged[lat_index][lon_index])
            pos.append(sat_pos)
            lats.append(lat)


plt.rcParams["figure.figsize"] = [12,7]

x = []
y = []
for d, p, l in zip(den, pos, lats):
    if p and d:# and d<10**.1:#abs(l)>0 and p and d:# and d>10**1:
        x.append(d)
        y.append(p)

fig = plt.figure()
ax = fig.add_subplot(1,1,1)

#     plt.scatter(np.log10(x), np.log10(y), alpha=.02
plt.scatter(x, np.log10(y), alpha=.02
           )#, color = color)

logA = x
logB = np.log10(y)
coefficients = np.polyfit(logA, logB, 1)
polynomial = np.poly1d(coefficients)
ys = polynomial(x)
slope, intercept, r_value, p_value, std_err = stats.linregress(logA, logB)

thetext = "slope: "+ str(round(slope,2))+"\nr_squared: "+str(round(r_value**2,2))+"\nstandard error: "+str(round(std_err,3))
plt.scatter(x, ys, color = 'red')

ax.text(2, 3.5, thetext, fontsize=10)

# plt.xlim([0,4.5])
plt.ylim([-.5,4])

# print slope
# print coefficients
# print r_value
# print p_value
#print slope,std_err,p_value, r_value**2

plt.title("Satellite positions per day versus density")
plt.xlabel('Log(vessels seen by satellite)')
plt.ylabel('Log(satellite positions)')

plt.show()

In [28]:
make_averages(averages,2700)



In [29]:
make_averages(averages_2000,2000)



In [30]:
make_averages(averages_1500,1500)



In [31]:
make_averages(averages_1000,1000)



In [32]:
make_averages(averages_500,500)



In [539]:
# map the vessel density seen by satellite

firstlat = 90
lastlat = -90
firstlon = -180
lastlon = 180
scale = cellsize

numlats = int((firstlat-lastlat)/scale+.5)
numlons = int((lastlon-firstlon)/scale+.5)
    
lat_boxes = np.linspace(lastlat,firstlat,num=numlats,endpoint=False)
lon_boxes = np.linspace(firstlon,lastlon,num=numlons,endpoint=False)

fig = plt.figure()
plt.clf()

m = Basemap(llcrnrlat=lastlat, urcrnrlat=firstlat,
          llcrnrlon=firstlon, urcrnrlon=lastlon, lat_ts=0, projection='mill',resolution="h")

m.drawmapboundary()
m.drawcoastlines(linewidth=.2)
#m.fillcontinents('#555555')# , lake_color, ax, zorder, alpha)

x = np.linspace(-180, 180, 360/cellsize )
y = np.linspace(lastlat, firstlat, (firstlat-lastlat)/cellsize)
x, y = np.meshgrid(x, y)
converted_x, converted_y = m(x, y)

from matplotlib import colors,colorbar

norm = colors.LogNorm(vmin=1, vmax=1000)

m3 = int(averages.max()**.3333)

m.pcolormesh(converted_x, converted_y, averages, norm=norm, vmin=1, vmax=m3**3)

t = "Vessels per Day Averaged over Satellite Footprint, Jan 1, 2015"
plt.title(t)

ax = fig.add_axes([0.15, 0.1, 0.4, 0.02]) 
norm = colors.Normalize(vmin=0, vmax=1000)
norm = colors.LogNorm(vmin=1, vmax=1000)
lvls = np.logspace(0,3,7)

cb = colorbar.ColorbarBase(ax,norm = norm, orientation='horizontal', ticks=lvls)

# cb.ax.set_xticklabels(["<1" ,int(m3**.5), m3, int(m3**1.5), m3*m3,int(m3**2.5), str(int(m3**3))+"+"], fontsize=10)


plt.savefig("satellite_footprint2by2_500b.png",bbox_inches='tight',dpi=450,transparent=True,pad_inches=0)

plt.rcParams["figure.figsize"] = [12,8]

plt.show()



In [310]:
minimum = 0
maximum = 100
plt.rcParams["figure.figsize"] = [6,3]

for i in range(18):

    x = []

    plt.clf()
    if i > 0:
        minimum = maximum
    maximum += 100*i
    
    less_than_5 = 0
    count = 0
    for p, d in zip (pos, den):
        if d>minimum and d<maximum and p<500:
            x.append(p)
            if p<10:
                less_than_5 += 1
            count+=1
    plt.title("Density between "+str(minimum)+" and "+str(maximum))     
    plt.xlabel("positions per day")
    plt.ylabel("number of vessels")
    plt.hist(x, bins=50)
    plt.xlim([0,200])
    plt.show()
    print "Density between "+str(minimum)+" and "+str(maximum) + " : ", int(100*less_than_5/count)


Density between 0 and 100 :  64
Density between 100 and 200 : 
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-310-f35dd4444196> in <module>()
     26     plt.xlim([0,200])
     27     plt.show()
---> 28     print "Density between "+str(minimum)+" and "+str(maximum) + " : ", int(100*less_than_5/count)

ZeroDivisionError: integer division or modulo by zero

In [13]:
# okay, let's look at only fishing vessels

q = '''
SELECT
  integer(FLOOR(a_first_lat*10)) lat_bin,
  integer(FLOOR(a_first_lon*10)) lon_bin,
  integer(FLOOR(a_avg_lat*10)) lat_bin_avg,
  integer(FLOOR(a_avg_lon*10)) lon_bin_avg,
  a_satellite_positions sat_positions,
  a_positions positions
FROM
  [scratch_david_gapanalysis.ave_locations_2015_with_density_v2]
WHERE
  a_date = "2015-01-01"
  AND a_max_lat - a_min_lat <5
  AND (a_max_lon - a_min_lon < 10
    OR a_first_lon > 170
    OR a_first_lon < -170)
  AND a_mmsi IN (select mmsi from
[scratch_david_mmsi_lists.Combinedfishing_2014]
  )
'''

positions_fishing_vessels = Query(q)


Waiting on bqjob_r6a16a99088a29707_000001530ad38aee_2 ... (7s) Current status: DONE   
Query time: 16.1007111073 seconds.

In [14]:
# now make a chart of pings versus position

den = []
pos = []
count = 0

for row in positions_fishing_vessels:
    lat = int(row[2])
    lon_ave = int(row[3])
    lon_f = int(row[1])
    if abs(lon_f - lon_ave > 50): # use average, except near the dateline
        lon = lon_f
    else:
        lon = lon_ave
    sat_pos = row[4]
    all_pos = row[5]
    count += 1
    # must only be satellite positions
    if sat_pos == all_pos and lat<900 and lat>-900 and lon>-1800 and lon<1800:
        lat_index = (lat+900)/(cellsize*10)
        lon_index = (lon+1800)/(cellsize*10)
        den.append(averages[lat_index][lon_index])
        pos.append(int(sat_pos))

In [15]:
x = den
y = pos

fig = plt.figure()
ax = fig.add_subplot(1,1,1)
# ax.set_yscale('log')
# ax.set_xscale('log')

plt.scatter(np.log10(x), np.log10(y), alpha=.1)#, color = color)
# plt.scatter(x1, y1, alpha=1)#, color = color)

logA = np.log10(x)
logB = np.log10(y)
coefficients = np.polyfit(logA, logB, 1)
polynomial = np.poly1d(coefficients)
ys = polynomial(np.log10(x))
slope, intercept, r_value, p_value, std_err = stats.linregress(logA, logB)

plt.scatter(np.log10(x), ys, color = 'red')


# print slope
# print coefficients
# print r_value
# print p_value
print slope,std_err,p_value

plt.title("Satellite positions per day Versus Density FISHING VESSELS")
plt.xlabel('Log(vessels seen by satellite)')
plt.ylabel('Log(satellite positions)')

plt.show()


-0.585306674036 0.0293862270318 3.75524846468e-79

In [16]:
minimum = 0
maximum = 100

for i in range(18):

    x = []

    plt.clf()
    if i > 1:
        minimum = maximum
    maximum += 100*i

    for p, d in zip (pos, den):
        if d>minimum and d<maximum and p<400:
            x.append(p)
    plt.title("Density between "+str(minimum)+" and "+str(maximum)+", FISHING VESSELS")     
    plt.xlabel("positions per day")
    plt.ylabel("number of vessels")
    plt.hist(x, bins=50)
    plt.show()


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-3feb559eade3> in <module>()
     17     plt.xlabel("positions per day")
     18     plt.ylabel("number of vessels")
---> 19     plt.hist(x, bins=50)
     20     plt.show()

/Library/Python/2.7/site-packages/matplotlib-override/matplotlib/pyplot.pyc in hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, hold, **kwargs)
   2894                       histtype=histtype, align=align, orientation=orientation,
   2895                       rwidth=rwidth, log=log, color=color, label=label,
-> 2896                       stacked=stacked, **kwargs)
   2897         draw_if_interactive()
   2898     finally:

/Library/Python/2.7/site-packages/matplotlib-override/matplotlib/axes/_axes.pyc in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5595         flat = np.ravel(x)
   5596         if len(flat) == 0:
-> 5597             raise ValueError("x must have at least one data point")
   5598         elif len(flat) == 1 and not binsgiven:
   5599             raise ValueError(

ValueError: x must have at least one data point

In [ ]:
# okay, let's look at averages

q = '''
SELECT
  integer(FLOOR(a_avg_lat*10)) lat_bin_avg,
  integer(FLOOR(a_avg_lon*10)) lon_bin_avg,
  avg(a_satellite_positions) sat_positions,
  count(*)
  FROM
  [scratch_david_gapanalysis.ave_locations_2015_with_density_v2]
WHERE
  a_positions = a_satellite_positions
  and a_date = "2015-01-01"
  AND a_max_lat - a_min_lat <5
  AND (a_max_lon - a_min_lon < 10)
  AND a_mmsi IN (select mmsi from
[scratch_david_mmsi_lists.Combinedfishing_2014])
GROUP BY lat_bin_avg, lon_bin_avg
'''

positions_fishing_vessels_avgs = Query(q)

In [ ]:
# now make a chart of pings versus position

den = []
pos = []

for row in positions_fishing_vessels_avgs:
    lat = int(row[0])
    lon = int(row[1])
    sat_pos = row[2]
    number = int(row[3])
    # must only be satellite positions
    if lat<900 and lat>-900 and lon>-1800 and lon<1800:
        lat_index = (lat+900)/(cellsize*10)
        lon_index = (lon+1800)/(cellsize*10)
        den.append(averages[lat_index][lon_index])
        pos.append(float(sat_pos))

In [ ]:
x = den
y = pos

fig = plt.figure()
ax = fig.add_subplot(1,1,1)
# ax.set_yscale('log')
# ax.set_xscale('log')

plt.scatter(np.log10(x), np.log10(y), alpha=.1)#, color = color)
# plt.scatter(x1, y1, alpha=1)#, color = color)

logA = np.log10(x)
logB = np.log10(y)
coefficients = np.polyfit(logA, logB, 1)
polynomial = np.poly1d(coefficients)
ys = polynomial(np.log10(x))
slope, intercept, r_value, p_value, std_err = stats.linregress(logA, logB)

plt.scatter(np.log10(x), ys, color = 'red')


# print slope
# print coefficients
# print r_value
# print p_value
print slope,std_err,p_value

plt.title("AVERAGE Satellite positions per day Versus Density FISHING VESSELS")
plt.xlabel('Log(vessels seen by satellite)')
plt.ylabel('Log(satellite positions)')

plt.show()

In [26]:
math.cos(-89*3.14/180.)


Out[26]:
0.018239759727164722

In [114]:
def hello(a,b=None):
    if b:
        print "what"
    else:
        print a


hello(a="yes")


yes

In [ ]: