IS 608 HW 4

Using the dataset available here complete the following:

Create lists & graphs of the best and worst places to swim in the dataset.
The testing of water quality can be sporadic. Which sites have been tested most regularly? Which ones have long gaps between tests? Pick out 5-10 sites and visually compare how regularly their water quality is tested.
Is there a relationship between the amount of rain and water quality? Show this relationship graphically. If you can, estimate the effect of rain on quality at different sites and create a visualization to compare them.

Background: The United States Environmental Protection Agency (EPA) reports Entero counts as colonies (or cells) per 100 ml of water. The federal standard for unacceptable water quality is a single sample value of greater than 110 Enterococcus/100mL, or five or more samples with a geometric mean (a weighted average) greater than 30 Enterococcus/100mL.



In [190]:

    
# Import modules for analysis and visualization
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats.mstats import gmean
import numpy as np
from numpy import genfromtxt
import os.path
from datetime import datetime



In [198]:

    
# Clean up and import (pandas)
data_url = "https://raw.githubusercontent.com/jlaurito/CUNY_IS608/master/lecture4/data/riverkeeper_data_2013.csv"
raw_data = pd.read_csv(data_url)
# info like what you get with str() in R
raw_data.dtypes









    Out[198]:





Site                 object
Date                 object
EnteroCount          object
FourDayRainTotal    float64
SampleCount           int64
dtype: object



In [199]:

    
# look at the first 25 rows
# remember the first row is the colnames
raw_data.head(25)









    Out[199]:






  
    
      
      Site
      Date
      EnteroCount
      FourDayRainTotal
      SampleCount
    
  
  
    
      0
      Hudson above Mohawk River
      10/16/2011
      1733
      1.5
      35
    
    
      1
      Hudson above Mohawk River
      10/21/2013
      4
      0.2
      35
    
    
      2
      Hudson above Mohawk River
      9/21/2013
      20
      0.0
      35
    
    
      3
      Hudson above Mohawk River
      8/19/2013
      6
      0.0
      35
    
    
      4
      Hudson above Mohawk River
      7/21/2013
      31
      0.0
      35
    
    
      5
      Hudson above Mohawk River
      6/4/2013
      238
      1.2
      35
    
    
      6
      Hudson above Mohawk River
      10/15/2012
      23
      1.4
      35
    
    
      7
      Hudson above Mohawk River
      9/15/2012
      11
      0.1
      35
    
    
      8
      Hudson above Mohawk River
      8/18/2012
      15
      0.3
      35
    
    
      9
      Hudson above Mohawk River
      7/21/2012
      6
      0.2
      35
    
    
      10
      Hudson above Mohawk River
      6/16/2012
      10
      0.2
      35
    
    
      11
      Hudson above Mohawk River
      5/20/2012
      11
      0.0
      35
    
    
      12
      Hudson above Mohawk River
      6/24/2013
      30
      1.4
      35
    
    
      13
      Hudson above Mohawk River
      9/19/2011
      11
      0.1
      35
    
    
      14
      Hudson above Mohawk River
      8/21/2011
      231
      0.4
      35
    
    
      15
      Hudson above Mohawk River
      7/14/2011
      11
      0.3
      35
    
    
      16
      Hudson above Mohawk River
      7/2/2011
      11
      2.1
      35
    
    
      17
      Hudson above Mohawk River
      5/19/2011
      91
      1.6
      35
    
    
      18
      Hudson above Mohawk River
      10/16/2010
      >2420
      1.3
      35
    
    
      19
      Hudson above Mohawk River
      9/14/2010
      15
      0.0
      35
    
    
      20
      Hudson above Mohawk River
      8/21/2010
      25
      0.2
      35
    
    
      21
      Hudson above Mohawk River
      7/19/2010
      31
      0.7
      35
    
    
      22
      Hudson above Mohawk River
      5/22/2008
      12
      0.2
      35
    
    
      23
      Hudson above Mohawk River
      5/25/2010
      <1
      0.0
      35
    
    
      24
      Hudson above Mohawk River
      10/23/2009
      4
      0.0
      35



In [200]:

    
# Adjust the data types to make the data easier to work with
raw_data["Site"] = raw_data["Site"].astype('category') # like facets
raw_data["Date"] = pd.to_datetime(raw_data["Date"]) # dates as dates
# We need to remove the greater than and less than symbols and treat "EnteroCount" as an integer -- 'int64'
raw_data["EnteroCount"] = raw_data["EnteroCount"].str.lstrip('><')
raw_data["EnteroCount"] = raw_data["EnteroCount"].astype('int64')



In [201]:

    
# Check to make sure the changes are correct
raw_data.dtypes









    Out[201]:





Site                      category
Date                datetime64[ns]
EnteroCount                  int64
FourDayRainTotal           float64
SampleCount                  int64
dtype: object



In [202]:

    
raw_data.head(20) # there are no more < or > symbols









    Out[202]:






  
    
      
      Site
      Date
      EnteroCount
      FourDayRainTotal
      SampleCount
    
  
  
    
      0
      Hudson above Mohawk River
      2011-10-16
      1733
      1.5
      35
    
    
      1
      Hudson above Mohawk River
      2013-10-21
      4
      0.2
      35
    
    
      2
      Hudson above Mohawk River
      2013-09-21
      20
      0.0
      35
    
    
      3
      Hudson above Mohawk River
      2013-08-19
      6
      0.0
      35
    
    
      4
      Hudson above Mohawk River
      2013-07-21
      31
      0.0
      35
    
    
      5
      Hudson above Mohawk River
      2013-06-04
      238
      1.2
      35
    
    
      6
      Hudson above Mohawk River
      2012-10-15
      23
      1.4
      35
    
    
      7
      Hudson above Mohawk River
      2012-09-15
      11
      0.1
      35
    
    
      8
      Hudson above Mohawk River
      2012-08-18
      15
      0.3
      35
    
    
      9
      Hudson above Mohawk River
      2012-07-21
      6
      0.2
      35
    
    
      10
      Hudson above Mohawk River
      2012-06-16
      10
      0.2
      35
    
    
      11
      Hudson above Mohawk River
      2012-05-20
      11
      0.0
      35
    
    
      12
      Hudson above Mohawk River
      2013-06-24
      30
      1.4
      35
    
    
      13
      Hudson above Mohawk River
      2011-09-19
      11
      0.1
      35
    
    
      14
      Hudson above Mohawk River
      2011-08-21
      231
      0.4
      35
    
    
      15
      Hudson above Mohawk River
      2011-07-14
      11
      0.3
      35
    
    
      16
      Hudson above Mohawk River
      2011-07-02
      11
      2.1
      35
    
    
      17
      Hudson above Mohawk River
      2011-05-19
      91
      1.6
      35
    
    
      18
      Hudson above Mohawk River
      2010-10-16
      2420
      1.3
      35
    
    
      19
      Hudson above Mohawk River
      2010-09-14
      15
      0.0
      35

Create lists & graphs of the best and worst places to swim in the dataset.



In [7]:

    
# Create a column showing whether or not the water quality is acceptable in a given place
# unnaceptable: 
# 110 Enterococcus/100mL OR
# five or more samples with a geometric mean (a weighted average) > 30 Enterococcus/100mL.

# The prompt is really vague in defining the geometric mean condition - I used gmean()

# When calculated, geometric mean by site of highest 5 samples and single sample >= 110
# shows nearly all sites would have water with unacceptable water quality at some point

raw_data['swim'] = np.where(
    (raw_data.groupby('Site').EnteroCount.transform(max) > 110) 
    | (raw_data.groupby('Site').EnteroCount.transform(lambda group: gmean(group.nlargest(5))) > 30), 
    'unacceptable', 'acceptable')
raw_data.head(10)









    Out[7]:






  
    
      
      Site
      Date
      EnteroCount
      FourDayRainTotal
      SampleCount
      swim
    
  
  
    
      0
      Hudson above Mohawk River
      2011-10-16
      1733
      1.5
      35
      unacceptable
    
    
      1
      Hudson above Mohawk River
      2013-10-21
      4
      0.2
      35
      unacceptable
    
    
      2
      Hudson above Mohawk River
      2013-09-21
      20
      0.0
      35
      unacceptable
    
    
      3
      Hudson above Mohawk River
      2013-08-19
      6
      0.0
      35
      unacceptable
    
    
      4
      Hudson above Mohawk River
      2013-07-21
      31
      0.0
      35
      unacceptable
    
    
      5
      Hudson above Mohawk River
      2013-06-04
      238
      1.2
      35
      unacceptable
    
    
      6
      Hudson above Mohawk River
      2012-10-15
      23
      1.4
      35
      unacceptable
    
    
      7
      Hudson above Mohawk River
      2012-09-15
      11
      0.1
      35
      unacceptable
    
    
      8
      Hudson above Mohawk River
      2012-08-18
      15
      0.3
      35
      unacceptable
    
    
      9
      Hudson above Mohawk River
      2012-07-21
      6
      0.2
      35
      unacceptable



In [18]:

    
# As a result of above criteria not being very useful, I instead grouped the data by site and 
# calculated the mean entero_count for each site

mean_entero = raw_data.groupby(['Site'])['EnteroCount'].mean().sort_values()



In [19]:

    
# Top ten best places to swim
print "Ten best places to swim"
print "+++++++++++++++++++++++"
mean_entero.head(10)









    



Ten best places to swim
+++++++++++++++++++++++






    Out[19]:





Site
Poughkeepsie Drinking Water Intake     8.342105
Croton Point Beach                    15.458333
Stony Point mid-channel               17.340909
Little Stony Point                    17.526316
Poughkeepsie Launch Ramp              17.675676
Haverstraw Bay mid-channel            18.708333
TZ Bridge mid-channel                 21.438596
Cold Spring Harbor                    22.542857
Yonkers mid-channel                   25.019231
Irvington Beach                       28.805556
Name: EnteroCount, dtype: float64



In [20]:

    
# Ten worst places to swim
print "Ten worst places to swim"
print "++++++++++++++++++++++++"
mean_entero.tail(10).sort_values(ascending=False)









    



Ten worst places to swim
++++++++++++++++++++++++






    Out[20]:





Site
Gowanus Canal                              4206.837838
Newtown Creek- Metropolitan Ave. Bridge    2953.684211
Tarrytown Marina                           2205.666667
Saw Mill River                             1455.760000
Upper Sparkill Creek                       1296.072727
Newtown Creek- Dutch Kills                 1205.087719
Kingsland Pt. Park- Pocantico River         907.857143
Orangetown STP Outfall                      854.192982
Mohawk River at Waterford                   621.057143
Piermont Pier                               482.165775
Name: EnteroCount, dtype: float64



In [21]:

    
# write the results to a csv and read in for more flexible use (now and later)
mean_entero.to_csv( 'mean_entero.csv' )



In [22]:

    
mean_entero1 = pd.read_csv('mean_entero.csv')
mean_entero1.columns = ['Site','Average_Entero_Count']
mean_entero1.head() #make sure it looks right









    Out[22]:






  
    
      
      Site
      Average_Entero_Count
    
  
  
    
      0
      Croton Point Beach
      15.458333
    
    
      1
      Stony Point mid-channel
      17.340909
    
    
      2
      Little Stony Point
      17.526316
    
    
      3
      Poughkeepsie Launch Ramp
      17.675676
    
    
      4
      Haverstraw Bay mid-channel
      18.708333



In [164]:

    
# Plot the results
%matplotlib inline
sns.set_style("whitegrid")
sns.set_style("ticks")
plt.figure(figsize=(10, 20))
ax = sns.barplot(x ="Average_Entero_Count", y="Site", data=mean_entero1, palette="Blues_d")
ax.set(xlabel='Average Entero Count', ylabel='Site')
ax.set_title('Waterways ranked by Average Entero Count')
sns.despine()

# Don't go swimming in the Gowanus Canal

The testing of water quality can be sporadic. Which sites have been tested most regularly? Which ones have long gaps between tests? Pick out 5-10 sites and visually compare how regularly their water quality is tested.



In [68]:

    
# Are we looking for consistancy or frequency in testing??
# Note to self - learn more about plotting time series data

# Determine the most recent reading for each site (might be useful?)
maxDate = raw_data.groupby(by=["Site"])["Date"].max()
maxDate.sort_values(ascending=False, inplace=True)

# what are our most FREQUENTLY tested Sites?
test_Site_counts = raw_data.groupby(by=["Site"])["Date"].count()
test_Site_counts.sort_values(ascending=False, inplace=True)

# Print out results
print "Most recently tested site dates"
print "++++++++++++++++++++++++"
print maxDate.head(15)

print "\n"

print "Most frequently tested sites"
print "++++++++++++++++++++++++"
print test_Site_counts.head(15)









    



Most recently tested site dates
++++++++++++++++++++++++
Site
Gay's Point mid-channel        2013-10-21
Athens                         2013-10-21
Hudson River above Troy Lock   2013-10-21
Hudson above Mohawk River      2013-10-21
Island Creek/Normans Kill      2013-10-21
Mohawk River at Waterford      2013-10-21
Dunn Memorial Bridge- Albany   2013-10-21
Castleton                      2013-10-21
Bethlehem Launch Ramp          2013-10-21
Coeymans Landing               2013-10-21
Hudson Landing Ramp            2013-10-21
Albany Rowing Dock             2013-10-21
Congress St. Bridge- Troy      2013-10-21
Coxsackie Waterfront Park      2013-10-21
Esopus Creek West              2013-10-20
Name: Date, dtype: datetime64[ns]


Most frequently tested sites
++++++++++++++++++++++++
Site
Piermont Pier                              186
Upper Sparkill Creek                       164
125th St. Pier                              65
Nyack Launch Ramp                           60
TZ Bridge mid-channel                       56
Newtown Creek- Dutch Kills                  56
Newtown Creek- Metropolitan Ave. Bridge     56
Orangetown STP Outfall                      56
Yonkers mid-channel                         51
Yonkers STP Outfall                         50
Saw Mill River                              49
East River mid-channel at 23rd St.          49
The Battery mid-channel                     48
GW Bridge mid-channel                       48
79th St. mid-channel                        48
Name: Date, dtype: int64



In [94]:

    
# Consistancy is more important
# Figure out how many days elapsed between readings and calculate average by group. 
# This gives us a general sense of the regularity of testing

# After lots of cursing and fiddling!
lag_date_data = raw_data.groupby('Site')['Date'].apply(lambda x: x.diff().mean()).reset_index().rename(columns={'Date':'Mean_lag_days'})
lag_date_data.head()









    Out[94]:






  
    
      
      Site
      Mean_lag_days
    
  
  
    
      0
      125th St. Pier
      -8 days +21:45:00
    
    
      1
      79th St. mid-channel
      40 days 06:07:39.574468
    
    
      2
      Albany Rowing Dock
      54 days 15:31:45.882352
    
    
      3
      Annesville Creek
      51 days 08:40:00
    
    
      4
      Athens
      35 days 02:10:54.545454



In [96]:

    
# write it to a csv to make it more flexible
lag_date_data.to_csv( 'lag_data.csv' )



In [113]:

    
# Clean up the csv for plotting
lag_data = pd.read_csv('lag_data.csv')
lag_data.columns = ['index','Site','mean_test_lag']
lag_data_num_col = pd.DataFrame(lag_data['mean_test_lag'].str.split().tolist(), columns=['mean_lag_days','junk','more_junk'])
mean_lag_data = pd.concat([lag_data, lag_data_num_col], axis=1)
del mean_lag_data['index']
del mean_lag_data['mean_test_lag']
del mean_lag_data['junk']
del mean_lag_data['more_junk']
# convert mean column to numeric type
mean_lag_data["mean_lag_days"] = pd.to_numeric(mean_lag_data["mean_lag_days"]) 
# apply absolute value to mean column
mean_lag_data['mean_lag_days'] = mean_lag_data['mean_lag_days'].abs()
mean_lag_data.head(10)









    Out[113]:






  
    
      
      Site
      mean_lag_days
    
  
  
    
      0
      125th St. Pier
      8
    
    
      1
      79th St. mid-channel
      40
    
    
      2
      Albany Rowing Dock
      54
    
    
      3
      Annesville Creek
      51
    
    
      4
      Athens
      35
    
    
      5
      Beacon Harbor
      21
    
    
      6
      Bethlehem Launch Ramp
      34
    
    
      7
      Castle Point, NJ
      18
    
    
      8
      Castleton
      14
    
    
      9
      Catskill Creek- East End
      38



In [114]:

    
print mean_lag_data.dtypes









    



Site             object
mean_lag_days     int64
dtype: object



In [123]:

    
# Get a general sense of the spread - on average how often are the sites tested
%matplotlib inline
plt.figure(figsize=(4.5, 10))
sns.violinplot(mean_lag_data, palette="colorblind")
sns.despine()



In [162]:

    
# Subset random sample of sites to compare and plot
# Random subset of 10% of the data
sub_lag_data =  mean_lag_data.sample(frac=0.1, replace=True)

%matplotlib inline
plt.figure(figsize=(10, 8))
ax = sns.barplot(x ="mean_lag_days", y="Site", data=sub_lag_data, palette="Blues_d")
ax.set(xlabel='Mean Lag Time', ylabel='Site')
ax.set_title('Mean Lag Time Between Tests in Days')
sns.despine()

# From the plot we can see there is a huge range in average lag between tests!

Is there a relationship between the amount of rain and water quality? Show this relationship graphically. If you can, estimate the effect of rain on quality at different sites and create a visualization to compare them.



In [203]:

    
# Cleanup so we're using just the entero count and rain totals
del raw_data['Site']
del raw_data['Date']
del raw_data['SampleCount']
raw_data.head()









    Out[203]:






  
    
      
      EnteroCount
      FourDayRainTotal
    
  
  
    
      0
      1733
      1.5
    
    
      1
      4
      0.2
    
    
      2
      20
      0.0
    
    
      3
      6
      0.0
    
    
      4
      31
      0.0



In [253]:

    
%matplotlib inline
# Scatterplot to show relationship between rainfall and Entero Count
plt.figure(figsize=(7.5, 6.5))
sns.regplot('FourDayRainTotal', 'EnteroCount', data=raw_data, fit_reg=False, x_jitter=1)
sns.despine()
sns.plt.ylim(0)
sns.plt.xlim(0)

# There seems to be a relatively strong relationship between rainfall and water quality once
# you go beyond ~3 inches of rain. At that point there are much fewer high Entero Count readings









    Out[253]:





(0, 10.0)



In [227]:

    
# Compare any two locations with the following

# re-read in/clean up
data_url = "https://raw.githubusercontent.com/jlaurito/CUNY_IS608/master/lecture4/data/riverkeeper_data_2013.csv"
site_rain_data = pd.read_csv(data_url)
site_rain_data["Site"] = site_rain_data["Site"].astype('category') # like facets
site_rain_data["Date"] = pd.to_datetime(site_rain_data["Date"]) # dates as dates
site_rain_data["EnteroCount"] = site_rain_data["EnteroCount"].str.lstrip('><')
site_rain_data["EnteroCount"] = site_rain_data["EnteroCount"].astype('int64')
site_rain_data.head()









    Out[227]:






  
    
      
      Site
      Date
      EnteroCount
      FourDayRainTotal
      SampleCount
    
  
  
    
      0
      Hudson above Mohawk River
      2011-10-16
      1733
      1.5
      35
    
    
      1
      Hudson above Mohawk River
      2013-10-21
      4
      0.2
      35
    
    
      2
      Hudson above Mohawk River
      2013-09-21
      20
      0.0
      35
    
    
      3
      Hudson above Mohawk River
      2013-08-19
      6
      0.0
      35
    
    
      4
      Hudson above Mohawk River
      2013-07-21
      31
      0.0
      35

Enter some sites into the search boxes



In [283]:

    
site1 = raw_input('enter your first site --> ') # Gowanus Canal <- worst









    



enter your first site --> Gowanus Canal



In [284]:

    
site2 = raw_input('enter your second site --> ') # Croton Point Beach <- best water









    



enter your second site --> Croton Point Beach



In [285]:

    
# Gowanus is dirtiest, Croton Point Beach is cleanest
x = site_rain_data.loc[site_rain_data['Site'] == site1]
y = site_rain_data.loc[site_rain_data['Site'] == site2]
frames = [x, y]
result = pd.concat(frames)
del result['Date']
del result['SampleCount']
result.head()









    Out[285]:






  
    
      
      Site
      EnteroCount
      FourDayRainTotal
    
  
  
    
      3360
      Gowanus Canal
      10
      0
    
    
      3361
      Gowanus Canal
      63
      0
    
    
      3362
      Gowanus Canal
      41
      0
    
    
      3363
      Gowanus Canal
      364
      1
    
    
      3364
      Gowanus Canal
      10
      0



In [287]:

    
plt.figure(figsize=(10, 10))
sns.lmplot('FourDayRainTotal', 'EnteroCount', data=result, hue='Site', palette="Set2", fit_reg=False, scatter_kws={"s": 50}, legend=False)
plt.legend(loc='upper right')
sns.despine()
sns.plt.ylim(0)
sns.plt.xlim(0)









    Out[287]:





(0, 10.0)






    





<matplotlib.figure.Figure at 0x117aa7c10>

	Site	Date	EnteroCount	FourDayRainTotal	SampleCount
0	Hudson above Mohawk River	10/16/2011	1733	1.5	35
1	Hudson above Mohawk River	10/21/2013	4	0.2	35
2	Hudson above Mohawk River	9/21/2013	20	0.0	35
3	Hudson above Mohawk River	8/19/2013	6	0.0	35
4	Hudson above Mohawk River	7/21/2013	31	0.0	35
5	Hudson above Mohawk River	6/4/2013	238	1.2	35
6	Hudson above Mohawk River	10/15/2012	23	1.4	35
7	Hudson above Mohawk River	9/15/2012	11	0.1	35
8	Hudson above Mohawk River	8/18/2012	15	0.3	35
9	Hudson above Mohawk River	7/21/2012	6	0.2	35
10	Hudson above Mohawk River	6/16/2012	10	0.2	35
11	Hudson above Mohawk River	5/20/2012	11	0.0	35
12	Hudson above Mohawk River	6/24/2013	30	1.4	35
13	Hudson above Mohawk River	9/19/2011	11	0.1	35
14	Hudson above Mohawk River	8/21/2011	231	0.4	35
15	Hudson above Mohawk River	7/14/2011	11	0.3	35
16	Hudson above Mohawk River	7/2/2011	11	2.1	35
17	Hudson above Mohawk River	5/19/2011	91	1.6	35
18	Hudson above Mohawk River	10/16/2010	>2420	1.3	35
19	Hudson above Mohawk River	9/14/2010	15	0.0	35
20	Hudson above Mohawk River	8/21/2010	25	0.2	35
21	Hudson above Mohawk River	7/19/2010	31	0.7	35
22	Hudson above Mohawk River	5/22/2008	12	0.2	35
23	Hudson above Mohawk River	5/25/2010	<1	0.0	35
24	Hudson above Mohawk River	10/23/2009	4	0.0	35

	Site	Date	EnteroCount	FourDayRainTotal	SampleCount
0	Hudson above Mohawk River	2011-10-16	1733	1.5	35
1	Hudson above Mohawk River	2013-10-21	4	0.2	35
2	Hudson above Mohawk River	2013-09-21	20	0.0	35
3	Hudson above Mohawk River	2013-08-19	6	0.0	35
4	Hudson above Mohawk River	2013-07-21	31	0.0	35
5	Hudson above Mohawk River	2013-06-04	238	1.2	35
6	Hudson above Mohawk River	2012-10-15	23	1.4	35
7	Hudson above Mohawk River	2012-09-15	11	0.1	35
8	Hudson above Mohawk River	2012-08-18	15	0.3	35
9	Hudson above Mohawk River	2012-07-21	6	0.2	35
10	Hudson above Mohawk River	2012-06-16	10	0.2	35
11	Hudson above Mohawk River	2012-05-20	11	0.0	35
12	Hudson above Mohawk River	2013-06-24	30	1.4	35
13	Hudson above Mohawk River	2011-09-19	11	0.1	35
14	Hudson above Mohawk River	2011-08-21	231	0.4	35
15	Hudson above Mohawk River	2011-07-14	11	0.3	35
16	Hudson above Mohawk River	2011-07-02	11	2.1	35
17	Hudson above Mohawk River	2011-05-19	91	1.6	35
18	Hudson above Mohawk River	2010-10-16	2420	1.3	35
19	Hudson above Mohawk River	2010-09-14	15	0.0	35

	Site	Average_Entero_Count
0	Croton Point Beach	15.458333
1	Stony Point mid-channel	17.340909
2	Little Stony Point	17.526316
3	Poughkeepsie Launch Ramp	17.675676
4	Haverstraw Bay mid-channel	18.708333

	Site	Mean_lag_days
0	125th St. Pier	-8 days +21:45:00
1	79th St. mid-channel	40 days 06:07:39.574468
2	Albany Rowing Dock	54 days 15:31:45.882352
3	Annesville Creek	51 days 08:40:00
4	Athens	35 days 02:10:54.545454

	Site	EnteroCount	FourDayRainTotal
3360	Gowanus Canal	10	0
3361	Gowanus Canal	63	0
3362	Gowanus Canal	41	0
3363	Gowanus Canal	364	1
3364	Gowanus Canal	10	0