Homework 11: PARKING TICKETS

1. I want to make sure my Plate ID is a string. Can't lose the leading zeroes!

2. I don't think anyone's car was built in 0AD. Discard the '0's as NaN.

3. I want the dates to be dates! Read the read_csv documentation to find out how to make pandas automatically parse dates.



In [1]:

    
import pandas as pd
dates=['Issue Date', 'Vehicle Expiration Date'] #Importing dates as datetime
col_types={'Plate ID': 'str','Date First Observed':'str'} #Importing Plate ID and the Date First Observed as a string, because it has to be made into a time by a function.



In [2]:

    
df=pd.read_csv("small-violations.csv",dtype=col_types,parse_dates=dates,na_values={'Date First Observed'==0,'Vehicle Expiration Date'==88888888,88888888.0,88880088},infer_datetime_format=True)









    



c:\users\harsha devulapalli\appdata\local\programs\python\python35\lib\site-packages\IPython\core\interactiveshell.py:2723: DtypeWarning: Columns (17,18,20,21,22,23,29,30,31,32,36,38,39) have mixed types. Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)



In [3]:

    
df.dtypes #Finding out if our datatype import has worked or not.









    Out[3]:





Summons Number                                int64
Plate ID                                     object
Registration State                           object
Plate Type                                   object
Issue Date                           datetime64[ns]
Violation Code                              float64
Vehicle Body Type                            object
Vehicle Make                                 object
Issuing Agency                               object
Street Code1                                float64
Street Code2                                float64
Street Code3                                float64
Vehicle Expiration Date                      object
Violation Location                          float64
Violation Precinct                          float64
Issuer Precinct                             float64
Issuer Code                                 float64
Issuer Command                               object
Issuer Squad                                 object
Violation Time                               object
Time First Observed                          object
Violation County                             object
Violation In Front Of Or Opposite            object
House Number                                 object
Street Name                                  object
Intersecting Street                          object
Date First Observed                          object
Law Section                                 float64
Sub Division                                 object
Violation Legal Code                         object
Days Parking In Effect                       object
From Hours In Effect                         object
To Hours In Effect                           object
Vehicle Color                                object
Unregistered Vehicle?                       float64
Vehicle Year                                float64
Meter Number                                 object
Feet From Curb                              float64
Violation Post Code                          object
Violation Description                        object
No Standing or Stopping Violation           float64
Hydrant Violation                           float64
Double Parking Violation                    float64
dtype: object



In [4]:

    
df.columns #Identifying all the columns of the dataframe.









    Out[4]:





Index(['Summons Number', 'Plate ID', 'Registration State', 'Plate Type',
       'Issue Date', 'Violation Code', 'Vehicle Body Type', 'Vehicle Make',
       'Issuing Agency', 'Street Code1', 'Street Code2', 'Street Code3',
       'Vehicle Expiration Date', 'Violation Location', 'Violation Precinct',
       'Issuer Precinct', 'Issuer Code', 'Issuer Command', 'Issuer Squad',
       'Violation Time', 'Time First Observed', 'Violation County',
       'Violation In Front Of Or Opposite', 'House Number', 'Street Name',
       'Intersecting Street', 'Date First Observed', 'Law Section',
       'Sub Division', 'Violation Legal Code', 'Days Parking In Effect    ',
       'From Hours In Effect', 'To Hours In Effect', 'Vehicle Color',
       'Unregistered Vehicle?', 'Vehicle Year', 'Meter Number',
       'Feet From Curb', 'Violation Post Code', 'Violation Description',
       'No Standing or Stopping Violation', 'Hydrant Violation',
       'Double Parking Violation'],
      dtype='object')

4. "Date first observed" is a pretty weird column, but it seems like it has a date hiding inside. Using a function with .apply, transform the string (e.g. "20140324") into a Python date. Make the 0's show up as NaN.



In [5]:

    
## WRITING A FUNCTION TO CONVERT STRING INTO A DATE

def string_to_date(string):
    from dateutil import parser
    if pd.isnull(string):
        return None
    else:
        dt = parser.parse(string)
        return dt.date()



In [6]:

    
string_to_date('20160808') #Testing it out.









    Out[6]:





datetime.date(2016, 8, 8)



In [7]:

    
df['Date First Observed']= df['Date First Observed'].apply(string_to_date) #Applying it to the dataframe column

5. "Violation time" is... not a time. Make it a time.



In [8]:

    
def string_to_time(string): #CONVERTING THE VIOLATION TIME STRING INTO TIME
    from dateutil import parser
    import re
    if pd.isnull(string):
        return None
    if string =='Nan' or string =='nan' or string =='0':
        return None
    if string[0:4].isnumeric()==True:
        if int(string[0:2]) <=12 and int(string[2:4])<=59:
            regex=re.search(r"\d\d\d\d[AaPp]",string)
            if regex:
                time=string[0:2]+":"+string[2:4]+" "+string[4]+"M"
                return parser.parse(time).time()
            else:
                return None



In [9]:

    
def float_to_int(float): #THE VIOLATION CODE COLUMN IS ACTUALLY A FLOAT, SO WE'RE CONVERTING INTO AN INTEGER FIRST. 
    if pd.isnull(float):
        return None
    else:
         return int(float)



In [10]:

    
df['Violation Code']=df['Violation Code'].apply(float_to_int)



In [11]:

    
df['Violation Time']=df['Violation Time'].apply(string_to_time)



In [12]:

    
df['Violation Time'].head() #TESTING IF IT WORKED









    Out[12]:





0    07:52:00
1    12:40:00
2    12:43:00
3    14:32:00
4    12:39:00
Name: Violation Time, dtype: object

6, There sure are a lot of colors of cars, too bad so many of them are the same. Make "BLK" and "BLACK", "WT" and "WHITE", and any other combinations that you notice.



In [13]:

    
df['Vehicle Color'].value_counts() #LOOKING AT COLORS IN THE ORIGINAL DATASET









    Out[13]:





WHITE    149076
WH       132446
GY       123169
BK        96899
BLACK     71865
BL        46399
GREY      46237
BLUE      35237
SILVE     34335
BROWN     31801
RED       31768
RD        28920
GR        21184
GREEN     16673
TN        13939
BLK        9342
BR         8893
OTHER      8751
TAN        8421
YW         8144
GRAY       7620
GOLD       6546
WHT        5989
GL         5758
YELLO      5179
GRY        3521
MR         3114
WT         2286
BRN        2146
ORANG      1652
          ...  
NY            1
WITH          1
GR/BL         1
FLORI         1
GOLDT         1
DK-TN         1
PEAR          1
BURQU         1
GLW           1
BLUGR         1
LABLU         1
RUBY          1
BRWB          1
BRNZ          1
BLAKE         1
SW            1
GR/WH         1
BULE          1
CNY           1
NAVYB         1
METBL         1
NNHE          1
IL            1
TEN           1
B/R           1
CHAPI         1
GOL/S         1
SIK           1
FL            1
WHIOT         1
Name: Vehicle Color, dtype: int64



In [14]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['WHT', 'WH','WT','WHI'], 'WHITE')



In [15]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['GRAY', 'GY','GRY'], 'GREY')



In [16]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['BLK', 'BK'], 'BLACK')



In [17]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace('BL', 'BLUE')



In [18]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['BR', 'BN','BRWN'], 'BROWN')



In [19]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace('RD', 'RED')



In [20]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['GR', 'GN','GRN'], 'GREEN')



In [21]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace('TN', 'TAN')



In [22]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace('GL', 'GOLD')



In [23]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace('BRN', 'BROWN')



In [24]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['YW', 'YELLO','YELL'], 'YELLOW')



In [25]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['BL', 'BLU'], 'BLUE')



In [26]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace('MR', 'MAROON')



In [27]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['SIL', 'SILVR','SILVE','SILV'], 'SILVER')



In [28]:

    
df['Vehicle Color'] = df['Vehicle Color'].replace(['OR', 'ORANG'], 'ORANGE')



In [29]:

    
df['Vehicle Color'].value_counts() #LOOKING AT THE COLORS AFTER CLEANING UP, MUCH BETTER!









    Out[29]:





WHITE     290458
GREY      180547
BLACK     178106
BLUE       82534
RED        60688
BROWN      43890
GREEN      39536
SILVER     36460
TAN        22360
YELLOW     13520
GOLD       12304
OTHER       8751
MAROON      3114
ORANGE      2873
LTG         1325
SL          1214
PURPL        963
LT/          634
DKG          622
DK/          537
PR           476
DKB          358
BEIGE        272
BLG          257
W            255
LTGY         233
BRW          207
LAVEN        206
MAROO        191
NOC          166
           ...  
CHRYS          1
BLU/G          1
BKACK          1
BT             1
GRY/G          1
NY             1
GR/BL          1
FLORI          1
GOLDT          1
PEAR           1
NNHE           1
FL             1
SIK            1
GOL/S          1
LTW            1
RUBY           1
BRNZ           1
BLAKE          1
SW             1
GR/WH          1
BULE           1
CNY            1
NAVYB          1
BRWB           1
METBL          1
IL             1
TEN            1
B/R            1
CHAPI          1
WHIOT          1
Name: Vehicle Color, dtype: int64

7. Join the data with the Parking Violations Code dataset from the NYC Open Data site.



In [30]:

    
#READING PARKING VIOLATION CODES FROM NYC DATASET
codesdf=pd.read_csv("DOF_Parking_Violation_Codes.csv",dtype={'Manhattan\xa0 96th St. & below':int, 'All Other Areas': int})



In [31]:

    
codesdf.columns









    Out[31]:





Index(['CODE', 'DEFINITION', 'Manhattan  96th St. & below', 'All Other Areas'], dtype='object')



In [32]:

    
codesdf['Manhattan\xa0 96th St. & below'].value_counts() #CHECKING IF THE IMPORT WORKED WELL









    Out[32]:





65     36
115    29
95     11
60      2
100     1
180     1
50      1
165     1
265     1
0       1
Name: Manhattan  96th St. & below, dtype: int64



In [33]:

    
newdf=df.join(codesdf, on='Violation Code', how='left') #MERGING THE TWO DATAFRAMES INTO A NEW DATAFRAME Unimaginatively CALLED NEWDF



In [34]:

    
newdf.head() #DID IT WORK? SEEMS TO HAVE, THE NANS IN THE END OF THIS FRAME ARE THERE IN THE ORIGINAL ONE TOO.









    Out[34]:






  
    
      
      Summons Number
      Plate ID
      Registration State
      Plate Type
      Issue Date
      Violation Code
      Vehicle Body Type
      Vehicle Make
      Issuing Agency
      Street Code1
      ...
      Feet From Curb
      Violation Post Code
      Violation Description
      No Standing or Stopping Violation
      Hydrant Violation
      Double Parking Violation
      CODE
      DEFINITION
      Manhattan  96th St. & below
      All Other Areas
    
  
  
    
      0
      1283294138
      GBB9093
      NY
      PAS
      2013-08-04
      46.0
      SUBN
      AUDI
      P
      37250.0
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      62
      Standing or parking a vehicle beyond markings ...
      65.0
      45.0
    
    
      1
      1283294151
      62416MB
      NY
      COM
      2013-08-04
      46.0
      VAN
      FORD
      P
      37290.0
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      62
      Standing or parking a vehicle beyond markings ...
      65.0
      45.0
    
    
      2
      1283294163
      78755JZ
      NY
      COM
      2013-08-05
      46.0
      P-U
      CHEVR
      P
      37030.0
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      62
      Standing or parking a vehicle beyond markings ...
      65.0
      45.0
    
    
      3
      1283294175
      63009MA
      NY
      COM
      2013-08-05
      46.0
      VAN
      FORD
      P
      37270.0
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      62
      Standing or parking a vehicle beyond markings ...
      65.0
      45.0
    
    
      4
      1283294187
      91648MC
      NY
      COM
      2013-08-08
      41.0
      TRLR
      GMC
      P
      37240.0
      ...
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      57
      Parking a vehicle within the area designated a...
      65.0
      65.0
    
  

5 rows × 47 columns



In [35]:

    
newdf.columns #THE NEW COLUMNS SEEM TO HAVE BEEN ADDED









    Out[35]:





Index(['Summons Number', 'Plate ID', 'Registration State', 'Plate Type',
       'Issue Date', 'Violation Code', 'Vehicle Body Type', 'Vehicle Make',
       'Issuing Agency', 'Street Code1', 'Street Code2', 'Street Code3',
       'Vehicle Expiration Date', 'Violation Location', 'Violation Precinct',
       'Issuer Precinct', 'Issuer Code', 'Issuer Command', 'Issuer Squad',
       'Violation Time', 'Time First Observed', 'Violation County',
       'Violation In Front Of Or Opposite', 'House Number', 'Street Name',
       'Intersecting Street', 'Date First Observed', 'Law Section',
       'Sub Division', 'Violation Legal Code', 'Days Parking In Effect    ',
       'From Hours In Effect', 'To Hours In Effect', 'Vehicle Color',
       'Unregistered Vehicle?', 'Vehicle Year', 'Meter Number',
       'Feet From Curb', 'Violation Post Code', 'Violation Description',
       'No Standing or Stopping Violation', 'Hydrant Violation',
       'Double Parking Violation', 'CODE', 'DEFINITION',
       'Manhattan  96th St. & below', 'All Other Areas'],
      dtype='object')

8. How much money did NYC make off of parking violations?



In [36]:

    
newdf['Violation County'].value_counts() #LET US SEE THE VIOLATION COUNTY DATA, THIS NEEDS TO BE CLEANED UP A BIT









    Out[36]:





NY       392873
K        216562
Q        193612
BX        94041
R          9454
KINGS         1
RICH          1
BRONX         1
Name: Violation County, dtype: int64



In [37]:

    
newdf['Violation County'] = newdf['Violation County'].replace('BX', 'BRONX')



In [38]:

    
newdf['Violation County'] = newdf['Violation County'].replace(['R', 'RICH'], 'RICHMOND')



In [39]:

    
newdf['Violation County'] = newdf['Violation County'].replace('K', 'KINGS')



In [40]:

    
newdf['Violation County'].value_counts() #ALL CLEANED









    Out[40]:





NY          392873
KINGS       216563
Q           193612
BRONX        94042
RICHMOND      9455
Name: Violation County, dtype: int64

Now, we know the locations of the violation and from the merged dataset we know that there the fine rates vary upon the location of the violation, with certain rates for everything under 96th street in Manhattan and other rates for elsewhere. So using the Violation County numbers, and the assumption that NY county is everything under 96th - we calculate.



In [41]:

    
manhattanviolations=newdf[newdf['Violation County']=='NY']



In [42]:

    
#FINES COLLECTED IN MANHATTAN
manhattanviolations['Manhattan\xa0 96th St. & below'].sum()









    Out[42]:





38633190.0



In [43]:

    
allotherviolations=newdf[newdf['Violation County']!= 'NY']



In [44]:

    
#FINES NOT COLLECTED IN MANHATTAN

allotherviolations['All Other Areas'].sum()









    Out[44]:





48133905.0



In [45]:

    
#FINES COLLECTED IN ALL OF NYC
manhattanviolations['Manhattan\xa0 96th St. & below'].sum()+allotherviolations['All Other Areas'].sum()









    Out[45]:





86767095.0

9. What's the most lucrative kind of parking violation? The most frequent?



In [46]:

    
newdf['Manhattan\xa0 96th St. & below'].value_counts()
#IT IS EVIDENT THAT THE 265 DOLLARS IS THE MOST LUCRATIVE FINE in terms of monetary amount, but 115 DOlLARS IS THE MOST COMMON









    Out[46]:





115.0    482727
65.0     407620
95.0      59637
180.0     24322
165.0      6976
0.0        1857
60.0       1482
265.0      1227
100.0       533
50.0         37
Name: Manhattan  96th St. & below, dtype: int64



In [47]:

    
lucrative=newdf[newdf['Manhattan\xa0 96th St. & below']==265]



In [48]:

    
lucrative['DEFINITION'].value_counts()
#TRACTOR TRAILER PARKING SEEMS TO MAKE MONIES









    Out[48]:





Parking a tractor-trailer on a residential street between 9PM and 5AM.    1227
Name: DEFINITION, dtype: int64



In [49]:

    
frequent=newdf['DEFINITION'].value_counts()
#IDENTIFYING THE MOST FREQUENT SETS OF VIOLATIONS



In [50]:

    
frequent.head() #THE TOP FIVE SETS OF VIOLATIONS









    Out[50]:





Expired Meter -- parking in a metered space where the meter works and the time has ended. Drivers get a 5-minute grace period past the expired time on Alternate Side Parking signs and any other parking spaces with specific times listed (i.e.. 8:30am - 9:30am). During the 5-minute grace period, parking tickets cannot be issued.    140329
Standing or parking in a safety zone, between a safety zone and the nearest curb, or within 30 feet of points on the curb immediately opposite the ends of a safety zone.                                                                                                                                                                   125716
Standing at a commuter van stop, other than temporarily for the purpose of quickly picking up or dropping off passengers.                                                                                                                                                                                                                    93579
Stopping, standing or parking within an intersection.                                                                                                                                                                                                                                                                                        75484
Bus Lane: Standing or parking where standing is not allowed by sign, street marking or; traffic control device.                                                                                                                                                                                                                              66747
Name: DEFINITION, dtype: int64

10. New Jersey has bad drivers, but does it have bad parkers, too? How much money does NYC make off of all non-New York vehicles?



In [51]:

    
nonnyviolations=newdf[newdf['Registration State']!= 'NY']



In [52]:

    
nonnymanhattanviolations=nonnyviolations[nonnyviolations['Violation County']=='NY']



In [53]:

    
nonnymanhattanviolations['Manhattan\xa0 96th St. & below'].sum()
#MONEY MADE BY NON NY VEHICLES IN MANHATTAN









    Out[53]:





11523570.0



In [54]:

    
nonnymanhattanviolations=nonnyviolations[nonnyviolations['Violation County']!='NY']



In [55]:

    
nonnymanhattanviolations['All Other Areas'].sum()
#MONEY MADE BY NON NY VEHICLES OUTSIDE MANHATTAN









    Out[55]:





10324830.0



In [56]:

    
nonnymanhattanviolations['Manhattan\xa0 96th St. & below'].sum()+nonnymanhattanviolations['All Other Areas'].sum()
#TOTAL MONEY MADE BY NON NY VEHICLES









    Out[56]:





22184410.0

11. Make a chart of the top few.



In [57]:

    
import matplotlib.pyplot as plt

%matplotlib inline



In [58]:

    
plt.style.use('ggplot')
toptenviolations=nonnyviolations['Violation Code'].value_counts()
toptenviolations.head(10).plot(kind='bar')
#I USE THE VIOLATION CODE INSTEAD OF ACTUAL VIOLATION NAME BECAUSE THE NAMES OF SOME OF THEM ARE TOO LONG.









    Out[58]:





<matplotlib.axes._subplots.AxesSubplot at 0x18e04067b38>

We can then inspect what 21,38,14 violation codes are.

12. What time of day do people usually get their tickets? You can break the day up into several blocks - for example 12am-6am, 6am-12pm, 12pm-6pm, 6pm-12am.



In [59]:

    
newdf['Violation Time'].value_counts().plot(figsize=(16, 8))









    Out[59]:





<matplotlib.axes._subplots.AxesSubplot at 0x18e0404ada0>

From the graph above, it is very evident that most people get tickets in the 6 AM to 12 PM ie the morning slot. More than any other segments of the day.

13. What's the average ticket cost in NYC?



In [60]:

    
newdf['Manhattan\xa0 96th St. & below'].describe()
#AVERAGE COST OF TICKETS IN MANHATTAN is 95 DOLALRS









    



c:\users\harsha devulapalli\appdata\local\programs\python\python35\lib\site-packages\numpy\lib\function_base.py:3834: RuntimeWarning: Invalid value encountered in percentile
  RuntimeWarning)






    Out[60]:





count    986418.000000
mean         94.962425
std          28.902005
min           0.000000
25%                NaN
50%                NaN
75%                NaN
max         265.000000
Name: Manhattan  96th St. & below, dtype: float64



In [61]:

    
newdf['All Other Areas'].describe()
#AVERAGE COST OF TICKET ELSEWHERE IS 85 DOLLARS









    



c:\users\harsha devulapalli\appdata\local\programs\python\python35\lib\site-packages\numpy\lib\function_base.py:3834: RuntimeWarning: Invalid value encountered in percentile
  RuntimeWarning)






    Out[61]:





count    986418.000000
mean         84.716996
std          40.356772
min           0.000000
25%                NaN
50%                NaN
75%                NaN
max         265.000000
Name: All Other Areas, dtype: float64



In [62]:

    
import numpy
a=[84.716996,94.962425]
numpy.mean(a)
#SO THE AVERAGE TICKET COST IN NEW YORK CITY IS ABOUT 90 DOLLARS









    Out[62]:





89.839710499999995

14. Make a graph of the number of tickets per day.



In [63]:

    
ticketsperday=newdf['Issue Date'].value_counts()



In [64]:

    
ticketsperday.head(100).plot(figsize=(16, 8))









    Out[64]:





<matplotlib.axes._subplots.AxesSubplot at 0x18e000e08d0>

Looks like, the month of August is when the Police seem to have gone on a fining spree.

15. Make a graph of the amount of revenue collected per day.

<< Not too sure on how to get this done>>

16. Manually construct a dataframe out of https://dmv.ny.gov/statistic/2015licinforce-web.pdf (only NYC boroughts - bronx, queens, manhattan, staten island, brooklyn), having columns for borough name, abbreviation, and number of licensed drivers.



In [84]:

    
driversdf=pd.read_csv("drivers.csv")



In [85]:

    
driversdf.head(6)









    Out[85]:






  
    
      
      County
      Male
      Female
      Total
      Borough Abbreviation
    
  
  
    
      0
      BRONX
      259845
      197002
      456847
      BRONX
    
    
      1
      KINGS
      555917
      427852
      983769
      KINGS
    
    
      2
      NEW YORK
      399606
      348977
      748583
      NEW YORK
    
    
      3
      QUEENS
      631912
      502152
      1134064
      QUEENS
    
    
      4
      RICHMOND
      156339
      147911
      304250
      RICHMOND
    
    
      5
      NYC
      2003619
      1623894
      3627513
      NYC

17. What's the parking-ticket in dollars-per-licensed-driver in each borough of NYC? Do this with pandas and the dataframe you just made, not with your head!



In [91]:

    
#MAKING DATAFRAMES FOR EACH OF THESE BOROUGHS/COUNTIES
bronxviolations=newdf[newdf['Violation County']=='BRONX']
brooklynviolations=newdf[newdf['Violation County']=='KINGS']
manhattanviolations=newdf[newdf['Violation County']=='NY']
queensviolations=newdf[newdf['Violation County']=='Q']
statenislandviolations=newdf[newdf['Violation County']=='RICHMOND']



In [97]:

    
#GETTING THE TOTAL NUMBER OF DRIVERS FROM THE NEW DATA FRAME
numberofdriversinbronx=driversdf.iloc[0]['Total']
numberofdriversinbrooklyn=driversdf.iloc[1]['Total']
numberofdriversinmanhattan=driversdf.iloc[2]['Total']
numberofdriversinqueens=driversdf.iloc[3]['Total']
numberofdriversinstatenisland=driversdf.iloc[4]['Total']



In [98]:

    
bronxviolations['All Other Areas'].sum()/numberofdriversinbronx #AVERAGE PARKING TICKET PER DRIVER IN BRONX









    Out[98]:





14.477724489818254



In [99]:

    
brooklynviolations['All Other Areas'].sum()/numberofdriversinbrooklyn #AVERAGE PARKING TICKET PER DRIVER IN BROOKLYN









    Out[99]:





16.182579447004326



In [100]:

    
manhattanviolations['Manhattan\xa0 96th St. & below'].sum()/numberofdriversinmanhattan #AVERAGE PARKING TICKET PER DRIVER IN MANHATTAN









    Out[100]:





51.608425518613167



In [101]:

    
queensviolations['All Other Areas'].sum()/numberofdriversinqueens #AVERAGE PARKING TICKET PER DRIVER IN QUEENS









    Out[101]:





13.185256740360332



In [102]:

    
statenislandviolations['All Other Areas'].sum()/numberofdriversinstatenisland #AVERAGE PARKING TICKET PER DRIVER IN STATEN ISLAND









    Out[102]:





2.2674281018898932

	Summons Number	Plate ID	Registration State	Plate Type	Issue Date	Violation Code	Vehicle Body Type	Vehicle Make	Issuing Agency	Street Code1	...	Feet From Curb	Violation Post Code	Violation Description	No Standing or Stopping Violation	Hydrant Violation	Double Parking Violation	CODE	DEFINITION	Manhattan 96th St. & below	All Other Areas
0	1283294138	GBB9093	NY	PAS	2013-08-04	46.0	SUBN	AUDI	P	37250.0	...	NaN	NaN	NaN	NaN	NaN	NaN	62	Standing or parking a vehicle beyond markings ...	65.0	45.0
1	1283294151	62416MB	NY	COM	2013-08-04	46.0	VAN	FORD	P	37290.0	...	NaN	NaN	NaN	NaN	NaN	NaN	62	Standing or parking a vehicle beyond markings ...	65.0	45.0
2	1283294163	78755JZ	NY	COM	2013-08-05	46.0	P-U	CHEVR	P	37030.0	...	NaN	NaN	NaN	NaN	NaN	NaN	62	Standing or parking a vehicle beyond markings ...	65.0	45.0
3	1283294175	63009MA	NY	COM	2013-08-05	46.0	VAN	FORD	P	37270.0	...	NaN	NaN	NaN	NaN	NaN	NaN	62	Standing or parking a vehicle beyond markings ...	65.0	45.0
4	1283294187	91648MC	NY	COM	2013-08-08	41.0	TRLR	GMC	P	37240.0	...	NaN	NaN	NaN	NaN	NaN	NaN	57	Parking a vehicle within the area designated a...	65.0	65.0

	County	Male	Female	Total	Borough Abbreviation
0	BRONX	259845	197002	456847	BRONX
1	KINGS	555917	427852	983769	KINGS
2	NEW YORK	399606	348977	748583	NEW YORK
3	QUEENS	631912	502152	1134064	QUEENS
4	RICHMOND	156339	147911	304250	RICHMOND
5	NYC	2003619	1623894	3627513	NYC