Few months ago the City of
New York posted Parking Violations Data.
The data had over six million observations with several elements like Vehicle
details including Plate ID, Plate type, Registration state, details of when the
ticket was issued, where it was issued, Violation county and precinct. Because
I live in Brooklyn, I was naturally curious to see how parking violations compared
amongst boroughs and where Brooklyn stood in particular. On a side note, I was
able to find an entry for the parking ticket that I got in September 2013 J.
The data was available in
the form of a 1.3 GB CSV file. I tried reading the file directly into R, but for some
reason it was unable to read beyond 450,000 rows. I processed the file separately to
retain columns of interest and was able to import the whole file. The Issue
Date on this data ran from 1970 to way beyond 2014, but I have only analyzed
2013 data here, which was about 63% of the total. A few entries had incorrect
Borough ID’s and those have been excluded here (keeping only BX, K, NY, Q, and
R). Since there were around 99 different violation codes, I have grouped
similar ones (for e.g. Bike lane, crosswalk and sidewalk related violations
have been grouped together, similarly, all registration related offences have
been grouped together) to be able to see the broad violation categories. I have
omitted Violations with negligible number of observations.
To assess the total amount
paid in fines ($) at borough level, I obtained the parking fine amount by
violation code from NYC Department of Finance website. The fines are categorized by location as “Manhattan below 96”, which means
areas below 96th St in Manhattan and “All other areas”. Because
there is no significant difference between these categories, for the purposes
of this analysis, I have used “All other areas”. This data was matched to the
parking violation data using violation code.
There are many interesting elements in this data. For e.g. as next steps
I plan to study seasonality, if any, and the effect of weather on parking
violations by looking at temperature, precipitation etc. Violations by type of
vehicle (Agricultural, tractor, motorcycle etc.) would also make interesting
analysis.