Share the Insight

There are two main insights we want to communicate.

  • Bangalore is the largest market for Onion Arrivals.
  • Onion Price variation has increased in the recent years.

Let us explore how we can communicate these insight visually.

Preprocessing to get the data


In [2]:
# Import the library we need, which is dplyr and ggplot2
library(dplyr)
library(ggplot2)

In [3]:
# Read the csv file of Monthwise Quantity and Price csv file we have.
df <- read.csv('MonthWiseMarketArrivals_Clean.csv')

In [4]:
str(df)


'data.frame':	10320 obs. of  10 variables:
 $ market  : Factor w/ 122 levels "ABOHAR(PB)","AGRA(UP)",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ month   : Factor w/ 12 levels "April","August",..: 5 5 5 5 5 5 5 5 4 4 ...
 $ year    : int  2005 2006 2010 2011 2012 2013 2014 2015 2005 2006 ...
 $ quantity: int  2350 900 790 245 1035 675 440 1305 1400 1800 ...
 $ priceMin: int  404 487 1283 3067 523 1327 1025 1309 286 343 ...
 $ priceMax: int  493 638 1592 3750 686 1900 1481 1858 365 411 ...
 $ priceMod: int  446 563 1460 3433 605 1605 1256 1613 324 380 ...
 $ city    : Factor w/ 119 levels "ABOHAR","AGRA",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ state   : Factor w/ 22 levels "AP","ASM","BHR",..: 17 17 17 17 17 17 17 17 17 17 ...
 $ date    : Factor w/ 243 levels "1996-01-01","1996-02-01",..: 109 121 169 181 193 205 217 229 110 122 ...

In [5]:
# Fix the date
df$date = as.Date(as.character(df$date), "%Y-%m-%d")

In [6]:
# Get the data for year 2015 and sort
df2015City <- df %>% 
          filter(year == 2015) %>%
          group_by(city) %>%
          summarize(quantity_year = sum(quantity)) %>%
          arrange(desc(quantity_year))

In [7]:
head(df2015City)


Out[7]:
cityquantity_year
1BANGALORE8267060
2MAHUVA5113510
3SOLAPUR4162041
4PUNE3591209
5LASALGAON3581359
6PIMPALGAON3455265

In [8]:
df2015City$city <- as.character(df2015City$city)
str(df2015City)


Classes 'tbl_df', 'tbl' and 'data.frame':	111 obs. of  2 variables:
 $ city         : chr  "BANGALORE" "MAHUVA" "SOLAPUR" "PUNE" ...
 $ quantity_year: int  8267060 5113510 4162041 3591209 3581359 3455265 3272139 2971205 2444020 1882161 ...

Let us plot the Cities in a Geographic Map

Getting the geocode for each city, we will run it through the google maps API. We will use the ggmap library to do so.

For example searching for 'Bangalore', we can see the lat, lon as 12.9538477,77.3507442,10 in the url itself

https://www.google.co.in/maps/place/Bengaluru,+Karnataka+560001/@12.9538477,77.3507442,10z/data=!3m1!4b1!4m2!3m1!1s0x3bae1670c9b44e6d:0xf8dfc3e8517e4fe0


In [ ]:
install.packages('ggmap',repos="http://ftp.iitm.ac.in/cran")

In [9]:
library(ggmap)

In [10]:
geocode('Bangalore')


Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Bangalore&sensor=false
Out[10]:
lonlat
177.5945612.9716

In [11]:
# Let us get the city, state name strings to run through this
dfGeo <- df %>%
         mutate(city_state = paste(city, state, sep=", ")) %>%
         select(city, state, city_state) %>%
         distinct(city_state)

In [13]:
head(dfGeo)


Out[13]:
citystatecity_state
1ABOHARPBABOHAR, PB
2AGRAUPAGRA, UP
3AHMEDABADGUJAHMEDABAD, GUJ
4AHMEDNAGARMSAHMEDNAGAR, MS
5AJMERRAJAJMER, RAJ
6ALIGARHUPALIGARH, UP

In [12]:
str(dfGeo)


'data.frame':	119 obs. of  3 variables:
 $ city      : Factor w/ 119 levels "ABOHAR","AGRA",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ state     : Factor w/ 22 levels "AP","ASM","BHR",..: 17 21 5 15 18 21 18 17 21 12 ...
 $ city_state: chr  "ABOHAR, PB" "AGRA, UP" "AHMEDABAD, GUJ" "AHMEDNAGAR, MS" ...

In [14]:
dfGeo$city <- as.character(dfGeo$city)

In [15]:
dim(dfGeo)


Out[15]:
  1. 119
  2. 3

In [16]:
dfGeo$city


Out[16]:
  1. 'ABOHAR'
  2. 'AGRA'
  3. 'AHMEDABAD'
  4. 'AHMEDNAGAR'
  5. 'AJMER'
  6. 'ALIGARH'
  7. 'ALWAR'
  8. 'AMRITSAR'
  9. 'BALLIA'
  10. 'BANGALORE'
  11. 'BAREILLY'
  12. 'BELGAUM'
  13. 'BHATINDA'
  14. 'BHAVNAGAR'
  15. 'BHOPAL'
  16. 'BHUBNESWER'
  17. 'BIHARSHARIF'
  18. 'BIJAPUR'
  19. 'BIKANER'
  20. 'BOMBORI'
  21. 'BURDWAN'
  22. 'CHAKAN'
  23. 'CHALLAKERE'
  24. 'CHANDIGARH'
  25. 'CHANDVAD'
  26. 'CHENNAI'
  27. 'CHICKBALLAPUR'
  28. 'COIMBATORE'
  29. 'DEESA'
  30. 'DEHRADOON'
  31. 'DELHI'
  32. 'DEORIA'
  33. 'DEVALA'
  34. 'DEWAS'
  35. 'DHAVANGERE'
  36. 'DHULIA'
  37. 'DINDIGUL'
  38. 'DINDORI'
  39. 'ETAWAH'
  40. 'FARUKHABAD'
  41. 'GONDAL'
  42. 'GORAKHPUR'
  43. 'GUWAHATI'
  44. 'HALDWANI'
  45. 'HASSAN'
  46. 'HOSHIARPUR'
  47. 'HUBLI'
  48. 'HYDERABAD'
  49. 'INDORE'
  50. 'JAIPUR'
  51. 'JALANDHAR'
  52. 'JALGAON'
  53. 'JAMMU'
  54. 'JAMNAGAR'
  55. 'JHANSI'
  56. 'JODHPUR'
  57. 'JUNNAR'
  58. 'KALVAN'
  59. 'KANPUR'
  60. 'KARNAL'
  61. 'KHANNA'
  62. 'KOLAR'
  63. 'KOLHAPUR'
  64. 'KOLKATA'
  65. 'KOPERGAON'
  66. 'KOTA'
  67. 'KURNOOL'
  68. 'LASALGAON'
  69. 'LONAND'
  70. 'LUCKNOW'
  71. 'LUDHIANA'
  72. 'MADURAI'
  73. 'MAHUVA'
  74. 'MALEGAON'
  75. 'MANDSOUR'
  76. 'MANMAD'
  77. 'MEERUT'
  78. 'MIDNAPUR'
  79. 'MUMBAI'
  80. 'NAGPUR'
  81. 'NANDGAON'
  82. 'NASIK'
  83. 'NEEMUCH'
  84. 'NEWASA'
  85. 'NIPHAD'
  86. 'PALAYAM'
  87. 'PATIALA'
  88. 'PATNA'
  89. 'PHALTAN '
  90. 'PIMPALGAON'
  91. 'PUNE'
  92. 'PURULIA'
  93. 'RAHATA'
  94. 'RAHURI'
  95. 'RAICHUR'
  96. 'RAIPUR'
  97. 'RAJAHMUNDRY'
  98. 'RAJKOT'
  99. 'RANCHI'
  100. 'SAGAR'
  101. 'SAIKHEDA'
  102. 'SANGALI'
  103. 'SANGAMNER'
  104. 'SATANA'
  105. 'SHEROAPHULY'
  106. 'SHIMLA'
  107. 'SHRIRAMPUR'
  108. 'SINNAR'
  109. 'SOLAPUR'
  110. 'SRIGANGANAGAR'
  111. 'SRINAGAR'
  112. 'SRIRAMPUR'
  113. 'SURAT'
  114. 'TRIVENDRUM'
  115. 'UDAIPUR'
  116. 'UJJAIN'
  117. 'VANI'
  118. 'VARANASI'
  119. 'YEOLA'

In [17]:
geos <- geocode(dfGeo$city)


Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=ABOHAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=AGRA&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=AHMEDABAD&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=AHMEDNAGAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=AJMER&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=ALIGARH&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=ALWAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=AMRITSAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BALLIA&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BANGALORE&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BAREILLY&sensor=false
Warning message:
In readLines(connect, warn = FALSE): unable to connect to 'maps.googleapis.com' on port 80.Warning message:
In FUN(X[[i]], ...):   geocoding failed for "BAREILLY".
  if accompanied by 500 Internal Server Error with using dsk, try google.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BELGAUM&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BHATINDA&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BHAVNAGAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BHOPAL&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BHUBNESWER&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BIHARSHARIF&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "BIHARSHARIF"Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BIJAPUR&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "BIJAPUR"Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BIKANER&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BOMBORI&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=BURDWAN&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=CHAKAN&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=CHALLAKERE&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=CHANDIGARH&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=CHANDVAD&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "CHANDVAD".Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=CHENNAI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=CHICKBALLAPUR&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "CHICKBALLAPUR"Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=COIMBATORE&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DEESA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DEHRADOON&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DELHI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DEORIA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DEVALA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DEWAS&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DHAVANGERE&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DHULIA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DINDIGUL&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=DINDORI&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=ETAWAH&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=FARUKHABAD&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=GONDAL&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=GORAKHPUR&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "GORAKHPUR"Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=GUWAHATI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=HALDWANI&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=HASSAN&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=HOSHIARPUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=HUBLI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=HYDERABAD&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=INDORE&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JAIPUR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JALANDHAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JALGAON&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JAMMU&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JAMNAGAR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JHANSI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JODHPUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=JUNNAR&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "JUNNAR".Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KALVAN&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KANPUR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KARNAL&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KHANNA&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KOLAR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KOLHAPUR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KOLKATA&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KOPERGAON&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KOTA&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=KURNOOL&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=LASALGAON&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "LASALGAON".Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=LONAND&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=LUCKNOW&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=LUDHIANA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MADURAI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MAHUVA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MALEGAON&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MANDSOUR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MANMAD&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MEERUT&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MIDNAPUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=MUMBAI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=NAGPUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=NANDGAON&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=NASIK&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=NEEMUCH&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=NEWASA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=NIPHAD&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PALAYAM&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PATIALA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PATNA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PHALTAN%20&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PIMPALGAON&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PUNE&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=PURULIA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RAHATA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RAHURI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RAICHUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RAIPUR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RAJAHMUNDRY&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RAJKOT&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "RAJKOT".Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=RANCHI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SAGAR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SAIKHEDA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SANGALI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SANGAMNER&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SATANA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SHEROAPHULY&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SHIMLA&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SHRIRAMPUR&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SINNAR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SOLAPUR&sensor=false
Warning message:
: geocode failed with status OVER_QUERY_LIMIT, location = "SOLAPUR".Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SRIGANGANAGAR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SRINAGAR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SRIRAMPUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=SURAT&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=TRIVENDRUM&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=UDAIPUR&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=UJJAIN&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=VANI&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=VARANASI&sensor=false
.Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=YEOLA&sensor=false

In [18]:
head(geos)


Out[18]:
lonlat
174.199330.14529
278.0080727.17667
372.5713623.0225
474.7495919.09521
574.6399226.4499
678.0880127.89739

In [19]:
dfGeo <- bind_cols(dfGeo, geos)

In [20]:
head(dfGeo)


Out[20]:
citystatecity_statelonlat
1ABOHARPBABOHAR, PB74.199330.14529
2AGRAUPAGRA, UP78.0080727.17667
3AHMEDABADGUJAHMEDABAD, GUJ72.5713623.0225
4AHMEDNAGARMSAHMEDNAGAR, MS74.7495919.09521
5AJMERRAJAJMER, RAJ74.6399226.4499
6ALIGARHUPALIGARH, UP78.0880127.89739

In [21]:
# Check for consitency
ggplot(dfGeo) + aes(lon, lat) + geom_point()


Warning message:
: Removed 10 rows containing missing values (geom_point).

In [22]:
# Check for consistency
dfGeo %>% filter(lon <65)


Out[22]:
citystatecity_statelonlat
1BOMBORIMSBOMBORI, MS10.4225412.87311
2KALVANMSKALVAN, MS49.3526234.65133
3VANIMSVANI, MS43.3729838.50121

In [23]:
# Imputing the values of 
newGeo <- geocode(c("Bombay, Maharashtra", "Kalyan", "Vani,Maharashtra"))


Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Bombay,%20Maharashtra&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Kalyan&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Vani,Maharashtra&sensor=false

In [24]:
newGeo


Out[24]:
lonlat
172.8776619.07598
273.1305419.24033
373.8918920.33749

In [25]:
dfGeo$lat[dfGeo$city == "BOMBORI"] <- newGeo$lat[1]
dfGeo$lon[dfGeo$city == "BOMBORI"] <- newGeo$lon[1]

In [26]:
dfGeo$lat[dfGeo$city == "KALVAN"] <- newGeo$lat[2]
dfGeo$lon[dfGeo$city == "KALVAN"] <- newGeo$lon[2]

In [27]:
dfGeo$lat[dfGeo$city == "VANI"] <- newGeo$lat[3]
dfGeo$lon[dfGeo$city == "VANI"] <- newGeo$lon[3]

In [28]:
# Check for consistency
dfGeo %>% filter(lon <65)


Out[28]:
citystatecity_statelonlat

In [29]:
# Check for consitency
ggplot(dfGeo) + aes(lon, lat) + geom_point()


Warning message:
: Removed 10 rows containing missing values (geom_point).

In [38]:
is.vector(is.na(dfGeo$lon))


Out[38]:
TRUE

In [39]:
dfGeo[is.na(dfGeo$lon),]


Out[39]:
citystatecity_statelonlat
1BAREILLYUPBAREILLY, UPNANA
2BIHARSHARIFBHRBIHARSHARIF, BHRNANA
3BIJAPURKNTBIJAPUR, KNTNANA
4CHANDVADMSCHANDVAD, MSNANA
5CHICKBALLAPURKNTCHICKBALLAPUR, KNTNANA
6GORAKHPURUPGORAKHPUR, UPNANA
7JUNNARMSJUNNAR, MSNANA
8LASALGAONMSLASALGAON, MSNANA
9RAJKOTGUJRAJKOT, GUJNANA
10SOLAPURMSSOLAPUR, MSNANA

PRINCIPLE: Joining two data frames

There will be many cases in which your data is in two different dataframe and you would like to merge them in to one dataframe. Let us look at one example of this - which is called left join


In [41]:
df2015CityGeo = left_join(df2015City, dfGeo, by='city')

In [42]:
head(df2015CityGeo)


Out[42]:
cityquantity_yearstatecity_statelonlat
1BANGALORE8267060KNTBANGALORE, KNT77.5945612.9716
2MAHUVA5113510GUJMAHUVA, GUJ71.7563221.09022
3SOLAPUR4162041MSSOLAPUR, MSNANA
4PUNE3591209MSPUNE, MS73.8567418.52043
5LASALGAON3581359MSLASALGAON, MSNANA
6PIMPALGAON3455265MSPIMPALGAON, MS73.9873820.16997

In [43]:
str(df2015CityGeo)


Classes 'tbl_df', 'tbl' and 'data.frame':	111 obs. of  6 variables:
 $ city         : chr  "BANGALORE" "MAHUVA" "SOLAPUR" "PUNE" ...
 $ quantity_year: int  8267060 5113510 4162041 3591209 3581359 3455265 3272139 2971205 2444020 1882161 ...
 $ state        : Factor w/ 22 levels "AP","ASM","BHR",..: 12 5 15 15 15 15 4 13 15 14 ...
 $ city_state   : chr  "BANGALORE, KNT" "MAHUVA, GUJ" "SOLAPUR, MS" "PUNE, MS" ...
 $ lon          : num  77.6 71.8 NA 73.9 NA ...
 $ lat          : num  13 21.1 NA 18.5 NA ...

In [44]:
ggplot(df2015CityGeo) + aes(lon, lat) + geom_point()


Warning message:
: Removed 9 rows containing missing values (geom_point).

In [45]:
ggplot(df2015CityGeo) + aes(lon, lat) + geom_point() + coord_map()


Warning message:
: Removed 9 rows containing missing values (geom_point).

In [57]:
ggplot(df2015CityGeo) + aes(lon, lat, size = quantity_year) + geom_point(alpha = 0.5) + coord_map() + 
  scale_size_continuous(range=c(0,20))


Warning message:
: Removed 9 rows containing missing values (geom_point).

In [79]:
ggplot(df2015CityGeo) + aes(lon, lat, size = quantity_year/1000, colour = state) + geom_point(alpha = 0.5) + coord_map() + scale_size_continuous(range=c(0,20))


Warning message:
: Incompatible methods ("+.gg", "Ops.data.frame") for "+"
Error in p + o: non-numeric argument to binary operator

Exercise - Can you plot all the States by quantity in geographic map


In [48]:
dfState <- read.csv('state_geocode.csv')

In [49]:
head(dfState)


Out[49]:
statenamelonlat
1MSMaharashtra75.7138919.75148
2GUJGujarat71.1923822.25865
3MPMadhya pradesh78.6568922.97342
4TNTamil Nadu78.6568911.12712
5KNTKarnataka75.7138915.31728
6DELDelhi77.2090228.61394

In [68]:
map <- get_map("India", zoom = 5)


Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=India&zoom=5&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=India&sensor=false

In [69]:
ggmap(map)



In [87]:
map1 <- get_map("India", maptype = "watercolor", source = "stamen", zoom = 5)


Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=India&zoom=5&size=640x640&scale=2&maptype=terrain&sensor=false
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=India&sensor=false
Map from URL : http://tile.stamen.com/watercolor/5/21/12.jpg
Map from URL : http://tile.stamen.com/watercolor/5/22/12.jpg
Map from URL : http://tile.stamen.com/watercolor/5/23/12.jpg
Map from URL : http://tile.stamen.com/watercolor/5/24/12.jpg
Map from URL : http://tile.stamen.com/watercolor/5/21/13.jpg
Map from URL : http://tile.stamen.com/watercolor/5/22/13.jpg
Map from URL : http://tile.stamen.com/watercolor/5/23/13.jpg
Map from URL : http://tile.stamen.com/watercolor/5/24/13.jpg
Map from URL : http://tile.stamen.com/watercolor/5/21/14.jpg
Map from URL : http://tile.stamen.com/watercolor/5/22/14.jpg
Map from URL : http://tile.stamen.com/watercolor/5/23/14.jpg
Map from URL : http://tile.stamen.com/watercolor/5/24/14.jpg
Map from URL : http://tile.stamen.com/watercolor/5/21/15.jpg
Map from URL : http://tile.stamen.com/watercolor/5/22/15.jpg
Map from URL : http://tile.stamen.com/watercolor/5/23/15.jpg
Map from URL : http://tile.stamen.com/watercolor/5/24/15.jpg

In [88]:
ggmap(map1)



In [90]:
ggmap(map1) + geom_point(data = df2015CityGeo,aes(lon,lat,size=quantity_year/1000,color=state))


Warning message:
: Removed 10 rows containing missing values (geom_point).

In [ ]: