Share the Insight

There are two main insights we want to communicate.

  • Bangalore is the largest market for Onion Arrivals.
  • Onion Price variation has increased in the recent years.

Let us explore how we can communicate these insight visually.

Preprocessing to get the data


In [ ]:
# Import the library we need, which is Pandas and Matplotlib
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

In [ ]:
# Set some parameters to get good visuals - style to ggplot and size to 15,10
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = (15, 10)

In [ ]:
# Read the csv file of Monthwise Quantity and Price csv file we have.
df = pd.read_csv('MonthWiseMarketArrivals_clean.csv')

In [ ]:
# Change the index to the date column
df.index = pd.PeriodIndex(df.date, freq='M')

In [ ]:
# Sort the data frame by date
df = df.sort_values(by = "date")

In [ ]:
# Get the data for year 2015
df2015 = df[df.year == 2015]

In [ ]:
# Groupby on City to get the sum of quantity
df2015City = df2015.groupby(['city'], as_index=False)['quantity'].sum()

In [ ]:
df2015City = df2015City.sort_values(by = "quantity", ascending = False)

In [ ]:
df2015City.head()

Let us plot the Cities in a Geographic Map


In [ ]:
# Load the geocode file
dfGeo = pd.read_csv('city_geocode.csv')

In [ ]:
dfGeo.head()

PRINCIPLE: Joining two data frames

There will be many cases in which your data is in two different dataframe and you would like to merge them in to one dataframe. Let us look at one example of this - which is called left join


In [ ]:
dfCityGeo = pd.merge(df2015City, dfGeo, how='left', on=['city', 'city'])

In [ ]:
dfCityGeo.head()

In [ ]:
dfCityGeo.plot(kind = 'scatter', x = 'lon', y = 'lat', s = 100)

We can do a crude aspect ratio adjustment to make the cartesian coordinate systesm appear like a mercator map


In [ ]:
dfCityGeo.plot(kind = 'scatter', x = 'lon', y = 'lat', s = 100, figsize = [10,11])

In [ ]:
# Let us at quanitity as the size of the bubble
dfCityGeo.plot(kind = 'scatter', x = 'lon', y = 'lat', s = dfCityGeo.quantity, figsize = [10,11])

In [ ]:
# Let us scale down the quantity variable
dfCityGeo.plot(kind = 'scatter', x = 'lon', y = 'lat', s = dfCityGeo.quantity/1000, figsize = [10,11])

In [ ]:
# Reduce the opacity of the color, so that we can see overlapping values
dfCityGeo.plot(kind = 'scatter', x = 'lon', y = 'lat', s = dfCityGeo.quantity/1000, alpha = 0.5, figsize = [10,11])

Exercise - Can you plot all the States by quantity in (pseudo) geographic map


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]: