The datasets which are to be used in this screen.

  1. Age-Group
  2. First-time members
  3. Occupation
  4. Per State we represent
  5. Party-Wise composition

Move to the Senate directory


In [1]:
import os
#os.chdir("/home/archimedeas/wrkspc/anaconda/the-visual-verdict/visualizations/1_the_senate/datasets")
os.getcwd()


Out[1]:
'C:\\Users\\user\\Desktop\\major_1\\ipython_notebooks'

In [2]:
os.chdir('..')
os.getcwd()
os.chdir('datasets')

Now we read in the CSV files using the Pandas Dataset


In [ ]:
import pandas as pd
df = pd.read_csv("1_age_group_5yr_span_trial.csv")
df

Converting this dataset to a suitable format

  1. For the purpose of using various fields as labels in matplotlib
  2. We use numpy for the same

In [ ]:
labels = []
values = []
for i in range(13):
    labels.append(str(df.iat[i,0]))
    values.append(df.iat[i,1])
    
print(values, "\n\n", labels)

Now we plot a Matplotlib plot using these values and labels

First we set the parameters for using the mpl in this notebook


In [ ]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.rcParams["figure.figsize"] = (15,10)

A sample conversion to Bokeh


In [ ]:
plt.plot(values)

In [3]:
plt.plot(values)

from bokeh import mpl
from bokeh.io import show, output_notebook

output_notebook()
show(mpl.to_bokeh())


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-d7f9a66da746> in <module>()
----> 1 plt.plot(values)
      2 
      3 from bokeh import mpl
      4 from bokeh.io import show, output_notebook
      5 

NameError: name 'plt' is not defined

Now returning back to the mpl plots we represent the data in various different ways

Bar Plot


In [4]:
import numpy as np
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ind = np.arange(13)    
width = 0.35      

p1 = plt.bar(ind, values, width)

ax.set_xticklabels(labels)

plt.show()


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-441612f9c6f8> in <module>()
      7 width = 0.35
      8 
----> 9 p1 = plt.bar(ind, values, width)
     10 
     11 ax.set_xticklabels(labels)

NameError: name 'values' is not defined

Scatter Plot


In [5]:
import numpy as np
import matplotlib.pyplot as plt


N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2 

plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()


C:\Anaconda3\lib\site-packages\matplotlib\collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == str('face'):

How to add subplots


In [ ]:
plt.plot(range(12))

In [ ]:
plt.subplots(3)

In [6]:
from collections import OrderedDict

import pandas as pd

from bokeh._legacy_charts import Donut, show, output_file
from bokeh.sampledata.olympics2014 import data

# throw the data into a pandas data frame
df = pd.io.json.json_normalize(data['data'])

# filter by countries with at least one medal and sort
df = df[df['medals.total'] > 8]
df = df.sort("medals.total", ascending=False)

# get the countries and we group the data by medal type
countries = df.abbr.values.tolist()
gold = df['medals.gold'].astype(float).values
silver = df['medals.silver'].astype(float).values
bronze = df['medals.bronze'].astype(float).values

# build a dict containing the grouped data
medals = OrderedDict()
medals['bronze'] = bronze
medals['silver'] = silver
medals['gold'] = gold

# any of the following commented are also valid Donut inputs
#medals = list(medals.values())
#medals = np.array(list(medals.values()))
#medals = pd.DataFrame(medals)

output_file("donut.html")

donut = Donut(medals, countries)

show(donut)


C:\Anaconda3\lib\site-packages\IPython\kernel\__main__.py:13: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
C:\Anaconda3\lib\site-packages\bokeh\_legacy_charts\_chart.py:92: UserWarning: Instantiating a Legacy Chart from bokeh._legacy_charts
  warn("Instantiating a Legacy Chart from bokeh._legacy_charts")

In [7]:
from bokeh._legacy_charts import Bar
from bokeh.io import output_notebook, show

# get the countries and we group the data by medal type
states = ['delhi', 'assam']


delhi = [ [56,46],
         [23,77],
         [45,55],
         [60,40],
         [35,15,25,25]   
]

assam = [ [46,56],
         [33,67],
         [75,25],
         [50,50],
         [75,5,10,10]   
]

output_notebook()

bar = Bar([delhi[0],assam[0]], states, title="Stacked bars", stacked=True)
bar2 = Bar([delhi[0],assam[0]], states, title="Stacked bars")

show(bar)


BokehJS successfully loaded.
C:\Anaconda3\lib\site-packages\bokeh\_legacy_charts\_chart.py:92: UserWarning: Instantiating a Legacy Chart from bokeh._legacy_charts
  warn("Instantiating a Legacy Chart from bokeh._legacy_charts")

In [66]:
from bokeh.plotting import figure, vplot, hplot, output_notebook, output_file
from bokeh.models import *

In [80]:
fig = figure(width=400, height=400)
fig.line([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], line_width = 50, line_color= "red")

fig1 = figure(width=400, height=400)

fig1.circle(1,3,size = 10)

#output_file("test-plot")
BoxAnnotation()

output_notebook()
show(hplot(fig,fig1))


BokehJS successfully loaded.

In [89]:
from bokeh.plotting import figure, output_file, show

plot = figure()

plot.annulus(x=[1, 2, 3], y=[1, 2, 3], color="#7FC97F",
             inner_radius=0.2, outer_radius=0.5, fill_alpha = 0.5)

plot.grid.grid_line_color = "red"
#plot.axis.axis_line_color = "purple"



plot.axis.major_tick_line_color = "green"


show(plot)



In [ ]: