Designing Effective Visualizations

For any data sets there are many possibilities for mapping data components to graphical attributes.

adapted from:

Ward, Matthew O et.al: Interactive Data Visualization: Foundations, Techniques, and Applications, Second Edition, 2nd Edition. CRC Press, 05/2015. VitalBook file.

see also the folloring article:

Rougier NP, Droettboom M, Bourne PE (2014) Ten Simple Rules for Better Figures. PLoS Comput Biol 10(9): e1003833. doi:10.1371/journal.pcbi.1003833; http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003833

A good plotting tutorial: http://www.labri.fr/perso/nrougier/teaching/matplotlib/matplotlib.html (material was adopted for this notebook)

Ineffective visualizations

  • Too confusing
  • Too complex
  • Data is distorted
  • Unappealing

Designing Visualizations

Keys, Labels, and Legends


In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

In [2]:
import numpy as np

x = np.linspace(-np.pi, np.pi, 256,endpoint=True)
y,z = np.sin(x), np.cos(x)

plt.plot(x,y)
plt.plot(x,z)

plt.show()


Improving the range of the plot


In [3]:
plt.plot(x, y, color="blue", linewidth=2.5, linestyle="-")

# Plot sine using green color with a continuous line of width 1 (pixels)
plt.plot(x, z, color="red", linewidth=2.5, linestyle="-")

plt.xlim(x.min()*1.1, x.max()*1.1)
plt.ylim(y.min()*1.1, y.max()*1.1)

plt.xticks( [-np.pi, -np.pi/2, 0, np.pi/2, np.pi])
plt.yticks([-1, 0, +1])

plt.grid()

plt.show()



In [4]:
plt.plot(x, y, color="blue", linewidth=2.5, linestyle="-", label="cosine")

# Plot sine using red color with a continuous line of width 1 (pixels)
plt.plot(x, z, color="red", linewidth=2.5, linestyle="-", label='sin')

plt.xlim(x.min()*1.1, x.max()*1.1)
plt.ylim(y.min()*1.1, y.max()*1.1)

plt.xticks([-np.pi, -np.pi/2, 0, np.pi/2, np.pi],
       [r'$-\pi$', r'$-\pi/2$', r'$0$', r'$+\pi/2$', r'$+\pi$'])

plt.yticks([-1, 0, +1],
       [r'$-1$', r'$0$', r'$+1$'])

plt.grid()

plt.legend(loc='upper left', frameon=False)

plt.show()



In [5]:
plt.plot(x, y, color="blue", linewidth=2.5, linestyle="-", label="cosine")

# Plot sine using red color with a continuous line of width 1 (pixels)
plt.plot(x, z, color="red", linewidth=2.5, linestyle="-", label='sin')

plt.xlim(x.min()*1.1, x.max()*1.1)
plt.ylim(y.min()*1.1, y.max()*1.1)

plt.xticks([-np.pi, -np.pi/2, 0, np.pi/2, np.pi],
       [r'$-\pi$', r'$-\pi/2$', r'$0$', r'$+\pi/2$', r'$+\pi$'])

plt.yticks([-1, 0, +1],
       [r'$-1$', r'$0$', r'$+1$'])

ax = plt.gca() # get current axis
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
ax.spines['bottom'].set_position(('data',0))
ax.yaxis.set_ticks_position('left')
ax.spines['left'].set_position(('data',0))

plt.legend(loc='upper left', frameon=False)

plt.show()


Intuitive Mapping from Data to Visualization

Plotting the distance of the planets from the sun

http://www.physlink.com/Reference/AstroPhysical.cfm


In [6]:
import pandas as pd

planetData = pd.DataFrame({'Distance from sun [m]' : 
                            pd.Series([5.79e10, 1.08e11, 1.496e11, 2.28e11, 7.78e11, 1.43e12, 
                                     2.87e12, 4.5e12, 5.91e12], 
                                    index=['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter',
                                           'Saturn', 'Uranus', 'Neptune', 'Pluto']),
                            'Orbital period [s]' : 
                            pd.Series([7.6e6, 1.94e7, 3.156e7, 5.94e7, 3.74e8, 9.35e8, 2.64e9,
                                      5.22e9, 7.82e9], 
                                    index=['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter',
                                           'Saturn', 'Uranus', 'Neptune', 'Pluto'])
              })
planetData


Out[6]:
Distance from sun [m] Orbital period [s]
Mercury 5.790000e+10 7600000
Venus 1.080000e+11 19400000
Earth 1.496000e+11 31560000
Mars 2.280000e+11 59400000
Jupiter 7.780000e+11 374000000
Saturn 1.430000e+12 935000000
Uranus 2.870000e+12 2640000000
Neptune 4.500000e+12 5220000000
Pluto 5.910000e+12 7820000000

In [7]:
fig = plt.figure()
ax = fig.add_subplot(111)
planetData.plot(x="Distance from sun [m]", y="Orbital period [s]", ax=ax, logy=True, style='o')
ax.annotate('Earth', (1.e11+1.5e11, 31560000), xycoords='data')
plt.grid()


(Image from http://www.pyslink.com)


In [ ]: