Lesson 02 - Design Principles

Lesson Overview

  • Next goal in this course is to re-create a visualization to improve it
  • learn about design principles that helps
    • to enhance the graphic
    • to prototype
  • refine a design
  • metrics to measure their effectiveness
  • various charts and bars that can be used
  • theory about how to make that selection

Visual Encodings

  • visual encodings are the building blocks using which we create graphics

Chart Types

Visualizations in Data Science

  • Data artist or a journalist may be interested in complex graphic to convey all information
  • Data scientist's first priority is to have simple solutions to solve problems
  • Inspect the data - types, variations, and choose the right tool

Chart types

  • Chart types are just visual encodings applied to data types and some relationships between those data
  • In case of scatter plot the x values are considered independent of each other. If they were dependent on each other and have an order then it is probably better to use line chart

Small Multiples

  • Series of plots arranged together faceted by a category

Line Plot

For proportions or comparisons grouped bar charts

Chart Relationships

  • bar chart - highlights individual values, supports comparisons, and can show rankings or deviations
  • boxplot - shows distributions and quantiles, especially useful when comparing distributions
  • pie chart - shows part-to-whole relationship and best suited for one category; poor for making comparisons
  • stacked bar chart - shows part-to-whole relationship and best suited for showing composition within categories and totals
  • bubble chart - shows how three or more sets of values vary; shows correlation
  • line chart - shows overall changes and patterns, usually over equally spaced intervals of time
  • map - values are encoded on physical locations and patterns may be drawn by comparing locations
  • scatterplot - shows how two pair sets of values (for example height and shoe size) vary; shows correlation
  • tables - work well when there is no inherent relation between the data you are conveying

The below is a map of the United states. Each dot represents a person and the color represents their race. The x and y position are chosen based on the block from which they reported in census

Geographic Charts

  • choropleth = geographic + color
  • cartogram = geographic + size
  • dot map = geographic + shape

Additional Chart types

  • Common Chart Types from Duke Introduction to Data Visualization Guide
  • Racial Dot Map by Dustin Cable
  • Choropleth Map
  • Cartogram
  • A Tour through the Visualization Zoo by Jeff Heer, Mike Bostock, and Vadim Ogievetsky

  • Bullet Graph - Stephen Few developed the bullet graph to replace meters and gauges that often fill too much valuable space on dashboards. You can read more about bullet graphs on wikipedia.

  • Sparklines - Edward Tufte invented these bit-sized graphics to pack a punch of information in a small chart area. A reader can quickly see historical trends, anomalies, and the current status of a metric by viewing a sparkline. You can read more about sparklines on wikipedia.
  • Cycle Plots - Originally created by Cleveland, Dunn, and Terpenning in 1978, cycle plots offer a way to investigate time series data in a different way than conventional line charts.
  • Connected Scatter Plots - Think back to the Gapminder data visualization. Could you reveal the same patterns in the data over the years without animation? Alberot Cairo says "Yes!". Alberto praises connected scatter plots and shares examples of them on his blog, The Functional Art.
  • Violin Plots - Violin plots are similar to box plots, except that they show the probability density of the data at different values. Nathan Yau describes violin plots and other ways to visualize and compare distributions on his blog Flowing Data.

Pre attentive Processing

Try and count number of times 9 appears in below two. Which one is faster to count?

Negative Space

You can see a vase or 2 people facing each other

The arrow reinforces the company's goal as a shipping service

Cholera Map

https://upload.wikimedia.org/wikipedia/commons/2/27/Snow-cholera-map-1.jpg ar

Careful with Colors

Correcting Colors

Gestalt Principles of Perception

Many of these principles play an important role in choosing visual encodings and creating a hierarchy of information in a graphic

Chartjunk

  • Important to decide what you will leave out
  • heavy or dark grid lines
  • unnecessary text
  • ornamentated chart axes
  • pictures within graphs
  • shading or 3D perspective

Less is more

Data Ink Ratio

data ink ratio = (ink used to describe data) / (ink used to describe everything else)

  • high data to ink ratio is considered good

  • Let's improve data to ink ratio of a visualization

Compare it with the first one

Lie Factor

lie factor = (size of the effect shown in the graphic) / (size of the effect shown in the data)

A graph with 0.95 < lie factor < 1.05 has high integerity

Grammar of Graphics

  • Hadley Wickham's A Layered Grammar of Graphics
  • separating aesthetic (e.g. line) from the data itself
  • It is quite an extensive theory which has influenced the development of graphics and visualization libraries alike (including D3 and its precursors), but in this class you will focus on 3 of its key principles:
    • Separation of data from aesthetics
    • Definition of common plot/chart elements
    • Composition of these common elements