Theory and Practice of Visualization

Objective: describe the theory and practice of effective, beautiful visualizations.

This notebook describes some of the considerations that go into creating visualizations that are meant to communicate quantitative information.

Tufte

Edward Tufte has written a number of excellent books on these topics. The first and perhaps most significant of these is The Visual Display of Quantitative Information. This notebook is mostly a summary of the main points contained in this book. His website has excellent discussions about these and other topics here

Graphical excellence

On page 13:

Graphical displays should:

  • show the data
  • induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production, or something else
  • avoid distorting what the data have to say
  • present many numbers in a small space
  • make large data sets coherant
  • encourage the eye to compare different pieces of data
  • reveal the data at several levels of detail, from a broad overview to the fine structure
  • serve a reasonably clear purpose: description, exploration, tabulation, or decoration
  • be closely integrated with the statistical and verbal descriptions of a dta set.

On page 51:

Graphical excellence...

  • is the well-designed presentation of interesting data--a matter of sustance, of statistics, and of design.
  • consists of complex ideas communicated with clarity, precision, and efficiency.
  • is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.
  • is nearly always multivariate
  • requires telling the truth about the data

Graphical integrity

On page 57:

Lie Factor = (size of effect shown in graphic)/(size of effect in data)

One page 77:

Graphical integrity is more likely to result if these principles are followed:

  • The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented.
  • Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data.
  • Show data variation, not design variation.
  • ...
  • The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data.
  • Graphics must not quote the data out of context.

Data-ink

On page 92:

Data-ink ratio = (data-ink)/(total ink used to print the graphic) = proportion of a graphic's ink devoted to the non-redundant display of data-information

On page 105:

  • Above all else show the data.
  • Maximize the data-ink ratio, within reason.
  • Erase non-data-ink, within reason.
  • Erase redundant data-ink, within reason.
  • Revise and edit.

Chartjunk

On page 121:

Forgo chartjunk, including moire' vibration, the grid, and the duck.

Multifunctioning graphical elements

On page 139:

Mobilize every graphical element, perhaps several times over to show the data