Treemaps are somewhat controversial in data visualization circles. According to Tufte's commandments, data graphics must, among other things:
The question many arrive at with Treemaps are ... do they?
Let's explore.
If you haven't aleady, activate your dataviz environment and open Jupyter Notebooks. You'll need to run this command at the top:
install.packages('treemap', repos='http://cran.us.r-project.org')
If you get an error message: Close Jupyter, shut down the server in your terminal, and run this command: conda install r-essentials which will update all the libraries. I needed to do this on a Mac. PC worked.
For our example, we'll make a Treemap of our college enrollment data. The goal is to show which colleges are bigger than others, as well as the majors within them.
In [1]:
library(treemap)
In [2]:
enrollment <- read.csv("../../Data/collegeenrollment.csv")
In [3]:
head(enrollment)
In [4]:
treemap(enrollment,
index=c("College","MajorName"), # A list grouping variables: ORDER MATTERS.
vSize = "Total", # This determines the size, so it must be a number.
title="Majors at UNL, 2017", # Customize the title
fontsize.title = 24, #Change the font size of the title
fontsize.labels=c(15,7), # Size of labels, must equal count of index
fontcolor.labels=c("white","black"),
fontface.labels=c(2,1), # Font of labels: 1,2,3,4 for normal, bold, italic, bold-italic...
bg.labels=c("transparent"), # Background color of labels
align.labels=list(
c("left", "top"),
c("right", "bottom")
), # Where to place labels in the rectangle?
overlap.labels=0.5, # number between 0 and 1 that determines the overlap between labels.
inflate.labels=F, # If true, labels are bigger when rectangle is bigger.
)
Discussion: Does this accomplish Tufte's commandments?
In [ ]: