Summary
load package tree
train the model: model <- tree(formula = ..., data = ...)
summary: summary(model)
draw the tree:
- plot(model)
- annotate: text(model, pretty = 0)
print out details: just type the model model
make prediction: tree.pred <- predict(tree.carseats, CarseatsNew[-train,], type = "class")
prune trees: prune.carseats <- prune.misclass(tree.carseats, best = 13)
CV: cv.carserts <- cv.tree(tree.carseats, FUN = prune.misclass)
- you can print the CV, you will see the number of nodes, deviance and so on.
Question
For the default use of tree, what is the stop criterion, I see some of nodes have 5 data, some of them have 26 of data



In [ ]:

    
require(tree)
require(ISLR)



In [ ]:

    
require(tidyverse)



In [ ]:

    
summary(Carseats)



In [ ]:

    
ggplot(data = Carseats) +
    geom_histogram(mapping = aes(x = Sales), binwidth = 2)



In [ ]:

    
High <- ifelse(Carseats$Sales <= 8, "No", "Yes")



In [ ]:

    
CarseatsNew <- transform(Carseats, High = High)



In [ ]:

    
tree.carseats <- tree(data = CarseatsNew, formula =  High ~ . - Sales)



In [ ]:

    
summary(tree.carseats)



In [ ]:

    
plot(tree.carseats)
text(tree.carseats, pretty = 0)



In [ ]:

    
tree.carseats

Part 2: divide training and testing



In [ ]:

    
set.seed(1011)
train <- sample(1:nrow(Carseats), 250)
# head(train)
tree.carseats <- tree(High ~ . - Sales, CarseatsNew, subset = train)



In [ ]:

    
plot(tree.carseats)
text(tree.carseats, pretty = 0)



In [ ]:

    
tree.pred <- predict(tree.carseats, CarseatsNew[-train,], type = "class")
tree.pred



In [ ]:

    
with(CarseatsNew[-train,], table(tree.pred, High))

Part 3 cross validation



In [ ]:

    
cv.carserts <- cv.tree(tree.carseats, FUN = prune.misclass)



In [ ]:

    
cv.carserts



In [ ]:

    
plot(cv.carserts)



In [ ]:

    
prune.carseats <- prune.misclass(tree.carseats, best = 13)
plot(prune.carseats); text(prune.carseats)



In [ ]:

    
tree.pred <- predict(prune.carseats, CarseatsNew[-train,], type = "class")
with(CarseatsNew[-train,], table(tree.pred, High))