In [48]:
#评估算法

In [46]:
#10折交叉验证

In [47]:
cv<-trainControl(method="cv",number=10,metric<-"Accuracy")

In [49]:
#建立模型

In [50]:
#a)linear algorithms

In [51]:
set.seed(7)

In [53]:
fit.lda<-train(Species~.,data=train,method="lda",metric=metric,trControl=cv)

In [54]:
#CART

In [55]:
set.seed(7)

In [56]:
fit.cart<-train(Species~.,data=train,method="rpart",metric=metric,trControl=cv)


Loading required package: rpart

In [57]:
#KNN

In [58]:
fit.knn<-train(Species~.,data=train,method="knn",metric=metric,trControl=cv)

In [60]:
#SVM

In [61]:
fit.svm<-train(Species~.,data=train,method="svmRadial",metric=metric,trControl=cv)


Loading required package: kernlab

Attaching package: ‘kernlab’

The following object is masked from ‘package:ggplot2’:

    alpha


In [62]:
#Rf

In [63]:
fit.rf<-train(Species~.,data=train,method="rf",metric=metric,trControl=cv)


Loading required package: randomForest
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.

Attaching package: ‘randomForest’

The following object is masked from ‘package:ggplot2’:

    margin


In [64]:
results<-resamples(list(lda=fit.lda,cart=fit.cart,knn=fit.knn,svm=fit.svm,rf=fit.rf))#检查模型是可以比较的

In [65]:
summary(results)


Out[65]:
Call:
summary.resamples(object = results)

Models: lda, cart, knn, svm, rf 
Number of resamples: 10 

Accuracy 
       Min. 1st Qu. Median   Mean 3rd Qu. Max. NA's
lda  0.9091  0.9375 1.0000 0.9742       1    1    0
cart 0.8182  0.8365 0.9167 0.9149       1    1    0
knn  0.8333  0.9423 1.0000 0.9673       1    1    0
svm  0.8333  0.9167 0.9583 0.9417       1    1    0
rf   0.8333  0.9167 1.0000 0.9513       1    1    0

Kappa 
       Min. 1st Qu. Median   Mean 3rd Qu. Max. NA's
lda  0.8642  0.9062 1.0000 0.9614       1    1    0
cart 0.7250  0.7545 0.8750 0.8725       1    1    0
knn  0.7500  0.9122 1.0000 0.9508       1    1    0
svm  0.7447  0.8740 0.9375 0.9118       1    1    0
rf   0.7500  0.8750 1.0000 0.9268       1    1    0

In [66]:
dotplot(results)



In [67]:
print(fit.lda)#模型精度在97%+/-4%


Linear Discriminant Analysis 

120 samples
  4 predictor
  3 classes: 'setosa', 'versicolor', 'virginica' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 108, 109, 108, 107, 109, 109, ... 
Resampling results

  Accuracy   Kappa      Accuracy SD  Kappa SD  
  0.9742424  0.9614198  0.04152486   0.06218963

 

In [68]:
#预测

In [69]:
predictions<-predict(fit.lda,test)

In [70]:
confusionMatrix(predictions,test$Species)#混淆矩阵,精度评价


Out[70]:
Confusion Matrix and Statistics

            Reference
Prediction   setosa versicolor virginica
  setosa          8          0         0
  versicolor      0          9         0
  virginica       0          0        13

Overall Statistics
                                     
               Accuracy : 1          
                 95% CI : (0.8843, 1)
    No Information Rate : 0.4333     
    P-Value [Acc > NIR] : 1.273e-11  
                                     
                  Kappa : 1          
 Mcnemar's Test P-Value : NA         

Statistics by Class:

                     Class: setosa Class: versicolor Class: virginica
Sensitivity                 1.0000               1.0           1.0000
Specificity                 1.0000               1.0           1.0000
Pos Pred Value              1.0000               1.0           1.0000
Neg Pred Value              1.0000               1.0           1.0000
Prevalence                  0.2667               0.3           0.4333
Detection Rate              0.2667               0.3           0.4333
Detection Prevalence        0.2667               0.3           0.4333
Balanced Accuracy           1.0000               1.0           1.0000

In [ ]: