Bank Marketing

Frame, Acquire, Explore, Refine steps are all the same from the previous notebook.

Exercise Load the train and test dataset


In [ ]:


In [1]:


In [2]:
head(train)


Out[2]:
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomedeposit
158managementmarriedtertiaryno2143yesnounknown5may2611-10unknownno
244techniciansinglesecondaryno29yesnounknown5may1511-10unknownno
333entrepreneurmarriedsecondaryno2yesyesunknown5may761-10unknownno
447blue-collarmarriedunknownno1506yesnounknown5may921-10unknownno
533unknownsingleunknownno1nonounknown5may1981-10unknownno
628managementsingletertiaryno447yesyesunknown5may2171-10unknownno

In [3]:
head(test)


Out[3]:
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomedeposit
138self-employedsinglesecondaryno677yesnocellular14may1142-10unknownno
258blue-collarmarriedprimaryno5445yesnocellular14apr3911-10unknownno
355retiredmarriedsecondaryno5nonounknown20jun1081-10unknownno
426managementsinglesecondaryno63nonocellular28jul764-10unknownno
548techniciandivorcedtertiaryno907noyescellular4aug1031-10unknownno
633technicianmarriedtertiaryno525yesyesunknown28may1391-10unknownno

5. Model

Decision Tree Model


In [6]:
library(party)


Loading required package: grid
Loading required package: mvtnorm
Loading required package: modeltools
Loading required package: stats4
Loading required package: strucchange
Loading required package: zoo

Attaching package: ‘zoo’

The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric

Loading required package: sandwich

In [16]:
model_DT <- ctree(deposit ~., data=train, controls=ctree_control(maxdepth=2))

In [17]:
print(model_DT)


	 Conditional inference tree with 4 terminal nodes

Response:  deposit 
Inputs:  age, job, marital, education, default, balance, housing, loan, contact, day, month, duration, campaign, pdays, previous, poutcome 
Number of observations:  35211 

1) duration <= 472; criterion = 1, statistic = 5454.5
  2) poutcome == {success}; criterion = 1, statistic = 4525.232
    3)*  weights = 975 
  2) poutcome == {failure, other, unknown}
    4)*  weights = 29606 
1) duration > 472
  5) duration <= 800; criterion = 1, statistic = 174.671
    6)*  weights = 3119 
  5) duration > 800
    7)*  weights = 1511 

In [18]:
plot(model_DT)



In [19]:
plot(model_DT, type="simple")



In [20]:
testPrediction <- predict(model_DT, test, type="response")

Exercise

Compute the error and accuracy metrics:

  1. Precision
  2. Recall
  3. AUC
  4. TPR and FPR

In [ ]:

Exercise

Now, set maxdepth to 6 and measure accuracy metrics

Set maxdepth to 0 and measure accuracy metrics


In [ ]: