Employee Attrition

For any company, employee attrition is a big challenge. It is easier to retain employees than recruit employees in the open market - especially if you are a mid market company.

Ms ABC, head of HR of a mid market company XYZ has hired you as a data scientist to help the team out by predicting which of the employees exhibit potential risk to attrite in the next month.

Her team has diligently gathered data and has shared them with you as a csv. The dataset has the following columns

  • Satisfaction Level
  • Last evaluation
  • Number of projects
  • Average monthly hours
  • Time spent at the company
  • Whether they have had a work accident
  • Whether they have had a promotion in the last 5 years
  • Departments
  • Salary
  • Whether the employee has left - This is the target variable to be predicted

Can you help ABC and her team with this problem?

Load the libraries


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Frame - find probability of attrition


In [ ]:


In [ ]:


In [ ]:

Acquire - load historical data


In [ ]:


In [ ]:


In [ ]:

Refine - check for NAs and outliers


In [ ]:


In [ ]:


In [ ]:

Transform - encoding categorical variables


In [ ]:


In [ ]:


In [ ]:

Explore - shape of data

left


In [ ]:


In [ ]:

sales, left


In [ ]:


In [ ]:

satisfaction level, average monthly hours, left


In [ ]:


In [ ]:

sales, salary, left


In [ ]:


In [ ]:

Model - Build a classifier

Decision Tree - 2 models (with max_depth=3 and max_depth=None)


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Random Forest - 2 models (with n_estimators=10 and with oob)


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Model Selection - AUC and cross-validation


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Build - the ML API


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Deploy - the ML API


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Interact - get prediction using API


In [ ]:


In [ ]:


In [ ]:


In [ ]: