Logistic - Explore the Data

Raw Data

You are provided with the following data: loan_data.csv
This is the historical data that the bank has provided. It has the following columns

Application Attributes:

  • years: Number of years the applicant has been employed
  • ownership: Whether the applicant owns a house or not
  • income: Annual income of the applicant
  • age: Age of the applicant

Behavioural Attributes:

  • grade: Credit grade of the applicant

Outcome Variable:

  • amount : Amount of Loan provided to the applicant
  • interest: Interest rate charged for the applicant
  • default : Whether the applicant has defaulted or not

Let us build some intuition around the Loan Data

Frame the Problem

  • What are the features
  • What are the target

In [ ]:

Load the Refine Data


In [4]:
#Load the libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [5]:
#Default Variables
%matplotlib inline
plt.rcParams['figure.figsize'] = (16,9)
plt.rcParams['font.size'] = 18
plt.style.use('fivethirtyeight')
pd.set_option('display.float_format', lambda x: '%.2f' % x)

In [6]:
#Load the dataset
df = pd.read_csv("data/loan_data_clean.csv")

In [7]:
df.head()


Out[7]:
default amount interest grade years ownership income age
0 0 5000 10.65 B 10.00 RENT 24000.00 33
1 0 2400 10.99 C 25.00 RENT 12252.00 31
2 0 10000 13.49 C 13.00 RENT 49200.00 24
3 0 5000 10.99 A 3.00 RENT 36000.00 39
4 0 3000 10.99 E 9.00 RENT 48000.00 24

Dual Variable Exploration


In [1]:
# Create a crosstab of default and grade

In [2]:
# Create a crosstab of default and grade - percentage by default type

In [3]:
# Create a crosstab of default and grade - percentage by all type

In [4]:
# Create a crosstab of default and grade - percentage by default type

Explore the impact of ownership with default


In [ ]:


In [ ]:


In [ ]:

Explore the impact of age with defualt


In [ ]:

Explore the impact of income with default


In [ ]:


In [17]:
# Create the transformed income variable

Explore the impact of years with default


In [ ]:

Three Variable Exploration


In [5]:
#Plot age, years and default

Explore the relationship of age, income and default


In [ ]:


In [ ]:

Explore the relationshiop of age, grade and default


In [ ]:


In [ ]: