Logistic - Explore the Data

Raw Data

You are provided with the following data: loan_data.csv
This is the historical data that the bank has provided. It has the following columns

Application Attributes:

years: Number of years the applicant has been employed
ownership: Whether the applicant owns a house or not
income: Annual income of the applicant
age: Age of the applicant

Behavioural Attributes:

grade: Credit grade of the applicant

Outcome Variable:

amount : Amount of Loan provided to the applicant
interest: Interest rate charged for the applicant
default : Whether the applicant has defaulted or not

Let us build some intuition around the Loan Data

Frame the Problem

What are the features
What are the target



In [ ]:

Load the Refine Data



In [4]:

    
#Load the libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns



In [5]:

    
#Default Variables
%matplotlib inline
plt.rcParams['figure.figsize'] = (16,9)
plt.rcParams['font.size'] = 18
plt.style.use('fivethirtyeight')
pd.set_option('display.float_format', lambda x: '%.2f' % x)



In [6]:

    
#Load the dataset
df = pd.read_csv("data/loan_data_clean.csv")



In [7]:

    
df.head()

Dual Variable Exploration



In [1]:

    
# Create a crosstab of default and grade



In [2]:

    
# Create a crosstab of default and grade - percentage by default type



In [3]:

    
# Create a crosstab of default and grade - percentage by all type



In [4]:

    
# Create a crosstab of default and grade - percentage by default type

Explore the impact of `ownership` with `default`



In [ ]:



In [ ]:



In [ ]:

Explore the impact of `age` with `defualt`



In [ ]:

Explore the impact of `income` with `default`



In [ ]:



In [17]:

    
# Create the transformed income variable

Explore the impact of `years` with `default`



In [ ]:

Three Variable Exploration



In [5]:

    
#Plot age, years and default

Explore the relationship of `age`, `income` and `default`



In [ ]:



In [ ]:

Explore the relationshiop of `age`, `grade` and `default`



In [ ]:



In [ ]:

	amount	interest	grade	years	ownership	income	age
0	5000	10.65	B	10.00	RENT	24000.00	33
1	2400	10.99	C	25.00	RENT	12252.00	31
2	10000	13.49	C	13.00	RENT	49200.00	24
3	5000	10.99	A	3.00	RENT	36000.00	39
4	3000	10.99	E	9.00	RENT	48000.00	24

Logistic - Explore the Data

Raw Data

Frame the Problem

Load the Refine Data

Dual Variable Exploration

Explore the impact of ownership with default

Explore the impact of age with defualt

Explore the impact of income with default

Explore the impact of years with default