SBA Repay: Predicting Loan Defaults for Small Business Loans

Initial Model Testing - Logistic Regression
A Simple Decision Tree
Random Forest
Random Forest Feature Importances
Focusing on Shorter Duration Loans

1. Initial Model Testing - Logistic Regression



In [1]:

    
#Import necessary Python packages 

#data analysis tools
import numpy as np
import pandas as pd
import datetime
from dateutil.relativedelta import relativedelta

#plotting tools
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import graphviz 

#classification 
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score, cross_val_predict, StratifiedShuffleSplit
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import make_scorer, precision_score



In [2]:

    
#Load cleaned data
data = pd.read_pickle('loans_7a_matured')



In [3]:

    
data.columns









    Out[3]:





Index([      'ApprovalFiscalYear',             'TermInMonths',
                 'RevolverStatus',            'JobsSupported',
               'FranchiseCodeBin',                'SP_to2016',
       'SBAGuaranteedApprovalAdj',                       'AL',
                             'AR',                       'AZ',
       ...
                             2008,                       2009,
                             2010,                       2011,
                             2012,                       2013,
                             2014,                       2015,
                             2016,                       2017],
      dtype='object', length=106)

We will remove the approval year information in order to make the model useful for future years (not included in the data).



In [4]:

    
data.drop(list(data.columns)[80:], axis = 1, inplace = True)
data.drop('ApprovalFiscalYear', axis = 1, inplace = True)

Target variable is Loan Status - Paid in Full (PIF) versus Defaulted. Right now:

Negative class = defaulted.
Postive class = repaid (paid in full).



In [5]:

    
data['PIF'].value_counts()









    Out[5]:





1    492214
0    158049
Name: PIF, dtype: int64



In [6]:

    
# Percent of loans that defaulted out of paid in full + defaulted loans
print(str(np.round(100*(data['PIF']==0).sum() / len(data), 2)) + '% of matured loans defaulted')









    



24.31% of matured loans defaulted



In [7]:

    
# Select the features
X = data.drop(['PIF'], axis = 1)
# Select the target variable: switch class labels so that "defaulted" is the postive class 
# since this is what we really care about
y = (1 - data['PIF'])

class_names = ['Paid in Full', 'Defaulted']

From here on:
Class 0 = Paid in Full
Class 1 = Defaulted

We will first split our data into training and testing sets. We will use a stratified split since classes are somewhat imbalanced. We will always run model optimization/selection on the training data (by introducing an additional cross validation portion/k-fold cross validation) and testing on the untouched test set.



In [8]:

    
# Set aside a test set
# Random stratified 70-30 split: preserves the original proportion of positive and negative class 
# examples in train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify = y,
                                                        test_size = 0.30, random_state = 101)

We will start with (L2-regularized) logistic regression with default hyperparameters. We will scale all numeric features prior to model training and testing.



In [9]:

    
# Scale numerical features for logistic regression (with regularization)

from sklearn.preprocessing import StandardScaler

# Get scaling parameters from training data, then apply the scaler to testing data as well
std_scale = StandardScaler().fit(X_train[['TermInMonths', 'JobsSupported', 'SP_to2016', 'SBAGuaranteedApprovalAdj']])
X_train_std = std_scale.transform(X_train[['TermInMonths', 'JobsSupported', 'SP_to2016', 'SBAGuaranteedApprovalAdj']])
X_test_std = std_scale.transform(X_test[['TermInMonths', 'JobsSupported', 'SP_to2016', 'SBAGuaranteedApprovalAdj']])

We will first run logistic regression with default hyperparamters. We will use class_weight = 'balanced' to introduce higher penalty for missclassifying the minority class.



In [10]:

    
# Define the model
def_logreg_model = LogisticRegression(class_weight = 'balanced', random_state = 101)
# Train the model on scaled training data
def_logreg_model.fit(X_train_std, y_train)
# Test the model: make predictions on testing data
def_logreg_pred = def_logreg_model.predict(X_test_std)
# Compare model outputs with actual outputs
print(classification_report(def_logreg_pred, y_test))









    



             precision    recall  f1-score   support

          0       0.76      0.90      0.82    123397
          1       0.75      0.50      0.60     71682

avg / total       0.75      0.75      0.74    195079

We should aim to avoid predicting that a loan will be paid in full, when in fact it will default, i.e., we want to detect all of defaults (positive class). False negatives should be important. Therefore, we'll pay particular attention to recall (of the positive/Default class).

This is exactly the metric that the above classifier struggles with - it is 0.50. We can gain more insight as to what went wrong by examining the confusion matrix.



In [11]:

    
# Function to display the confusion matrix - original or normalized

import itertools
def plot_confusion_matrix(cm, classes, title = 'Confusion matrix', cmap = plt.cm.Blues, normalize = False):
    """
    This function prints and plots the confusion matrix.
    """
    
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
      print('Confusion matrix, without normalization')

    print(cm)

    plt.imshow(cm, interpolation='nearest', cmap = cmap)
    plt.title(title, fontsize = 20)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, fontsize = 20)
    plt.yticks(tick_marks, classes, fontsize = 20)
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True Label', fontsize = 20)
    plt.xlabel('Predicted Label', fontsize = 20)
    plt.grid(False)



In [12]:

    
# Plot confusion matrix without normalization

def_logreg_cm = confusion_matrix(def_logreg_pred, y_test)

plt.figure(figsize = (8,4))
plot_confusion_matrix(def_logreg_cm, classes = class_names, normalize = False,
                      title = 'Confusion Matrix')# Plot normalized confusion matrix
plt.figure(figsize = (8,4))
plt.figure(figsize = (8,4))
plot_confusion_matrix(def_logreg_cm, classes = class_names, normalize = True,
                      title = 'Confusion Matrix with Normalization')









    



Confusion matrix, without normalization
[[111521  11876]
 [ 36143  35539]]
Normalized confusion matrix
[[ 0.90375779  0.09624221]
 [ 0.50421305  0.49578695]]






    












    





<matplotlib.figure.Figure at 0x10c4ea0f0>

We see that many defaulted loans got labeled as paid in full (actually, as many as were classified correctly). We will try to improve the perofrmance by optimizing hyperparamters using a Grid Search with 10-fold Cross Validation (stratified split) on training data, picking the best model (optimal hyperparameters), and then applying it to the test data. One hyperparameter that is of importance to logistic regression is C - the amount of regularization used.



In [13]:

    
# Tune the hyperparameters: vary the regularization paramter
# Try an exhaustive range of values

param_grid = {'C': [0.0001, 0.0005, 0.001, 0.005, 0.1, 0.5, 1, 5, 10, 50, 100, 500, 1000, 5000]}

# 10-fold cross validation on training data to tune C
grid_logmodel = GridSearchCV(LogisticRegression(class_weight = 'balanced'), 
                             param_grid, refit = True, verbose = 1,  
                             cv = StratifiedShuffleSplit(n_splits = 10, test_size = 0.3, random_state = 101))
grid_logmodel.fit(X_train_std, y_train)









    



Fitting 10 folds for each of 14 candidates, totalling 140 fits






    



[Parallel(n_jobs=1)]: Done 140 out of 140 | elapsed:  1.0min finished






    Out[13]:





GridSearchCV(cv=StratifiedShuffleSplit(n_splits=10, random_state=101, test_size=0.3,
            train_size=None),
       error_score='raise',
       estimator=LogisticRegression(C=1.0, class_weight='balanced', dual=False,
          fit_intercept=True, intercept_scaling=1, max_iter=100,
          multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,
          solver='liblinear', tol=0.0001, verbose=0, warm_start=False),
       fit_params={}, iid=True, n_jobs=1,
       param_grid={'C': [0.0001, 0.0005, 0.001, 0.005, 0.1, 0.5, 1, 5, 10, 50, 100, 500, 1000, 5000]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring=None, verbose=1)



In [14]:

    
# See the chosen optimal parameter
grid_logmodel.best_params_









    Out[14]:





{'C': 1}

Looks like the default parameter was the best choice!

2. A Simple Decision Tree

We will next implement a simple decision tree model. We will limit the depth to four, for simplicity.



In [15]:

    
# Train and test a simple decision tree with random stratified split and 10-fold cross validation.
# Better accuracy could be achieved without the max_depth and min_samples_leaf constraints,
# but we will aim for simplicity here (to avoid overfitting and it easier to visualize).

# Decision trees do not need features to be scaled. For easier interpretability, we will go back to original data.
dtree = DecisionTreeClassifier(max_depth = 4, min_samples_leaf = 5, class_weight = 'balanced')

# Fit/train the model 
dtree.fit(X_train, y_train)

# Test the model
dtree_pred = dtree.predict(X_test)
# Display results
print(classification_report(dtree_pred, y_test))









    



             precision    recall  f1-score   support

          0       0.77      0.97      0.86    117221
          1       0.94      0.57      0.71     77858

avg / total       0.84      0.81      0.80    195079



In [16]:

    
#Plot normalized confusion matrix
dtree_cm = confusion_matrix(dtree_pred, y_test)

plt.figure(figsize = (8,4))
plot_confusion_matrix(dtree_cm, classes = class_names, normalize = True,
                      title = 'Normalized Confusion Matrix')









    



Normalized confusion matrix
[[ 0.97472296  0.02527704]
 [ 0.42906317  0.57093683]]

We can see that, relative to logistic regression, this model is overall doing better - precision of the positive class has significantly improved (from 0.75 to 0.94). However, this is only slightly better in terms of recall (0.50 to 0.57). We would like to further improve recall as well. Let's see what an individual tree looks like.



In [17]:

    
# Visualize the tree
dot_data = tree.export_graphviz(dtree, out_file = None, 
                         feature_names = X_train.columns,  
                         class_names = ['DEF', 'PIF'],  
                         filled = True, rounded = True,  
                         special_characters = True)  
graph = graphviz.Source(dot_data)  
graph.render("dec_tree_simple") 
graph









    Out[17]:

From the tree, it looks like the most important features are:

* Term in Months (most decisions/splits are made based on this feature, and it looks like loans < 83 months are more likely to default)
* S&P 1500 Index
* Guaranteed Amount

Let's examine Term in Months in more detail.



In [18]:

    
# Term in months seems to be the most important from the Decision Tree
fig = plt.figure(figsize = (5,3), facecolor = 'gainsboro')
sns.set_context('poster', font_scale = 1.2)


g = sns.factorplot(x = 'PIF', y = 'TermInMonths',
                   kind = 'bar', data = data, estimator = np.mean, palette = 'Set1' )

g.set_xticklabels(['Defaulted', 'Repaid'])
sns.plt.xlabel('')
sns.plt.title('7A Matured Loans')
sns.plt.ylabel('Mean Term in Months')
g.savefig('Term.png', dpi = 300)









    





<matplotlib.figure.Figure at 0x11b837748>

It looks like paid in full loans have, on average, longer duration!

3. Random Forest

To improve the performance further, we will use a random forest classifier: random forests are an ensemble learning method that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes output by individual trees, or, in Scikit Learn implementation, average of probabilistic outputs of all trees.



In [19]:

    
# Train and test a Random Forest classifier with default hyperparamters first (use 300 estimators to start with)
df_rf_n300 = RandomForestClassifier(n_estimators = 300, class_weight = 'balanced', n_jobs=-1)
# Fit/train the model
df_rf_n300.fit(X_train, y_train)
# Test the model: make predictions on the test set
df_rf_n300_pred = df_rf_n300.predict(X_test)
print(classification_report(y_test, df_rf_n300_pred))









    



             precision    recall  f1-score   support

          0       0.93      0.95      0.94    147664
          1       0.83      0.77      0.80     47415

avg / total       0.91      0.91      0.91    195079

This is much better compared to both logistic regression and single simple decision tree above. Let's tune the hyperparamters next. We will again use 10-fold cross validation with stratified splits on the training data, pick the optimal parameters, and then apply the selected model to test data.

We would will to look at a range of following hyperparameters:

    1. n_estimators - the number of trees
    2. max_features - the maximum number of features Random Forest is allowed to try in individual tree
    3. min_samples_leaf -  leaf is the end node of a decision tree - a smaller leaf makes the model more prone to capturing noise in training data

We will optimize for f1 score to acount for both precision and recall of Defaults.

Due to high computational demands of grid search, we will first check if we can reduce the number of estimators, without affecting the performance. (the optimal way to do this would be to use a range of n_estimators in all possible combinations with other parameters we want to tune)



In [20]:

    
# Train and test a Random Forest classifier with default hyperparamters first - use 100 estimators 
df_rf_n100 = RandomForestClassifier(n_estimators = 100, class_weight = 'balanced', n_jobs=-1)
# Fit/train the model
df_rf_n100.fit(X_train, y_train)
# Test the model: make predictions on the test set
df_rf_n100_pred = df_rf_n100.predict(X_test)
print(classification_report(y_test, df_rf_n100_pred))









    



             precision    recall  f1-score   support

          0       0.93      0.95      0.94    147664
          1       0.83      0.76      0.79     47415

avg / total       0.90      0.90      0.90    195079

A random forest with 100 estimators achieves the same precision and recall as when using 300 estimators, so let's use at most 100 estimators.



In [21]:

    
# GridSearch for RF
param_grid = {'max_features': [0.2, 'auto', 'log2'], 
              'n_estimators': [50, 100], 'min_samples_leaf': [1, 5, 10, 50, 100]}
grid_rf = GridSearchCV(RandomForestClassifier(class_weight = 'balanced', n_jobs = 4),
                param_grid, cv = 10, refit = True, verbose = 3, scoring = 'f1')



In [22]:

    
grid_rf.fit(X_train, y_train)









    



Fitting 10 folds for each of 30 candidates, totalling 300 fits
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.819527, total=  32.1s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........






    



[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:   36.0s remaining:    0.0s






    



[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.824500, total=  32.5s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........






    



[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:  1.2min remaining:    0.0s






    



[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.821089, total=  32.3s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.824209, total=  31.9s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.825624, total=  32.2s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.820174, total=  32.2s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.819891, total=  32.0s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.818649, total=  31.7s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.824519, total=  31.2s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.829354, total=  31.2s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.823320, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.826448, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.825178, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.825556, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.828164, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.824614, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.825152, total= 1.1min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.825190, total= 1.0min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.827839, total= 1.1min
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.829465, total= 1.1min
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.815233, total=  31.1s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.816638, total=  29.2s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.819848, total=  29.6s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.817653, total=  31.8s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.817094, total=  31.3s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.819069, total=  31.6s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.815598, total=  31.5s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.817924, total=  30.0s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.819034, total=  29.8s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.820020, total=  28.9s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.818769, total=  56.3s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.818249, total=  57.6s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.817182, total=  57.9s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.818219, total=  57.6s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.820101, total=  54.9s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.817878, total=  57.5s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.815593, total=  57.6s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.817331, total=  57.7s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.818899, total=  58.5s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.822699, total=  55.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.803012, total=  27.7s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.801235, total=  29.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.803172, total=  29.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.804733, total=  29.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.805263, total=  29.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.803326, total=  29.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.801111, total=  29.2s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.804223, total=  29.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.802991, total=  29.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.803113, total=  28.4s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.803761, total=  53.6s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.804338, total=  56.7s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.803967, total=  56.9s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.801446, total=  56.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.806374, total=  56.9s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.803352, total=  57.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.804197, total=  55.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.805285, total=  53.6s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.803113, total=  56.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.805535, total=  56.0s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.768537, total=  26.2s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.770544, total=  26.2s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.768658, total=  26.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.769805, total=  26.8s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.774337, total=  26.1s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.768070, total=  25.9s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.770446, total=  25.6s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.769442, total=  25.7s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.771883, total=  25.1s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.774100, total=  24.8s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.771915, total=  48.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.770585, total=  49.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.769401, total=  49.5s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.768599, total=  49.6s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.770855, total=  49.9s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.769694, total=  49.0s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.770403, total=  49.7s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.770932, total=  49.5s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.771027, total=  47.5s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.772408, total=  47.9s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.756169, total=  24.1s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.756107, total=  26.4s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.757020, total=  27.7s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.759254, total=  24.2s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.757136, total=  23.9s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.755372, total=  24.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.752006, total=  24.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.752171, total=  24.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.755631, total=  24.1s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.758183, total=  23.9s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.759067, total=  45.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.752581, total=  45.9s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.756655, total=  45.4s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.759363, total=  43.4s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.758900, total=  43.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.755245, total=  45.2s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.756856, total=  45.2s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.755425, total=  45.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.755422, total=  45.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.762964, total=  45.8s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.788519, total=  24.2s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.790092, total=  24.2s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.787594, total=  24.0s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.793764, total=  24.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.791509, total=  24.4s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.786576, total=  24.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.788404, total=  23.8s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.788106, total=  24.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.796984, total=  23.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.795841, total=  22.8s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.788444, total=  44.5s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.796983, total=  45.9s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.796518, total=  45.4s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.802912, total=  45.6s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.797704, total=  46.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.789151, total=  46.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.791105, total=  46.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.790502, total=  46.1s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.794046, total=  46.5s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.800886, total=  46.1s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.781652, total=  22.2s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.779459, total=  21.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.781917, total=  21.3s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.785484, total=  21.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.786709, total=  20.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.780335, total=  20.8s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.782629, total=  21.1s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.784492, total=  21.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.782750, total=  22.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.787042, total=  21.6s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.785195, total=  41.3s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.781701, total=  41.4s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.781812, total=  41.5s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.784502, total=  41.4s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.786628, total=  41.2s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.779726, total=  40.8s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.782395, total=  41.7s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.780946, total=  41.4s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.783254, total=  41.6s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.785785, total=  41.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.767874, total=  21.1s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.767199, total=  20.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.763612, total=  20.3s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.772984, total=  20.2s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.768590, total=  20.2s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.765462, total=  20.0s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.767442, total=  21.2s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.766687, total=  21.0s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.769147, total=  21.1s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.777027, total=  21.1s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.770320, total=  39.8s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.768813, total=  40.1s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.764729, total=  39.9s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.770200, total=  39.7s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.770319, total=  39.6s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.769659, total=  39.6s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.766500, total=  39.8s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.765918, total=  39.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.770937, total=  39.6s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.773886, total=  39.5s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.737704, total=  18.6s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.737647, total=  18.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.734512, total=  18.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.741496, total=  18.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.741510, total=  17.8s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.735243, total=  17.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.736710, total=  17.6s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.738617, total=  17.5s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.737032, total=  17.5s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.743652, total=  18.0s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.739088, total=  34.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.736759, total=  34.7s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.739132, total=  34.7s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.746570, total=  35.1s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.745726, total=  35.1s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.738031, total=  34.9s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.737206, total=  35.0s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.737242, total=  34.7s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.736890, total=  34.6s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.743696, total=  35.1s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.729027, total=  17.0s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.730214, total=  17.2s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.723167, total=  16.8s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.728121, total=  17.1s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.726107, total=  17.1s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.723144, total=  16.6s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.723785, total=  17.0s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.725895, total=  17.0s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.724083, total=  17.0s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.731316, total=  17.1s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.727444, total=  31.5s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.726207, total=  31.4s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.726204, total=  31.5s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.728769, total=  30.2s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.728746, total=  29.9s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.725110, total=  30.0s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.725381, total=  31.9s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.722839, total=  31.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.721177, total=  31.4s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.728084, total=  31.8s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.771795, total=  22.1s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.771176, total=  21.8s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.767069, total=  22.1s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.778527, total=  22.1s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.771965, total=  22.2s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.768072, total=  21.8s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.769334, total=  21.9s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.769172, total=  22.0s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.771755, total=  22.3s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.780740, total=  22.1s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.774888, total=  41.6s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.778022, total=  42.3s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.775131, total=  41.6s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.782712, total=  41.9s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.777895, total=  41.8s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.773865, total=  41.5s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.778342, total=  41.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.773598, total=  39.7s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.779478, total=  39.9s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.784256, total=  40.5s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.769631, total=  18.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.769085, total=  18.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.765031, total=  18.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.770423, total=  19.1s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.771883, total=  19.0s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.767044, total=  18.8s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.766294, total=  19.1s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.768913, total=  19.1s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.770118, total=  18.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.771723, total=  19.2s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.771128, total=  36.0s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.770132, total=  35.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.766868, total=  36.0s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.773444, total=  36.2s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.769691, total=  35.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.765020, total=  35.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.767412, total=  37.4s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.766910, total=  36.1s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.767654, total=  36.3s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.773730, total=  36.0s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.756178, total=  18.1s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.753765, total=  18.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.753839, total=  18.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.756282, total=  18.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.756713, total=  18.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.750339, total=  18.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.749113, total=  17.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.750912, total=  17.7s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.757026, total=  18.0s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.760863, total=  17.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.755583, total=  33.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.755144, total=  33.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.751023, total=  34.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.758466, total=  34.1s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.758501, total=  34.1s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.751308, total=  34.1s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.754069, total=  34.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.751134, total=  34.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.753628, total=  34.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.759864, total=  34.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.722897, total=  15.9s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.727957, total=  15.9s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.722718, total=  15.8s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.728538, total=  16.0s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.726537, total=  15.8s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.719018, total=  15.7s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.722191, total=  15.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.722318, total=  15.9s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.722966, total=  15.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.731150, total=  15.9s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.724290, total=  29.2s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.723919, total=  29.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.718633, total=  29.4s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.727786, total=  29.5s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.725967, total=  29.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.721349, total=  29.7s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.721040, total=  29.7s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.721226, total=  29.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.722054, total=  29.4s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.731446, total=  28.9s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.706727, total=  14.1s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.709476, total=  14.0s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.702897, total=  14.2s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.710406, total=  13.9s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.710826, total=  14.0s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.703580, total=  14.7s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.704260, total=  14.4s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.707675, total=  14.5s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.704474, total=  14.5s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.713174, total=  14.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.706537, total=  26.0s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.708831, total=  26.3s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.703920, total=  27.1s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.713935, total=  27.5s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.712668, total=  27.5s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.709534, total=  26.0s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.707275, total=  26.2s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.707422, total=  27.3s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.709410, total=  27.5s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.711047, total=  27.5s






    



[Parallel(n_jobs=1)]: Done 300 out of 300 | elapsed: 180.4min finished






    Out[22]:





GridSearchCV(cv=10, error_score='raise',
       estimator=RandomForestClassifier(bootstrap=True, class_weight='balanced',
            criterion='gini', max_depth=None, max_features='auto',
            max_leaf_nodes=None, min_impurity_split=1e-07,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=4,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False),
       fit_params={}, iid=True, n_jobs=1,
       param_grid={'max_features': [0.2, 'auto', 'log2'], 'n_estimators': [50, 100], 'min_samples_leaf': [1, 5, 10, 50, 100]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring='f1', verbose=3)



In [23]:

    
print(grid_rf.best_params_)









    



{'max_features': 0.2, 'min_samples_leaf': 1, 'n_estimators': 100}



In [24]:

    
grid_rf_pred = grid_rf.predict(X_test)



In [25]:

    
print(classification_report(grid_rf_pred, y_test))









    



             precision    recall  f1-score   support

          0       0.95      0.94      0.95    148874
          1       0.82      0.84      0.83     46205

avg / total       0.92      0.92      0.92    195079



In [26]:

    
opt_rf = RandomForestClassifier(n_estimators = 100, max_features = 0.2, min_samples_leaf = 1, class_weight='balanced')
opt_rf.fit(X_train, y_train) 
opt_rf_pred = opt_rf.predict(X_test)
print(classification_report(opt_rf_pred, y_test))









    



             precision    recall  f1-score   support

          0       0.95      0.94      0.94    149036
          1       0.81      0.84      0.82     46043

avg / total       0.92      0.92      0.92    195079

4. Random Forest Feature Importances

Let's examine which features are contributing most in the optimal Random Forest model.



In [27]:

    
#Feature ranking for random forest
fig = plt.figure(figsize = (10,5))
importances = opt_rf.feature_importances_
std = np.std([tree.feature_importances_ for tree in opt_rf.estimators_],
            axis=0)
indices = np.argsort(importances)[::-1]
fts = list(X_train.columns)
# Print the feature ranking
print("Feature ranking:")

for f in range(X_train.shape[1]):
   print("%d. feature %s (%f)" % (f + 1, fts[indices[f]], importances[indices[f]]))









    



Feature ranking:
1. feature TermInMonths (0.575866)
2. feature SBAGuaranteedApprovalAdj (0.101732)
3. feature SP_to2016 (0.097691)
4. feature JobsSupported (0.047171)
5. feature RevolverStatus (0.013784)
6. feature INDIVIDUAL (0.007467)
7. feature CA (0.006855)
8. feature Retail Trade (0.005870)
9. feature FL (0.004868)
10. feature Construction (0.004829)
11. feature Other Services (0.004709)
12. feature Health Care & Social Assistance (0.004662)
13. feature TX (0.004635)
14. feature Manufacturing (0.004490)
15. feature NY (0.004467)
16. feature Professional, Scientific, & Technical Services (0.004407)
17. feature Wholesale Trade (0.004002)
18. feature FranchiseCodeBin (0.003866)
19. feature PA (0.003514)
20. feature OH (0.003276)
21. feature Administrative/Support/Waste Management (0.003274)
22. feature NJ (0.003119)
23. feature Transportation & Warehousing (0.002974)
24. feature GA (0.002890)
25. feature IL (0.002762)
26. feature MN (0.002755)
27. feature PARTNERSHIP (0.002685)
28. feature MA (0.002684)
29. feature MI (0.002427)
30. feature MO (0.002335)
31. feature WI (0.002312)
32. feature UT (0.002204)
33. feature NC (0.002179)
34. feature WA (0.002101)
35. feature Arts, Entertainment, & Recreation (0.002060)
36. feature IN (0.002046)
37. feature PR (0.001989)
38. feature MD (0.001971)
39. feature CO (0.001949)
40. feature Real Estate Rental & Leasing (0.001846)
41. feature VA (0.001824)
42. feature AZ (0.001823)
43. feature CT (0.001782)
44. feature MT (0.001763)
45. feature Information (0.001740)
46. feature IA (0.001672)
47. feature TN (0.001639)
48. feature Agriculture (0.001622)
49. feature KS (0.001561)
50. feature NH (0.001550)
51. feature OK (0.001545)
52. feature MS (0.001428)
53. feature OR (0.001404)
54. feature ME (0.001404)
55. feature KY (0.001394)
56. feature LA (0.001328)
57. feature RI (0.001318)
58. feature Finance & Insurance (0.001301)
59. feature NE (0.001191)
60. feature AL (0.001139)
61. feature Educational Services (0.001120)
62. feature NV (0.001120)
63. feature ID (0.001119)
64. feature ND (0.001116)
65. feature VT (0.001083)
66. feature AR (0.001031)
67. feature NM (0.000945)
68. feature SC (0.000916)
69. feature HI (0.000897)
70. feature WV (0.000727)
71. feature Mining (0.000580)
72. feature SD (0.000553)
73. feature WY (0.000548)
74. feature DE (0.000468)
75. feature DC (0.000357)
76. feature Utilities (0.000163)
77. feature Public Administration (0.000058)
78. feature Management of Companies & Enterprises (0.000049)






    





<matplotlib.figure.Figure at 0x114f66940>



In [28]:

    
# Plot the top ten feature importances of the optimized random forest method
importances = opt_rf.feature_importances_
std = np.std([tree.feature_importances_ for tree in opt_rf.estimators_],
             axis=0)
indices = np.argsort(importances)[::-1]

# Plot the feature importances of the forest
fig = plt.figure(figsize = (10,5))

plt.title("Feature Importances")
plt.barh(range(10), importances[indices][0:10][::-1],
       color="r", xerr = std[indices][0:10][::-1], align="center")
# If you want to define your own labels,
# change indices to a list of labels on the following line.
plt.yticks(range(10), ['Term in Months', 'Amount', 'S&P 1500', 'Jobs Supported', 'Revolver Status',
                      'Individual', 'CA', 'Retail Trade', 'Construction', 'FL'][::-1])
    
plt.yticks(range(10))
plt.ylim([-1,10])
plt.tight_layout()


fig.savefig('OPTIMAL_RF_FImportance.png', dpi = 300)

Since Random Forests are nonlinear classifiers and we cannot tell if these features are having a negative or positive effect, let's try to infer the direction from the original data.



In [29]:

    
# Check if there is a difference in mean duration
import scipy
from scipy.stats import ttest_ind
t, prob = scipy.stats.ttest_ind(data[data['PIF']==0]['TermInMonths'], data[data['PIF']==1]['TermInMonths'] )
print(t, prob)









    



-311.664506627 0.0

Looks like duration is significantly longer for paid in full loans.



In [30]:

    
# Check if there is a difference in mean amount
t, prob = scipy.stats.ttest_ind(data[data['PIF']==0]['SBAGuaranteedApprovalAdj'], data[data['PIF']==1]['SBAGuaranteedApprovalAdj'] )
print(t, prob)









    



-33.4039391709 1.94024593685e-244

Average amount is lower for defaulted loans.



In [31]:

    
# Check if there is a difference in mean number of jobs
t, prob = scipy.stats.ttest_ind(data[data['PIF']==0]['JobsSupported'], data[data['PIF']==1]['JobsSupported'] )
print(t, prob)









    



5.73036927693 1.00256282317e-08



In [32]:

    
#Examine loans with TermDuration = 84
data_84 = data[data['TermInMonths'] == 84]
data_84.reset_index(inplace=True, drop = True)



In [33]:

    
data_84['PIF'].value_counts()









    Out[33]:





1    206616
0      1157
Name: PIF, dtype: int64

For loans of duration = 84 months, < 0.5% default!

5. Focusing on Shorter Duration Loans

Let's examine loans with duration < 84 months and repeat all the steps from above without TermInMonths.



In [34]:

    
# Consider only loans with TermDuration < 84
data_l84 = data[data['TermInMonths'] < 84]
data.reset_index(inplace = True, drop = True)

target_l84 = y = (1 - data[data['TermInMonths'] < 84]['PIF'])



In [35]:

    
# Split into training and testing data as before
X_train_l84, X_test_l84, y_train_l84, y_test_l84 = train_test_split(data_l84.drop(['TermInMonths',  'PIF'],axis = 1), 
                                                                target_l84, 
                                                                stratify = target_l84,
                                                                test_size = 0.30, random_state = 101)



In [36]:

    
# Logistic Regression Classification 
logmodel_l84 = LogisticRegression(class_weight = 'balanced')
logmodel_l84.fit(X_train_l84, y_train_l84)
lm_l84_pred = logmodel_l84.predict(X_test_l84)
print(classification_report(y_test_l84, lm_l84_pred))









    



             precision    recall  f1-score   support

          0       0.59      0.15      0.24     51666
          1       0.46      0.87      0.60     41974

avg / total       0.53      0.48      0.40     93640



In [37]:

    
# GridSearch for Logistic Regression
param_grid = {'C': [0.001, 0.005, 0.1, 0.5, 1, 5, 10, 50, 100, 500, 1000, 5000, 10000]}
grid_l84 = GridSearchCV(LogisticRegression(class_weight = 'balanced'), param_grid, refit=True,verbose=1)
grid_l84.fit(X_train_l84, y_train_l84)
print(grid_l84.best_params_)









    



Fitting 3 folds for each of 13 candidates, totalling 39 fits






    



[Parallel(n_jobs=1)]: Done  39 out of  39 | elapsed:  1.2min finished






    



{'C': 0.001}



In [38]:

    
grid_l84_pred = grid_l84.predict(X_test_l84)

print(classification_report(y_test_l84, grid_l84_pred))









    



             precision    recall  f1-score   support

          0       0.59      0.15      0.24     51666
          1       0.46      0.87      0.60     41974

avg / total       0.53      0.48      0.40     93640

Again, Logistic Regression does not seem to be sufficient. We will again tune hyperparamters for Random Forest using grid search with cross validation.



In [39]:

    
# GridSearch for Random Forest
param_grid = {'max_features': [0.2, 'auto', 'log2'],
              'n_estimators': [50, 100], 'min_samples_leaf': [1, 5, 10, 50, 100]}
grid_rf_l84 = GridSearchCV(RandomForestClassifier(class_weight = 'balanced', n_jobs = 4),
                param_grid, cv = 10, refit = True, verbose = 3, scoring = 'f1')



In [40]:

    
grid_rf_l84.fit(X_train_l84, y_train_l84)









    



Fitting 10 folds for each of 30 candidates, totalling 300 fits
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.666631, total=  12.8s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........






    



[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:   15.0s remaining:    0.0s






    



[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.662346, total=  12.9s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........






    



[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:   30.2s remaining:    0.0s






    



[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.654345, total=  12.8s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.663999, total=  12.8s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.660588, total=  12.2s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.659413, total=  12.3s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.662374, total=  12.1s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.653367, total=  12.3s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.661986, total=  12.8s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=50, score=0.664318, total=  12.7s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.667977, total=  24.8s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.669225, total=  24.2s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.663436, total=  24.4s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.666844, total=  24.7s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.664641, total=  24.5s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.664548, total=  24.9s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.668228, total=  24.5s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.659651, total=  23.3s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.665133, total=  23.4s
[CV] max_features=0.2, min_samples_leaf=1, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=1, n_estimators=100, score=0.668154, total=  27.4s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.692674, total=  13.4s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.696114, total=  13.9s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.685529, total=  13.3s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.694882, total=  13.0s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.689659, total=  12.0s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.693180, total=  12.2s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.689121, total=  11.9s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.687010, total=  12.2s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.692253, total=  12.1s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=50 ...........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=50, score=0.690586, total=  12.0s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.691629, total=  22.9s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.695212, total=  23.0s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.687300, total=  22.8s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.696874, total=  23.1s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.690774, total=  23.0s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.690841, total=  21.5s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.690620, total=  21.3s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.685525, total=  21.3s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.692362, total=  22.1s
[CV] max_features=0.2, min_samples_leaf=5, n_estimators=100 ..........
[CV]  max_features=0.2, min_samples_leaf=5, n_estimators=100, score=0.690668, total=  22.4s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.692118, total=  11.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.693815, total=  11.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.682205, total=  11.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.694467, total=  11.2s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.689959, total=  11.2s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.692199, total=  11.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.689301, total=  11.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.687370, total=  11.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.686790, total=  11.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=50, score=0.690208, total=  11.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.691316, total=  21.2s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.694139, total=  21.2s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.682364, total=  21.2s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.697410, total=  21.5s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.690258, total=  21.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.691985, total=  21.3s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.690051, total=  20.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.684269, total=  20.0s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.689323, total=  20.1s
[CV] max_features=0.2, min_samples_leaf=10, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=10, n_estimators=100, score=0.692834, total=  20.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.680299, total=   9.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.681666, total=   9.2s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.669693, total=   9.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.685892, total=   9.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.679000, total=   9.2s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.681488, total=   9.2s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.679586, total=   9.1s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.674875, total=   9.4s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.671800, total=   9.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=50 ..........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=50, score=0.678400, total=   9.1s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.679835, total=  17.5s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.684237, total=  17.4s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.671817, total=  17.4s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.685309, total=  17.3s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.680435, total=  17.4s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.681511, total=  17.2s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.678700, total=  17.6s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.675334, total=  17.4s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.674762, total=  17.5s
[CV] max_features=0.2, min_samples_leaf=50, n_estimators=100 .........
[CV]  max_features=0.2, min_samples_leaf=50, n_estimators=100, score=0.680127, total=  17.1s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.672168, total=   8.3s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.675879, total=   8.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.663309, total=   8.2s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.678086, total=   8.4s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.674120, total=   8.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.674442, total=   8.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.672260, total=   7.8s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.669740, total=   7.8s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.666465, total=   8.0s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=50 .........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=50, score=0.670716, total=   7.8s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.674741, total=  14.7s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.677191, total=  14.7s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.665628, total=  15.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.680328, total=  15.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.673725, total=  15.7s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.674884, total=  15.5s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.671123, total=  15.6s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.671917, total=  15.7s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.669254, total=  15.7s
[CV] max_features=0.2, min_samples_leaf=100, n_estimators=100 ........
[CV]  max_features=0.2, min_samples_leaf=100, n_estimators=100, score=0.673753, total=  15.6s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.658801, total=   9.6s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.659060, total=   9.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.649992, total=   9.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.659136, total=   9.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.649749, total=   9.8s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.652311, total=   9.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.654951, total=   9.8s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.644762, total=   9.8s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.656255, total=   9.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=50, score=0.653676, total=   9.6s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.662417, total=  18.5s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.662935, total=  18.5s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.656037, total=  18.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.659818, total=  18.3s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.655685, total=  18.4s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.659055, total=  17.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.660911, total=  17.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.649834, total=  17.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.659755, total=  17.7s
[CV] max_features=auto, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=1, n_estimators=100, score=0.660628, total=  18.4s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.690823, total=   7.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.692218, total=   7.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.681533, total=   7.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.695001, total=   8.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.685654, total=   7.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.686418, total=   8.2s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.688447, total=   7.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.680116, total=   8.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.683318, total=   8.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=50, score=0.684941, total=   8.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.691220, total=  14.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.694393, total=  14.7s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.679268, total=  15.1s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.693581, total=  14.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.688082, total=  15.1s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.686561, total=  14.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.688691, total=  15.0s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.681451, total=  14.9s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.685764, total=  15.1s
[CV] max_features=auto, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=auto, min_samples_leaf=5, n_estimators=100, score=0.688860, total=  15.0s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.689544, total=   7.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.687932, total=   7.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.677385, total=   7.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.690156, total=   7.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.687802, total=   7.6s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.684715, total=   7.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.682442, total=   7.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.679482, total=   7.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.682193, total=   7.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=50, score=0.683510, total=   7.6s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.684546, total=  14.1s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.691461, total=  13.8s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.678830, total=  13.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.689271, total=  13.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.685884, total=  13.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.683735, total=  13.4s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.684622, total=  13.5s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.679957, total=  13.6s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.681488, total=  13.9s
[CV] max_features=auto, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=10, n_estimators=100, score=0.684064, total=  13.9s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.673127, total=   6.2s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.674274, total=   6.1s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.664635, total=   6.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.676645, total=   6.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.668052, total=   6.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.673197, total=   6.1s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.671520, total=   6.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.666191, total=   6.2s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.666938, total=   6.1s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=50, score=0.670619, total=   6.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.672717, total=  11.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.679101, total=  11.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.665086, total=  11.5s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.680066, total=  11.6s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.671416, total=  11.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.673085, total=  11.5s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.671877, total=  11.3s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.667577, total=  11.4s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.669507, total=  11.5s
[CV] max_features=auto, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=auto, min_samples_leaf=50, n_estimators=100, score=0.669924, total=  11.4s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.664832, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.666125, total=   5.8s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.657515, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.673370, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.662684, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.662170, total=   5.5s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.663197, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.660698, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.658432, total=   5.7s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=50, score=0.664916, total=   5.6s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.666565, total=  10.3s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.669438, total=  10.2s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.658173, total=  10.4s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.671259, total=  10.4s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.665417, total=  10.5s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.666330, total=  10.3s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.661362, total=  10.6s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.659330, total=  10.4s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.659032, total=  10.3s
[CV] max_features=auto, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=auto, min_samples_leaf=100, n_estimators=100, score=0.662671, total=  10.3s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.652100, total=   9.0s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.656628, total=   9.1s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.646728, total=   8.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.654046, total=   8.5s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.650351, total=   8.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.648565, total=   8.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.648034, total=   8.7s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.642593, total=   8.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.653301, total=   8.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=50, score=0.649713, total=   8.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.659262, total=  16.3s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.657553, total=  16.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.650333, total=  16.5s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.656775, total=  16.6s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.655385, total=  16.6s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.654120, total=  16.5s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.656832, total=  16.4s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.643297, total=  16.5s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.656975, total=  16.6s
[CV] max_features=log2, min_samples_leaf=1, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=1, n_estimators=100, score=0.656257, total=  16.3s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.686200, total=   6.8s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.690212, total=   6.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.677388, total=   6.8s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.690217, total=   7.0s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.683490, total=   6.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.684471, total=   6.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.684774, total=   6.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.679959, total=   6.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.681168, total=   6.8s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=50 ..........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=50, score=0.685200, total=   6.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.686811, total=  12.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.693200, total=  12.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.679333, total=  12.8s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.687601, total=  12.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.684832, total=  12.6s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.682786, total=  12.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.686439, total=  12.6s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.680193, total=  12.7s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.682441, total=  12.9s
[CV] max_features=log2, min_samples_leaf=5, n_estimators=100 .........
[CV]  max_features=log2, min_samples_leaf=5, n_estimators=100, score=0.684860, total=  12.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.681667, total=   6.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.685196, total=   6.4s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.673259, total=   6.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.685642, total=   6.4s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.682731, total=   6.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.678443, total=   6.2s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.683293, total=   6.4s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.673057, total=   6.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.676964, total=   6.1s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=50, score=0.679843, total=   6.1s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.681262, total=  11.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.687383, total=  11.5s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.674064, total=  11.6s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.684343, total=  11.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.680402, total=  11.4s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.680933, total=  11.3s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.678437, total=  11.6s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.675079, total=  11.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.676820, total=  11.8s
[CV] max_features=log2, min_samples_leaf=10, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=10, n_estimators=100, score=0.680521, total=  11.7s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.669506, total=   5.5s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.675879, total=   5.2s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.657820, total=   5.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.674400, total=   5.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.664675, total=   5.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.667477, total=   5.4s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.667142, total=   5.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.661198, total=   5.4s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.663466, total=   5.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=50 .........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=50, score=0.666565, total=   5.3s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.669411, total=   9.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.673615, total=   9.7s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.662535, total=   9.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.676418, total=   9.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.665182, total=   9.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.666835, total=   9.7s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.669683, total=   9.5s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.662029, total=   9.6s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.663114, total=   9.8s
[CV] max_features=log2, min_samples_leaf=50, n_estimators=100 ........
[CV]  max_features=log2, min_samples_leaf=50, n_estimators=100, score=0.668349, total=   9.4s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.664936, total=   4.7s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.668059, total=   4.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.656568, total=   4.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.668589, total=   4.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.657034, total=   4.7s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.659690, total=   4.9s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.658688, total=   4.7s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.651598, total=   4.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.658742, total=   4.9s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=50 ........
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=50, score=0.659360, total=   4.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.663526, total=   8.8s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.666463, total=   8.6s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.656867, total=   8.6s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.667582, total=   8.6s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.660972, total=   8.7s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.661409, total=   8.6s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.657826, total=   8.7s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.653651, total=   8.6s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.656847, total=   8.5s
[CV] max_features=log2, min_samples_leaf=100, n_estimators=100 .......
[CV]  max_features=log2, min_samples_leaf=100, n_estimators=100, score=0.659686, total=   8.6s






    



[Parallel(n_jobs=1)]: Done 300 out of 300 | elapsed: 68.5min finished






    Out[40]:





GridSearchCV(cv=10, error_score='raise',
       estimator=RandomForestClassifier(bootstrap=True, class_weight='balanced',
            criterion='gini', max_depth=None, max_features='auto',
            max_leaf_nodes=None, min_impurity_split=1e-07,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=4,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False),
       fit_params={}, iid=True, n_jobs=1,
       param_grid={'max_features': [0.2, 'auto', 'log2'], 'n_estimators': [50, 100], 'min_samples_leaf': [1, 5, 10, 50, 100]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring='f1', verbose=3)



In [41]:

    
print(grid_rf_l84.best_params_)









    



{'max_features': 0.2, 'min_samples_leaf': 5, 'n_estimators': 100}



In [42]:

    
# Run RF with optimal paramters 
opt_rf_l84 = RandomForestClassifier(class_weight = 'balanced', max_features = 0.2, min_samples_leaf = 5, n_estimators = 100)
opt_rf_l84.fit(X_train_l84, y_train_l84)
opt_rf_l84_pred = opt_rf_l84.predict(X_test_l84)



In [43]:

    
print(classification_report(y_test_l84, opt_rf_l84_pred))









    



             precision    recall  f1-score   support

          0       0.75      0.75      0.75     51666
          1       0.69      0.70      0.70     41974

avg / total       0.73      0.73      0.73     93640

Random Forest, again, performs much better than Logistic Regression. Let's compare the performance with a random model. We will define this classifier as one that is aware of the proportion (imblance) of classes in the training data, and makes predictions on testing data according to the ratio of positive and negative examples it sees in the training data.



In [44]:

    
# Build a random/control models to compare Random Forest performance with

# proportion of positive class in training data
pos_prop = np.sum(y_train_l84)/len(y_train_l84)  

expected_pos_in_test = np.round(pos_prop*(len(y_test_l84)))
               
                
#control predicts accroding to proportions of positive and negative examples in the training data
zs = np.zeros(len(y_test_l84) - int(expected_pos_in_test)) #zeros
os = np.ones((int(expected_pos_in_test)))
                
zo = np.concatenate((zs, os))
                
y_test_control = np.random.permutation(zo)



In [45]:

    
print(classification_report(y_test_l84, y_test_control))









    



             precision    recall  f1-score   support

          0       0.55      0.55      0.55     51666
          1       0.45      0.45      0.45     41974

avg / total       0.50      0.50      0.50     93640

Random (control) model has much lower precision and recall than random forest for loans with duration < 84 months. Let's again examine feature importances.



In [46]:

    
# Feature ranking for random forest
fig = plt.figure(figsize = (15,5))
importances_rf_l84 = opt_rf_l84.feature_importances_
std = np.std([tree.feature_importances_ for tree in opt_rf_l84.estimators_],
            axis=0)
indices_rf_l84 = np.argsort(importances_rf_l84)[::-1]
fts_rf_l84 = list(X_train_l84.columns)
# Print the feature ranking
print("Feature ranking:")

for f in range(X_train_l84.shape[1]):
   print("%d. feature %s (%f)" % (f + 1, fts_rf_l84[indices_rf_l84[f]], importances_rf_l84[indices_rf_l84[f]]))









    



Feature ranking:
1. feature SP_to2016 (0.289196)
2. feature SBAGuaranteedApprovalAdj (0.223007)
3. feature JobsSupported (0.106649)
4. feature RevolverStatus (0.034330)
5. feature CA (0.031564)
6. feature FL (0.028554)
7. feature INDIVIDUAL (0.016419)
8. feature Retail Trade (0.014239)
9. feature GA (0.012046)
10. feature FranchiseCodeBin (0.011234)
11. feature TX (0.010899)
12. feature Health Care & Social Assistance (0.010268)
13. feature NY (0.008815)
14. feature Construction (0.008623)
15. feature PA (0.008377)
16. feature Professional, Scientific, & Technical Services (0.008231)
17. feature MA (0.008086)
18. feature Other Services (0.007896)
19. feature Wholesale Trade (0.007742)
20. feature Manufacturing (0.007437)
21. feature IL (0.006598)
22. feature Transportation & Warehousing (0.006583)
23. feature PR (0.006115)
24. feature AZ (0.005460)
25. feature Administrative/Support/Waste Management (0.005118)
26. feature MT (0.005055)
27. feature OH (0.004972)
28. feature NJ (0.004888)
29. feature UT (0.004489)
30. feature MI (0.004296)
31. feature CO (0.004280)
32. feature MN (0.004009)
33. feature NH (0.003963)
34. feature MD (0.003683)
35. feature ME (0.003624)
36. feature NV (0.003541)
37. feature Agriculture (0.003445)
38. feature WA (0.003335)
39. feature PARTNERSHIP (0.003289)
40. feature Real Estate Rental & Leasing (0.003042)
41. feature CT (0.003041)
42. feature WI (0.002921)
43. feature MO (0.002813)
44. feature HI (0.002781)
45. feature IN (0.002743)
46. feature VA (0.002600)
47. feature VT (0.002396)
48. feature Arts, Entertainment, & Recreation (0.002383)
49. feature NC (0.002303)
50. feature ND (0.002160)
51. feature Information (0.002016)
52. feature RI (0.001963)
53. feature KY (0.001959)
54. feature Finance & Insurance (0.001841)
55. feature KS (0.001798)
56. feature TN (0.001698)
57. feature LA (0.001623)
58. feature NE (0.001589)
59. feature IA (0.001523)
60. feature OR (0.001522)
61. feature OK (0.001521)
62. feature AR (0.001499)
63. feature AL (0.001444)
64. feature MS (0.001178)
65. feature SC (0.001139)
66. feature Mining (0.001131)
67. feature ID (0.001121)
68. feature Educational Services (0.000991)
69. feature NM (0.000914)
70. feature DC (0.000572)
71. feature WV (0.000395)
72. feature WY (0.000381)
73. feature SD (0.000361)
74. feature DE (0.000246)
75. feature Utilities (0.000033)
76. feature Public Administration (0.000005)
77. feature Management of Companies & Enterprises (0.000000)






    





<matplotlib.figure.Figure at 0x11bbe52e8>



In [47]:

    
# Plot the feature importances of the forest
importances = opt_rf_l84.feature_importances_
std = np.std([tree.feature_importances_ for tree in opt_rf_l84.estimators_],
             axis=0)
indices = np.argsort(importances)[::-1]

# Plot the feature importances of the forest
fig = plt.figure(figsize = (10,5))

plt.title("Feature importances")
plt.barh(range(10), importances[indices][0:10][::-1],
       color="r", xerr=std[indices][0:10][::-1], align="center")
# If you want to define your own labels,
# change indices to a list of labels on the following line.
plt.yticks(range(10), ['S&P 1500', 'Amount', 'Jobs Supported', 'Revolver Status',
                       'CA',  'FL', 'Individual', 'Retail Trade', 'GA', 'Franchise'][::-1])
    
plt.ylim([-1,10])
plt.tight_layout()


fig.savefig('L84_RF_FImportance.png', dpi = 300)