Facies classification using plurality voting (e.g. multiclass majority voting)

Contest entry by: Matteo Niccoli and Mark Dahl

Original contest notebook by Brendon Hall, Enthought


<span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">The code and ideas in this notebook,</span> by <span xmlns:cc="http://creativecommons.org/ns#" property="cc:attributionName">Matteo Niccoli and Mark Dahl,</span> are licensed under a Creative Commons Attribution 4.0 International License.

In this notebook we will attempt to predict facies from well log data using machine learnig classifiers. The dataset comes from a class exercise from The University of Kansas on Neural Networks and Fuzzy Systems. This exercise is based on a consortium project to use machine learning techniques to create a reservoir model of the largest gas fields in North America, the Hugoton and Panoma Fields. For more info on the origin of the data, see Bohling and Dubois (2003) and Dubois et al. (2007).

The dataset consists of log data from nine wells that have been labeled with a facies type based on observation of core. We will use this log data to train a support vector machine to classify facies types.

The plan

We will created three classifiers with pretuned parameters:

  • best SVM in the competition (by our team's SVM submission)
  • best Random Forest in the competition (form the leading submission, by gccrowther)
  • multilayer perceptron (from previous notebooks, not submitted)

We will then try to predict the facies using a plurality voting approach (plurality voting = multi-class majority voting).

From the scikit-learn website: "The idea behind the voting classifier implementation is to combine conceptually different machine learning classifiers and use a majority vote or the average predicted probabilities (soft vote) to predict the class labels. Such a classifier can be useful for a set of equally well performing model in order to balance out their individual weaknesses".

Exploring the dataset

First, we will examine the data set we will use to train the classifier.


In [1]:
%matplotlib inline
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.colors as colors
from mpl_toolkits.axes_grid1 import make_axes_locatable

import pandas as pd
from pandas import set_option
set_option("display.max_rows", 10)
pd.options.mode.chained_assignment = None

from sklearn import preprocessing
from sklearn.metrics import f1_score, accuracy_score, make_scorer
from sklearn.model_selection import LeaveOneGroupOut

In [2]:
filename = 'facies_vectors.csv'
training_data = pd.read_csv(filename)
training_data


Out[2]:
Facies Formation Well Name Depth GR ILD_log10 DeltaPHI PHIND PE NM_M RELPOS
0 3 A1 SH SHRIMPLIN 2793.0 77.450 0.664 9.900 11.915 4.600 1 1.000
1 3 A1 SH SHRIMPLIN 2793.5 78.260 0.661 14.200 12.565 4.100 1 0.979
2 3 A1 SH SHRIMPLIN 2794.0 79.050 0.658 14.800 13.050 3.600 1 0.957
3 3 A1 SH SHRIMPLIN 2794.5 86.100 0.655 13.900 13.115 3.500 1 0.936
4 3 A1 SH SHRIMPLIN 2795.0 74.580 0.647 13.500 13.300 3.400 1 0.915
... ... ... ... ... ... ... ... ... ... ... ...
4144 5 C LM CHURCHMAN BIBLE 3120.5 46.719 0.947 1.828 7.254 3.617 2 0.685
4145 5 C LM CHURCHMAN BIBLE 3121.0 44.563 0.953 2.241 8.013 3.344 2 0.677
4146 5 C LM CHURCHMAN BIBLE 3121.5 49.719 0.964 2.925 8.013 3.190 2 0.669
4147 5 C LM CHURCHMAN BIBLE 3122.0 51.469 0.965 3.083 7.708 3.152 2 0.661
4148 5 C LM CHURCHMAN BIBLE 3122.5 50.031 0.970 2.609 6.668 3.295 2 0.653

4149 rows × 11 columns

This data is from the Council Grove gas reservoir in Southwest Kansas. The Panoma Council Grove Field is predominantly a carbonate gas reservoir encompassing 2700 square miles in Southwestern Kansas. This dataset is from nine wells (with 4149 examples), consisting of a set of seven predictor variables and a rock facies (class) for each example vector and validation (test) data (830 examples from two wells) having the same seven predictor variables in the feature vector. Facies are based on examination of cores from nine wells taken vertically at half-foot intervals. Predictor variables include five from wireline log measurements and two geologic constraining variables that are derived from geologic knowledge. These are essentially continuous variables sampled at a half-foot sample rate.

The seven predictor variables are:

The nine discrete facies (classes of rocks) are:

  1. Nonmarine sandstone
  2. Nonmarine coarse siltstone
  3. Nonmarine fine siltstone
  4. Marine siltstone and shale
  5. Mudstone (limestone)
  6. Wackestone (limestone)
  7. Dolomite
  8. Packstone-grainstone (limestone)
  9. Phylloid-algal bafflestone (limestone)

These facies aren't discrete, and gradually blend into one another. Some have neighboring facies that are rather close. Mislabeling within these neighboring facies can be expected to occur. The following table lists the facies, their abbreviated labels and their approximate neighbors.

Facies Label Adjacent Facies
1 SS 2
2 CSiS 1,3
3 FSiS 2
4 SiSh 5
5 MS 4,6
6 WS 5,7
7 D 6,8
8 PS 6,7,9
9 BS 7,8

Let's clean up this dataset. The 'Well Name' and 'Formation' columns can be turned into a categorical data type.


In [3]:
training_data['Well Name'] = training_data['Well Name'].astype('category')
training_data['Formation'] = training_data['Formation'].astype('category')
training_data['Well Name'].unique()


Out[3]:
[SHRIMPLIN, ALEXANDER D, SHANKLE, LUKE G U, KIMZEY A, CROSS H CATTLE, NOLAN, Recruit F9, NEWBY, CHURCHMAN BIBLE]
Categories (10, object): [SHRIMPLIN, ALEXANDER D, SHANKLE, LUKE G U, ..., NOLAN, Recruit F9, NEWBY, CHURCHMAN BIBLE]

These are the names of the 10 training wells in the Council Grove reservoir. Data has been recruited into pseudo-well 'Recruit F9' to better represent facies 9, the Phylloid-algal bafflestone.

Before we plot the well data, let's define a color map so the facies are represented by consistent color in all the plots in this tutorial. We also create the abbreviated facies labels, and add those to the facies_vectors dataframe.


In [4]:
# 1=sandstone  2=c_siltstone   3=f_siltstone # 4=marine_silt_shale 
#5=mudstone 6=wackestone 7=dolomite 8=packstone 9=bafflestone
facies_colors = ['#F4D03F', '#F5B041', '#DC7633','#A569BD',
       '#000000', '#000080', '#2E86C1', '#AED6F1', '#196F3D']

facies_labels = ['SS', 'CSiS', 'FSiS', 'SiSh', 'MS',
                 'WS', 'D','PS', 'BS']
#facies_color_map is a dictionary that maps facies labels
#to their respective colors
facies_color_map = {}
for ind, label in enumerate(facies_labels):
    facies_color_map[label] = facies_colors[ind]

def label_facies(row, labels):
    return labels[ row['Facies'] -1]
    
training_data.loc[:,'FaciesLabels'] = training_data.apply(lambda row: label_facies(row, facies_labels), axis=1)
training_data.describe()


Out[4]:
Facies Depth GR ILD_log10 DeltaPHI PHIND PE NM_M RELPOS
count 4149.000000 4149.000000 4149.000000 4149.000000 4149.000000 4149.000000 3232.000000 4149.000000 4149.000000
mean 4.503254 2906.867438 64.933985 0.659566 4.402484 13.201066 3.725014 1.518438 0.521852
std 2.474324 133.300164 30.302530 0.252703 5.274947 7.132846 0.896152 0.499720 0.286644
min 1.000000 2573.500000 10.149000 -0.025949 -21.832000 0.550000 0.200000 1.000000 0.000000
25% 2.000000 2821.500000 44.730000 0.498000 1.600000 8.500000 3.100000 1.000000 0.277000
50% 4.000000 2932.500000 64.990000 0.639000 4.300000 12.020000 3.551500 2.000000 0.528000
75% 6.000000 3007.000000 79.438000 0.822000 7.500000 16.050000 4.300000 2.000000 0.769000
max 9.000000 3138.000000 361.150000 1.800000 19.312000 84.400000 8.094000 2.000000 1.000000

This is a quick view of the statistical distribution of the input variables. Looking at the count values, most values have 4149 valid values except for PE, which has 3232. We will drop the feature vectors that don't have a valid PE entry.


In [5]:
PE_mask = training_data['PE'].notnull().values
training_data = training_data[PE_mask]

In [6]:
training_data.describe()


Out[6]:
Facies Depth GR ILD_log10 DeltaPHI PHIND PE NM_M RELPOS
count 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000
mean 4.422030 2875.824567 66.135769 0.642719 3.559642 13.483213 3.725014 1.498453 0.520287
std 2.504243 131.006274 30.854826 0.241845 5.228948 7.698980 0.896152 0.500075 0.286792
min 1.000000 2573.500000 13.250000 -0.025949 -21.832000 0.550000 0.200000 1.000000 0.010000
25% 2.000000 2791.000000 46.918750 0.492750 1.163750 8.346750 3.100000 1.000000 0.273000
50% 4.000000 2893.500000 65.721500 0.624437 3.500000 12.150000 3.551500 1.000000 0.526000
75% 6.000000 2980.000000 79.626250 0.812735 6.432500 16.453750 4.300000 2.000000 0.767250
max 9.000000 3122.500000 361.150000 1.480000 18.600000 84.400000 8.094000 2.000000 1.000000

Now we extract just the feature variables we need to perform the classification. The predictor variables are the five log values and two geologic constraining variables, and we are also using depth. We also get a vector of the facies labels that correspond to each feature vector.


In [7]:
y = training_data['Facies'].values
print y[25:40]
print np.shape(y)


[3 3 2 2 2 2 2 2 3 3 3 3 3 3 3]
(3232,)

In [8]:
X = training_data.drop(['Formation', 'Well Name','Facies','FaciesLabels'], axis=1)
print np.shape(X)
X.describe(percentiles=[.05, .25, .50, .75, .95])


(3232, 8)
Out[8]:
Depth GR ILD_log10 DeltaPHI PHIND PE NM_M RELPOS
count 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000 3232.000000
mean 2875.824567 66.135769 0.642719 3.559642 13.483213 3.725014 1.498453 0.520287
std 131.006274 30.854826 0.241845 5.228948 7.698980 0.896152 0.500075 0.286792
min 2573.500000 13.250000 -0.025949 -21.832000 0.550000 0.200000 1.000000 0.010000
5% 2632.775000 23.491000 0.237299 -5.600000 4.800000 2.523300 1.000000 0.070000
25% 2791.000000 46.918750 0.492750 1.163750 8.346750 3.100000 1.000000 0.273000
50% 2893.500000 65.721500 0.624437 3.500000 12.150000 3.551500 1.000000 0.526000
75% 2980.000000 79.626250 0.812735 6.432500 16.453750 4.300000 2.000000 0.767250
95% 3061.500000 106.268000 1.045606 12.000000 27.787400 5.369550 2.000000 0.962000
max 3122.500000 361.150000 1.480000 18.600000 84.400000 8.094000 2.000000 1.000000

In [9]:
scaler = preprocessing.StandardScaler().fit(X)
X = scaler.transform(X)

Make performance scorers

Used to evaluate performance.


In [10]:
Fscorer = make_scorer(f1_score, average = 'micro')

Pre-tuned SVM classifier classifier and leave one well out average F1 score

This is the Support Vector Machine classifier from our first submission.


In [11]:
from sklearn import svm
SVC_classifier = svm.SVC(C = 100, cache_size=2400, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.01, kernel='rbf',
  max_iter=-1, probability=True, random_state=49, shrinking=True,
  tol=0.001, verbose=False)

In [12]:
f1_svc = []

wells = training_data["Well Name"].values
logo = LeaveOneGroupOut()

for train, test in logo.split(X, y, groups=wells):
    well_name = wells[test[0]]
    SVC_classifier.fit(X[train], y[train])
    pred_svc = SVC_classifier.predict(X[test])
    sc = f1_score(y[test], pred_svc, labels = np.arange(10), average = 'micro')
    print("{:>20s}  {:.3f}".format(well_name, sc))
    f1_svc.append(sc)
    
print "-Average leave-one-well-out F1 Score: %6f" % (sum(f1_svc)/(1.0*(len(f1_svc))))


     CHURCHMAN BIBLE  0.542
      CROSS H CATTLE  0.347
            LUKE G U  0.440
               NEWBY  0.400
               NOLAN  0.494
          Recruit F9  0.721
             SHANKLE  0.483
           SHRIMPLIN  0.590
-Average leave-one-well-out F1 Score: 0.502174

Pre-tuned multi-layer perceptron classifier and average F1 score


In [13]:
from sklearn.neural_network import MLPClassifier
mlp_classifier = MLPClassifier(activation='logistic', alpha=0.01, batch_size='auto',
       beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(100,), learning_rate='adaptive',
       learning_rate_init=0.001, max_iter=1000, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=49, shuffle=True,
       solver='adam', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

In [14]:
f1_mlp = []

wells = training_data["Well Name"].values
logo = LeaveOneGroupOut()

for train, test in logo.split(X, y, groups=wells):
    well_name = wells[test[0]]
    mlp_classifier.fit(X[train], y[train])
    pred_mlp = mlp_classifier.predict(X[test])
    sc = f1_score(y[test], pred_mlp, labels = np.arange(10), average = 'micro')
    print("{:>20s}  {:.3f}".format(well_name, sc))
    f1_mlp.append(sc)
    
print "-Average leave-one-well-out F1 Score: %6f" % (sum(f1_mlp)/(1.0*(len(f1_mlp))))


     CHURCHMAN BIBLE  0.525
      CROSS H CATTLE  0.341
            LUKE G U  0.419
               NEWBY  0.415
               NOLAN  0.482
          Recruit F9  0.779
             SHANKLE  0.541
           SHRIMPLIN  0.575
-Average leave-one-well-out F1 Score: 0.509666

Pre-tuned extra trees

This is the RF classifier with parameters tuned in the leading submission, by George Crowther, but without his engineered features.


In [15]:
from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import VarianceThreshold
from sklearn.ensemble import ExtraTreesClassifier

ET_classifier = make_pipeline(
    VarianceThreshold(threshold=0.49),
    ExtraTreesClassifier(criterion="entropy", max_features=0.71,
                         n_estimators=500, random_state=49))

In [16]:
f1_ET = []

wells = training_data["Well Name"].values
logo = LeaveOneGroupOut()

for train, test in logo.split(X, y, groups=wells):
    well_name = wells[test[0]]
    ET_classifier.fit(X[train], y[train])
    pred_cv = ET_classifier.predict(X[test])
    sc = f1_score(y[test], pred_cv, labels = np.arange(10), average = 'micro')
    print("{:>20s}  {:.3f}".format(well_name, sc))
    f1_ET.append(sc)
    
print "-Average leave-one-well-out F1 Score: %6f" % (sum(f1_ET)/(1.0*(len(f1_ET))))


     CHURCHMAN BIBLE  0.498
      CROSS H CATTLE  0.337
            LUKE G U  0.434
               NEWBY  0.408
               NOLAN  0.494
          Recruit F9  0.912
             SHANKLE  0.486
           SHRIMPLIN  0.614
-Average leave-one-well-out F1 Score: 0.522719

Plurality voting classifier (multi-class majority voting)

We will use a weighted approach, where the weights are somewhat arbitrary, but their proportion is based on the average f1 score of the individual classifiers.


In [17]:
from sklearn.ensemble import VotingClassifier

In [18]:
eclf_cv = VotingClassifier(estimators=[
        ('SVC', SVC_classifier), ('MLP', mlp_classifier), ('ET', ET_classifier)], 
                        voting='soft', weights=[0.3,0.33,0.37])

Leave one-well-out F1 scores


In [19]:
f1_ens = []

wells = training_data["Well Name"].values
logo = LeaveOneGroupOut()

for train, test in logo.split(X, y, groups=wells):
    well_name = wells[test[0]]
    eclf_cv.fit(X[train], y[train])
    pred_cv = eclf_cv.predict(X[test])
    sc = f1_score(y[test], pred_cv, labels = np.arange(10), average = 'micro')
    print("{:>20s}  {:.3f}".format(well_name, sc))
    f1_ens.append(sc)
    
print "-Average leave-one-well-out F1 Score: %6f" % (sum(f1_ens)/(1.0*(len(f1_ens))))


     CHURCHMAN BIBLE  0.554
      CROSS H CATTLE  0.351
            LUKE G U  0.451
               NEWBY  0.400
               NOLAN  0.501
          Recruit F9  0.912
             SHANKLE  0.519
           SHRIMPLIN  0.603
-Average leave-one-well-out F1 Score: 0.536423

Comments

Using the average F1 score from the leave-one-well out cross validation as a metric, the majority voting is superior to the individual classifiers, including the pre-tuned Random Forest from the leading submission. However, the Random Forest in the official leading submission was trained using additional new features engineered by George, and outperforms our majority voting classifier, with an F1 score of 0.580 against our 0.579. A clear indication, in our view, that the feature engineering is a key element to achieve the best possible prediction.

Predicting, displaying, and saving facies for blind wells


In [20]:
blind = pd.read_csv('validation_data_nofacies.csv') 
X_blind = np.array(blind.drop(['Formation', 'Well Name'], axis=1)) 
X_blind = scaler.transform(X_blind) 
y_pred = eclf_cv.fit(X, y).predict(X_blind) 
blind['Facies'] = y_pred

In [21]:
def make_facies_log_plot(logs, facies_colors):
    #make sure logs are sorted by depth
    logs = logs.sort_values(by='Depth')
    cmap_facies = colors.ListedColormap(
            facies_colors[0:len(facies_colors)], 'indexed')
    
    ztop=logs.Depth.min(); zbot=logs.Depth.max()
    
    cluster=np.repeat(np.expand_dims(logs['Facies'].values,1), 100, 1)
    
    f, ax = plt.subplots(nrows=1, ncols=6, figsize=(8, 12))
    ax[0].plot(logs.GR, logs.Depth, '-g')
    ax[1].plot(logs.ILD_log10, logs.Depth, '-')
    ax[2].plot(logs.DeltaPHI, logs.Depth, '-', color='0.5')
    ax[3].plot(logs.PHIND, logs.Depth, '-', color='r')
    ax[4].plot(logs.PE, logs.Depth, '-', color='black')
    im=ax[5].imshow(cluster, interpolation='none', aspect='auto',
                    cmap=cmap_facies,vmin=1,vmax=9)
    
    divider = make_axes_locatable(ax[5])
    cax = divider.append_axes("right", size="20%", pad=0.05)
    cbar=plt.colorbar(im, cax=cax)
    cbar.set_label((17*' ').join([' SS ', 'CSiS', 'FSiS', 
                                'SiSh', ' MS ', ' WS ', ' D  ', 
                                ' PS ', ' BS ']))
    cbar.set_ticks(range(0,1)); cbar.set_ticklabels('')
    
    for i in range(len(ax)-1):
        ax[i].set_ylim(ztop,zbot)
        ax[i].invert_yaxis()
        ax[i].grid()
        ax[i].locator_params(axis='x', nbins=3)
    
    ax[0].set_xlabel("GR")
    ax[0].set_xlim(logs.GR.min(),logs.GR.max())
    ax[1].set_xlabel("ILD_log10")
    ax[1].set_xlim(logs.ILD_log10.min(),logs.ILD_log10.max())
    ax[2].set_xlabel("DeltaPHI")
    ax[2].set_xlim(logs.DeltaPHI.min(),logs.DeltaPHI.max())
    ax[3].set_xlabel("PHIND")
    ax[3].set_xlim(logs.PHIND.min(),logs.PHIND.max())
    ax[4].set_xlabel("PE")
    ax[4].set_xlim(logs.PE.min(),logs.PE.max())
    ax[5].set_xlabel('Facies')
    
    ax[1].set_yticklabels([]); ax[2].set_yticklabels([]); ax[3].set_yticklabels([])
    ax[4].set_yticklabels([]); ax[5].set_yticklabels([])
    ax[5].set_xticklabels([])
    f.suptitle('Well: %s'%logs.iloc[0]['Well Name'], fontsize=14,y=0.94)

In [22]:
make_facies_log_plot(blind[blind['Well Name'] == 'STUART'], facies_colors)
make_facies_log_plot(blind[blind['Well Name'] == 'CRAWFORD'], facies_colors)



In [23]:
np.save('ypred.npy', y_pred)

Displaying predicted versus original facies in the training data

This is a nice display to finish up with, as it gives us a visual idea of the predicted faces where we have facies from the core observations. The plot we will use a function from the original notebook. Let's look at the well with the lowest F1 from the previous code block, CROSS H CATTLE, and the one with the highest F1 (excluding Recruit F9), which is SHRIMPLIN.


In [24]:
def compare_facies_plot(logs, compadre, facies_colors):
    #make sure logs are sorted by depth
    logs = logs.sort_values(by='Depth')
    cmap_facies = colors.ListedColormap(
            facies_colors[0:len(facies_colors)], 'indexed')
    
    ztop=logs.Depth.min(); zbot=logs.Depth.max()
    
    cluster1 = np.repeat(np.expand_dims(logs['Facies'].values,1), 100, 1)
    cluster2 = np.repeat(np.expand_dims(logs[compadre].values,1), 100, 1)
    
    f, ax = plt.subplots(nrows=1, ncols=7, figsize=(9, 12))
    ax[0].plot(logs.GR, logs.Depth, '-g')
    ax[1].plot(logs.ILD_log10, logs.Depth, '-')
    ax[2].plot(logs.DeltaPHI, logs.Depth, '-', color='0.5')
    ax[3].plot(logs.PHIND, logs.Depth, '-', color='r')
    ax[4].plot(logs.PE, logs.Depth, '-', color='black')
    im1 = ax[5].imshow(cluster1, interpolation='none', aspect='auto',
                    cmap=cmap_facies,vmin=1,vmax=9)
    im2 = ax[6].imshow(cluster2, interpolation='none', aspect='auto',
                    cmap=cmap_facies,vmin=1,vmax=9)
    
    divider = make_axes_locatable(ax[6])
    cax = divider.append_axes("right", size="20%", pad=0.05)
    cbar=plt.colorbar(im2, cax=cax)
    cbar.set_label((17*' ').join([' SS ', 'CSiS', 'FSiS', 
                                'SiSh', ' MS ', ' WS ', ' D  ', 
                                ' PS ', ' BS ']))
    cbar.set_ticks(range(0,1)); cbar.set_ticklabels('')
    
    for i in range(len(ax)-2):
        ax[i].set_ylim(ztop,zbot)
        ax[i].invert_yaxis()
        ax[i].grid()
        ax[i].locator_params(axis='x', nbins=3)
    
    ax[0].set_xlabel("GR")
    ax[0].set_xlim(logs.GR.min(),logs.GR.max())
    ax[1].set_xlabel("ILD_log10")
    ax[1].set_xlim(logs.ILD_log10.min(),logs.ILD_log10.max())
    ax[2].set_xlabel("DeltaPHI")
    ax[2].set_xlim(logs.DeltaPHI.min(),logs.DeltaPHI.max())
    ax[3].set_xlabel("PHIND")
    ax[3].set_xlim(logs.PHIND.min(),logs.PHIND.max())
    ax[4].set_xlabel("PE")
    ax[4].set_xlim(logs.PE.min(),logs.PE.max())
    ax[5].set_xlabel('Facies')
    ax[6].set_xlabel(compadre)
    
    ax[1].set_yticklabels([]); ax[2].set_yticklabels([]); ax[3].set_yticklabels([])
    ax[4].set_yticklabels([]); ax[5].set_yticklabels([])
    ax[5].set_xticklabels([])
    ax[6].set_xticklabels([])
    f.suptitle('Well: %s'%logs.iloc[0]['Well Name'], fontsize=14,y=0.94)

In [25]:
eclf_cv.fit(X,y)
pred = eclf_cv.predict(X)

X = training_data
X['Prediction'] = pred

In [26]:
compare_facies_plot(X[X['Well Name'] == 'CROSS H CATTLE'], 'Prediction', facies_colors)
compare_facies_plot(X[X['Well Name'] == 'SHRIMPLIN'], 'Prediction', facies_colors)


To do next:

  • replace current Random Forest in this notebook with our own extra-trees classifier.
  • implementation of new features.

In [ ]: