Tensorflow with Scikit-Learn

Preliminary stuff you can safely ignore


In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
%matplotlib inline
%pylab inline


Populating the interactive namespace from numpy and matplotlib

In [3]:
# to make sure we have 0.18.1 as version of sklearn
!conda install --name root scikit-learn -y


Fetching package metadata .......
Solving package specifications: ..........

# All requested packages already installed.
# packages in environment at /home/nbcommon/anaconda3_410:
#
scikit-learn              0.18.1              np111py35_0  

In [4]:
# should be at least 0.18.1
import sklearn
sklearn.__version__


Out[4]:
'0.18.1'

In [5]:
# https://www.tensorflow.org/get_started/os_setup#anaconda_installation
!conda install --name root -c conda-forge tensorflow -y


Fetching package metadata .........
Solving package specifications: ..........

Package plan for installation in environment /home/nbcommon/anaconda3_410:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    conda-env-2.6.0            |                0         1017 B  conda-forge
    conda-4.2.13               |           py35_0         383 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         384 KB

The following packages will be SUPERCEDED by a higher-priority channel:

    conda:     4.2.13-py35_0 --> 4.2.13-py35_0 conda-forge
    conda-env: 2.6.0-0       --> 2.6.0-0       conda-forge

Pruning fetched packages from the cache ...
Fetching packages ...
conda-env-2.6. 100% |################################| Time: 0:00:00   1.37 MB/s
conda-4.2.13-p 100% |################################| Time: 0:00:00   1.32 MB/s
Extracting packages ...
[      COMPLETE      ]|###################################################| 100%
Unlinking packages ...
[      COMPLETE      ]|###################################################| 100%
Linking packages ...
[      COMPLETE      ]|###################################################| 100%

In [6]:
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)
# should be at least 0.11.0
tf.__version__


Out[6]:
'0.11.0'

In [7]:
# graph definition
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.],[2.]])
product = tf.matmul(matrix1, matrix2)

# launching the graph in a session
with tf.Session() as sess:
    result = sess.run([product])
    print(result)


[array([[ 12.]], dtype=float32)]

In [8]:
sess = tf.InteractiveSession()

x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])

# Initialize 'x' using the run() method of its initializer op.
x.initializer.run()

# Add an op to subtract 'a' from 'x'.  Run it and print the result
sub = tf.sub(x, a)
print(sub.eval())
# ==> [-2. -1.]

# Close the Session when we're done.
sess.close()


[-2. -1.]

tflearn: High Level, Scikit-Lean like API

https://www.tensorflow.org/tutorials/tflearn/


In [9]:
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
X.shape, y.shape


Out[9]:
((150, 4), (150,))

In [10]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=3)
X_train.shape, y_train.shape, X_test.shape, y_test.shape


Out[10]:
((90, 4), (90,), (60, 4), (60,))

In [18]:
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]
clf = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir="/tmp/iris_model")

TODO: network configuration of 10,20,10 might be a bit too much, discuss


In [12]:
clf.fit(x=X_train, y=y_train, steps=2000)


Out[12]:
Estimator(params={'dropout': None, 'optimizer': 'Adagrad', 'gradient_clip_norm': None, 'activation_fn': <function relu at 0x7f6ecfc99400>, 'n_classes': 3, 'enable_centered_bias': True, 'hidden_units': [10, 20, 10], 'feature_columns': [_RealValuedColumn(column_name='', dimension=4, default_value=None, dtype=tf.float32, normalizer=None)], 'num_ps_replicas': 0, 'weight_column_name': None})

In [13]:
clf.evaluate(x=X_train, y=y_train)


Out[13]:
{'accuracy': 1.0, 'global_step': 6000, 'loss': 0.002281822}

In [14]:
clf.evaluate(x=X_test, y=y_test)


Out[14]:
{'accuracy': 0.94999999, 'global_step': 6000, 'loss': 0.47665894}

In [16]:
clf.predict(np.array([[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float))


Out[16]:
array([1, 2])

TODO


In [ ]: