开始使用Tensorflow

本教程帮助你使用TensorFlow编程, 开始之前,确保你安装了Tensorflow。使用 TensorFlow,你必须了解:

  • 如何使用Python编程。
  • 至少了解数组的概念。
  • 最好了解过机器学习。但不了解的话,本教程仍不失为一个很好的开始。

Tensorflow提供多种API。最底层API——Tensorflow核心——提供了完全的编程控制。我们建议机器学习研究人员以及需要精细控制他们模型的人使用Tensorflow核心。最高层API是建立在Tensorflow核心上的。这些高层API通常比Tensorflow核心易于学习和使用。此外,更高层的API是重复工作在不同使用者间更简单更一致。高层API像是tf.contrib.learn帮助你管理数据集,预测,训练和推理。注意一部分高层Tensorflow API——方法名包含contrib的——仍在开发中。有可能一些contrib方法在随后的版本中会改变或过时。

本教程从Tensorflow核心开始,随后我们会展示如何应用tf.contrib.learn中的一些模型。了解Tensorflow核心的理念有助于你理解Tensorflow内部是如何工作的。

张量(Tensor)

Tensorflow中的数据核心单位就是张量(Tensor)。张量包含了一组任意维度的数组的原始值。一个张量的(rank)是其维度值。以下是几个张量示例:

3                                   # 0阶张量:这是个有维度的纯量
[1. ,2., 3.]                        # 1阶张量:这是个维度为[3]的向量
[[1., 2., 3.], [4., 5., 6.]]        # 2阶张量:一个维度为[2,3]的矩阵
[[[1., 2., 3.]], [[7., 8., 9.]]]    # 维度为[2,1,3]的3阶张量

Tensorflow核心教程

导入Tensorflow

TensorFlow程序的规范导入声明如下:


In [1]:
import tensorflow as tf

这使Python可以访问TensorFlow所有的类,方法和符号。 大多数文档假定您已经完成了。

计算图

TensorFlow核心程序通常由两个不同阶段组成:

  1. 构建计算图阶段
  2. 执行计算图阶段 计算图是一系列排列成节点的Tensorflow操作的图,让我们构建一个简单的计算图。每个节点有零或多个张量作为输入并输出一个张量。常量也是一种张量。就像所有Tensorflow常量,常量没有输入,并输出一个保存在其内部的值。我们创建两个浮点型张量node1node2:

In [2]:
node1 = tf.constant(3.0, tf.float32)
node2 = tf.constant(4.0) # also tf.float32 implicitly
print(node1, node2)


Tensor("Const:0", shape=(), dtype=float32) Tensor("Const_1:0", shape=(), dtype=float32)

注意到打印出来的节点没有输出数值3.04.0。它们在运算时才会相应的输出3.0和4.0。真正计算这些节点,我们要在一个会话(session)中运行计算图。会话包括了Tensorflow运行时的控制和状态。

下面这段代码创建了一个会话,然后调用了其run方法来计算计算图以得出node1node2。代码如下:


In [3]:
sess = tf.Session()
print(sess.run([node1, node2]))


[3.0, 4.0]

我们看到了期望的数值3.0和4.0。

我们可以通过操作张量来构建更复杂的计算(操作同样是张量)。举例来说,我们可以将两个常数节点相加来得到一个新的图:


In [4]:
node3 = tf.add(node1, node2)
print("node3: ", node3)
print("sess.run(node3): ",sess.run(node3))


node3:  Tensor("Add:0", shape=(), dtype=float32)
sess.run(node3):  7.0

Tensorflow提供了一个叫做Tensorboard的视觉辅助工具来展示计算图。以下展示了Tensorboard如何视觉化计算图的:

这个图并不十分有趣,因为其总是输出常量。一个图可以是参数化的,接受输入,这被称作占位符(placeholders),占位符保证了在计算中会提供值。


In [5]:
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
adder_node = a + b  # + provides a shortcut for tf.add(a, b)

以上三行有点类似函数或匿名函数(lambda),我们定义了两个参数(a和b)以及对它们的操作。我们可以使用feed_dict参数来指定包含具体值的张量给占位符以进行计算:


In [6]:
print(sess.run(adder_node, {a: 3, b:4.5}))
print(sess.run(adder_node, {a: [1,3], b: [2, 4]}))


7.5
[ 3.  7.]

在Tensorboard中,计算图如下图说是:

我们可以通过加入其他操作来使计算图更复杂。例如,


In [7]:
add_and_triple = adder_node * 3.
print(sess.run(add_and_triple, {a: 3, b:4.5}))


22.5

以上计算图在Tensorboard中显示如下: 在机器学习中我们通常想要模型接受外部输入,就如上面一样。为了让模型更可训练,我们要修改图来输入相同的值获取新的输出。变量让我们向图添加可训练参数。它们由类型和初始值组成:


In [8]:
W = tf.Variable([.3], tf.float32)
b = tf.Variable([-.3], tf.float32)
x = tf.placeholder(tf.float32)
linear_model = W * x + b

常量当你调用tf.constant时就初始化了,而且它的值永远不会变。不同的是,变量当你调用tf.Variable时不会初始化。初始化Tensorflow程序中所有变量,你需要明确调用该操作:


In [9]:
init = tf.global_variables_initializer()
sess.run(init)

init是Tensorflow初始化所有变量子图的句柄。直到我们调用sess.run之前,变量都没初始化。

x是一个占位符,我们可以同时得出多个x值所对应的linear_model


In [11]:
print(sess.run(linear_model, {x:[1,2,3,4]}))


[ 0.          0.30000001  0.60000002  0.90000004]

我们创建了一个模型,当我们并不知道它有多好。评估训练数据模型,我们需要一个y占位符来提供所需的值,还要写损失函数。

损失函数描述了当前模型与提供的数据有多大偏离。我们使用一个标准的线性回归损失模型——当前模型与提供的数据的变化值的平方和。 linear_model - y创建了一个每个元素都为相应例子的错误变化量的向量。我们调用tf.square计算误差的平方。然后我们使用tf.reduce_sum将误差的平方相加来创建单个数值指代所有例子的误差:


In [12]:
y = tf.placeholder(tf.float32)
squared_deltas = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_deltas)
print(sess.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))


23.66

我们可以手工重新分配Wb的值为完美值-1和1来改进。变量在调用tf.variable时指定了初始化值,但可以用tf.assign来修改。举例来说,W=-1b=1是这个模型的最佳参数。修改如下:


In [13]:
fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])
sess.run([fixW, fixb])
print(sess.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))


0.0

这里我们猜测了Wb的“完美”值,但机器学习的意义就在于让机器自动修正模型参数。在下一节我们会展示如何来完成。

tf.train API

完整的机器学习的讨论超出本教程的讨论范围了。然而Tensorflow提供了优化器(optimizers)慢慢地改变每个变量来最小化损失函数。最简单的优化器是梯度下降函数(gradient descent)。它根据相对于每个变量的损失导数的大小来修改该变量。总体来说,手工进行符号求导太乏味并且容易出错。因此,仅使用tf.gradients给出模型的描述,Tensorflow就能自动处理求导。简单来说,优化器通常代劳了这些工作。例如,


In [14]:
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

In [15]:
sess.run(init) # reset values to incorrect defaults.
for i in range(1000):
  sess.run(train, {x:[1,2,3,4], y:[0,-1,-2,-3]})

print(sess.run([W, b]))


[array([-0.9999969], dtype=float32), array([ 0.99999082], dtype=float32)]

现在我们确实完成了一次机器学习!尽管这个简单的线性回归不需要许多Tensorflow核心代码,输入数据到模型的更复杂的模型和方法需要更多代码。因此Tensorflow为常见的模式,结构和功能提供更高级的抽象。下一节我们会学习如何使用这些抽象层。

完整程序

完整的可训练线性回归模型如下所示:


In [16]:
import numpy as np
import tensorflow as tf

# Model parameters
W = tf.Variable([.3], tf.float32)
b = tf.Variable([-.3], tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1,2,3,4]
y_train = [0,-1,-2,-3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
  sess.run(train, {x:x_train, y:y_train})

# evaluate training accuracy
curr_W, curr_b, curr_loss  = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))


W: [-0.9999969] b: [ 0.99999082] loss: 5.69997e-11

这个更复杂的程序一样可以在Tensorboard里显示出来

tf.contrib.learn

tf.contrib.learn是一个高级的Tensorflow库,讲话了机器学习的机制,包括下列:

  • 运行训练循环
  • 运行评估循环
  • 管理数据集
  • 管理输入 tf.contrib.learn 定义了许多常见模型。

基本使用

看使用tf.contrib.learn让线性回归程序变得多简单:


In [17]:
import tensorflow as tf
# NumPy is often used to load, manipulate and preprocess data.
import numpy as np

# Declare list of features. We only have one real-valued feature. There are many
# other types of columns that are more complicated and useful.
features = [tf.contrib.layers.real_valued_column("x", dimension=1)]

# An estimator is the front end to invoke training (fitting) and evaluation
# (inference). There are many predefined types like linear regression,
# logistic regression, linear classification, logistic classification, and
# many neural network classifiers and regressors. The following code
# provides an estimator that does linear regression.
estimator = tf.contrib.learn.LinearRegressor(feature_columns=features)

# TensorFlow provides many helper methods to read and set up data sets.
# Here we use `numpy_input_fn`. We have to tell the function how many batches
# of data (num_epochs) we want and how big each batch should be.
x = np.array([1., 2., 3., 4.])
y = np.array([0., -1., -2., -3.])
input_fn = tf.contrib.learn.io.numpy_input_fn({"x":x}, y, batch_size=4,
                                              num_epochs=1000)

# We can invoke 1000 training steps by invoking the `fit` method and passing the
# training data set.
estimator.fit(input_fn=input_fn, steps=1000)

# Here we evaluate how well our model did. In a real example, we would want
# to use a separate validation and testing data set to avoid overfitting.
print(estimator.evaluate(input_fn=input_fn))


INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\flynn\AppData\Local\Temp\tmpfg3y9nxz
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001A04F3A27F0>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': 'C:\\Users\\flynn\\AppData\\Local\\Temp\\tmpfg3y9nxz'}
WARNING:tensorflow:From c:\program files\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into C:\Users\flynn\AppData\Local\Temp\tmpfg3y9nxz\model.ckpt.
INFO:tensorflow:loss = 4.25, step = 1
INFO:tensorflow:global_step/sec: 319.215
INFO:tensorflow:loss = 0.0811149, step = 101 (0.314 sec)
INFO:tensorflow:global_step/sec: 331.92
INFO:tensorflow:loss = 0.013703, step = 201 (0.301 sec)
INFO:tensorflow:global_step/sec: 299.637
INFO:tensorflow:loss = 0.00571763, step = 301 (0.334 sec)
INFO:tensorflow:global_step/sec: 272.996
INFO:tensorflow:loss = 0.00116984, step = 401 (0.366 sec)
INFO:tensorflow:global_step/sec: 337.501
INFO:tensorflow:loss = 0.000354086, step = 501 (0.296 sec)
INFO:tensorflow:global_step/sec: 337.598
INFO:tensorflow:loss = 7.46088e-05, step = 601 (0.297 sec)
INFO:tensorflow:global_step/sec: 343.372
INFO:tensorflow:loss = 1.47349e-05, step = 701 (0.291 sec)
INFO:tensorflow:global_step/sec: 347.5
INFO:tensorflow:loss = 2.81632e-06, step = 801 (0.288 sec)
INFO:tensorflow:global_step/sec: 352.457
INFO:tensorflow:loss = 7.74491e-07, step = 901 (0.283 sec)
INFO:tensorflow:Saving checkpoints for 1000 into C:\Users\flynn\AppData\Local\Temp\tmpfg3y9nxz\model.ckpt.
INFO:tensorflow:Loss for final step: 2.23433e-07.
WARNING:tensorflow:From c:\program files\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Starting evaluation at 2017-06-14-08:42:09
INFO:tensorflow:Restoring parameters from C:\Users\flynn\AppData\Local\Temp\tmpfg3y9nxz\model.ckpt-1000
INFO:tensorflow:Finished evaluation at 2017-06-14-08:42:11
INFO:tensorflow:Saving dict for global step 1000: global_step = 1000, loss = 1.70858e-07
{'loss': 1.708582e-07, 'global_step': 1000}

自定义模型

tf.contrib.learn并不会将你限定在预制好的模型里。如果我们想要创建一个Tensorflow没建立的自定义模型。我们仍能保留tf.contrib.learn中的数据集,输入,训练等等的高级抽象。为了说明,我们将展示如何使用我们学到的底层API来实现我们自己的LinearRegressor等效模型。

要定义与tf.contrib.learn兼容的自定义模型,我们需要使用tf.contrib.learn.Estimatortf.contrib.learn.LinearRegressor实际上是tf.contrib.learn.Estimator的子类。不同于Estimator的子类,我们简单的提供了Estimator方法model_fn告诉tf.contrib.learn如何计算预测值,训练梯度,以及损失。代码如下所示:


In [18]:
import numpy as np
import tensorflow as tf
# Declare list of features, we only have one real-valued feature
def model(features, labels, mode):
  # Build a linear model and predict values
  W = tf.get_variable("W", [1], dtype=tf.float64)
  b = tf.get_variable("b", [1], dtype=tf.float64)
  y = W*features['x'] + b
  # Loss sub-graph
  loss = tf.reduce_sum(tf.square(y - labels))
  # Training sub-graph
  global_step = tf.train.get_global_step()
  optimizer = tf.train.GradientDescentOptimizer(0.01)
  train = tf.group(optimizer.minimize(loss),
                   tf.assign_add(global_step, 1))
  # ModelFnOps connects subgraphs we built to the
  # appropriate functionality.
  return tf.contrib.learn.ModelFnOps(
      mode=mode, predictions=y,
      loss=loss,
      train_op=train)

estimator = tf.contrib.learn.Estimator(model_fn=model)
# define our data set
x = np.array([1., 2., 3., 4.])
y = np.array([0., -1., -2., -3.])
input_fn = tf.contrib.learn.io.numpy_input_fn({"x": x}, y, 4, num_epochs=1000)

# train
estimator.fit(input_fn=input_fn, steps=1000)
# evaluate our model
print(estimator.evaluate(input_fn=input_fn, steps=10))


INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\flynn\AppData\Local\Temp\tmp3jfd9b4n
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001A04F7AF358>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': 'C:\\Users\\flynn\\AppData\\Local\\Temp\\tmp3jfd9b4n'}
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into C:\Users\flynn\AppData\Local\Temp\tmp3jfd9b4n\model.ckpt.
INFO:tensorflow:loss = 247.496701853, step = 1
INFO:tensorflow:global_step/sec: 511.142
INFO:tensorflow:loss = 0.000533193924613, step = 101 (0.197 sec)
INFO:tensorflow:global_step/sec: 618.761
INFO:tensorflow:loss = 7.35429351318e-05, step = 201 (0.162 sec)
INFO:tensorflow:global_step/sec: 580.982
INFO:tensorflow:loss = 2.90393887272e-06, step = 301 (0.173 sec)
INFO:tensorflow:global_step/sec: 579.053
INFO:tensorflow:loss = 6.44850839069e-07, step = 401 (0.172 sec)
INFO:tensorflow:global_step/sec: 569.392
INFO:tensorflow:loss = 2.70555060297e-08, step = 501 (0.176 sec)
INFO:tensorflow:global_step/sec: 371.484
INFO:tensorflow:loss = 3.2432840363e-09, step = 601 (0.271 sec)
INFO:tensorflow:global_step/sec: 404.233
INFO:tensorflow:loss = 2.67636255243e-10, step = 701 (0.246 sec)
INFO:tensorflow:global_step/sec: 582.709
INFO:tensorflow:loss = 6.11006128778e-12, step = 801 (0.171 sec)
INFO:tensorflow:global_step/sec: 630.344
INFO:tensorflow:loss = 1.67205773851e-12, step = 901 (0.159 sec)
INFO:tensorflow:Saving checkpoints for 1000 into C:\Users\flynn\AppData\Local\Temp\tmp3jfd9b4n\model.ckpt.
INFO:tensorflow:Loss for final step: 4.37125459069e-13.
INFO:tensorflow:Starting evaluation at 2017-06-14-08:53:25
INFO:tensorflow:Restoring parameters from C:\Users\flynn\AppData\Local\Temp\tmp3jfd9b4n\model.ckpt-1000
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Evaluation [10/10]
INFO:tensorflow:Finished evaluation at 2017-06-14-08:53:26
INFO:tensorflow:Saving dict for global step 1000: global_step = 1000, loss = 1.61285e-13
{'loss': 1.612852e-13, 'global_step': 1000}

自定义model()函数的内容与底层API的手工模型训练循环非常相似。

下一步

现在你已经有了关于Tensorflow可用的知识。我们还有几个教程。如果你是机器学习的初学者,参见ML初学者的MNIST,如果你已经有经验了,参见深入MNIST