深度理解TensorFlow框架，编程原理 —— 第二讲编程接口和可视化工具TensorBoard

上一讲解读了TensorFlow的抽象编程模型。这一讲，我们上手解读TensorFlow编程接口和可视化工具TensorBoard。

TensorFlow支持C++和Python两种接口。C++的接口有限，而Python提供了丰富的接口，并且有numpy等高效数值处理做后盾。所以，推荐使用Python接口。

接下来，我们手把手教大家用Python接口训练一个输入层和一个输出层的多层感知器（MLP），用来识别MNIST手写字数据集。首先我们导入tensorflow库，下载文件到指定目录。

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# Download and extract the MNIST data set.
# Retrieve the labels as one-hot-encoded vectors.
mnist = input_data.read_data_sets("/tmp/mnist", one_hot=True)

其中read_data_sets()方法是tensorflow例子程序中提供的下载MNIST数据集的方法，直接使用就可完成数据下载。

接下来，我们需要注册一个流图，在里面定义一系列计算操作：

graph = tf.Graph()
# Set our graph as the one to add nodes to
with graph.as_default():
    # Placeholder for input examples (None = variable dimension)
    examples = tf.placeholder(shape=[None, 784], dtype=tf.float32)
    # Placeholder for labels
    labels = tf.placeholder(shape=[None, 10], dtype=tf.float32)

    weights = tf.Variable(tf.truncated_normal(shape=[784, 10], stddev=0.1))
    bias = tf.Variable(tf.constant(0.05, shape=[10]))
    # Apply an affine transformation to the input features
    logits = tf.matmul(examples, weights) + bias
    estimates = tf.nn.softmax(logits)
    # Compute the cross-entropy
    cross_entropy = -tf.reduce_sum(labels * tf.log(estimates),
    reduction_indices=[1])
    # And finally the loss
    loss = tf.reduce_mean(cross_entropy)
    # Create a gradient-descent optimizer that minimizes the loss.
    # We choose a learning rate of 0.05
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
    # Find the indices where the predictions were correct
    correct_predictions = tf.equal(tf.argmax(estimates, dimension=1),
    tf.argmax(labels, dimension=1))
    accuracy = tf.reduce_mean(tf.cast(correct_predictions,
    tf.float32))

其中

graph = tf.Graph()
# Set our graph as the one to add nodes to
with graph.as_default():

这两句是定义流图并且，开始声明流图中的计算操作。

这里训练数据中，样本是28*28的像素图片，标签label已经用10个比特表示，所以定义了placehoder：

# Placeholder for input examples (None = variable dimension)
examples = tf.placeholder(shape=[None, 784], dtype=tf.float32)
# Placeholder for labels
labels = tf.placeholder(shape=[None, 10], dtype=tf.float32)

因为输入层和输出层之间连接，所以：

weights = tf.Variable(tf.truncated_normal(shape=[784, 10], stddev=0.1))
bias = tf.Variable(tf.constant(0.05, shape=[10]))
# Apply an affine transformation to the input features
logits = tf.matmul(examples, weights) + bias
estimates = tf.nn.softmax(logits)

其中权重矩阵weights就是一个784*10的矩阵，bias就是一个10个比特的向量。logits计算X·W+ b。estimates计算softmax激活函数。

我们的y预测计算完毕, 但是如何评估labels和y预测之间的差别 ? 这里, 不使用单纯的二次代价函数, 而是使用交叉熵代价函数. 实验证明, 交叉熵代价函数带来的训练效果往往比二次代价函数要好。参见: 交叉熵代价函数（作用及公式推导）

那么什么是交叉熵 ?

交叉熵 H(p, q) 可以理解为, 用伪造的模拟分布 q 去逼近真实分布 p 时, 需要使用的平均编码数 (非得和信息论扯上不是吗 ?).

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

深度理解TensorFlow框架，编程原理 —— 第二讲编程接口和可视化工具TensorBoard

David 9

Latest posts by David 9 (see all)

发布者

David 9

发表回复取消回复

David 9

Latest posts by David 9 (see all)

发布者

David 9

发表回复 取消回复

发表回复取消回复