Tensors can be logged and viewed in a browser-based learning visualization toolkit called TensorBoard. Scalar tensors are shown as graphs, multidimensional tensors are shown as histograms and images are simply displayed. The computational graph is also displayed. There are tools for visualizations of high dimensional data in lower dimensions.
TensorBoard generates visualizations from generated summary data saved at a log directory by a TensorFlow summary writer. A summary is a special TensorFlow operation that takes in a tensor from a graph and outputs protocol buffers that can be written to a drive. Specifically, in a TensorFlow run, summary operations are evaluated, saved at the log directory using the summary writer, and then read continuously by TensorBoard, which visualizes the information in a browser continuously. The summary writer saves to an append-only record dump that has "tfevents" in filename.
Current supported summary operations are as follows:
tf.summary.scalar
tf.summary.histogram
tf.summary.image
tf.summary.audio
tf.summary.text
Upcoming is tf.summary.tensor
, which is a summary that can write out any type of value because everything in TensorFlow is a tensor.
The scalar dashboard visualizes scalar statistics that vary over time, such as a model's loss or learning rate.
The histogram dashboard visualizes statistical distributions of tensors that vary over time. Each plot displays temporal slices of data, where each slice is a histogram of the tensor at a given step. Earlier times are towards the back while more recent times are towards the front.
The the appearance of the histograms was designed by Shan Carter, who used to make interactive graphics in the New York Times and was inspired by the cover of the album How do you Feel? by Joywave.
The distribution dashboard is another way to visualize histogram data. It displays high-level statistics on distributions. Each line on a plot represents a percentile in the distribution over the data. For example, the bottom line shows how the minimum value has changed over time and the middle line shows how the median has changed. In effect, the lines are drawn such that they produce colored regions having widths ${\sigma}$, ${2\sigma}$ and ${3\sigma}$ respectively.
Each row corresponds to a different tag and each column corresponds to a run, with always the latest image shown for each tag. Custom visualizations could be displayed on this dashboard (e.g. matplotlib plots).
Playable widgets can be embedded in this dashboard. Each row corresponds to a different tag and each column corresponds to a run, with always the latest image shown for each tag.
The graph explorer can visualize a TensorBoard graph. For reasonable use of the visualizer, name scopes should be used to group the graph operations hierarchically -- TensorFlow graphs can easily have many thousands of nodes, which can be far too many to see easily all at once, or even to lay out using standard tools.
The embedding visualizer takes high-dimensional data and projects it down to 3 or 2 dimensions. One interesting way of using this would be to take the input dataset and map it through the neural network to the final layer. That embedding is the learned representation of how the neural network is processing the information. So, the projection is visualizing the input data after it has been embedded in a high-dimensional space by the model.
The embedding visualizer reads data from a model checkpoint file and can be configured with additional metadata. By default, it features the PCA and t-SNE methods and can color points by label.
It is particularly well-suited to images and vocabularies.
Typical TensorFlow graphs can have many thousands of nodes -- far too many to see all easily at once without cortical implants, or even to lay out using standard tools. To simplify, variable names can be scoped and the visualization uses this information to define a hierarchy on the nodes of the graph. By default, only the top of the hierarchy is shown.
TensorFlow graphs have two types of connections: data dependencies and control dependencies. Data dependencies show the flow of tensors between two operations and are displayed as solid arrows. Control dependencies are displayed as dotted lines.
Some steps in using TensorBoard are as follows:
TB_SUMMARY_DIR = "/tmp/run"
with tf.name_scope("input"):
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
tf.summary.histogram("input", x)
summary_operation = tf.summary.merge_all()
writer = tf.summary.FileWriter(TB_SUMMARY_DIR)
writer.add_graph(sesh.graph)
_, summary = sesh.run([optimizer, summary], feed_dict = feed_dict)
writer.add_summary(summary, step)
rm -rf /tmp/run
tensorboard --logdir=/tmp/run
In [1]:
import subprocess
import tensorflow as tf
path_logs = "/tmp/run"
subprocess.Popen(["killall tensorboard"], shell = True)
subprocess.Popen(["rm -rf {path_logs}".format(path_logs = path_logs)], shell = True)
subprocess.Popen(["tensorboard --logdir={path_logs}".format(path_logs = path_logs)], shell = True)
subprocess.Popen(["xdg-open http://127.0.1.1:6006"], shell = True)
tf.reset_default_graph()
with tf.name_scope("input"):
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
tf.summary.histogram("input", x)
with tf.name_scope("architecture"):
W = tf.Variable([ .3], dtype = tf.float32)
b = tf.Variable([-.3], dtype = tf.float32)
linear_model = W * x + b
tf.summary.histogram("W", W)
tf.summary.histogram("b", b)
tf.summary.histogram("linear_model", linear_model)
with tf.name_scope("loss"):
loss = tf.reduce_sum(tf.square(linear_model - y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.01)
train = optimizer.minimize(loss)
tf.summary.scalar("loss", loss)
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]
summary_operation = tf.summary.merge_all()
writer = tf.summary.FileWriter(path_logs)
with tf.Session() as sesh:
writer.add_graph(sesh.graph)
sesh.run(tf.global_variables_initializer())
for i in range(1000):
_, summary = sesh.run(
[train, summary_operation],
{
x: x_train,
y: y_train
}
)
writer.add_summary(summary, i)
current_W, current_b, current_loss = sesh.run(
[W, b, loss],
{
x: x_train,
y: y_train
}
)
print("W: {W}, b: {b}, loss: {loss}".format(W = current_W, b = current_b, loss = current_loss))
subprocess.Popen(["killall tensorboard"], shell = True);
Here, the linear model histogram can be seen approaching the defined target values (-0, -1, -2, -3).
To visualize embeddings, there are three main steps to take:
embedding_variable = tf.Variable(...)
saver = tf.train.Saver()
saver.save(sesh, os.path.join(LOG_DIR, "model.ckpt"), step)
from tensorflow.contrib.tensorboard.plugins import projector
# Create randomly initialized embedding weights which will be trained.
N = 10000 # number of items (vocabulary size)
D = 200 # dimensionality of the embedding
embedding_variable = tf.Variable(
tf.random_normal([N, D]),
name = "word_embedding"
)
configuration = projector.ProjectorConfig()
# (Multiple embeddings could be added.)
embedding = configuration.embeddings.add()
embedding.tensor_name = embedding_variable.name
# Link this tensor to its metadata file (e.g. labels).
embedding.metadata_path = os.path.join(LOG_DIR, "metadata.tsv")
# Use the samecheckpoint LOG_DIR.
summary_writer = tf.summary.FileWriter(LOG_DIR)
# Save a file projector_config.pbtxt at the directory LOG_DIR for TensorBoard to read when launched.
projector.visualize_embeddings(summary_writer, configuration)
This example involves projection of 100 data points to 10 dimension space.
In [2]:
import tensorflow as tf
from tensorflow.contrib.tensorboard.plugins import projector
# Create a dummy embedding matrix filled with pseudorandom numbers.
embedding_variable = tf.Variable(tf.truncated_normal([100, 10]), name = "embedding")
# Create a list of 100 labels for the data points and save them to a metadata file.
labels = [str(i) for i in range(1, 101)]
with open("labels.csv", mode = "wt", encoding = "utf-8") as file_metadata:
file_metadata.write("\n".join(labels))
with tf.Session() as sesh:
# Create a summary writer and specify the graph.
writer = tf.summary.FileWriter("./graphs/embedding_test", sesh.graph)
# Initialize the embedding variable.
sesh.run(embedding_variable.initializer)
# Create a configuration for the projector.
configuration = projector.ProjectorConfig()
# Add the embedding visualizer.
embedding = configuration.embeddings.add()
# Set the name of the embedding to the variable name.
embedding.tensor_name = embedding_variable.name
# Set the path of the metadata in order to label data points.
embedding.metadata_path = "./labels.csv"
# Add the summary writer and the configuration to the projector.
projector.visualize_embeddings(writer, configuration)
# Save the model.
saver_embed = tf.train.Saver([embedding_variable])
saver_embed.save(sesh, "./graphs/embedding_test/embedding_test.ckpt", 1)
writer.close()
Launch TensorFlow:
tensorboard --logdir=graphs/embedding_test
The data points can be changed to display the labels instead of circles or a pointer can be hovered over a circle to display the label of the corresponding data point.