This was based on the great tutorial by @odduer, available here, it shows how to get cifar10 visualized on Tensorboard!
Here you'll find an example with word vector visualization :)
We'll be using a embedding matrix that is trained using GloVe, a word vector generation model. The matrix will contain 400,000 word vectors, each with a dimensionality of 50.
In [30]:
import numpy as np
word_vector = np.load('word_vector.npy')
print(word_vector.shape)
In [31]:
LOG_DIR = 'tensorboard/embedding/'
In [32]:
word_list = np.load('word_list.npy') # this maps an index to a word
word_list = word_list.tolist() # originally loaded as numpy array
# create metadata.tsv file
metadata_file = open(os.path.join(LOG_DIR, 'metadata.tsv'), "w+")
# write header
metadata_file.write('Word\tIndex\n')
# write metadata mapping
for index, word in enumerate(word_list):
metadata_file.write(str(word) + '\t' + str(index) + '\n')
In [33]:
import tensorflow as tf
from tensorflow.contrib.tensorboard.plugins import projector
print('Tested with TensorFlow 1.2.0')
print('Your TensorFlow version:', tf.__version__)
import os
# create a tf.Variable to store the embedding
embedding_var = tf.Variable(word_vector, name='embedding')
with tf.Session() as sess:
# initialize variable
sess.run(embedding_var.initializer)
# creates a summary writer
summary_writer = tf.summary.FileWriter(LOG_DIR)
# adds embedding in tensorboard
config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = embedding_var.name
# adds a metadata, comment the line bellow if you don't have
# a metadata
embedding.metadata_path = 'metadata.tsv'
projector.visualize_embeddings(summary_writer, config)
saver = tf.train.Saver([embedding_var])
saver.save(sess, os.path.join(LOG_DIR, 'embedding.ckpt'), 1)
In [29]:
## TODO ## Insert screenshots of embedding