This notebook demonstrates the use of the tf.data.Dataset API to build pipelines to feed data to your program. It covers:
Dataset.Dataset with eager execution enabled.We recommend using the Datasets API for building performant, complex input pipelines from simple, re-usable pieces that will feed your model's training or evaluation loops.
If you're familiar with TensorFlow graphs, the API for constructing the Dataset object remains exactly the same when eager execution is enabled, but the process of iterating over elements of the dataset is slightly simpler.
You can use Python iteration over the tf.data.Dataset object and do not need to explicitly create an tf.data.Iterator object.
As a result, the discussion on iterators in the TensorFlow Guide is not relevant when eager execution is enabled.
In [0]:
# Import TensorFlow.
import tensorflow as tf
# Enable eager execution
tf.enable_eager_execution()
DatasetCreate a source dataset using one of the factory functions like Dataset.from_tensors, Dataset.from_tensor_slices or using objects that read from files like TextLineDataset or TFRecordDataset. See the TensorFlow Guide for more information.
In [0]:
ds_tensors = tf.data.Dataset.from_tensor_slices([1, 2, 3, 4, 5, 6])
# Create a CSV file
import tempfile
_, filename = tempfile.mkstemp()
with open(filename, 'w') as f:
f.write("""Line 1
Line 2
Line 3
""")
ds_file = tf.data.TextLineDataset(filename)
Use the transformations functions like map, batch, shuffle etc. to apply transformations to the records of the dataset. See the API documentation for tf.data.Dataset for details.
In [0]:
ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2)
ds_file = ds_file.batch(2)
In [0]:
print('Elements of ds_tensors:')
for x in ds_tensors:
print(x)
print('\nElements in ds_file:')
for x in ds_file:
print(x)