Presented by NVIDIA's Deep Learning Institute
Created by Allison Gray and Myrieme Demouth
This lab has been prepared for NVIDIA's 2017 GPU Technology conference and is comprised of four main tasks.
Deep learning is being used today in a variety of different ways. In this lab, attendees will learn about how to use convolutional neural networks to perform classification and recurrent neural networks for character and sentence generation. Then we will combine these two techniques to generate captions for images and videos.
This exercise explores how to get started with TensorFlow and convolutional neural networks (CNNs). A dataset of paintings by Van Gogh and others are used to create a binary classifier.
In this part of the lab, we will demonstrate the power of RNNs with character and code generation examples. Participants will explore and experiment RNNs, learn how to configure, and use them to generate sentences. The captions from MSCOCO dataset will be used to train an RNN with different network parameters.
Image captioning can be performed by combining a CNN with an RNN. Attendees will get hands-on experience with combining data from CNNs with RNNs. The MSCOCO images and captions will be used to train and finetune a network to generate captions.
In this last part, we will combine all the things we learned about in the previous three tasks to generate captions about video clips.