DRAW: A Recurrent Neural Network For Image Generation (2015 ICML)
Deep Recurrent Atten- tive Writer (DRAW)
- Iterative Drawing(This paper's) VS One-shot drawing(Previous approach)
- Mimic the foveation of the human eye
- Spatial Attention
- A Sequential variational auto-encoding framework
- Resembles the selective read and write operations developed for the Neural Turing Machine
Variational Auto-encoder
- Autoencoder, Encoder, Decoder
- Use variational upper bound on log-likelihood of the data as Loss
Descriptions
- both the encoder and decoder are recurrent networks
- moreover the encoder is privy to the decoder’s previous outputs
- A dynamically updated attention mechanism
- restrict both
- the input region observed by the encoder
- the output region modified by the decoder.
Overall Networks
Attention Mechanism
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models (2016 NIPS)
Introduction
- the intersection of structured probabilistic models and deep networks.
- Prior work on deep generative methods (e.g., VAEs [9]) have been mostly unstructured
- Pro: producing impressive samples and likelihood scores
- Con: their representations have lacked interpretable meaning.
Inference
- Attend, Infer, Repeat(AIR) Framework
- treat inference as an iterative process,
- implemented as a recurrent neural network that attends to one object at a time
- learns to use an appropriate number of inference steps for each image.
- End-to-end learning is enabled by recent advances in amortized variational inference
- e.g., combining gradient based optimization for continuous latent variables with black-box optimization for discrete ones.