Professional Documents
Culture Documents
Administrative
● Everyone should be done with Assignment 3 now
● Milestone grades will go out soon
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 2 29 Feb 2016
Last class
Spatial Transformer
Segmentation
Soft Attention
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 3 29 Feb 2016
Videos
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 4 29 Feb 2016
ConvNets for images
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 5 29 Feb 2016
Feature-based approaches to Activity Recognition
Dense trajectories and motion boundary descriptors for action recognition
Wang et al., 2013
(code available!)
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 6 29 Feb 2016
Dense trajectories and motion boundary descriptors for action recognition
Wang et al., 2013
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 7 29 Feb 2016
Dense trajectories and motion boundary descriptors for action recognition
Wang et al., 2013
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 8 29 Feb 2016
Dense trajectories and motion boundary descriptors for action recognition
Wang et al., 2013
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 9 29 Feb 2016
Dense trajectories and motion boundary descriptors for action recognition
Wang et al., 2013
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 10 29 Feb 2016
Case Study: AlexNet
[Krizhevsky et al. 2012]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 11 29 Feb 2016
Case Study: AlexNet
[Krizhevsky et al. 2012]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 12 29 Feb 2016
Spatio-Temporal ConvNets
[3D Convolutional Neural Networks for Human Action Recognition, Ji et al., 2010]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 13 29 Feb 2016
Spatio-Temporal ConvNets
Sequential Deep Learning for Human Action Recognition, Baccouche et al., 2011
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 14 29 Feb 2016
Spatio-Temporal ConvNets spatio-temporal convolutions;
worked best.
[Large-scale Video Classification with Convolutional Neural Networks, Karpathy et al., 2014]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 15 29 Feb 2016
Spatio-Temporal ConvNets
Learned filters on
the first layer
[Large-scale Video Classification with Convolutional Neural Networks, Karpathy et al., 2014]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 16 29 Feb 2016
Spatio-Temporal ConvNets
1 million videos
487 sports classes
[Large-scale Video Classification with Convolutional Neural Networks, Karpathy et al., 2014]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 17 29 Feb 2016
Spatio-Temporal ConvNets
[Large-scale Video Classification with Convolutional Neural Networks, Karpathy et al., 2014]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 18 29 Feb 2016
Spatio-Temporal ConvNets
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 19 29 Feb 2016
Spatio-Temporal ConvNets
3D VGGNet, basically.
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 20 29 Feb 2016
Spatio-Temporal ConvNets
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 21 29 Feb 2016
Spatio-Temporal ConvNets
[Two-Stream Convolutional Networks for Action Recognition in Videos, Simonyan and Zisserman 2014]
[T. Brox and J. Malik, “Large displacement optical flow: Descriptor matching in variational motion estimation,” 2011]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 22 29 Feb 2016
Long-time Spatio-Temporal ConvNets
All 3D ConvNets so far used local motion cues to
get extra accuracy (e.g. half a second or so)
Q: what if the temporal dependencies of interest are
much much longer? E.g. several seconds?
event 1 event 2
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 23 29 Feb 2016
Long-time Spatio-Temporal ConvNets
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 24 29 Feb 2016
Long-time Spatio-Temporal ConvNets
LSTM way before it was cool
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 25 29 Feb 2016
Long-time Spatio-Temporal ConvNets
[Long-term Recurrent Convolutional Networks for Visual Recognition and Description, Donahue et al., 2015]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 26 29 Feb 2016
Long-time Spatio-Temporal ConvNets
[Beyond Short Snippets: Deep Networks for Video Classification, Ng et al., 2015]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 27 29 Feb 2016
Summary so far
We looked at two types of architectural patterns:
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 28 29 Feb 2016
Summary so far
We looked at two types of architectural patterns:
Finite temporal
3D
extent
CONVNET (neurons that are only
a function of finitely many
video frames in the past)
video
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 30 29 Feb 2016
Long-time Spatio-Temporal ConvNets
Beautiful:
All neurons in the ConvNet are
recurrent.
[Delving Deeper into Convolutional Networks for Learning Video Representations, Ballas et al., 2016]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 31 29 Feb 2016
Long-time Spatio-Temporal ConvNets
Normal ConvNet:
Convolution Layer
[Delving Deeper into Convolutional Networks for Learning Video Representations, Ballas et al., 2016]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 32 29 Feb 2016
Long-time Spatio-Temporal ConvNets
CONV
layer N
RNN-like recurrence
(GRU)
[Delving Deeper into Convolutional Networks for Learning Video Representations, Ballas et al., 2016]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 33 29 Feb 2016
Long-time Spatio-Temporal ConvNets
Recall: RNNs Vanilla RNN
GRU LSTM
[Delving Deeper into Convolutional Networks for Learning Video Representations, Ballas et al., 2016]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 34 29 Feb 2016
Long-time Spatio-Temporal ConvNets
Recall: RNNs
Matrix multiply
=>
GRU CONV
[Delving Deeper into Convolutional Networks for Learning Video Representations, Ballas et al., 2016]
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 35 29 Feb 2016
RNN Infinite (in theory)
temporal extent
(neurons that are function
of all video frames in the past)
Finite temporal
3D
extent
CONVNET (neurons that are only
a function of finitely many
video frames in the past)
video
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 36 29 Feb 2016
i.e. we obtain:
video
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 37 29 Feb 2016
Summary
- You think you need a Spatio-Temporal Fancy Video
ConvNet
- STOP. Do you really?
- Okay fine: do you want to model:
- local motion? (use 3D CONV), or
- global motion? (use LSTM).
- Try out using Optical Flow in a second stream (can work
better sometimes)
- Try out GRU-RCN! (imo best model)
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 38 29 Feb 2016
Unsupervised Learning
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 39 29 Feb 2016
Unsupervised Learning Overview
● Definitions
● Autoencoders
○ Vanilla
○ Variational
● Adversarial Networks
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 40 29 Feb 2016
Supervised vs Unsupervised
Supervised Learning
Data: (x, y)
x is data, y is label
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 43 29 Feb 2016
Autoencoders
Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 44 29 Feb 2016
Autoencoders
Originally: Linear + nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN
Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 45 29 Feb 2016
Autoencoders
Originally: Linear + nonlinearity (sigmoid)
z usually smaller than x
Later: Deep, fully-connected
(dimensionality reduction)
Later: ReLU CNN
Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 46 29 Feb 2016
Autoencoders
Reconstructed
input data
xx
Decoder
Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 47 29 Feb 2016
Originally: Linear +
nonlinearity (sigmoid)
Autoencoders Later: Deep, fully-connected
Later: ReLU CNN (upconv)
Reconstructed
input data
xx
Decoder Encoder: 4-layer conv
Decoder: 4-layer upconv
Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 48 29 Feb 2016
Originally: Linear +
nonlinearity (sigmoid)
Autoencoders Later: Deep, fully-connected
Later: ReLU CNN (upconv)
Reconstructed
input data
xx
Encoder / decoder Decoder Train for
sometimes share reconstruction
weights with no labels!
Features z
Example:
dim(x) = D Encoder
dim(z) = H
w e: H x D
T
Input data x
w d: D x H = w e
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 49 29 Feb 2016
Autoencoders Loss function
(Often L2)
Reconstructed
input data
xx
Decoder Train for
reconstruction
with no labels!
Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 50 29 Feb 2016
Autoencoders
Reconstructed
input data
xx
After training, Decoder
throw away
decoder! Features z
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 51 29 Feb 2016
Autoencoders Loss function
(Softmax, etc)
bird plane
Predicted
Label
yy y dog deer truck
Use encoder to
initialize a Classifier
supervised Train for final task
Fine-tune
model
Features z encoder (sometimes with
jointly with small data)
classifier
Encoder
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 52 29 Feb 2016
Autoencoders: Greedy Training
In mid 2000s layer-wise
pretraining with Restricted
Boltzmann Machines (RBM)
was common
Hinton and Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks”, Science 2006
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 53 29 Feb 2016
Autoencoders: Greedy Training
In mid 2000s layer-wise
pretraining with Restricted
Boltzmann Machines (RBM) Not common anymore
was common
Hinton and Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks”, Science 2006
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 54 29 Feb 2016
Autoencoders
Autoencoders can
reconstruct data, and
Reconstructed
xx can learn features to
input data
initialize a supervised
Decoder model
Features z
Can we generate
Encoder images from an
autoencoder?
Input data x
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 55 29 Feb 2016
Variational Autoencoder
A Bayesian spin on an autoencoder - lets us generate data!
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 56 29 Feb 2016
Variational Autoencoder
Intuition: x is an
image, z gives
A Bayesian spin on an autoencoder!
class, orientation,
attributes, etc
Assume our data is generated like this:
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 57 29 Feb 2016
Variational Autoencoder
Intuition: x is an
image, z gives
A Bayesian spin on an autoencoder!
class, orientation,
attributes, etc
Assume our data is generated like this:
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 58 29 Feb 2016
Variational Autoencoder
Prior: Assume
is a unit Gaussian
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 59 29 Feb 2016
Variational Autoencoder
Prior: Assume
is a unit Gaussian
Conditional: Assume
is a
diagonal Gaussian,
predict mean and
variance with neural
net
Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 60 29 Feb 2016
Variational Autoencoder
Mean and (diagonal)
Prior: Assume covariance of
is a unit Gaussian
x
Σx
Conditional: Assume
is a Decoder network
diagonal Gaussian, with parameters
predict mean and
variance with neural z
net Latent state
Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 61 29 Feb 2016
Variational Autoencoder
Mean and (diagonal)
Prior: Assume covariance of
is a unit Gaussian
x
Σx
Conditional: Assume
is a Decoder network
diagonal Gaussian, with parameters
predict mean and
variance with neural z
net Fully-connected or
Latent state
Kingma and Welling, ICLR 2014
upconvolutional
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 62 29 Feb 2016
Variational Autoencoder: Encoder
By Bayes Rule the posterior is:
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 63 29 Feb 2016
Variational Autoencoder: Encoder
By Bayes Rule the posterior is:
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 64 29 Feb 2016
Variational Autoencoder: Encoder
Mean and (diagonal)
By Bayes Rule the posterior is: covariance of
z
Σz
Use decoder network =)
Gaussian =) Encoder network
Intractible integral =( with parameters
x
Kingma and Welling,
ICLR 2014
Data point
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 65 29 Feb 2016
Variational Autoencoder: Encoder
Mean and (diagonal)
By Bayes Rule the posterior is: covariance of
z
Σz
Use decoder network =)
Gaussian =) Encoder network
Intractible integral =( with parameters
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 66 29 Feb 2016
Variational Autoencoder: Encoder
Mean and (diagonal)
By Bayes Rule the posterior is: covariance of
Fully-connected
or convolutional
z
Σz
Use decoder network =)
Gaussian =) Encoder network
Intractible integral =( with parameters
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 67 29 Feb 2016
Variational Autoencoder
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 68 29 Feb 2016
Variational Autoencoder
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 69 29 Feb 2016
Variational Autoencoder
z
Sample from
z z Mean and (diagonal)
Σ covariance of
Encoder network
Data point x Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 70 29 Feb 2016
Variational Autoencoder
x
Σx Mean and (diagonal)
Decoder network covariance of
z
Sample from
z z Mean and (diagonal)
Σ covariance of
Encoder network
Data point x Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 71 29 Feb 2016
Variational Autoencoder
Reconstructed xx
Sample from
x
Σx Mean and (diagonal)
Decoder network covariance of
z
Sample from
z z Mean and (diagonal)
Σ covariance of
Encoder network
Data point x Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 72 29 Feb 2016
Variational Autoencoder
Training like a normal autoencoder:
Reconstructed xx reconstruction loss at the end,
Sample from regularization toward prior in middle
x
Σx Mean and (diagonal)
Decoder network covariance of
z (should be close to data x)
Sample from
z z Mean and (diagonal)
Σ covariance of
Encoder network (should be close
Data point x to prior ) Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 73 29 Feb 2016
Variational Autoencoder: Generate Data!
After network is trained:
z
Sample from
prior
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 74 29 Feb 2016
Variational Autoencoder: Generate Data!
After network is trained:
x
Σx
Decoder
network
z
Sample from
prior
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 75 29 Feb 2016
Variational Autoencoder: Generate Data!
After network is trained:
Generated xx
Sample from
x
Σx
Decoder
network
z
Sample from
prior
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 76 29 Feb 2016
Variational Autoencoder: Generate Data!
After network is trained:
Generated xx
Sample from
x
Σx
Decoder
network
z
Sample from
prior
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 77 29 Feb 2016
Variational Autoencoder: Generate Data!
After network is trained:
Generated xx
Sample from
x
Σx
Decoder
network
z
Sample from
prior
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 78 29 Feb 2016
Variational Autoencoder: Generate Data!
After network is trained: Diagonal prior on z =>
independent latent variables
Generated xx
Sample from
x
Σx
Decoder
network
z
Sample from
prior
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 79 29 Feb 2016
Variational Autoencoder: Math
Maximum Likelihood?
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 80 29 Feb 2016
Variational Autoencoder: Math
Maximum Likelihood?
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 81 29 Feb 2016
Variational Autoencoder: Math
Maximum Likelihood?
Marginalize joint
distribution
Kingma and Welling, ICLR 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 82 29 Feb 2016
Variational Autoencoder: Math
Maximum Likelihood?
Intractible integral =(
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 83 29 Feb 2016
Variational Autoencoder: Math
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 84 29 Feb 2016
Variational Autoencoder: Math
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 85 29 Feb 2016
Variational Autoencoder: Math
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 86 29 Feb 2016
Variational Autoencoder: Math
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 87 29 Feb 2016
Variational Autoencoder: Math
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 88 29 Feb 2016
Variational Autoencoder: Math
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 89 29 Feb 2016
Variational Autoencoder: Math
“Elbow”
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 90 29 Feb 2016
Variational Autoencoder: Math
“Elbow”
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 91 29 Feb 2016
Variational Autoencoder: Math
“Elbow”
“Elbow”
“Elbow”
“Elbow”
Sampling
with
reparam.
trick
(see paper)
“Elbow”
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 98 29 Feb 2016
Goodfellow et al, “Generative
Adversarial Nets”, NIPS 2014
Generative Adversarial Nets
Can we generate images with less math?
Random noise z
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 99 29 Feb 2016
Goodfellow et al, “Generative
Adversarial Nets”, NIPS 2014
Generative Adversarial Nets
Can we generate images with less math?
Fake image x
Generator
Random noise z
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 100 29 Feb 2016
Goodfellow et al, “Generative
Adversarial Nets”, NIPS 2014
Generative Adversarial Nets
Can we generate images with less math?
Real or fake? y
Discriminator
Fake image x
Generator
Random noise z
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 101 29 Feb 2016
Goodfellow et al, “Generative
Adversarial Nets”, NIPS 2014
Generative Adversarial Nets
Can we generate images with less math?
Real or fake? y
Discriminator
Fake image x
x Real image
Generator
Fake examples: from generator
Random noise z Real examples: from dataset
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 102 29 Feb 2016
Goodfellow et al, “Generative
Adversarial Nets”, NIPS 2014
Generative Adversarial Nets
Can we generate images with less math?
Real or fake? y
Train generator and discriminator jointly
Discriminator After training, easy to generate images
Fake image x
x Real image
Generator
Fake examples: from generator
Random noise z Real examples: from dataset
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 103 29 Feb 2016
Generative Adversarial Nets
Generated samples
Nearest neighbor from training set Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 104 29 Feb 2016
Generative Adversarial Nets
Generated samples (CIFAR-10)
Nearest neighbor from training set Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 105 29 Feb 2016
Generative Adversarial Nets: Multiscale
Generate
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
low-res
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 106 29 Feb 2016
Generative Adversarial Nets: Multiscale
Upsample
Generate
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
low-res
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 107 29 Feb 2016
Generative Adversarial Nets: Multiscale
Upsample
Generate
delta, add Generate
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
low-res
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 108 29 Feb 2016
Generative Adversarial Nets: Multiscale
Upsample Upsample
Generate
delta, add Generate
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
low-res
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 109 29 Feb 2016
Generative Adversarial Nets: Multiscale
Upsample Upsample
Generate Generate
delta, add Generate
delta, add
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
low-res
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 110 29 Feb 2016
Generative Adversarial Nets: Multiscale
Upsample Upsample Upsample
Generate Generate
delta, add Generate
delta, add
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
low-res
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 111 29 Feb 2016
Generative Adversarial Nets: Multiscale
Done! Upsample Upsample Upsample
Generate Generate
Generate Generate
delta, add delta, add
delta, add low-res
Denton et al, “Deep generative image models using a Laplacian pyramid of adversarial networks”, NIPS 2015
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 112 29 Feb 2016
Generative Adversarial Nets: Multiscale
Discriminators work
at every scale!
Denton et al, NIPS 2015
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 113 29 Feb 2016
Generative Adversarial Nets: Multiscale
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 114 29 Feb 2016
Generative Adversarial Nets: Simplifying
Generator is an upsampling network with fractionally-strided convolutions
Discriminator is a convolutional network
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 115 29 Feb 2016
Generative Adversarial Nets: Simplifying
Generator
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 116 29 Feb 2016
Generative Adversarial Nets: Simplifying
Samples
from the
model look
amazing!
Radford et al,
ICLR 2016
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 117 29 Feb 2016
Generative Adversarial Nets: Simplifying
Interpolating
between
random
points in latent
space
Radford et al,
ICLR 2016
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 118 29 Feb 2016
Generative Adversarial Nets: Vector Math
Radford et al, ICLR 2016
Smiling woman Neutral woman Neutral man
Samples
from the
model
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 119 29 Feb 2016
Generative Adversarial Nets: Vector Math
Radford et al, ICLR 2016
Smiling woman Neutral woman Neutral man
Samples
from the
model
Average Z
vectors, do
arithmetic
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 120 29 Feb 2016
Generative Adversarial Nets: Vector Math
Radford et al, ICLR 2016
Smiling woman Neutral woman Neutral man
Smiling Man
Samples
from the
model
Average Z
vectors, do
arithmetic
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 121 29 Feb 2016
Generative Adversarial Nets: Vector Math
Glasses man No glasses man No glasses woman
Radford et al,
ICLR 2016
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 122 29 Feb 2016
Generative Adversarial Nets: Vector Math
Glasses man No glasses man No glasses woman
Radford et al,
ICLR 2016
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 123 29 Feb 2016
Dosovitskiy and Brox, “Generating
Images with Perceptual Similarity
Putting everything together Metrics based on Deep Networks”,
arXiv 2016
Pixel loss
xx
x
Σx
Variational
Autoencoder z
z
Σz
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 124 29 Feb 2016
Dosovitskiy and Brox, “Generating
Images with Perceptual Similarity
Putting everything together Metrics based on Deep Networks”,
arXiv 2016
Real or Generated
xx
x
Σx
Variational
Autoencoder z
z
Σz
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 125 29 Feb 2016
Dosovitskiy and Brox, “Generating
Images with Perceptual Similarity
Putting everything together Metrics based on Deep Networks”,
arXiv 2016
Real or Generated
Pretrained AlexNet
Discriminator Pixel loss
network y
xx
x
Σx
Variational
Autoencoder z
z
Σz
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 126 29 Feb 2016
Dosovitskiy and Brox, “Generating
Images with Perceptual Similarity
Putting everything together Metrics based on Deep Networks”,
arXiv 2016
Real or Generated
Pretrained AlexNet
Discriminator Pixel loss
network y
xx
x
Σx
Variational Features of
Autoencoder z Features of xf xxf reconstructed
z
real image image
Σz
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 127 29 Feb 2016
Dosovitskiy and Brox, “Generating
Images with Perceptual Similarity
Putting everything together Metrics based on Deep Networks”,
arXiv 2016
Real or Generated
Pretrained AlexNet
Discriminator Pixel loss
network y
xx
x
Σx
Variational Features of
Autoencoder z Features of xf xxf reconstructed
z
real image image
Σz
x L2 loss
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 128 29 Feb 2016
Dosovitskiy and Brox, “Generating
Images with Perceptual Similarity
Putting everything together Metrics based on Deep Networks”,
arXiv 2016
Samples
from the
model, trained
on ImageNet
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 129 29 Feb 2016
Recap
● Videos
● Unsupervised learning
○ Autoencoders: Traditional / variational
○ Generative Adversarial Networks
● Next time: Guest lecture from Jeff Dean
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 130 29 Feb 2016