GAN Report

Embryo Project (GAN)
Report
Project by Aneek Das under the supervision of Manoj Kumar
SENTHILNATHAN K
02/06/2019
Goal of the Project
 To classify images of embryos based on criterias such as

blastocyst quality, pregnancy and the number of pro-nuclei
using Deep-Learning.
 The dataset includes raw videos that contain the

development in embryos from day 1 to day 5 with each day
having close to 2400 images.
Challenge
● Images were different at different hours pertaining to the

time frames.
● Each day had approximately 2400 images which is a
really small amount of data to perform deep learning for
an accurate classification.
Task undertook to overcome the challenge
 GANs (Generative Adversarial Networks) a revolutionary deep

learning architecture was used in this project to augment the existing
dataset by generating synthetic images thereby increasing the dataset
size for better classification using deep learning
 To tackle the challenge of different images at different time frames in

the video, all the 0 hpi (history of personal illness) images were
extracted from the raw videos. These images would be used to check if
the obtained images can be used to predict the above mentioned
classifications thereby saving time.
Snapshots of extracted 0 hpi images from raw videos
Raw videos of the embryos. Extracted

They can have 1-12 wells in them. embryo
Procedure followed to extract images
 Classifying videos according to resolution and number of

wells in the video.
 Extracting each well according to the dimensions of the
video. Each well has a dimension of 250x250 pixels
 Extracting wells which are at 0 hpi. This is done using optical
character recognition.
 Extracting the embryo by cropping the excess space in each
well.
What is GAN?
 GANs (Generative Adversarial Networks) are a type of generative
deep neural network that learn from data using adversarial techniques.
 It consists of 2 networks ; the discriminator and the generator.
 The generator, generates new data instances, while the other, the
discriminator, evaluates them for authenticity.
 The discriminator decides whether each instance of data it reviews

belongs to the actual training dataset or not.
Working Principle
The generator takes as input a vector of random numbers (z), and transforms it into the form of the data we
are interested in imitating.
The discriminator takes as input a set of data, either real (x) or generated (G(z)), and produces a probability
of that data being real (P(x)).
The generator is then optimized in order to increase the probability of the generated data being rated
highly.
Gradient ascent expression for the discriminator:
By alternating gradient optimization between the two networks using these expressions on new batches of
real and generated data each time, the GAN will slowly converge to producing data that is as realistic as the
network is capable of modeling.
Gradient descent expression for the generator:

First architecture experiment
Feed-forward GAN refers to the architecture where both the generator and the discriminator are fully
connected networks.
This was done as an experiment to learn the working of GANs and see how this simple architecture
works on the present dataset.
Architecture : (for embryo dataset)

Generator :
● Input : Noise of dimension 200

● Fully connected layer | 512 units | relu activation
● Output : Fully connected layer | 62500 units | sigmoid activation
Discriminator :
● Input : 62500 values of pixels of images

● Output : Fully connected layer | 1 unit | sigmoid activation
Results of Feed Forward GAN
• The architecture was unable to learn any feature from the dataset.
• The model did not have enough number of parameters to learn features
from the dataset.
• After 20 epochs of training the generator had a loss of 4.36 and the
discriminator had a loss of 1.74.
• For an optimal GAN the generator loss has to be 0.693 and the
discriminator loss has to be 1.386.
Sample images generated by the feed forward GAN

Second architecture experiment using Stack GAN
 Stack GAN is used to generate photo-realistic images with images of

higher dimensions.
 The Stage-I GAN sketches the primitive shape and colors of the
object based on the input image, yielding Stage-I low-resolution
images.
 The Stage-II GAN takes Stage-I results and the input image as
inputs, and generates high-resolution images with better details. It is
able to rectify defects in Stage-I results and add compelling details
with the refinement process.
Architecture of Stack GAN
Generator :
Input shape : 512 noise points -> fully connected (4x4x512 units) -> batch normalization
-> relu activation -> reshaping to [-1, 4, 4, 512] , 2d upsampling -> convolution (256
feature maps, kernel size 5, tanh activation) -> 2d upsampling -> convolution (128
feature maps, kernel size 5, tanh activation) -> 2d upsampling -> Output layer :
convolution (3 feature maps, kernel size 5, tanh activation)
Discriminator :
Input size : 32x32x3 images -> convolution layer (512 feature maps, kernel size 5),
average pooling layer -> convolutional layer (256 feature maps, kernel size 5) ->
average pooling -> convolutional layer (128 feature maps, kernel size 5) -> average
pooling -> fully connected layer (1024 units, tanh activation) -> Output layer : fully
connected layer (2 units, activation = softmax)
Results of Stack GAN
• The size of the input images were resized to 32x32 images.
• Though the generator was able to learn the difference between the
background and the well, the Generator was not able to learn any good
features that constituted the embryo.
• This is primarily because of the dataset being very small. Training was
done on embryo images at hpi.
Sample images generated by the Stack GAN

Third Architecture Boundary Equilibrium GAN (BEGAN)
● BEGANs are used for training autoencode based GANs.
● The loss is derived from Wasserstein distance.
● This method balances the generator and discriminator during

training and provides a new approximate convergence measure.
● The generator can look into the features learnt by the

discriminator, so that it only learns the relevant features.
Architecture for embryo dataset
Generator :
Input noise points -> fully connected layer -> reshaping layer ->
(convolutional layer -> convolutional layer -> upscaling layer)x3 -> Output
convolutional layer
Discriminator :
Encoder: input layer -> (convolutional layer)x6 -> max pooling layer ->
reshaping layer -> fully connected layer
Decode : fully connected layer -> reshaping layer -> (convolutional layer ->
convolutional layer -> upscaling 2d)x3 -> output convolutional layer
Results for dataset containing all embryos from all 5 days
INFERENCE
• After 342000 steps, the generator suffered from Mode Collapse.
• Since there were empty wells, it realised that it could fool the
discriminator just by generating empty wells without any features.
• After this, all of the iterations produced mostly empty wells and it
started forgetting all the features.
• The best set for generator was at step 193000, with generator loss
of 0.1380 and discriminator accuracy of 0.4788.
Further Pre-processing
 To avoid mode collapse due to the dataset containing empty

embryo wells, all empty wells were removed from the dataset.
 Techniques like Feature Matching and Batch Discrimination were

implemented to avoid mode collapse.
Results after 18 hours - all classes
Results for 68 hpi class 5 embryos
Final conclusions made from all the performed
experiments
 The images learnt are really good. However there are certain flaws :
1. The generator is only being able to generate 2 kinds of images.

2. If we zoom into the images, we can see lots of white pixels scattered around
the images.
3. These may hinder the features learnt by the classification model and lead to
poor results.
 The individual classes of all the 5 days have very less number of images.
 The number of training images for the 68 hpi class 5 embryos was 492 images.
 As we see from the images, even though the generator is able to learn from the
discriminator, the discriminator is itself not able to learn the training data due to
insufficient images.
Final conclusions made from all the performed
experiments
 After 105000 steps, the discriminator reaches an accuracy of
0.487 which is near to optimal value (0.5) but the loss stops
decreasing. This validates the point.
 Since, we are concerned with day-wise extracted images, more

number of timestamps during the day have to be taken into
consideration for there to be sufficient data.
THANKS

GAN Report

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GAN Report

Uploaded by

Copyright:

Available Formats

Embryo Project (GAN)

Project by Aneek Das under the supervision of Manoj Kumar

 To classify images of embryos based on criterias such as

 The dataset includes raw videos that contain the

● Images were different at different hours pertaining to the

 GANs (Generative Adversarial Networks) a revolutionary deep

 To tackle the challenge of different images at different time frames in

Raw videos of the embryos. Extracted

 Classifying videos according to resolution and number of

 It consists of 2 networks ; the discriminator and the generator.

 The discriminator decides whether each instance of data it reviews

Gradient descent expression for the generator:

Architecture : (for embryo dataset)

● Input : Noise of dimension 200

● Input : 62500 values of pixels of images

Sample images generated by the feed forward GAN

 Stack GAN is used to generate photo-realistic images with images of

Sample images generated by the Stack GAN

● BEGANs are used for training autoencode based GANs.

● The loss is derived from Wasserstein distance.

● This method balances the generator and discriminator during

● The generator can look into the features learnt by the

 To avoid mode collapse due to the dataset containing empty

 Techniques like Feature Matching and Batch Discrimination were

1. The generator is only being able to generate 2 kinds of images.

 Since, we are concerned with day-wise extracted images, more

You might also like