You are on page 1of 24

Embryo Project (GAN)

Report

Project by Aneek Das under the supervision of Manoj Kumar

SENTHILNATHAN K
02/06/2019
Goal of the Project

 To classify images of embryos based on criterias such as


blastocyst quality, pregnancy and the number of pro-nuclei
using Deep-Learning.

 The dataset includes raw videos that contain the


development in embryos from day 1 to day 5 with each day
having close to 2400 images.
Challenge

● Images were different at different hours pertaining to the


time frames.
● Each day had approximately 2400 images which is a
really small amount of data to perform deep learning for
an accurate classification.
Task undertook to overcome the challenge

 GANs (Generative Adversarial Networks) a revolutionary deep


learning architecture was used in this project to augment the existing
dataset by generating synthetic images thereby increasing the dataset
size for better classification using deep learning

 To tackle the challenge of different images at different time frames in


the video, all the 0 hpi (history of personal illness) images were
extracted from the raw videos. These images would be used to check if
the obtained images can be used to predict the above mentioned
classifications thereby saving time.
Snapshots of extracted 0 hpi images from raw videos

Raw videos of the embryos. Extracted


They can have 1-12 wells in them. embryo
Procedure followed to extract images

 Classifying videos according to resolution and number of


wells in the video.
 Extracting each well according to the dimensions of the
video. Each well has a dimension of 250x250 pixels
 Extracting wells which are at 0 hpi. This is done using optical
character recognition.
 Extracting the embryo by cropping the excess space in each
well.
What is GAN?
 GANs (Generative Adversarial Networks) are a type of generative
deep neural network that learn from data using adversarial techniques.

 It consists of 2 networks ; the discriminator and the generator.

 The generator, generates new data instances, while the other, the
discriminator, evaluates them for authenticity.

 The discriminator decides whether each instance of data it reviews


belongs to the actual training dataset or not.
Working Principle
The generator takes as input a vector of random numbers (z), and transforms it into the form of the data we
are interested in imitating.

The discriminator takes as input a set of data, either real (x) or generated (G(z)), and produces a probability
of that data being real (P(x)).

The generator is then optimized in order to increase the probability of the generated data being rated
highly.
Gradient ascent expression for the discriminator:

By alternating gradient optimization between the two networks using these expressions on new batches of
real and generated data each time, the GAN will slowly converge to producing data that is as realistic as the
network is capable of modeling.

Gradient descent expression for the generator:


First architecture experiment
Feed-forward GAN refers to the architecture where both the generator and the discriminator are fully
connected networks.

This was done as an experiment to learn the working of GANs and see how this simple architecture
works on the present dataset.

Architecture : (for embryo dataset)


Generator :

● Input : Noise of dimension 200


● Fully connected layer | 512 units | relu activation
● Fully connected layer | 1024 units | relu activation
● Output : Fully connected layer | 62500 units | sigmoid activation

Discriminator :

● Input : 62500 values of pixels of images


● Fully connected layer | 1024 units | relu activation
● Fully connected layer | 512 units | relu activation
● Output : Fully connected layer | 1 unit | sigmoid activation
Results of Feed Forward GAN
• The architecture was unable to learn any feature from the dataset.
• The model did not have enough number of parameters to learn features
from the dataset.
• After 20 epochs of training the generator had a loss of 4.36 and the
discriminator had a loss of 1.74.
• For an optimal GAN the generator loss has to be 0.693 and the
discriminator loss has to be 1.386.

Sample images generated by the feed forward GAN


Second architecture experiment using Stack GAN

 Stack GAN is used to generate photo-realistic images with images of


higher dimensions.

 The Stage-I GAN sketches the primitive shape and colors of the
object based on the input image, yielding Stage-I low-resolution
images.

 The Stage-II GAN takes Stage-I results and the input image as
inputs, and generates high-resolution images with better details. It is
able to rectify defects in Stage-I results and add compelling details
with the refinement process.
Architecture of Stack GAN
Generator :

Input shape : 512 noise points -> fully connected (4x4x512 units) -> batch normalization
-> relu activation -> reshaping to [-1, 4, 4, 512] , 2d upsampling -> convolution (256
feature maps, kernel size 5, tanh activation) -> 2d upsampling -> convolution (128
feature maps, kernel size 5, tanh activation) -> 2d upsampling -> Output layer :
convolution (3 feature maps, kernel size 5, tanh activation)

Discriminator :

Input size : 32x32x3 images -> convolution layer (512 feature maps, kernel size 5),
average pooling layer -> convolutional layer (256 feature maps, kernel size 5) ->
average pooling -> convolutional layer (128 feature maps, kernel size 5) -> average
pooling -> fully connected layer (1024 units, tanh activation) -> Output layer : fully
connected layer (2 units, activation = softmax)
Results of Stack GAN
• The size of the input images were resized to 32x32 images.

• Though the generator was able to learn the difference between the
background and the well, the Generator was not able to learn any good
features that constituted the embryo.

• This is primarily because of the dataset being very small. Training was
done on embryo images at hpi.

Sample images generated by the Stack GAN


Third Architecture Boundary Equilibrium GAN (BEGAN)

● BEGANs are used for training autoencode based GANs.

● The loss is derived from Wasserstein distance.

● This method balances the generator and discriminator during


training and provides a new approximate convergence measure.

● The generator can look into the features learnt by the


discriminator, so that it only learns the relevant features.
Architecture for embryo dataset
Generator :

Input noise points -> fully connected layer -> reshaping layer ->
(convolutional layer -> convolutional layer -> upscaling layer)x3 -> Output
convolutional layer

Discriminator :

Encoder: input layer -> (convolutional layer)x6 -> max pooling layer ->
reshaping layer -> fully connected layer

Decode : fully connected layer -> reshaping layer -> (convolutional layer ->
convolutional layer -> upscaling 2d)x3 -> output convolutional layer
Results for dataset containing all embryos from all 5 days
INFERENCE
• After 342000 steps, the generator suffered from Mode Collapse.

• Since there were empty wells, it realised that it could fool the
discriminator just by generating empty wells without any features.

• After this, all of the iterations produced mostly empty wells and it
started forgetting all the features.

• The best set for generator was at step 193000, with generator loss
of 0.1380 and discriminator accuracy of 0.4788.
Further Pre-processing

 To avoid mode collapse due to the dataset containing empty


embryo wells, all empty wells were removed from the dataset.

 Techniques like Feature Matching and Batch Discrimination were


implemented to avoid mode collapse.
Results after 18 hours - all classes
Results for 68 hpi class 5 embryos
Final conclusions made from all the performed
experiments
 The images learnt are really good. However there are certain flaws :

1. The generator is only being able to generate 2 kinds of images.


2. If we zoom into the images, we can see lots of white pixels scattered around
the images.
3. These may hinder the features learnt by the classification model and lead to
poor results.

 The individual classes of all the 5 days have very less number of images.

 The number of training images for the 68 hpi class 5 embryos was 492 images.

 As we see from the images, even though the generator is able to learn from the
discriminator, the discriminator is itself not able to learn the training data due to
insufficient images.
Final conclusions made from all the performed
experiments
 After 105000 steps, the discriminator reaches an accuracy of
0.487 which is near to optimal value (0.5) but the loss stops
decreasing. This validates the point.

 Since, we are concerned with day-wise extracted images, more


number of timestamps during the day have to be taken into
consideration for there to be sufficient data.
THANKS

You might also like