You are on page 1of 1

STATEMENT OF PUPROSE

I have extensively been involved in modeling and simulation of physical systems since my M.Sc in Physics. I
completed my M.Sc. at Department of Physics, Government College University, Lahore, Pakistan with
distinction as I stood second in my university with CGPA 3.88/4.0. After my M.Sc. being an active member of a
team for the development of full scope training simulation for Chashma Power Plant, my responsibility was to
simulate the I&C systems. The project gave me insight of actual large scale systems where large number
of variables give rise to an emergent behavior and their mutual relationship along with their relevant
significance to some behavior is very difficult to ascertain. Such behavior is found everywhere in nature
particularly in biological systems. During my MS Computational Science and Engineering, I opted for the
modeling of a biological signaling pathway using Petri net theory in my MS research project. In this study, I
used Pathway Logic to identify cross talk within insulin signaling pathway and Petri net tools to
determine the structural and dynamic properties of this pathway. I presented a part of this study as
research paper entitled Identification of cross talk in insulin signaling pathway using Pathway Logic in IEEE
ICET-2013 conference. This paper is available online on IEEE explore.
The advanced knowledge of mathematics, MS degree in computer science with focus on modeling of biological
systems and practical experience in simulation provide the requisite skills for doctoral research. The field is in
infancy in Pakistan; therefore, I want to promote its usage to find the solutions to real problems of
bioinformatics.
Biological systems are highly complex and sophisticated approaches are required to model these systems. It is
getting easier and cheaper to have immense amount of information in the form of variables from an
experiment. This has happened with the advances in technology, which has generated massive amount of data
in all fields of sciences, especially in Bioinformatics. For instance, measuring gene expression and next
generation sequencing in bioinformatics has witnessed the shift from less or more univariate approaches to
multivariate approaches, as the use of multivariate approaches is natural. The availability of huge amount of
variables for a given sample has raised the problem called dimensionality problem, which is typical for many
fields of science. Specifically this is also known as large p small n problem i.e. many variables and few samples,
which results in multi-co-linearity and over-fitting. Several approaches have been introduced to address this
issue, and partial least squares (PLS) is one of them. PLS has proven to be a very versatile method for
multivariate data analysis and the number of applications is steadily increasing. It is a supervised method
specifically developed to address the problem of making good predictions and data reductions in multivariate
problems. Although, it has also been modified and implemented for classification problem in high dimensional
biological data sets but the classification performance is not satisfactory.
The challenge is to develop the classification tools that can model the biological systems having multivariate
data sets. Machine learning tools like support vector machines (SVM), Random Forest, etc. are appeared
successful for the small or moderate amount of data sets. For multivariate data sets, the solution can be
achieved by working over the hybrid approaches based on PLS and machine learning classifiers like SVM,
Random forest and so on. Hence, we can possibly merge the qualities of PLS i.e. dimension reduction and
machine learning classification tools i.e. superb classification performance for the classification of high
dimensional biological data sets.
Azmat Ali