You are on page 1of 6

MASTERS OF TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PROJECT SYNOPSIS

CPU TASK SCHEDULING USING GENETIC


ALGORITHM

SUBMITTED BY

Abhineet Kaur
University Roll Number: 1302871

SUPERVISED BY

Dr. Baljit Singh Khehra


Associate Professor, C.S.E.

BABA BANDA SINGH BAHADUR ENGINEERING


COLLEGE,
FATEHGARH SAHIB, PUNJAB

1. Introduction
The allocation of jobs to CPU is directly related to the performance of the system. There are
different applications whose processing time is unknown, priority level is different for each
application and in multitasking environment the preemption of the tasks is taken into
consideration. As there are many factors on which scheduling depends, so it comes under the
category of NP Hard problem. The scheduling and mapping of the precedence-constrained
task graph to processors plays the most vital role in both parallel and distributed network
environment system. This study majorly concerns with addressing the challenge of task
scheduling programs, which are represented as DAGs (directed acyclic graphs) for a
collection of computational tasks and their precedence onto a parallel processing system. The
main job of the task scheduler is to schedule the tasks in manner that the precedence relations
are maintained and also; the overall execution length is minimized. Many evolutionary
approaches have been developed for this problem [1].
1.1 Genetic Algorithm
Genetic Algorithm uses the principle of natural genetics and natural selection to construct
search and optimizing procedures. Genetic algorithms are good at taking larger, potentially
huge, search space and navigating them looking for optimal combinations of things and
solutions which we might not find in a lifetime.
Genetic Algorithms are optimization algorithms that maximize or minimize a given function
[2]. Selection operator deserves a special position in Genetic algorithm since it is the one
which mainly determines the evolutionary search spaces. It is used to improve the chances of
the survival of the fittest individuals. There are many traditional selection mechanisms used
and many user specified selection mechanisms specific to the problem definition.
Three most important aspects of using GA are:
1. Definition of objective function
2. Definition and implementation of genetic representation
3. Definition and implementation of genetic operators
Genetic Operators:
1. Selection: - This operator selects chromosomes in the population for reproduction.
Reproduction is the first operator applied on population. Chromosomes are selected from
the population to be parents. For crossover and produce offspring. Various methods of
selecting chromosomes for parents to crossover are
a. Roulette Wheel Selection
b. Boltzmann Selection
c. Tournament Selection
d. Rank Selection
e. Steady State Selection

Idea in all is that the above average strings are picked from the current population and
their multiple copies are inspected is the mating pool in a probabilistic manner.

2. Cross Over: - Cross Over Operator is applied to the mating pool with a hope that it would
create a better string. It is a Recombination operator which proceeds in three steps: First
reproduction operator selects at random a pour of two individual strings for mating, then
a cross site is selected at random along the string length and the position values and
swapped between two strings following the cross site. Various types are :
i. Single site cross over
ii. Two point cross over
iii. Multipoint cross over
iv. Uniform cross over
v. Matrix cross over
3. Mutation: - After cross over, strings are subjected to mutation. Mutation of a bit involves
flipping it, changing 0 to 1 and 1 to 0 with a small mutation probability P m. The classic
example of a mutation operator involves a probability that an arbitrary bit in a genetic
sequence will be changed from its original state. A common method of implementing the
mutation operator involves generating a random variable for each bit in a sequence. This
random variable tells whether or not a particular bit will be modified. This mutation
procedure, based on the biological point mutation, is called single point mutation. Other
types are inversion and floating point mutation.
Two hybrid genetic algorithms are to be taken into consideration, Critical Path Genetic
Algorithm (CPGA) and Task Duplication Genetic Algorithm (TDGA) [3]. The first
algorithm, CPGA, is based on how to use the idle time of the processors efficiently and
reschedule the critical path nodes to reduce their start time. The second algorithm is based on
task duplication principle to minimize the communication overheads.

2. Literature Survey
Ahmad and Kwong [4] conforms that using task duplication in scheduling can be useful
especially when the computation ratio of a parallel algorithm on a given system is high. This
is usually the case in distributed systems such as cluster of workstations. Both Duplication
Scheduling Heuristic (DSH) and Bottom-up-Top-down Duplication Heuristic (BTDH)
algorithms produce good solutions with the latter outperforming the former when
Communication-to-computation ratio (CCR) is very high. However, the basic principle in
both the algorithms is essentially the same, that is, to duplicate a parent task if it improves
start time of a node. The proposed Critical Path Fast Duplication (CPFD) algorithm which
uses a new technique tries to start every task at the earliest possible time from the beginning
of the scheduling process. The proposed algorithm outperforms both of these algorithms

without performing worse in any of the 490 test cases. Moreover, it consistently performs
better at low as well as high values of CCR.

Jin et al. [5] compared nine scheduling algorithms for multiprocessor task scheduling with
communication delays. Duplication Scheduling Heuristic (DSH) had provided short
scheduling time and schedules with the shortest makespan, with the additional expense
occurring by duplication of tasks on multiple processors. They concluded that from a purely
performance point of view, DSH (ones-shot heuristic algorithm) is the best solution, but its
deployment needs to be subject of a careful cost benefit analysis. One-shot heuristic
algorithms without task duplication can provide adequate performance and fast scheduling
time: Insertion Scheduling Heuristic (ISH) had proven to be the best of this group. The next
group, scheduling algorithms based on iterative search such as genetic algorithms, etc.
require an order of magnitude longer computation time, but (with the exception of simulated
annealing) had yielded better solutions with a shorter makespan than the one-shot heuristic
algorithms without task duplication. In this group, the best solutions were obtained by
genetic algorithms and tabu search. They conclude that the use of these algorithms are
justified whenever the scheduling can be done off-line, there is a need for repeated execution
of the schedules or the makespan of the application is significantly longer than the scheduling
time.

3. Problem Formulation
The model of the parallel system to be considered in this work can be described as follows
[6]: The system consists of a limited number of fully connected homogeneous processors. Let
a task graph G be a Directed Acyclic Graph (DAG) composed of N nodes n1, n2, n3, . . . , nN
. Each node is termed a task of the graph which in turn is a set of instructions that must be
executed sequentially without preemption in the same processor. A node has one or more
inputs. When all inputs are available, the node is triggered to execute. A node with no parent
is called an entry node and a node with no child is called an exit node. The computation cost
of a node ni is denoted by (ni) weight. The graph also has E directed edges representing a
partial order among the tasks. The partial order introduces a precedence-constrained DAG
and implies that if ni > nj, then nj is a child, which cannot start until its parent ni finishes. The
weight on an edge is called the communication cost of the edge and is denoted by c (ni, nj).
This cost is incurred if ni and nj are scheduled on different processors and is considered to be
zero if ni and nj are scheduled on the same processor. If a node ni is scheduled to processor P,
the start time and finish time of the node are denoted by ST(ni, P) and FT(ni, P) respectively.
After all nodes have been scheduled, the schedule length is defined as max{FT(ni, P)} across
all processors.
The objective of the task scheduling problem is
1. To find an assignment and the start times of the tasks to processors such that the schedule
length is minimized and,
2. The precedence constrains are preserved.

A Critical Path (CP) of a task graph is defined as the path with the maximum sum of node
and edge weights from an entry node to an exit node. A node in CP is denoted by CP Nodes
(CPNs). An example of a DAG is represented in Fig. 1, where CP is drawn in bold.

Figure1. A DAG with t1, t7 and t9 are CP nodes

4. Significance of studies
Objective of this study is to find an approach for optimized scheduling.. Optimizing the task
scheduling can be checked by evaluating certain factors such as throughput, finishing time,
and processor utilization.
1. Use the idle time of the processors efficiently
2. Load balancing among processors
3. Communication delays to be reduced

4.

Overall execution time to be minimized

5. References
[1] Hou ESH, Ansari N, Hong R. 1994. A genetic algorithm for multiprocessor scheduling.
IEEE Transactions on Parallel and Distributed Systems: 113-20, Volume 5 issue 2
[2] Sivraj R., Ravivhandran T. 2011. A review of selection methods in genetic algorithm:
3972-97, Volume 3
[3] Fatma A., Omara, Mona, Arafa M. 2010. J. Parallel Distrib. Comput.: 13-22
[4]Ahmad I., Kwong Y., August 1994. A new approach to scheduling parallel programs
using task duplication, in: Proceeding of the 23rd International Conf. on Parallel Processing.

[5] Shiyuan Jin, Guy Schiavone, Damla Turgut. 2007 A performance study of multiprocessor
task scheduling algorithms, Springer: 77-97
[6] Y. Kwok, I. Ahmad, 1999. Static scheduling algorithms for allocating directed task
graphs to multiprocessors, ACM Comput. Surv. 406-471, Volume 31

You might also like