You are on page 1of 8

Parallel Max-Min Ant System Using MapReduce

Qing Tan1,2, Qing He1, and Zhongzhi Shi1


1

The Key Laboratory of Intelligent Information Processing, Institute of Computing


Technology, Chinese Academy of Sciences,
100190 Beijing, China
2
Graduate University of Chinese Academy of Sciences,
100049 Beijing, China
{tanq,heq,shizz}@ics.ict.ac.cn

Abstract. Ant colony optimization algorithms have been successfully applied


to solve many problems. However, in some large scale optimization problems
involving large amounts of data, the optimization process may take hours or
even days to get an excellent solution. Developing parallel optimization
algorithms is a common way to tackle with this issue. In this paper, we present
a MapReduce Max-Min Ant System (MRMMAS), a MMAS implementation
based on the MapReduce parallel programming model. We describe
MapReduce and show how MMAS can be naturally adapted and expressed in
this model, without explicitly addressing any of the details of parallelization.
We present benchmark travelling salesman problems for evaluating MRMMAS.
The experimental results demonstrate that the proposed algorithm can scale
well and outperform the traditional MMAS with similar running times.
Keywords: Ant colony optimization, MMAS, Parallel MMAS, Travelling
salesman problem, MapReduce, Hadoop.

Introduction

Max-Min Ant System [1] is an optimization algorithm that was inspired by the
behavior of real ants. This evolutionary algorithm has become popular and has been
found to be effective for solving NP-hard combinatorial problems like travelling
salesman problem (TSP). However, as the city number grows, the algorithm often
takes a very long time to obtain the optimal solution. Efficient parallel Ant Colony
Optimization (ACO) [2] algorithms and implementation techniques are the key to
meet the scalability and performance requirements entailed in such cases. So far, there
are several parallel implementations of ACO algorithm [3,4]. In the PACS [3]
method, the artificial ants are firstly generated and separated into several groups, and
ACS is then applied to each group and the communication between groups is applied
according to some fixed cycles. [4] proposed two parallel strategies for the ant
system: the synchronous parallel algorithm and the partially asynchronous parallel
algorithm. And all of the above methods need the programmers to design and
implement the detailed parallelization on different processors.
Y. Tan, Y. Shi, and Z. Ji (Eds.): ICSI 2012, Part I, LNCS 7331, pp. 182189, 2012.
Springer-Verlag Berlin Heidelberg 2012

Parallel Max-Min Ant System Using MapReduce

183

MapReduce [5] is a programming model and an associated implementation for


parallel processing large dataset. Users only specify the computation in terms of a
map and a reduce function, and the underlying runtime system automatically
parallelizes the computation across the cluster of machines.
In this paper, we adapt MMAS algorithm in MapReduce framework and present a
MRMMAS to make the method applicable to dealing with large scale problems.
MRMMAS is simple, flexible, and scalable because it is designed in the MapReduce
model. Considering TSP is the most typical application of MMAS, we present our
MRMMAS method for solving TSP and conduct comprehensive experiments to
evaluate its performance on some TSP benchmark problems.
The rest of the paper is organized as follows. In Section 2, we present preliminary
knowledge including MapReduce overview and introduction of standard MMAS.
Section 3 describes how MMAS can be cast in the MapReduce model and shows the
map function and reduce function of MRMMAS in detail. Experimental results in
Section 4 demonstrate that the proposed algorithm can scale well through the
computer cluster. Finally, we offer our conclusions in Section 5.

Preliminary Knowledge

2.1

MapReduce Overview

MapReduce, as the framework showed in figure 1, is a simplified programming


model which is well suited to parallel computation [6]. Under this model, programs
are automatically distributed to a cluster of machines. In MapReduce, all data are
organized in the form of keys with associated values. For example, in a program that
counts the frequency of occurrences for different words, the key could be set as a
word and its value would be the frequency of that word.
As its name shows, map and reduce are two basic stages in the model. In the first
stage, the map function is called once for each input records. At each call, it may
produce intermediate output records with the form of key-value pair. In the second
stage, these intermediate outputs are grouped by key, and the reduce function is called
once for each key. Finally, the reduce function will output some reduced results.
More specifically, the map function is defined as a function that takes a single keyvalue pair and outputs a list of new key-value pairs. For each call, it may produce any
number of intermediate key-value pairs. It could be formalized as:
Map: (Key1, Value1) list((Key2, Value2))
In the second stage, these intermediate pairs are sorted and grouped by key, and the
reduce function is called once for each key. The reduce function reads a key and a
corresponding list of values and outputs a new list of values for the key.
Mathematically, this would be written:
Reduce: (Key2, list(Value2)) Value3
The MapReduce model provides sufficient high-level parallelization. Since the map
function only takes a single record, all map operations are independent of each other
and fully parallelizable. And also the reduce function can be executed in parallel on
each set of intermediate pairs with the same key.

184

Q. Tan, Q. He, and Z. Shi

Fig. 1. Overview of the MapReduce execution framework

2.2

Max-Min Ant System (MMAS)

Ant Colony Optimization (ACO) metaheuristic is a population based approach inspired


by the behavior of ant colony in real world. In ACO, solutions of the problem are
constructed within a stochastic iterative process, by adding solution components to
partial solutions. This process, together with the pheromone updating rule in ACO,
makes the algorithm efficient in solving combinatorial optimization problems.
Initially, each ant was randomly positioned on a starting node. Then each ant applies
a state transition rule to incrementally build a solution. Finally, all of the solutions are
evaluated and the pheromone updating rule was applied until all the ants have built a
complete solution. The framework of ACO algorithm could be represented as follows:
Procedure: ACO algorithm for static combinatorial problems

Initialize parameters and pheromone trails;


Loop /* at this level each loop is called an iteration */
2
Put each ant in a random starting node;
Loop
3
/*Construct solutions*/

Each ant applies a state transition rule to choose a next city to visit;
Until all ants have built a complete solution
4
Pheromone updating rule is applied;
Until end condition is satisfied, usually reach a given iteration number

Max-Min Ant System [1] is one of the best implementation of ACO algorithm. It
combines an improved exploitation of the best solutions with an effective mechanism
for avoiding early search stagnation. It differs from Ant System (AS) mainly in the
following three aspects: (1) Only one single ant adds pheromone after each iteration;
(2) The range of possible pheromone trails on each solution component is limited to an
interval [ min , max ] ; (3) The initial pheromone trails are set to max .

Parallel Max-Min Ant System Using MapReduce

185

MapReduce Based Parallel Max-Min Ant System

In this section, we present the main design for MapReduce Max-Min Ant System
(MRMMAS). Firstly, we point out how MMAS be naturally adapted to MapReduce
programming model and present the general idea of MRMMAS. Then we explain
how the computations can be formalized as map and reduce operations in detail.
3.1

The Analysis of MMAS from Serial to Parallel

The whole procedure of MMAS is an iteration process. In every round of the


iteration, the ant colonies construct feasible solutions through two rules: state
transition rule and pheromone updating rule. And in MMAS, the pheromone updating
rule is applied only when all ants have built a complete solution. In another word, the
pheromone level keeps constant during the process of solution construction.
In MMAS, the most intensive calculation to occur is the calculation of solution
construction. In each iteration, every ant would require a lot of computations to decide
which city to visit from its current city. Fortunately, the pheromone updating rule in
MMAS does not require the communications among the ants in the same iteration but
only deliver the information to the ants in the following iterations through the
updating of the pheromone. It is obviously that the computation of constructing a
solution for one ant is irrelevant with the construction of another ant in the same
iteration. Therefore, the solution construction process could be parallel executed.
After this phase, all the constructed solutions are summed up and pheromone updating
rule is carried out. The updated pheromone level will be send to each ant and play a
role in the following iteration.
3.2

MMAS Based on MapReduce

In an iteration of MMAS, each ant in the swarm locates at a starting node, chooses a
next city to visit step by step, and evaluates its solution. All of these actions are
completed independently of the rest of the swarm. As the analysis above, MRMMAS
algorithm needs one kind of MapReduce job. The map function performs the
procedure of constructing a solution for one ant and thus the map stage realizes the
solution construction for all the ants in a parallel way. Then, the reduce function
performs the procedure of updating the pheromone. For each round of the iteration,
such a job is carried out to implement the whole process of MMAS. The procedure of
MRMMAS is shown in the following.
Procedure: MapReduce MMAS for static combinatorial problems

Initialize parameters and pheromone trails;


Loop /* for each iteration, a MapReduce job is carried out */
2
/* Map stage */
3
/* a map function realizes the behavior of an ant */
4
The ant is randomly put in a starting node;
1

186

Q. Tan, Q. He, and Z. Shi

6
7

The ant applies a state transition rule to choose a next city to visit until a
complete solution has been built. /* solution construction */
Calculate the fitness of the solution. /* solution evaluation */
/* Reduce stage */
Pheromone updating rule is realized by a reduce function;
Until end condition is satisfied, usually reach a given iteration number

Map Function: Firstly, the pheromone values, heuristic information, and all of the
parameters used in the state transition rule are transmitted into the map function from
the main function of the MapReduce job. The MRMMAS map function, shown as
function 1, is called once for each ant in the population. The input dataset is stored on
HDFS as a sequence file of <key, value> pairs, each of which represents a record in
the dataset. The number of the record is set as the number of the ant population. So
the map function would be carried out m times, where m is the population of the
ant swarm. The dataset is split and globally broadcasted to all mappers. Consequently,
the process of solution construction for the ants is parallel executed. For each map
task, one ant constructs one solution according to the state transition rule. Then, the
solution is evaluated and expressed as an output <key, value> pair.
Function 1: MRMMAS Map
def mapper(key, value):
/* get [n][n] , [n][n] , , from MapReduce job */

/* initialize tabuList */
for i=1 to n do
tabuList[i] = false;
/* randomly put the ant in a starting node */
currentPosition = randomInit(n);
solution[1] = currentPosition;
tabuList[currentPosition] = true;
/* construct the solution through state transition rule */
for i=2 to n do
/* calculate the visited probability of each city */
for (int j=0; j<city; j++) {

if (list[j] == false) { product[j] = currentPosition


, j * currentPosition , j ;}

else { product[j] = 0;}


}
/* randomly select a city to visit according to the probabilities */
currentPosition = randomSelect(product);
solution[i] = currentPosition;
tabuList[currentPosition] = true;
/* solution evaluation */
fitness = Fit(solution);

Parallel Max-Min Ant System Using MapReduce

187

/* output the solution in a <key, value> pair */


key= Solution; /* a string Solution */
Take fitness+solution as vaule;
output <key value> pair;

In the above procedure, the constructed solution and its fitness are outputted by a
<key, value> pair. All of the mappers have the same key, so all of the solutions will
be summed up together in the reduce step. And the information of different solution is
expressed in different vaule. Suppose a TSP solution is [1-2-3-4-5-6-7-1] and its path
length is 123.45, then vaule is a string 123.45+1,2,3,4,5,6,7.
Reduce Function: The input of the reduce function is the intermediate <key, value>
pairs obtained from the map function of each host. As described in the map function,
each pair includes a solution and its fitness. In the reduce function, we can sum up all
the solutions constructed in the map step and obtain the best solution in the iteration
and the best solution from the beginning. Then we can update the pheromone
according to the pheromone updating rule in MMAS. These results are outputted by a
<key, value> pair and will be transmitted to all the mappers in the following iteration.
The pseudo code for MRMMAS reduce function is shown in function 2.
Function 2: MRMMAS Reduce
def reducer(key, value_list):
/* get [n][n] , and global best solution gBest from MapReduce job */

/* Of all of the solutions, find the best record in the current iteration */
for value in value_list:
fitness = getFitness(value); solution = getSolution(value);
if (iBest is null) or (fitness > iBest)
iBest = fitness; iBestSolution = solution;
/* update the global best solution */
if (iBest > gBest) { gBest = iBest;}
/* pheromone updating */
= (1 ) * ;
for all edges(i,j) in iBestSolution
ij = ij + ijbest ;
/* range pheromone into [ min , max ] */
for all edges(i,j)
if ( ij > max ) { ij = max ;}
if ( ij < min )

{ ij = min ;}

/* output the results in a <key, value> pair */


Take gBest+gBestSolution as key;
Take [n][n] as vaule;

value> pair;

output <key

188

Q. Tan, Q. He, and Z. Shi

Experiments

In this section, we evaluate the performance of MRMMAS. Experiments were run on


a cluster of computers, each of which has two 2.8 GHz cores and 4GB of memory.
Some generally available and typical TSP data sets were used as the test material.
Considering that MRMMAS performs the same calculations as a serial
implementation of MMAS, MRMMAS and serial MMAS will achieve the same level
of accuracy with the same parameter setting. We have compared the quality of
solutions and thus verified the correctness of MRMMAS. Thereby, the following
experiments mainly focus on the efficiency of MRMMAS. We will check the average
execution time per iteration because it shows whether the parallel implementation is
an improvement. The first iteration of each run was excluded from averages because
it often ran slightly faster or slower than the rest of the runs due to the initialization.
Figure 2 shows the average execution time of MRMMAS on the data set kroA100.
The number we reported is averaged after five runs of MRMMAS, and each run has
50 iterations of MMAS. From the results, we can see that the running time could be
effectively reduced as the number of processors grows.
We use speedup as a measure of scalability. Speedup is defined as the ratio of the
serial runtime of the sequential time for solving a problem to the time taken by
the parallel algorithm to solve the same problem on p processing elements [7]. Thus,
the speedup with p processors is: S p = t1 / t p . To measure the speedup, we increase
the number of computers in the system. The perfect parallel algorithm will
demonstrate linear speedup: a system with p times the number of computers yields a
speedup of p. However, linear speedup is difficult to achieve because the
communication cost among the cluster of computers.
Figure 3 shows the speedup performance of MRMMAS on different test set. From
the results, we can see that MRMMAS scales well through 32 processors. However,
the improvement becomes gradually undramatic as the number of processors grows.
That is because the implementation and communication overhead hindered further
improvement. Moreover, the speedup performance on large-scale TSP is better than
those of smaller TSP due to the higher computation proportion.

Fig. 2. Execution times per iteration for MRMMAS on kroA100

Parallel Max-Min Ant System Using MapReduce

189

Fig. 3. Speedup for MRMMAS on different test data set

Conclusions

Although ACO algorithm has successfully been applied to solve many problems, its
long running time is always an issue when dealing with large scale problems. This
paper presents a parallel MMAS algorithm based on MapReduce, which will be
widely embraced by both academia and industry. In our implementation, the process
of solution construction will be carried on in different processors. The MapReduce
system can balance the load dynamically and automatically. We have presented that
MMAS can be naturally adapted to the MapReduce programming model and the
experimental results show that it scales well through the computer cluster.
Acknowledgments. Supported by the National Natural Science Foundation of China
(No. 60933004, 60975039, 61175052, 61035003, 61072085), National High-tech
R&D Program of China (863 Program) (No.2012AA011003).

References
1. Sttzle, T., Hoos, H.: MAX-MIN ant system. Future Generation Computer System 16(8),
889914 (2000)
2. Dorigo, M., Sttzle, T.: Ant Colony Optimization. The MIT Press, America (2004)
3. Chu, S.-C., Roddick, J., Pan, J.-S., Su, C.-J.: Parallel Ant Colony Systems. In: Zhong, N.,
Ra, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 279
284. Springer, Heidelberg (2003)
4. Bernd, B., Gabriel, E.K., Christine, S.: Parallel Strategies for the Ant System. University of
Vienna, Vienna (1997)
5. Jeffrey, D., Sanjay, G.: MapReduce: Simplified Data Processing on Large Clusters.
Communications of the ACM 51, 107113 (2008)
6. Ralf, L.: Googles MapReduce Programming Model Revisited. Science of Computer
Programming 70, 130 (2008)
7. Grama, A., Gupta, A., Karypis, G., Kumar, V.: Introduction to Parallel Computing, 2nd edn.
Addison-Wesley, Harlow (2003)

You might also like