Professional Documents
Culture Documents
Abstract
We used Particle Swarm Optimization (PSO) algorithm hybrid with Fuzzy C-Means (FCM) and Learning Automata (LA)
algorithms for Software Cost Estimation (SCE). In this paper we test and evaluate PSO-FCM and PSO-LA hybrid models on
NASA dataset software projects. The obtained results showed that in the hybrid models the values of Magnitude of Relative
Error (MRE) and Mean Magnitude of Relative Error (MMRE) were reduced compared with COCOMO model and also the
accuracy of Percentage of Relative Error Deviation (PRED) was higher in the hybrid models.
Keywords: COCOMO Model, Fuzzy C-Means, Learning Automata, Particle Swarm Optimization, Software Cost Estimation
1.Introduction
the companies in better analyzing the feasibility investigation and efficient management of the projects6,7. The
main reasons in unsuccessful conclusion of the projects
according to the cost and schedule are failure to use management methodology, the quality of tool application and
imprecise estimation. The topic of SCE in software project management that include are planning for resources,
costs, employees and time control, has become a major
concern for software companies8,9.
Among SCE models, COCOMO I10, COCOMO II11,
SLIM12, and Function Point (FP)13 models are in algorithmic models category. The project managers cannot make
precise and reliable estimations on final status of the projects in respect with the required time and cost to finish
the project relying on COCOMO model. In algorithmic
models, estimation parameters are often obtained from
experimental data of different and previous projects14.
SCE algorithmic models work essentially based on cost
factors and scale factors. Also the efficiency of these models depends on the size of the projects and variation in
project size results in numerous subsequent variations
A Novel PSO based Approach with Hybrid of Fuzzy C-Means and Learning Automata in Software Cost Estimation
796
2. Meta-Heuristic Algorithms
The ability of meta-heuristic algorithms in solving optimization problems works through factors collective
cooperation and parallel search. Therefore the more the
power of the meta-heuristic algorithms in controlling
these two factors, the more the ability of these algorithms
in finding solutions close to the optimal solution. Metaheuristic algorithms based on collective intelligence deal
with optimal problem solvation, are tolls in finding close
to optimal solutions. In this section we introduce PSO
algorithm which is one of the most important population
algorithms.
Farhad Soleimanian Gharehchopogh, Laya Ebrahimi, Isa Maleki and Saman Joudati Gourabi
search space of possible solutions. In this space an evaluation criterion is defined and the quality assessment of the
problem solutions is achieved according to that. The variation of status of each particle in a group is influenced by
its own experience and its neighbors knowledge and the
search behavior of one particle affects the other particles
in the group. This simple behavior results in formation of
optimal areas in search space. Therefore in PSO algorithm
each particle informs other particles in a suitable way once
it finds the optimal position and based on the achieved
values for cost function, each particle decides according
to a certain possibility, to follow other particles and the
search in problem space is done based on the previous
knowledge of the particles. This results in the particles not
getting too close to one another and helps them efficiently
solve the continuous optimization problems.
In PSO algorithm first the individuals in the groups
are accidentally created in the problem space and the
search for the optimal answer begins. In the overall structure of the search each person follows the individual with
the best fitness function, while not forgetting his own
experience and follows the situation in which he has the
best fitness function itself. Therefore each person in each
algorithm iteration changes his future position according to two values, one being the best personal position
so far (pbest) and the other the best position of all population so far and in fact the best pbest in the whole
population (gbest). Conceptually, pbest for each person is
in fact the biological memory of that person. Gbest is the
general knowledge of the population and when people
change their position according to gbest in fact they try
to upgrade his own knowledge to the level of population
knowledge. Conceptually the best particle group connects
every particle in the group to one another. Determination
of the next position of the particle for each particle is
achieved using Equations (1) and (2).
vi +1 = w.vi + c1 .r1 .(Pbesti xi ) + c2 .r2 .( g besti xi ),
(1)
xi +1 = xi + vi +1
(2)
(3)
In Eq. (3) imax shows the maximum number of algorithm iteration and i is the counter of the optimal answer
iteration. In Eq. (3) parameters wmax and wmin are the initial and ultimate values of inertia weight along algorithm
implementation. The value of inertia weight changes
linearly from 0.9 to 0.4 along the implementation of the
algorithm. The high values w result in general search and
smaller values result in local search. In order to make
a balance between local and general searches, the inertia weight along the implementation of the algorithm
is reduced uniformly. Therefore with decreasing the W
value the search is more local and around the optimal
answer.
3. Proposed Models
Precise and reliable estimation for software projects is
among the current challenges in software engineering.
The accuracy of cost estimation is one of the major phases
in interactions between manpower and time. Precise
estimation is a complicated process since the predicted
results should be close to reality. SCE algorithmic models
work essentially based on cost and scale factors. Therefore
the value of these factors in sensitivity of cost and manpower is significantly effective. Also the majority of these
models depend on the project size and the variation in the
factors of the projects size result in numerous variations
in cost and time. Wrong calculations and predictions of
cost factors will also have major effects in the final result.
Determination of cost factor values in software projects
is very difficult and complicated. In hybrid models Effort
Multipliers (EM) factors19 such as RELY, CPLX, STOR,
TOOL and SCED which are very effective in estimation
accuracy are evaluated and tested.
FCM clustering algorithm is the most widely used
algorithm in recognizing interrelated data in different
clusters22. In FCM algorithm, at first random points are
selected equal in number to the required clusters and then
the data are assigned to one of these clusters according to
the proximity and the clusters are formed. By repeating
this procedure new centers can be calculated for the data
through calculating the mean in each iteration and then
the data are reassigned to new clusters. This procedure
797
A Novel PSO based Approach with Hybrid of Fuzzy C-Means and Learning Automata in Software Cost Estimation
Jm =
m
ij
u
i =1 j 1
x j vi
(4)
u
i =1
ij
j = 1,..., n
=1
(5)
In Eq. (4), c is the cluster number, n is the data number, m is the fuzziness amount which can be a real number
bigger than 1. Xj is the kth data, vi is the ith cluster and uij
is the membership degree of jth data in ith cluster. In all
fuzzy clustering algorithms at first the cluster number is
defined and the primary values are assigned to the clusters. Then using Equations (6) and (7), these values are
updated and this procedure is repeated until a certain difference between the data is achieved.
n
(u )
vi =
j =1
n
ij
xj
(6)
(u )
ij
j =1
uij =
1
c
x j vi
x
i =1
2 /( m 1)
(7)
vi
Generally the FCM clustering is used to find the structure in the data which are not labeled. In this situation it
is tried to form an objective function which has a minimum by putting the data in different clusters. Therefore
by selecting fuzzy distance functions the data can be optimally clustered.
In hybrid PSO-FCM model the minimum distance
between the clusters and distance summation inside the
cluster and the number of the clusters are used as fitness parameters and to improve the PSO algorithm.
Application of FCM results in the accumulation of the
data in the best cluster and the fitness function to have
many local optimal points. Each particle in the group
possesses a set of FCM rules. In different generations
the particles get different values which improve the fitness function. Ultimately the cluster with the highest
fitness of the particles is selected as the best set of rules to
798
Farhad Soleimanian Gharehchopogh, Laya Ebrahimi, Isa Maleki and Saman Joudati Gourabi
if i = j
p j (n + 1) = (b / r 1) + (1 b) p j (n)
if i j
(9)
Actuali Estimatei
MMRE =
Actuali
1
N
MRE
i =1
100
(10)
(11)
799
A Novel PSO based Approach with Hybrid of Fuzzy C-Means and Learning Automata in Software Cost Estimation
(12)
800
Parameters
Value
Population Size
50
WMAX
0.9
WMIN
0.4
C1
1.5
C2
1.5
Iterations
100
1.25
1.05
1.05
Farhad Soleimanian Gharehchopogh, Laya Ebrahimi, Isa Maleki and Saman Joudati Gourabi
In Table 3, we have evaluated and compared 10 projects from NASA dataset software projects. As can be seen
the PSO-LA hybrid model has lower error rate compared
to intermediate COCOMO model.
MRE comparison curve between PSO-LA hybrid
model and intermediate COCOMO model on 10 projects from NASA dataset software projects are shown in
Figure8.
Figure 9 shows MRE comparison between PSO-FCM
and PSO-LA hybrid model and intermediate COCOMO
model on 10 projects from NASA dataset software projects. As can be seen the PSO-FCM hybrid model has lower
error rate compared to PSO-LA hybrid model. Also the
MRE error rate in PSO-FCM and PSO-LA hybrid models
is lower compared to intermediate COCOMO model.
Table 4 shows MMRE comparison between hybrid
models and intermediate COCOMO model for 60 projects from NASA dataset software projects. The results
Project
No.
Models
COCOMO
PSO-FCM
No. Clusters
1
11
27.21
2.20
3.42
1.86
2.15
15
9.36
11.20
8.56
13.15
7.91
19
3.8
5.23
6.12
7.32
3.5
21
22.51
10.78
13.98
11.2
9.61
25
39.54
12.85
10.84
16.87
9.62
No
30
31.56
12.79
17.15
15.48
9.31
35
28.46
12.76
8.65
11.35
2.76
39
23.76
15.12
12.34
5.23
44
24.50
32.41
26.14
10
55
3.64
8.16
9.56
6.31
Project
No.
COCOMO
PSO-LA
11
27.21
4.56
2.72
15
9.36
11.23
21.48 17.45
19
3.8
6.72
21
22.51
8.69
25
39.54
13.25
30
31.56
17.61
35
28.46
7.41
39
23.76
16.35
44
24.50
12.97
10
55
3.64
8.54
4.56
Figure 7 shows the MRE comparison between PSOFCM hybrid model and intermediate COCOMO model
on 10 projects from NASA dataset software projects.
As can be seen in the clustering of PSO-FCM hybrid
model the error rate is smaller compared to intermediate
COCOMO model.
Models
801
A Novel PSO based Approach with Hybrid of Fuzzy C-Means and Learning Automata in Software Cost Estimation
COCOMO
PSO-FCM
PSOLA
Cluster
1
MMRE
COCOMO
No. Clusters
PSO-FCM
29.6
PSO-LA
25.36
24.56
24.22
23.86
26.32
802
65
68.3
63.3
50
36.3
16.6
35
23.3
11.6
18.3
13.3
11.6
PRED(25)
40
PRED(15)
18.3
PRED(10)
PRED(5)
61.6 58.3
8.3
Farhad Soleimanian Gharehchopogh, Laya Ebrahimi, Isa Maleki and Saman Joudati Gourabi
6.References
1. Gharehchopogh FS. Neural networks application in software cost estimation: a case study. 2011 International
Symposium on Innovations in Intelligent Systems and
Applications (INISTA 2011). 2011 1518 Jun; Istanbul,
Turkey: IEEE. p. 6973.
2. Khalifelu ZA, Gharehchopogh FS. Comparison and evaluation data mining techniques with algorithmic models in
software cost estimation. Procedia-Technology Journal.
2012; 1:6571.
3.
Khalifelu ZA, Gharehchopogh FS. A new approach in
software cost estimation using regression based classifier.
AWERProcedia Information Technology & Computer
Science Journal. 2012; 2. 25256.
4. Khalifelu ZA, Gharehchopogh FS. A survey of data mining techniques in software cost estimation. AWERProcedia
Information Technology & Computer Science Journal.
2012; 1:33142.
5. Maleki I, Ghaffari A, Masdari M. A new approach for software cost estimation with hybrid genetic algorithm and ant
colony optimization. International Journal of Innovation
and Applied Studies. 2014; 5(1):7281.
6. Li YF, Xie M, Goh TN. A study of project selection and feature weighting for analogy based software cost estimation. J
Syst Software. 2009; 82(2):24152.
7. Mittas N., Angelis L. Visual comparison of software cost
estimation models by regression error characteristic analysis. J Syst Software. 2010; 83(4):62137.
8. Jorgensen M, Shepperd M. A systematic review of software
development cost estimation studies. IEEE Trans Software
Eng. 2007; 33(1):4053.
9. Anish M, Kamal P, Harish M. Software cost estimation
using fuzzy logic. ACM SIGSOFT Software Engineering
Notes. 2010; 35(1):115.
10. Boehm BW. Software Engineering Economics. Englewood
Cliffs, New Jersy: Prentice-Hall; 1981.
803