Professional Documents
Culture Documents
Article information:
Access to this document was granted through an Emerald subscription provided by emeraldsrm:462515 []
For Authors
If you would like to write for this, or any other Emerald publication, then please use our Emerald
for Authors service information about how to choose which publication to write for and submission
guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information.
Paper submitted 8 September 1994; accepted for publication 4 July 1995; discussion open
until September 1996
INTRODUCTION
For a building company (contractor) to decide whether to bid for a project is a
strategic decision which heavily relies on producing a quick cost forecast. Because
the detailed estimating of a project cost is time-consuming and expensive, the
decision has to be based on a rough forecast of the costs involved in the project. To
make such a forecast, it is required that the cost estimators be able to associate
partial information of the project, which includes part of design information and
market factors, with previous cost situations and data without going through a
deep reasoning and logical sequence (McCaffer 1976). The mechanism that
relates the present project with previous ones is unstructured and not well
understood. However, there is evidence to show that experienced estimators are
more confident in selecting information for estimating and more consistent in
associating historical cost data with a new project (Morrison & Stevens 1981).
A number of models have been developed to analyse cost data in order to
improve cost estimating accuracy. Many of them aim at statistically generalizing
1996 Blackwell Science Ltd
69
70
Li,H.
mathematical associations from past cost data (Morrison & Stevens 1981; Beeston
1983), and then apply the associations to assist future cases. Despite the simplicity
and clarity of this approach, many professionals tend to deny its usability. One
reason for this is perhaps that there seem to be many factors that are not representable in the cost data, so that the prediction based on the mathematical associations may be unacceptably biased towards a narrowed range of factors. The
second reason is that values of variables relating to project costs may not lend
themselves to curve fitting since the mathematical representations may not fit well
into the quantitative distribution (Beeston 1983). A more subtle and seldom
quoted reason is that many estimators do not themselves fully appreciate how their
results are arrived at, and may tend to reject any structural approach to bidding
(Woodward 1975).
There have also been research efforts (Ahmad & Minkarah 1988; Tavakoli &
Utomo 1989) in attempting to capture the essence of the estimating ability of
human estimators to form expert systems (ESs). Unfortunately, the creation of
knowledge base is a long and expensive process. In addition, it is rarely clear how
cost estimating expertise can be articulated, which makes it extremely difficult to
acquire knowledge directly from cost estimators to form a knowledge base.
Artificial neural networks (ANNs) offer an alternative source of assistance to
cost estimators with distinct features. Neural networks are simple mathematical
models that shadow the microstructure of the brain (Rumelhart & McClelland
1986a). They exhibit properties that can be considered essential features of
human cognition: their abilities to self-organize knowledge from training data;
their abilities to deal with incomplete information; their abilities to cope with
complex relations; and even their fallibility. All these suggest that neural networks
exhibit a propensity to behave in a way parallel with the nature of cost estimating.
There have been several applications pertaining to applying neural networks to
support construction decision-making processes. One example is the study presented by Moselhi & Hegazy (1991) which identifies the possibility of applying
ANNs in construction cost estimation. Another examples includes an analogybased methodology (Hegazy & Moselhi 1994) which combines neural networks
with genetic algorithms to model markup estimation. Other research includes
applications of neural networks in modular construction decision making (Murtaza & Fisher 1994).
This paper presents an experiment of using neural networks to analyse and
model cost estimation. Multi-layered neural networks are designed and implemented to perform the cost estimating in bidding stages. This paper compares the
estimating performance of neural networks with that of convention cost estimation techniques. To facilitate the comparison, a regression-based method is
employed to provide estimating results. Moreover, this paper compares the estimating performances of neural networks with different hidden layers and different
nodes on the hidden layers, in order to glean the effects of layer/node number on
estimating accuracy. In summary, this paper aims to identify the following
research questions:
1996 Blackwell Science Ltd, Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
1 Will the neural network perform better than traditional methods, such as
regression-based methods?
2 Will the neural network with more hidden layers perform better in cost estimation?
3 Will the neural network with more nodes in each hidden layer perform better
than those with fewer nodes in each hidden layer?
BACKGROUND
71
72
Li, H.
abilities, that is to say, its ability to provide a generic functional mapping from
inputs to outputs. It is necessary to note, however, that the use of a neural network
is not to replace an estimator, but to provide some assistance for his or her jud
gement, in the same light as a spreadsheet aids what-if analysis.
A typical neural network may be viewed as an evolved version of the perception of
Rosenblatt (1958), as illustrated in Fig. 1. Perceptrons (neurons) are arranged in
multiple, fully interconnected layers. But no communication is permitted between
the neurones within a layer. The input neurones are linked to the output neurones
through one or more interconnected hidden layers. The multi-layered perception is
referred to as a feed forward network since inputs are fed into the input layer and
propagate forward through the network topology to the output layer.
Associated with each connection is a numerical value which is the strength of
the weight of that connection: wij is the weight of connection between units i and j .
At the beginning of a training process, the connection weights are assigned with
random values. As examples are presented during the training, the feed-forward
mechanism (to be described later) modifies the connection weights in an iterative
process. When the iterative process is converged, the collection of connection
strengths captures and stores the information in the examples used in training.
Such a trained neural network is ready to be used. When presented with an
incomplete pattern of information, the neural competition will result in a com
plete pattern of information according to the information learned and stored in its
connection strengths.
A feed-forward network with backpropagation proceeds as follows. A neural
node sums the weighted inputs from its neighbouring lower (input) nodes,
compares this sum to its threshold value () and passes the result to higher
(output) nodes through a function referred to as an interaction rule.
1996 Blackwell Science Ltd, Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
where x k,i is the output from node i on layer k, wi,j is the weight between node i on
layer k and node j on layer k + 1 and n is the number of nodes impinging on x k+1j .
The activation value of x k + l j is then computed using a sigmoid function as:
where is the threshold activation level, known as the offset. The output value is
computed from the activation value. In this experiment, the output value equates
activation value. The output value is sent to the other processing units in the upper
layer to complete their output values.
The training (learning) of a multi-layer backpropagation neural network is an
iterative process. When a cost estimation case is presented to the neural network,
the network propagates the activation through to the processing nodes. The error
at the output nodes is then backpropagated to the hidden processing nodes, and
connection weights in the hidden nodes are modified by a modification rule. Each
presentation of one training case and subsequent modification of connection
weights is called a cycle.
The modification rule used in the neural networks is based on the generalized
delta rule which is described by Rumelhart & McClelland (1986b). The mod
ification of weights is accomplished through the gradient descent on the total error
in a given training case, as shown in the following equation:
In this equation, = a learning constant called the learning rate, and j = gra
dient of the total error with respect to the network input node j . At the output layer
j is determined from the difference between the expected activations tj, and the
computed activations aj:
1996 Blackwell Science Ltd, Engineering. Construction and Architectural Management 3 | 1, 2, 69-81
73
74
Li, H.
At the hidden nodes the expected activations are not known a priori. The
following equation gives a reasonable estimate of j for the hidden nodes:
In this equation, the error attributed to a hidden node depends on the error of
the nodes it is linked to. The amount of error from these nodes attributed to the
hidden node depends on the strength of connection from the hidden unit to those
nodes; a hidden node with a strong excitatory connection to a node exhibiting
error will be strongly 'blamed' for this error, causing this connection weight to be
reduced. Variable oj is defined as:
4 Construction cost
5 Inflation rate
The output node is markup percentage. Some of these attributes have numerical
values, such as the company size and inflation rate. Others have nominal values,
such as need for work, ranging in low, medium and high. The nominal values are
numbered as: 1 = low; 2 = medium; 3 = high. Input attributes represent factors
that have direct impacts on bidding decisions. The output attribute indicates the
markup percentage, as described in Fig. 3. It is necessary to note that there are
many factors, such as those affecting the construction cost (e.g. labour, material
and services costs) of a project, which are not included in the input and output
attributes. Details of these cost factors are given in Skitmore 1988; Seydel & Olson
1990; Ogunlana & Thorpe 1991.
Two decisions are important in designing the hidden layers of a neural network.
The first is to decide the number of hidden layers, which is related to the capacity
of the network. The second is to decide the number of nodes on each layer.
Currently, there are no rules for assisting these decisions. In order to investigate
the effects of layer number and node number on the estimating accuracy of a
neural network, two sets of experiments were arranged, as illustrated in Table 1.
In experiment 1, four networks with different layers were designed to facilitate a
comparison. The number of nodes on each layer in this set of networks was 12.
Four networks were also designed for experiment 2, where each network had the
same number of layers (four layers) but varying nodes, as specified in Table 1.
It is necessary to note that this experimentation is not an exhaustive test of
neural networks with all combinations of hidden layers and nodes, as such
combinations are virtually infinite. However, we expected to glean useful generalization from the performance of neural networks with designed configurations. In particular, it would be interesting to compare the estimating results from
regression-based estimation methods and neural network based models with
actual values, so that the estimating abilities of these models could be evaluated.
The artificial neural networks were then trained and tested against the bidding
examples. One hundred and fifty five examples were used for training, while five
1996 Blackwell Science Ltd, Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
75
Li, H.
Table 1 General configurations of neural networks
Experiment 1
Note: no. of nodes is 12 on each layer
network name
network name
network
network
network
network
11
12
13
14
no. of layers
(N11)
(N12)
(N13)
(N14)
3
5
7
9
network
network
network
network
21
22
23
24
no. of nodes
(N21)
(N22)
(N23)
(N24)
5
7
9
11
examples were used for testing. The training process was a lengthy process conducted on a UNIX SUN workstation. The number of training iterations varied
from 1300 to almost 200 000 cycles. The variation of training cycles appeared to
be stochastic; for example, N13 took 2300 cycles, whereas N14 took 120000.
During the training process, connection weights increased and decreased as a
neural network settled down to a stable cluster of mutually excitatory nodes.
It was found that increasing the number of nodes reduced the least-meansquare error. N22 (seven nodes per hidden layer) stopped at a higher error level
(0.00872) than N24 (11 nodes per hidden layer) which stopped at 0.00034. The
expansion of nodes showed influence on the error level, which is consistent with
Ersoy's experience (Ersoy 1990).
RESULTS
The regression method used in this experiment was a stepwise multiple linear
regression program where markup rate was the independent variable, and the four
input attributes used in neural network models were extracted as dependent
variables, as described in the Appendix. The regressed equation was formulated
from the 155 examples, and then tested on the other five examples. Testing results
on five examples from regression, experiment 1 and experiment 2 are listed in
Table 2.
The best way to compare these models is to find which model provides the
closest estimation. In Table 2 the estimation results produced by regression are
generally higher than actual values, with an average error rate greater than 10%.
Table 2 Results from testing on training examples
Experiment 2
Experiment 1
example
1
2
3
4
5
actual
regression
N11
N12
N13
N14
N21
N22
N23
N24
4.50
6.56
4.75
7.45
7.80
5.32
7.45
5.78
8.32
8.50
5.32
7.28
5.53
7.01
8.23
4.93
7.03
5.32
7.38
8.03
4.78
6.92
4.95
7.68
7.92
4.63
6.28
4.83
6.95
7.45
4.42
7.35
4.82
6.84
8.20
6.12
6.50
5.57
7.73
8.12
5.01
6.52
4.87
7.18
8.30
4.57
6.33
4.92
7.75
7.65
1996 Blackwell Science Ltd, Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
Results from neural networks, however, are more accurate. This led us to believe
that neural network-based cost estimation models are superior to regressionbased models. The average estimation error rate of neural networks fluctuated
around 10%, except that N22 had surprisingly high error rate. We attributed this
to the initial higher convergence error, as stated earlier.
In Experiment 1, we found that increasing the number of layers significantly
improved the cost estimation performance of neural networks. The superiority of
N14 over the other networks in Experiment 1 was clearly demonstrated in Table
2. In experiment 2, although we observed that during the training stage the
increase of nodes in each layer reduced the convergence error level, we found no
evidence that the expansion of nodes had a significant effect on cost estimating
performance.
In general we believe that, if sufficient historical cost data is available, neural
networks can be trained to intelligently assist estimators. Results from neural
networks provide a starting point to estimators for a given estimating task.
CONCLUSIONS
From previous experimental results, the neural network-based cost model looks
very promising. The neural network-based cost model captures the intuition of
cost estimators in making cost decisions. Using a neural network as a representation for cost estimation does not make a priori assumptions about the cost
behaviour, but rather bases its estimation of cost on the training data. This
representation can be easily updated when new cost data become available. Thus,
the neural network model provides a flexible, dynamic medium in which to
represent the state of cost estimation expertise as it evolves.
From the experimental results, we found that neural networks performed better
in modelling cost estimation than regression-based methods. From the results of
Experiment 1, we found that increasing the number of hidden layers improved the
estimation performance of neural networks. But there was no evidence in
experiment 2 to suggest that increasing nodes in each hidden layer improved the
estimation performance of neural networks, although in the training stage neural
networks with more nodes in each layer had lower convergence error levels.
In training neural networks we experienced all the difficulties with the backpropagation training algorithm outlined by Wasserman (1989) and Hecht-Nielsen (1990). These problems include the tendency of the network to become
trapped in local optima, to suffer from network paralysis as the weights move to
higher values, and to become temporarily unstable - that is, to forget what it has
already learned as it learns a few facts. To solve these problems, genetic algorithms
can be used to adjust the weight distribution in a neural network (Dorsey & Mayer
1992; Hegazy & Moselhi 1994). Further research effort is needed to incorporate
the use of genetic algorithms to minimize these problems.
1996 Blackwell Science Ltd. Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
77
78
Li, H.
References
Ahmad, I. & Minkarah, I. (1988) An expert system for selecting bid markup. Proceedings of Fifth
Conference on Computing in Civil Engineering, ASCE, 229-238.
Beeston, D.T. (1983) Statistical Methods for Building Price Data. E&FN Spon, London.
Dorsey, R.E. & Mayer, W.J. (1992) Optimization using genetic algorithms. In Advances in Artificial
Intelligence in Economics, Finance and Management (eds. A.B. Whinston & J.D. Johnson). JAI Press,
Green-which, C T .
Ersoy, O. (1990) Tutorial at Hawaii International Conference on Systems Science. Morgan Kaufmann, U.S.A.
Harris, F. & McCaffer, R. (1983) Modern Construction Management, 2nd edn. Collins Publishing
Limited, UK.
Hecht-Nielsen, R. (1990) Counterpropagation networks. Applied Optics, 26(23), 4979-4984.
Hegazy, T . & Moselhi, O. (1994) Analogy-based solution to markup estimation problem. ASCE
Journal of Computing in Civil Engineering, 8(1), 72-87.
Kohonen, T. (1989) Self-organization and associative memory. Spring-Verlag, New York.
LeCun, Y. (1986) Learning processes in an asymmetric threshold network. In Disordered Systems and
Biological Organization (eds. Eienenstock, F. Fogelman Souli & G. Weisbuch). Springer, Berlin.
McCaffer, R. (1976) Contractor's bidding behavior and lender price prediction. PhD thesis, Loughborough
University of Technology.
Morrison, N. & Stevens, S. (1981) Cost planning and computers. Report for the Property Services
Agency. University of Reading, UK.
Moselhi, O. & Hegazy, T . (1991) Neural networks as tools in construction. ASCE Journal of Construction and Engineering Management, 117(4), 606-625.
Murtaza, M.B. & Fisher, D.J. (1994) Neuromodex - neural system for modular construction decision
making. ASCE Journal of Computing in Civil Engineering, 8(2), 221-233.
Ogunlana, S. & Thorpe, A. (1991). The nature of estimating accuracy, developing correct associations.
Building and Environment, 26(2), 77-86.
Pao, Y.H. (1989) Adaptive Pattern Recognition and Neural Networks. Addison-Wesley, Reading.
Rosenblatt, F. (1958) The perception: a probabilistic model for information storage and organization
in the brain. Psychological Review, 65, 386-408.
Rumelhart, D.E. & McClelland, J.L. (1986a) A general framework for parallel distributed processing.
In Parallel Distributed Processing: Exploration in the Microstructure of Cognition, Vol. 1, Foundations (eds
D.E. Rumelhart & J.L. McClelland). MIT Press, Cambridge, Massachusetts.
Rumelhart, D.E. & McClelland, J.L. (1986b) Learning internal representation by error propagation. In
Parallel Distributed Processing: Exploration in the Microstructure of Cognition, Vol. 1, Foundation (eds
D.E. Rumelhart & J.L. McClelland). M I T Press, Cambridge, Massachusetts.
Seydel, J. & Olson, D. (1990) Bids considering multiple criteria. Journal of Construction Engineering
Management, ASCE, 116(4), 609-623.
Skitmore, M. (1988) Factors affecting estimating accuracy. Cost Engineering, 30(12), 12-17.
Skitmore, M. (1989) Contract Bidding in Construction. Longman Scientific & Technical Publications,
UK.
Tavakoli, A. & Utomo, J. (1989) Bid markup assistant: an expert system. Journal of Cost Engineering,
31(6), 28-33.
Wasserman, P.D. (1989) Neural Computing: Theory and Practice. Yan Nostrand Reinhold, New York.
Woodward, J.F. (1975) Quantitative Methods in Construction Management and Design. Macmillan,
London.
APPENDIX
The stepwise linear equations discovery (SLED) system is a regression-based
numerical modelling system that employs a guided searching method to generalize stepwise linear equations to approximate complex relationships between
cost variables. The SLED module anticipates that past cost data collected from
construction projects may be correlated in some nonlinear way and may not lend
1996 Blackwell Science Ltd. Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
themselves to fit only one linear equation within a given error bound and coverage
rate. However, the nonlinearity could be approximated by a set of stepwise linear
equations. SLED is thus designated to learn such stepwise linear equations.
SLED employs a linear formula template (Z=a1X1+...+h
sanXn+b) in
generalizing stepwise empirical equations, where Z is the independent variable
(X1, X2,...Xn)
are dependent variables, a1, a2,..., an and b are constants. Two
criteria are used to control the searching of stepwise empirical formulas. MR is the
majority ratio denoting the minimum fraction of cost data that must be covered by
an equation within 1 , is the error bound. The majority ratio also exhibits the
accredibility of the equation.
Figure 4 illustrates (though a two-variable example) how nonlinearly dis
tributed datasets can be modelled by a set of linear equations. Given two cost
variables X and Y, and their values as xj, yj, j = 1, n; where n is the total number of
projects, that are extracted from the completed cost estimates, the mechanism for
generating stepwise linear equations is outlined by the pseudo code listed in Fig.
5, where the generalization mechanism is illustrated by a two-variable regression
problem.
The learning process can be best described as a searching process guided by the
control criteria (MR and ). It first sorts input datasets in sequence according to
the distance from (xj, yj) to the origin (0,0). Then the learning module incre
mentally searches through the data space in order to retrieve the data segments
that fit leaner equations. A data segment is an annular shape defined by two radii
d1 and d2, as shown in Fig. 4. For each data segment, SLED uses the linear
equation Z = a 1 X 1 + a 2 X 2 + = . . . + a n X n + + b to model the data enclosed in the
segment. After computing constants a, b and formulating the equation, the
learning module measures the percentage of the examples in a data segment which
are covered by the equation in the 1 bound. All equations that cover at least
1996 Blackwell Science Ltd. Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
79
80
Li, H.
Xmin = X(i_first)
Ymin = Y(i_first)
Xmax = X(i_second)
Ymax = Y(i_second)
Loop2:
i_second = i_second + 1
a = a1
b = b1
{compute a1, kb by regression to construct equation X = a 1 Y+b 1 based on
datasets (X(j), Y(j), j = i_first to i_second)}
evaluate equation X = a 1 Y+b 1 by criteria MR and .
IF (evaluation is satisfactory) THEN
GOTO Loop2
Xmax = X(i_second)
Ymax = Y(i_second)
ELSE
OUTPUT {X = aY+b,
(XminXXmax; YminYYmax)}
IF (i_second = n) THEN
EXIT
ELSE
i_first = i_second
i_second = i_second + 1
GOTO Loop1
END IF
END IF
1996 Blackwell Science Ltd, Engineering, Construction and Architectural Management 3 | 1, 2, 69-81
1, 2, 69-81
81