You are on page 1of 22

426 European Journal of Operational Research 74 (1994)426-447

North-Holland

Theory and Methodology

Highlights and critical points


in the theory and application
of the Analytic Hierarchy Process
Thomas L. S a a t y
J.M. Katz Graduate School of Business, University of Pittsburgh, Pittsburgh, PA 15260, USA

Received June 1992; revised December 1992

Abstract: This paper provides a detailed discussion with references on the fundamentals of the Analytic
Hierarchy Process and in particular of relative measurement. The points discussed are grouped under
the following categories: Structure in the A H P - Hierarchies and Networks, Scales of Measurement,
Judgments, Consistency and the Eigenvector, Synthesis and Normative vs. Descriptive. The paper also
includes a discussion of rank and a number of citations of rank reversals attributed to a variety of factors
ranging from intransitivity to procedure invariance, that are thought to be unexplained by Utility Theory
with its underlying principle to always preserve rank. It is shown that when there is synergy due to the
number of elements the A H P can be used to both preserve rank when it is desired to preserve it and
allow it to reverse when it should reverse.

I. Introduction

The Analytic Hierarchy Process (AHP) is a theory of measurement concerned with deriving domi-
nance priorities from paired comparisons of homogeneous elements with respect to a common criterion
or attribute. Such measurement can be extended to nonhomogeneous elements through 'clustering.' In a
multicriteria setting, the A H P can be used to scale elements in a hierarchy (feedforward) structure with
mutually independent elements in each level, or in a network (feedforward-feedback) system of
components allowing for dependence within and between components. Thus a hierarchy is a special case
of the more general system formulation, the network. Applications of the A H P have included parallel
hierarchies (one for benefits and one for costs), and solitary hierarchies (projected and idealized
planning, resource allocation). More complex applications of the A H P include the case of an infinite
number of elements (via Fredholm integral operators), and the modelling of neural firing and its
synthesis.
Because of its widespread use, the A H P has been repeatedly put under the microscope and every
aspect has been examined, questioned, and explained [16]. This is a healthy process for a new theory.
Replies to individual queries have appeared in the literature, and the theory has been extended over the

0377-2217/94/$07.00 © 1994 - Elsevier Science B.V. All rights reserved


SSDI 0377-2217(94)E0227-O
T.L. Saaty / Analytic Hierarchy Process 427

years. This paper provides a more detailed account of some of the basic technical and behavioral aspects
of the subject.
A number of applications of the AHP have been published in the literature. In a recent book, 'The
Silverlake Project', by IBM [3], the subject occupies an entire chapter in which it is said, " A H P is an
extraordinarily powerful decision-making tool. It brings structure to decision making, yet it's f l e x i b l e . . . "
Using the AHP, IBM won the Malcolm Baldrige National Quality Award for producing the best-selling
computer, the AS/400. In another application to household population forecasts, Cook, Falchi, and
Mariano used the A H P to evaluate the impacts of cross-sectional variables which cannot be explicitly
captured in a time-series approach [8]. J.F. Bard used the A H P to rank Pareto-optimal solutions for
selecting automation options [2]. R.P. Hamalainen used the AHP to structure and set priorities on
alternative methods of power generation in Finland by working with members of the parliament of that
nation [18]. Golden, Wasil and Harker edited a book with a dozen applications of AHP in project
selection, the electric utility industry, the federal government, and others in medicine, politics, engineer-
ing and business [17]. In 1977 this author won an award from the Institute of Management Sciences on
an application of the AHP in transport planning in the Sudan [29,30]. Numerous applications of the AHP
have been made in industry and government by using the software package Expert Choice. The
proceedings of the First International Symposium on A H P held in Tianjin, China include a few of several
hundred applications (estimated by a member of the Academy of Sciences of China) of the AHP by the
government of China [28]. The Xerox Corporation has institutionalized use of the AHP in their strategic
decision making. It is also used by the Departments of Defense and of Energy in Washington.
Special issues of various journals have been dedicated to the AHP; two by Socio-Economic Planning
Sciences [46,47], two by Mathematical Modelling [24,25], and one by the European Journal of Opera-
tional Research [14]. Since 1989, a semi-annual journal called AHP and Decision Making has been
published by the Science Press in Bejing, China. A special issue of the Communications of the
Operations Research Society of Japan is about the A H P [10]. Nearly twenty books and translations have
appeared on the subject in several languages including Chinese, Japanese, Indonesian, French, German,
Russian and Portuguese. Several software packages implementing A H P have been reviewed in a recent
issue of the Journal of Multi-Criteria Decision Analysis [6]. There was a Second International Sympo-
sium on A H P in Pittsburgh in 1991 with proceedings [27]. Also, two survey articles have appeared on the
subject [48,49]. This paper includes five sections to analyze the foundations and questions raised about
the AHP together with our explanation why the theory is as it is today: 1) Structure in the A H P -
Hierarchies and Networks, 2) Scales of Measurement, 3) Judgments, Consistency and the Eigenvector, 4)
Synthesis, and 5) Normative-Descriptive.

2. Structure in the A H P - h i e r a r c h i e s a n d networks

The AHP emphasizes the use of elaborate structures to represent a decision problem. The soundness
of a decision is reflected at least as much by the richness and accuracy of the structure and relations in
the structure, as it is in assigning and manipulating numbers according to some theory. In fact, a
reasonable test of the efficiency of numerical measurement is how well it can deal with elaborate
structures without undue condensing and alteration of that structure to make it possible to perform the
manipulations. How well the mathematical operations generalize to cases of dependence within and
between clusters of elements is the acid test as to whether the theory seems natural or contrived. With its
use of relative ratio scales and corresponding formal operations, the A H P is capable of incorporating all
observed dependencies.
A hierarchy is a structure used to represent the simplest type of functional dependence of one level or
component of a system on another in a sequential manner. It is also a convenient way to decompose a
428 T.L. Saaty / Analytic Hierarchy Process

component

@ ~ m e n t

/1__

A Unear Hierarchy A Nonlinear Network

Q~, * (~) m e a n s that A dominates B or that B d e p e n d s on A.


Figure 1. Structural difference between a linear and a nonlinear network

complex problem in search of cause-effect explanations in steps which form a linear chain. One result of
this approach is to assume the independence of an upper part or cluster of the hierarchy from the
functions of all its lower parts. The structure of hierarchies is linear and proceeds downward from the
most general and less controllable (goals, objectives, criteria, subcriteria) to the more concrete and
controllable factors terminating in the level of alternatives. A useful criterion to check the validity of a
hierarchy is to determine if the elements of an upper level can be used as common attributes to compare
the elements in the level immediately below with each other. There are two general types of hierarchies
that arise in planning. One is the forward process hierarchy which descends from the goal to time
horizons, environmental, political and social factors to actors and their objectives, policies, and so on
down to a level of contrast scenarios from which a composite scenario is derived with the measurement
approach of the AHP. The other is the backward process hierarchy descending from the goal of choosing
a best outcome, to feasible and desired anticipatory scenarios, to problems and opportunities facing the
future, to actors who control the problems and opportunities, their objectives and policies, to the most
likely policies of a particular actor to influence the other actors. Planning involves testing the effect of
the high priority policies for some actor in a second forward process hierarchy to close the gap between
the dominant contrast scenarios (or composite scenario) of the forward process and the anticipatory
scenarios of the backward process. The iterations are repeated.
There is a more general way, not necessarily linear, to structure a problem involving functional
dependence. It allows for feedback between clusters. It is a network system of which a hierarchy is a
special case.
Figure 1 depicts the structural difference between the two frameworks of a hierarchy and a network.
A network can be used to identify relationships among components using one's own thoughts,
relatively free of rules. It is especially suited for modeling dependence relations. Such a network
approach makes it possible to represent and analyze interactions and also to synthesize their mutual
effects by a single logical procedure. In a hierarchy we have the outer dependence of the elements in a
level on the elements in the level above. A loop means that there is inner dependence of elements within
a component. In the network diagram shown above, interaction within a cluster is called inner
T.L. Saaty / Analytic Hierarchy Process 429

dependence and interaction between clusters is called outer dependence. Inner dependence is analyzed
with respect to the attributes of a dominating cluster linked to the given cluster. Formal definitions of the
foregoing are found in [39]. In a given problem there may be several networks associated with each of the
criteria of a separate governing design hierarchy called the control hierarchy. Synthesizing interactions
into different networks involves the use of priorities from the control hierarchy.
Hierarchic structures are fundamental to planning and to the analysis of risk. The A H P has been
written on and used extensively in planning [40]. One cannot do planning without considering risk and
uncertainty, and therefore risk and uncertainty have also been addressed in the literature [33,34].
Concern with risk is represented in the hierarchy through scenarios expressing uncertainty to be faced,
through benefits and costs of risky operations and through criteria indicating both uncertain and risky
outcomes. Sensitivity analysis is used to test the effect of risky factors on the outcome. The use of
lotteries on a few tangible criteria as it is traditionally used provides no guarantee of a thorough
treatment of risk under varying environmental, political, economic and other conditions. Such thinking
requires an adequate structure to capture it.

3. Scales of measurement

Since there do not exist numerical units applicable in all domains, scales need to be generated to solve
problems according to specific goals. Despite this dilemma, traditional methods continue to use
homogeneous linear scales with a unit. It is this issue, however, which substantially differentiates AHP
from these approaches. Prior to this millennium, there were very few scales of measurement, if any, and
negative numbers had not been formalized. Yet it is incorrect to assume that decisions made in the past
were inferior because of the lack of scales of measurement. All people share a mental ability to perform
the more general approach of relative measurement to capture priorities without systematic organization
of the elements of the decision. The A H P organizes and quantifies this very process of thinking.
Decision making involves making tradeoffs reflected through arithmetic operations on the weights
used to represent judgments. The type of arithmetic performed on these weights is a major concern in
multicriteria problems since the freedom to add and multiply measurements does not always exist. For
example, one cannot meaningfully multiply two readings on the Fahrenheit scale. In this quest one is
faced with the question of what kinds of numerical scales there are. Widely known scales include ordinal
(invariant under strictly monotone increasing transformations), interval (invariant under positive linear
transformations), ratio (invariant under positive similarity transformations), and absolute (invariant
under the identity transformation) scales. A decision theory needs to justify the way it elicits judgments,
converts them to numbers, and produces an overall answer that belongs to one of these scales.
Additionally, the outcome of the calculation must remain in the same scale or change continuously if the
structure of the decision were to be extended or perturbed. What scale admits such flexibility?

3.1. Ratio and interval scales

Since ordinal scales cannot be added or multiplied, we need to examine interval and ratio scales as
candidate scales for decision making.
Interval scales have the form ax + b with a > 0, b 4: 0. When b = 0, we have a ratio scale. In terms of
perturbations of b, the two scales are related. One can add or subtract interval scale numbers but not
multiply or divide them. Ratio scales permit manipulation with respect to all four arithmetic operations
keeping subtraction appropriately positive.
Throughout the hierarchic structure, the A H P elicits judgments in the form of absolute numbers and
430 T.L. Saaty / Analytic Hierarchy Process

derives the ranks on a ratio scale of priorities. These interim scales are weighted to produce commensu-
rate ratio scales that can later be added to obtain an overall ranking of the alternatives. There are no
scaling constants for the criteria in the A H P with indeterminate scale affiliation as there is in other
theories using interval scales [23]. As in the t e m p e r a t u r e example, interval scale numbers cannot be used
to weight interval scale numbers. When the alternatives rated on an interval scale in one decision
become criteria for the alternatives of a new decision also measured on an interval scale, it is a puzzle
how one abandons the old calculations and begins new calculations to make things compatible. This is a
weakness of interval scales in decision making. An example of this occurrence is when policy alternatives
of one decision become criteria to judge consequences of policies as new alternatives. To use such an
approach to derive weights for the consequences, different technical and philosophical procedures are
needed to avoid multiplication of the two interval scale results! With ratio scales, one weights the ratios
from the consequences by the priority ratios of the criteria, which is what one would do anyway
continuing the previous composition operations of weighting and adding.
Consider a person given a choice between paying $1 with an assurance of a $10 return, or paying $2 to
get a $15 return. No gambles are involved in this transaction and one would reason 15 - 2 = 13 is more
than 1 0 - 1 = 9 and chooses to pay $2. It would not be correct to form the ratios 1 5 / 2 = 7.5 and
10/1 = 10. This transaction should not be determined by b e n e f i t / c o s t ratios, but by differences. So when
does one use benefit cost ratios? Answer: when dealing with two different ratio scales (for example, when
the rewards are gambles). Then there is one ratio scale for the reward and one for the gamble, and the
choice should be based on b e n e f i t / c o s t ratios. The benefit belongs to a different scale than the cost
represented by the gamble taken to earn it. Two different ratio scales cannot be added or subtracted -
only multiplied or divided. However, the resulting ratios now belong to the same scale among themselves
and all four operations can be p e r f o r m e d on them again. According to the criterion of maximizing
expected value, gambling $1 to gain $10 is preferred to gambling $2 to gain $15 if the probability of
getting $10 is less than 1/6. As the risk decreases (the probability of getting $10 converges to 1) one
should again turn to differences and the gamble with the greater difference should be taken. However,
generally speaking, under risk one should use ratios.

3.2. Absolute and relatiue scales

Abstractly, absolute m e a s u r e m e n t is the comparison of some value on a scale with the unit value of
the scale. The task is to define the meaning of a unit before using it to construct the scale. This is rarely
adequate in practice. Sometimes a range is first given and the unit is suitably defined without ascribing
d e e p e r meaning to the operation. In essence a unit is arbitrary but needs consensus for its adoption.
A major philosophical difference between the A H P and theories based on absolute m e a s u r e m e n t is
that the latter require units of m e a s u r e m e n t to tradeoff weights for the criteria or attributes. For
example, for characteristics such as style and color, one must have an existing scale to measure style, and
another for color. Further, some means to trade off a unit of one against a unit of the other is required.
In the AHP, relative m e a s u r e m e n t does not tradeoff units in the same way, because m e a s u r e m e n t
ascends from paired comparisons to derive (rather than assume) a scale.
M e a s u r e m e n t of p h e n o m e n a on absolute scales is thought to be a practical but not a profound way to
associate numbers with the p h e n o m e n a by using a conveniently chosen and mostly uniformly graded
instrument that is then applied to associate numbers with objects and events. These numbers are
surrogates, indicators, or stimuli to the mind educated about the significance of the magnitude of the
number, in terms of the goals and understanding of an individual. For a group, its m e m b e r s would have
to agree on how m e a s u r e m e n t s are to be interpreted to lend credence to an objective acceptance. Thus
m e a s u r e m e n t s have no intrinsic significance apart from the people who use them. Significance itself can
be represented in the form of priorities with respect to a hierarchy of objectives. These priorities need
T.L. Saaty /Analytic Hierarchy Process 431

not behave linearly or monotonically or even correspond to the original numbers. To the blind, the
comparison of two alternatives according to visual brightness is impossible, thus an absolute scale for
such comparison is valueless. But this person may be able to feel or judge brightness by comparing the
heat or electric intensity generated by the light, which has a meaning. Hence, it is imperative to
understand how to transform absolute m e a s u r e m e n t s to priorities to take advantage of both qualitative
and quantitative information.
Two types of relative scales are encountered in multicriteria decisions. One is obtained by directly
converting absolute numbers to relative numbers as for example by dividing by their sum, known as
normalization. The other is in the form of priorities derived from comparisons of elements relative to a
common attribute or from comparison of numbers associated with them. The comparison of homoge-
neous elements, which fall within an order of magnitude, is made in terms of the dominance of one
element over another. This relationship is expressed as the number of times the more preferred entity is
preferred to the less preferred entity. The latter serves as the unit of comparison. Comparisons
expressing individual preferences are an innate ability of the mind. Measurement on an absolute scale
simply complies with the mechanics of associating a reading with its object. What the numbers mean is
not and cannot be designed into the scale. Note that absolute m e a s u r e m e n t applies to elements one at a
time. Relative m e a s u r e m e n t is based on comparing elements in multiples. Without priorities to interpret
and quantify information, one would need to create an infinite number of different homogeneous
m e a s u r e m e n t scales with a unit, one for each of the infinite number of properties known and would still
need a way to interpret and combine the resulting information in order to make a decision. Additionally,
one cannot reduce ethics, religion, happiness, politics, and society into fixed units as some theories do by
converting to dollars in an attempt to unify all m e a s u r e m e n t s into some absolute scale with a unit. In
constructing priorities, consideration of how science generates and processes m e a s u r e m e n t should be
taken. When priorities are derived from ratios of measurements, they must give back their measurements
as they do in the AHP. Otherwise, setting priorities would itself be a contrived number crunching
process.

3.3. Tangibles and intangibles

We note that it is not meaningful to take numbers on the same absolute scale used to measure
alternatives for different criteria, normalize them for each criterion and then weight and synthesize and
achieve the same rank results. While it is true that a set of numbers such as dollars can be normalized to
develop a relative score, it does not follow that normalizing such numbers with respect to several criteria
and weighting by the priorities of the criteria and adding yields the same (correct) result that would be
obtained by weighting the numbers before normalization and then adding. The reason is that the
mapping from absolute to relative numbers is many to one ( 2 / 1 = 4 / 2 = 8 / 4 = 2000/1000 . . . . ). The
elements whose absolute measurements are 8 and 4 do not fall in the same cluster as those whose
absolute m e a s u r e m e n t s are 2000 and 1000. This is how the A H P distinguishes between magnitudes
before ratios are formed. It is not possible to recover the original set. Thus to use absolute numbers
directly in the A H P for a priority scale, all those absolute measurements with respect to a single scale
must first be composed with respect to their several criteria. This leads to a single composite criterion
representing all the tangible criteria measured on that unit. The same process is repeated if other units
are used. It is only then that tangibles and intangibles are combined in a hierarchic framework.
Alternatively, all tangible criteria may be treated as intangibles in the hierarchy without using their unit
directly but benefiting from the measurements to carefully make paired comparisons on a priority scale.
Only then can they be normalized to combine with tangible criteria on a relative scale.
The transformation of absolute numbers to relative numbers has little influence over how meaning is
assigned to generate priorities on a relative scale whose ratios may not be the same as those of the
432 T.L. Saaty / Analytic Hierarchy Process

corresponding absolute numbers. Priorities should not be combined with measurements unless they
coincide with them, in which case no difficulties arise. However priorities based on information from
different scales are a generalization that requires comparison of the criteria with respect to a higher
criterion. For emphasis, note that after absolute numbers are converted to priorities, one cannot take the
final scale and treat it arithmetically as if it is still the original scale of absolute numbers.
Suppose we wished to determine the best of three vacation sites A, B, C relative to travel cost and
lodging costs as shown in the following table:

Alternatives Criteria
C1 C2
Travel cost ($) Lodging cost ($)
A 50 200
B 100 170
C 150 230
Total cost 300 600

The overall costs are given by

Cost A = 50 + 200 = 250, (minimum) (la)

Cost B = 100 + 170 = 270, (lb)

Cost C = 150 + 230 = 380, (maximum) (lc)

and thus A would be the preferred site (minimized cost).


The same problem can be studied with a hierarchic interpretation using a relative scale. Upon
multiplication and division of the costs of the alternatives for each criterion by the total costs for that
criterion and adding we obtain

Cost A = 300 x ( 5 0 / 3 0 0 ) + 600 × ( 2 0 0 / 6 0 0 ) , (2a)

Cost B = 300 × (100/300) + 600 × ( 1 7 0 / 6 0 0 ) , (2b)

Cost C = 300 x (150/300) + 600 x ( 2 3 0 / 6 0 0 ) , (2c)


which yields the same results as (1). The criteria derived their importance from the alternatives because
dollars have the same priority for both criteria. However, this is not always true for any unit and any
criterion. The quantities 300 and 600 can be used to determine the relative priorities of the criteria C 1
and C 2 (the ratio of 300 to 600 is 1 to 2, ~1 to 1 or 0.333 to 0.667). If we compare these with respect to the
goal of selecting the best vacation site, we have

Goal CI C2 Priorities
CI 1 l 0.333
C2 2 1 0.667

For comparison of A, B, and C with respect to each criterion, use the ratios of the quantities in (2) for
the costs which are the relative values. When A is compared to B relative to travel one has (50/300) +
1
(100/300) = 7 and so on. For each criterion the following matrices arise:
T.L. Saaty / Analytic Hierarchy Process 433

C1 A B C Pri.
A 1 1/2 1/3 0.167
B 2 1 2/3 0.333
C 3 3/2 1 0.500

C2 A B C Pri,
A 1 2/1.7 2/2.3 0.333
B 1.7/2 1 1.7/2.3 0.283
C 2,3/2 2.3/1.73 1 0.383

Weighting by the corresponding criteria priorities and adding yields

Cost A = 0.333 × 0.167 + 0.667 × 0.333 = 0.278, (minimum) (3a)


Cost B = 0.333 × 0.333 + 0.667 x 0.283 = 0.300, (3b)

Cost C = 0.333 × 0.500 + 0.667 × 0.383 = 0.422, (maximum) (3c)

which is the same as (1) and (2). Thus the costs obtained by additive hierarchic composition lead to the
same solution as an appropriate analysis of the original data.
For elements of the same order of magnitude (homogeneous), the paired comparison judgments in the
matrices may be approximated by values from the scale 1-9 based on perception. This is useful when
there are no known numerical values to form the ratios. In such a case, the matrices of the above
example become:

C1 A B C Priorities
A 1 1/2 1/3 0.163
B 2 1 1/2 0.297
C 3 2 1 0.540

C2 A B C Priorities
A 1 1 1 0.333
B 1 1 1 0.333
C 1 1 1 0.333

Cost A = 0.333 x 0.163 + 0.667 × 0.333 = 0.276, (minimum) (4a)

Cost B = 0.333 × 0.297 + 0.667 × 0.333 = 0.321, (4b)

Cost C = 0.333 × 0.540 + 0.667 × 0.333 = 0.402, (maximum) (4c)

and A is again the preferred alternative. The approximation using a 1-9 scale could lead to a different
choice than the best one, but there would be no need to approximate if exact numbers are known.
However in general one needs to compare the dollar values according to the importance of their
magnitudes. The numerical differences between them may not be an adequate indicator of significance
to the decision maker.
The example demonstrates that when the criteria weights are described in terms of the unit of
measurement of the alternatives, the arithmetic operations of the A H P can be used to duplicate with
relative numbers the answers one gets with absolute numbers. Yet that is not the purpose of the AHP.
The priorities associated with numbers may not vary linearly or monotonically with those numbers. In
fact for each problem the priorities would satisfy the needs of that problem according to the judgments
of the involved individual or group.
434 T.L. Saaty /Analytic Hierarchy Process

In a second example one unit of m e a s u r e m e n t is also used, but because the criteria do not signify the
units in which the alternatives are measured, their priorities do not derive from the m e a s u r e m e n t of the
alternatives, and the foregoing process is no longer valid to derive the weights of the criteria. Suppose we
have three foods and their content measured in milligrams for the two criteria, vitamin X and vitamin Y.
The importance of the criteria is no longer determined by the total or average milligram content of the
alternatives as before, but rather by the needs of the body for that vitamin to remain healthy.

Alternatives Criteria
Vitamin X Vitamin Y
milligrams milligrams
per unit per unit
A 50 200
B 100 170
C 150 230
Total 300 600

It may be harmful to get an excessive amount of one vitamin but healthy to get such an amount from
the other. We must establish priorities by comparing the criteria with respect to healthful contribution,
and the alternatives' milligram content for their positive contribution. The actual measurements and
their totals cannot determine the single best food to eat in small quantities.

3.4. Priority scales

Priority scales (which are derived ratio scales) are essential in multicriteria decisions. One may be
tempted to use readings from existing scales, but a single ordinary scale may not be unique. For example,
t e m p e r a t u r e can be measured either with a mercury or an alcohol t h e r m o m e t e r and both readings
belong to an interval scale. Even when only mercury is used there can be different calibrations, such as
the Fahrenheit and Celsius scales with different size units. Different numbers are produced for
m e a s u r e m e n t s with different scales. Measurements from different scales cannot be combined because
they are not commensurate. In the A H P , all such measurements are transformed into a uniform priority
scale based on the judged importance or preference for different readings as they affect our objectives.
The goal has a priority scale to measure the criteria. Each criterion in turn has a priority scale to
measure the alternatives. When the priority of the criteria is used to weight the priorities of that
alternative with respect to each of those criteria, and the sum then taken, the result is the priority of the
alternative with respect to the goal and the m e a s u r e m e n t s of the alternatives define a scale for
measuring them in terms of the goal.
Physics and mathematics extend some one-dimensional absolute measurements to multiple dimen-
sions such as areas, volumes, and m e a s u r e m e n t in higher dimensions. There are two ways to convert
these m e a s u r e m e n t s to relative numbers. The first is to compare areas (or volumes) with other areas (or
volumes) directly. The second is to use logarithms which enable one to compose higher dimensional
m e a s u r e m e n t s by adding corresponding one dimensional measurements, instead of multiplying them.
(This kind of transformation was discovered by Fechner to be in use by the mind to respond to stimuli
measured on an absolute scale [35].) Numbers associated with areas, volumes, or even lengths can be
suspect. A blanket whose total area is known to be adequate but is full of holes would not be a suitable
cover. A farmer can grow crops on a narrow elongated strip of land, that would be useless to a tennis
player who needs an area of the same magnitude with prescribed dimensions, to play tennis. Readings on
the same scale may have opposite interpretations in different settings. Freezing and boiling are both
good for preserving food, mid temperatures undesirable, whereas the opposite applies to human comfort.
T.L. Saaty / Analytic Hierarchy Process 435

While it is helpful to have numerical measurements, interpretation differs from problem to problem.
Priorities are needed to determine what numbers imply about the underlying situation. Higher dimen-
sions have properties not reflected in the arithmetic of composing lower dimensional measurements and
must be c o m p a r e d as they are on these properties to create meaningful priorities.
Observations made about numerical m e a s u r e m e n t s as stimuli to be interpreted apply equally to
probabilities generated directly or through Bayes T h e o r e m from posterior to prior by complex and
laborious calculations. Probabilities are not esoteric numbers that have deeper meaning than other
numbers used as stimuli. Saaty and Vargas [39] have proved that the supermatrix formulation of the A H P
implies Bayes Theorem, thus making probabilities part of the A H P priority framework. The outcome of
probability computations have different implications in the mind of a decision maker about what they
m e a n depending on how high or how low the probabilities are. Traditional expected value reasoning is
not the only way nor always the best way to apply to every decision. Playing with probabilities is often a
numbers game because of the fascination generated in some academic minds due to the intricacies of
thought involved in the calculations and not because the significance of the probabilities in a decision is
an automatic consequence of the numerical probability.

4. Judgments, consistency and the eigenvector

In general people can only compare stimuli in a limited range where their perception is sensitive
enough to make distinctions. The range cannot be too wide or they will err. When the range is too wide
elements that are close together tend to be summarily lumped together. The A H P uses the fundamental
scale of absolute values 1-9 to represent paired comparison judgments to keep m e a s u r e m e n t within the
same order of magnitude. To cope with elements that may be wide apart, they are put in separate
homogeneous clusters linked by a common element from one to the next to allow connectedness in the
scaling operation, and the scale 1-9 is applied to compare elements in each cluster. The 1-9 scale can
also be used in decimal form to compare elements in a unit interval when the priorities of various
alternatives fall in that interval and distinctions need to be made among them. For example the paired
comparisons could use from 1.1 to 1.9, or any other interval, say 7.1 to 7.9, using the same semantic scale
applied to the decimals if one has the perception capability to do so.
In any comparison pair, the smaller of two elements A and B, say A, is taken as the unit and the
larger one is estimated as a multiple of that unit. If we assume, as we always do, that we can derive
weights, wA and w B for the elements on a ratio scale, then the absolute number assigned to the larger
element is an estimate of this ratio and the comparison takes the form ( W A / W s ) / 1 . In the absence of a
scale for measuring every element (which, due to the paucity of known scales, is nearly always the case)
this approach is the most precise way to measure elements with respect to a property.
Every person is more familiar with language when making distinctions in the strength of their feelings
than they are with numbers. In the A H P the semantic scale for making homogeneous comparisons uses
the concepts of: equal, between, moderate, between, strong, between, very strong, between and extreme.
'Between' indicates compromise between the two values on both sides of the word. The values 1-9 are
assigned to the concepts sequentially starting with equal. (The fundamental scale 1-9 which is a very
small part of the scale 1 to ~ may be viewed as a scale for comparing perturbations of the identity
element.)
Why not use a power scale x, x 2. . . . . x n to match the semantic scale [22] instead of the 1-9 scale?
Isn't it better for preserving transitivity of preferences to use a power scale? Semantics have been found
to be very effective in producing a derived scale that matches known results. In using a power scale or
some other scale with these semantics there is the additional problem of choosing a base. Any base
chosen has to be justified on some empirical ground and can change from problem to problem. It is well
436 T.L. Saaty / Analytic Hierarchy Process

known from cognitive psychology that people are unable to make accurate comparisons past a certain
upper bound. Any scale that extends beyond that upper bound (and power scales do so very rapidly) is
likely to be inaccurate. When applied to physical phenomena beyond our ability to perceive or respond
to, the scale would lead to failure. Again we note for emphasis that there is no need to extend the scale,
instead, one should apply clustering as described earlier. Note that the values are too close in the case of
homogeneous clusters to be able to distinguish between a linear and a nonlinear approximation in a
small interval. The advantage of a power scale is that it would better enable one to enforce transitivity
(A > B, B > C imply A > C). The disadvantage is the sacrifice in validity because such a scale jumps too
far out of range too fast to be of any use when one tries to tie it to human perceptions. With the 1-9
scale one improves transitivity by using a method of consistency improvement on a tighter arithmetic
scale rather than quickly expanding past the range of perception (and practicality) with a geometric
power scale. This is best seen by considering an element and its reciprocal transpose in the matrix. One
may overestimate the value, but the reciprocal entry underestimates it. Both are used to compute the
eigenvector which captures intransitivities in the derived scale. This is a far better way than guessing at
values from a power scale to improve transitivity, but sacrificing accuracy which is captured through
redundancy. If instead one were to use both a power scale and consistency improvement the result is a
derived scale which, despite transitivity, is inaccurate as it must conform to the wider ranging power
scale.
A H P questions (to elicit judgment) differ from utility theory questions. Because scales are needed to
trade off units between two attributes, utility theory must answer the question: how many units of one
attribute can be traded off with how many units of another? In the process of weighting the alternatives,
scaling constants are derived to establish the final ranks of the alternatives [21]. In the A H P the question
generally is: which of two attributes (alternatives) is more important, preferred or likely with respect to a
higher level attribute? Forming ratios is identical with taking the lesser criterion (alternative) as the unit
and estimating the other as a multiple of it. The process is repeatedly applied to implicitly trade off units
for every pair of elements throughout the structure in the A H P and deriving the scale for trading off a
unit of one criterion against a unit of the other. The scale derived from the comparisons yields a priority
scale that simultaneously trades units of all the elements in the comparison set.
The A H P need not carry out the full set of l n ( n - 1) comparisons although redundancy is desirable to
produce a valid scale (a good approximation to an underlying ratio scale). Yet there are instances where
one may want to reduce the number of comparisons to near the minimum, n - 1, needed to derive a
ratio scale. There are several ways proposed to do this. One is to automatically replace a judgment in the
i, j position by the geometric mean of all ratios that yield aij in the consistency relation aij- ajk = aik and
then derive the scale. Another way is to put zeros for the unknown judgments [19-21].
Paired instead of multiple comparisons are used because the object is to derive a ratio scale of relative
values. The redundancy of paired comparisons helps improve the validity of the resulting ratio scale, and
draws the maximum information from the judgments. Inconsistency measurement can be used to
determine which judgment needs to be reconsidered in view of the other judgments, and actually
calculates a more desirable consistent value if it were acceptable to the judge.

4.1. Types and uses o f dominance in making comparisons

Dominance is the basic link between relations among qualities, and corresponding relations among
magnitudes associated with these qualities. Dominance requires a relative criterion or goal in order for
one entity to dominate another. Dominance precedes and is needed to derive utility. To determine
dominance, the following question is asked: Which of each pair of given elements dominates the other,
and how much more important (dominant) is it?
Although genetics plays an important part in shaping personalities and inclinations, humans need
knowledge and experience to form ideas, feelings and preferences. In total, these are used to judge what
T.L. Saaty / Analytic Hierarchy Process 437

is familiar, important, desired, or likely. H u m a n memory is crucial to this process, however accurate or
inaccurate it may be. When faced with choices, humans tend to silently ask: Do I like this? Is it what I
want? Is it what I need? Is it important for this purpose? How important is the purpose itself?
There are different ways to ask the question of dominance in the AHP. Relative to conditions, one
needs to ask which is more important; relative to opportunities, one needs to ask which is more
preferred; relative to possibilities, one needs to ask which is more likely. In some problems the conditions
arise in connection with space, matter, and energy. They offer distinctions between sizes, shapes, density,
frequency, and others. Dominance applies in social, political, and behavioral areas. In fact, dominance is
such an essential part of human thinking, that it is difficult to find where it is not used, except perhaps in
the arts where one observes quietly. Even then some feelings tend to be dominant. Dominance is related
to the stimulation and firing of neurons, and is how humans quantify the strength of response to diverse
stimuli [41]. It is this firing that people try to verbalize in the form of judgment. In a decision some or all
of the questions about importance, preference and likelihood may be involved. However, each would be
applied to all the elements in a level of the hierarchy as it relates to the preceding level.
Dominance is useful in other areas that require measurement, such as in predicting the outcome of
interactions of natural phenomena, or in predicting the outcome of decisions based on preferences. In
passing, we note that to predict the outcome of competition, one does not need to answer questions
involving utility, but the concept of dominance is natural for that purpose. The A H P has been used on
numerous occasions to successfully predict the outcome of presidential elections, games, sports and other
forms of competition. Given an outcome in the AHP, one asks: Which is more likely to be the cause of
such an outcome and how important is this cause when compared with that cause? Conversely, given a
cause one asks which is the more likely effect or outcome of this cause. Such questions can only be
answered with respect to a criterion one has in mind. T h e r e are also problems involving natural
conditions imposed on a decision. In all such problems, the outcome is not determined by the preference
or utility of the judge, but by surmising the importance or dominance of the factor involved.

4.2. D o m i n a n c e a m o n g criteria

It has been argued that one cannot assign importance weights to criteria without knowledge of the
ranges of the particular alternatives. The vitamin example shows, that this is not a valid argument.
People compare criteria most frequently without consulting the existing alternatives and their measure-
ments. The weakness arises from focusing exclusively on the absolute m e a s u r e m e n t of alternatives one at
a time, giving rise to the belief that it is the only way to do ranking. Consider an arithmetically illiterate
person who is entirely unschooled in numbers, scales and arithmetic yet is a successful decision maker.
Such an individual can say, without any knowledge about how to scale the alternatives, not only which
alternative is more important on a criterion, but also which criterion is more important, and does it
correctly, not accidentally. We believe that he does it by comparing them.
Note that people can correctly label one attribute as more important than another in an alternative.
For example, one can say that a particular person is a better politician than a scientist. It can even be
noted that this person is a much better politician than a scientist or that (s)he is about as good a
politician as (s)he is a scientist. In the A H P criteria weights are often established with respect to a goal
and independently from the alternatives. From the standpoint of absolute measurement, multiplying the
relative weights of the alternatives by the weights of the criteria can be interpreted as a process of
rescaling the criteria weights. This is analogous to the vitamin example where the ranges of the
alternatives were irrelevant in deciding on the weights of the criteria. When the criteria derive their
importance from the alternatives as if they are emergent properties of a particular set of alternatives, the
A H P establishes their weights in terms of these alternatives. The first is a top down approach and the
second is a bottom up approach. It is interesting to note that when alternatives are tied or nearly tied
relative to the more important criteria, the less important criteria play a decisive role in ranking them.
438 T.L. Saaty /Analytic HierarchyProcess

Researchers have compared the use of dominance information with other forms of data such as
proximity, profile, and conjoint. In making dominance comparisons, the question of what dominates what
must be asked with regard to a third element, a property they have in common or a goal they serve.
Shepard writes [44],
Instead of giving 'i dominates j' the geometrical interpretation 'i falls beyond j in a particular direction', one can
give it the alternative geometrical interpretation 'i falls closer than j to a particular ideal point'. These two
interpretations become equivalent in the special case in which any ideal points move out sufficiently far beyond
the periphery of the configuration of the remaining points.
He also says there is a connection between proximity and dominance; for the first, closeness is taken
between elements alone whereas in the second their closeness is related to an ideal.
Coombs [9] proposed that some combination of attributes is 'ideal' to the decision maker who
compares other elements in the decision with them as a frame of reference. The ideal as a standard helps
to update what is important from experience and use it to evaluate the next choice to make (relative to
what has been experienced and not to the ideal).
Experienced individuals gather and organize information (by repetition) about the impact of certain
criteria in the satisfaction of a higher criterion or goal: the desired ideal. There are two models for
interpreting how criteria are assessed as to importance. The first is the (wi/w j) model which assumes the
elements in a paired comparison are measured in the mind on some inborn or acquired ratio scale and
the ratios are then formed. To justify this, imagine the goal or criterion is divided according to the
contributions of the criteria into ranges of degrees of fulfillment. These ranges cover such intensities as:
negligible, moderate, strong, very strong, and extreme. A geometric scale is then used to assign values to
the ranges and ratios are formed from these values.
The second is the ( w i / w j ) / l model mentioned before in which the numerator is an absolute number,
a multiple of the unit denominator. The goal is divided into ranges of perception, as in the fundamental
scale of the AHP, as multiples of that unit, and a range is chosen and value assigned to the larger
element as a multiple of that unit. Reciprocal comparison in the A H P is the tradeoff of one unit against
another. This rationale explains how a person with no knowledge of actual weights, can hold a stone in
each hand, and judge how many times heavier one is over the other. A person can also ably answer: How
much more important is it for one to study than to do physical exercise to do well on a test? We said
before that when there are absolute scales they do not automatically signify importance and must be
interpreted in relative terms through comparisons. The ability to posit a criterion or goal in the mind,
and make comparisons relative to that goal is what people do. Experience indicates that people can do
this successfully and willingly without being coached. Schoemaker and Waid [43] have compared the
validity of the results obtained through the A H P with those obtained by other methods and found that
direct judgment comparisons give close results on their respective scales.

4.3. The measurement of inconsistency and the principal right eigenvector

The A H P deals with consistency explicitly because in making paired comparisons, just as in thinking,
people do not have the intrinsic logical ability to always be consistent. Thus if we can identify how
serious this inconsistency is and where it can be improved, we can improve the quality of a decision. In
the A H P we both locate the inconsistency and suggest the optimal value to improve it. The suggested
value or some value in that direction may be adopted if it is compatible with the overall understanding.
The measurement of inconsistency and the derived scale must be structurally linked in order to
determine the most inconsistent judgments. Otherwise a general measure of inconsistency given in
statistical terms cannot be brought to focus on particular judgments as one can with the eigenvalue and
the eigenvector.
The eigenvector is associated with the idea of dominance and consistency of judgments. It is the only
way to capture inconsistency in the judgments [37]. The degree of inconsistency is measured by the
T.L. Saaty / Analytic Hierarchy Process 439

deviation of the principal eigenvalue of the matrix of comparisons from the order of the matrix. We use
the right eigenvector, and not the left, because in the paired comparisons the smaller of two elements
serves as the unit and the larger element is given as a multiple of that unit. It is not possible to take the
larger of two elements and ask what fraction of it is the smaller one without first using the latter to
decompose the former. Thus we can only say what dominates what. What is dominated has the reciprocal
value. A dominance matrix also has a left eigenvector that measures 'dominated' through the forced
reciprocal relation. In principle one would like another left eigenvector that measures 'dominated'
directly by comparing the smaller element with the larger one, thus eliciting new kinds of information in
the comparisons, but as just noted this is not possible. The principal left and right eigenvectors of any
matrix are structurally linked. If the matrix is consistent, left and right eigenvectors are elementwise
reciprocal. But this can also hold true in special cases when the matrix is inconsistent [32].
When the judgments are inconsistent, the need for the principal right eigenvector and for no other
derived scale has been mathematically proved in two ways. The first uses a theorem in matrix theory
which says that a small perturbation of a matrix leads to a small perturbation of its eigenvalues and left
and right eigenvectors. This demands that the principal eigenvector setting obtained algebraically in the
consistent case should go over to a principal eigenvector setting in the inconsistent case. The other way is
that due to inconsistency, all powers of the matrix of judgments participate in determining the priority
vector. When the contribution of each matrix, obtained by normalizing its row sums, is averaged over the
vectors and on passing to the limit the principal right eigenvector is obtained [35]. Eigenvector ranking is
arrived at deductively without additional assumptions as those made by least squares and logarithmic
least squares methods which superimpose a criterion to minimize errors and derive ranks that do not
relate to inconsistency and sometimes lead to different ranking which in the case of least squares may
not be unique [31].

5. Synthesis

In multicriteria decision making, synthesis of the rankings of the alternatives with respect to the
several criteria requires the use of mathematics. In the A H P hierarchic composition uses an additive
(linear) function and not a multiplicative [36] or some other kind of nonlinear function to produce the
overall rank of the alternatives because the most complex decision making dependencies are represented
by the supermatrix of the A H P for which composition is obtained by raising that matrix to infinite power
leading to an additive function. In a hierarchy dependence only occurs from level to level and the
elements in a level are also independent. In that case hierarchic composition is a special case of network
dominance composition and leads to an additive function from one level to the next. Multiple levels of
the hierarchy give rise to a multilinear form (a tensor) that does not behave as an additive linear
function, but is nonlinear.
Another argument derives from the questionable belief that one can fix a structure to model a
problem and adopt manipulations that satisfy some axioms. In the A H P more of the burden is placed on
a comprehensive structure adjusted to incorporate new information and also new expectations about the
priorities represented in terms of concrete criteria and subcriteria. It is not reasonable to think of
computational operations as some kind of magic to capture every important nuance one would like to see
come out. The outcome of the manipulations arising from paired comparisons in fact has far reaching
mathematical properties. Depending on the breadth and depth of the hierarchy, a multilinear form can
be used to come as closely as desired to any preconceived underlying answer.
In the supermatrix approach, the limiting priorities are given by a general nonlinear form with an
infinite n u m b e r of terms each of which is an infinite product of linear variables. There can be no general
440 T.L. Saaty /Analytic Hierarchy Process

functional form for the computations that captures all expectations. No matter how general a composi-
tion function one may adopt, there would always be examples calling for a still more general one to
capture a higher order nonlinearity not captured by the existing method. At best we can derive
multilinear forms that come arbitrarily closely to any expectation. (Polynomials, and more generally
multinomials are dense in the space of continuous functions, and are a special case of multilinear forms
in n-dimensional space.)

5.1. Validity of a hierarchy - Independence-weak dependence

Consider a three level hierarchy with a goal, criteria and alternatives. Assume that the criteria are
dependent among themselves. The supermatrix representation is given by

W=
(°°i)
X
0
Y
Z

where X is the column vector of priorities of the criteria with respect to the goal, Y is the matrix of
column eigenvectors of interdependence among the criteria, and Z is the matrix of column eigenvectors
of the alternatives with respect to each criterion. W is a column stochastic matrix obtained by
appropriate weighting of the matrices corresponding to interactions between levels.
The k-th power of W that captures rank dominance along paths of length k is given by

,0y k - 1x 0y~
Wk= | n-2 t n--I
I Z i~o Y X Z i~=Ogi
and the priorities are given by the limit of W k as k tends to infinity. It is given by

0 0 O)
W~ = 0 0 0 .
Z(I-Y)-'X Z(I-Y)-' I

Note that if Y = 0, and hence the criteria are independent among themselves, the weights of the
alternatives are given by ZX, the result of the additive model. Also when Y is not zero, but is a small
perturbation in a neighborhood of the null matrix, the additive model would still be a good representa-
tion of the limiting priorities and hierarchic composition is still valid. It is only when Y is a large
perturbation away from zero that the supermatrix solution should be used. In general, unless there are
strong dependencies among the criteria, the additive model is an adequate estimate of the priorities in a
hierarchy. Otherwise, the criteria should be redefined to ensure the independence of the new set.
To test for the mutual independence of the criteria, one proceeds as follows. Construct a z e r o - o n e
matrix of criteria against criteria using the number one to signify dependence of one criterion on
another, and zero otherwise. A criterion need not depend on itself, as for example, an industry may not
use its own output. For each column (criterion) of this matrix construct a pairwise comparison matrix
only for the dependent criteria, derive an eigenvector and augment it with zeros for the excluded criteria.
If a column is all zeros, then assign a zero vector. The question in the comparison would be: For a given
criterion, which of two criteria depend more on that criterion?
T.L. Saaty / Analytic Hierarchy Process 441

In Multiattribute Utility Theory ( M A U T ) [15,24] there is more than one way to do marginal tradeoffs
for the criteria. Examples are probability equivalent and certainty equivalent tradeoffs without a unique
way to determine the utility function on which the decision rests. This seems to require the assistance of
an expert. Concern about it was expressed by M A U T practitioners McCord and de Neufville [29].

5.2. The validity o f the derived rank

The outcome of hierarchic composition is a valid indicator of rank because it is derived from
mathematical considerations of dominance in preference. The eigenvector provides a unique representa-
tion of the rank of the elements compared when inconsistency is allowed. Similarly, hierarchic composi-
tion yields a unique representation of rank by considering priority dominance in a supermatrix of the
hierarchy. Additive composition is obtained as the limiting form of powers of this matrix [29].

5.3. R a n k preservation and reversal

Relative measurement of alternatives introduces two additional properties unrecognized by absolute


measurement approaches, the number of alternatives and the portion of the total that an alternative has
for each criterion (which cannot be inserted as criteria in the traditional way that new information is
incorporated). Both of these properties change when an alternative is added or deleted, in some cases
causing rank reversal. According to Arthur L. Blumenthal [5],
"Absolute judgment is the identification of the magnitude of some simple stimulus,..., whereas comparative
judgment is the identification of some relation between two stimuli both present to the observer. Absolute
judgment involves the relation between a single stimulus and some information held in short-term memory -
information about some former comparison stimuli or about some previously experienced measurement scale ....
To make the judgment, a person must compare an immediate impression with memory impression of similar
stimuli .... , ".
Measurement on a scale with a unit is called absolute measurement. Alternatives are measured on
such a scale one at a time. Each of them is examined independently of how many other alternatives there
are and how large or small their measurements may be. In this case adding or deleting alternatives has
no effect on the ranks of the alternatives already measured unless criteria are added or deleted by the
added or deleted alternatives.
In relative measurement, the alternatives must be compared and a scale derived from the compar-
isons. The scale is only known at the end of this process not at the beginning. A new alternative must be
compared with the existing alternatives and not against a predetermined scale. Rank reversals occur
because the measurements are made in relative terms and hence give rise to changes in the previous
values of the alternatives. The composite ranks of the alternatives depend on what other alternatives
there are and also on those that are added or deleted. Thus rank reversal is an intrinsically legitimate
phenonemon. It has also been observed to occur in practice. From this we conclude that in multicriteria
decisions, absolute measurement is insufficient to do all types of rankings particularly those that should
lead to rank reversal.
We have noted that according to psychologists, the mind does:
1) absolute comparisons with standards established in memory from previous experience (living in the
past) whereby the criteria are decomposed into intensities such as: high, medium, low and the
alternatives one encounters are rated one at a time in terms of these intensities. Absolute measure-
ment in the A H P and in Utility Theory is based on this idea of one at a time assessment, and
2) relative measurement, that does not require prior standards.
Normalization in relative measurement is the vehicle to account for rank reversal in the presence of
synergy. The weight of an alternative, when compared to a particular criterion, is given by

Wi= ~ WijX)/ ~ Wij


j-J i I
442 T . L Saaty / Analytic Hierarchy Process

where w i depends on wij, the weight of the same alternative with respect to criterion j, n is the number
of alternatives, m is the number of criteria, and x~ is the weights of the j-th criterion. In absolute
measurement only the numerator is used and in that case w i only depends on m, and on xi. But in
relative measurement w i depends on all four factors. Introducing an alternative increases n and also
increases the number of wit and therefore changes w r Some w i are changed differently than others, and
rank reversal occurs. A similar argument applies if alternatives are deleted.
Absolute measurement cannot take into consideration the effect of new criteria that arise from the
synergy between the alternatives taken as a group, such as 'manyness' that indicates how many
alternatives there are, or the effect of how well or badly each alternative measures on the criteria. The
reason why it cannot, is that 'how many' varies with each new alternative. Any alternative that has
already been assessed cannot now be reassessed as that would imply the dependence of the alternatives
on each other. In relative measurement, the effect of synergy on the criteria is uniformly captured
through normalization across all the criteria to adjust their weights and on the alternatives by including it
as a separate criterion when for example special emphasis is placed on manyness itself as a criterion.
Rank reversal that results from this synergy is an inevitable phenonemon that is not allowed for by
absolute measurement. Thus one cannot make a categorical statement about rank preservation in all
situations, unless one's ranking horizon is limited to absolute measurement which is only one of the two
modes of operation of the mind.
Since relative measurement can be used to account for rank reversal, the question is whether it can
also be used to preserve rank in situations where it is determined in advance that the presence of other
alternatives should have no effect on the ranks of the existing alternatives. The reply is in the affirmative
with regard to the number of alternatives by not normalizing but by using the ideal mode described
below. In this manner one also prevents rank reversal if the relative measurement of an added or deleted
alternative is smaller than the relative measurements of the other alternatives. Such an alternative is
loosely called irrelevant.
Economists have designed real life experiments that lead to rank reversal and have not yet found an
explanation for it in terms of utility theory. They write:
" O n e of the most puzzling paradoxes in decision theory is the phenomenon of preference reversal.
This phenomenon seems to contradict the transitivity axiom, a cornerstone of expected utility theory"
[47]; " T h e preference reversal phenomenon which is inconsistent with the traditional statement of
preference theory remains" [20]; "This reversal procedure has profound operational significance. It
places neoclassical economics upon a more satisfactory behavioral basis, so that all axioms become
testable hypotheses about observable variables" [9]; "In experimental studies, the behavioral pattern
called the preference reversal phenomenon remains robust in spite of the many different kinds of
improvements in experimental design" [8]; " T h e preference-reversal phenomenon has been shown in
many studies, but its causes have not been established. The results show that preference reversal cannot
be adequately explained by violations of independence, the reduction axiom, or transitivity. The main
cause of preference reversal is the failure of procedure invariance" [52].
The theory of absolute measurement can mislead by forcing rank preservation. Many such violations
in practice have been cited by specialists in Utility Theory [41]. Frequently, these arise from the number
of copies or from actual measurement of alternatives that affect an earlier ranking. Counterexamples
have appeared that rank alternatives individually show that copies, phantoms, and decoys can cause rank
reversal. A phantom alternative, well known to marketeers, is a higher priced product advertised to be
available only to cause people to shift from a lower priced to an intermediate priced product. Thus a
hypothetical alternative causes change in people's choice. Regularity is a condition of choice theory that
has to do with rank preservation. A decoy is one of a kind item that is often sold out but intended to
attract customers attention for example in a sale. R. Corbin and A. Marley [13] provide an example that,
"concerns a lady in a small town, who wishes to buy a hat. She enters the only hat store in town, and
finds two hats, A and B, that she likes equally well, and so might be considered equally likely to buy.
T.L. Saaty / Analytic Hierarchy Process 443

However, now suppose that the sales clerk discovers a third hat, C, identical to A. T h e n the lady may
well choose hat B for sure (rather than risk the possibility of seeing s o m e o n e wearing a hat just like hers),
a result that contradicts regularity".
H e r e is a plausible set of j u d g m e n t s in the case of the two hats A and B, with A preferred to B.
A d d i n g C that is a copy of A, changes the preference to B over A with respect to uniqueness and thus
makes it the m o r e desired choice overall.

Style (0.4) Uniqueness (0.6)


A B A B
A 1 3 0.75 1 1 0.5
B 1/3 1 0.25 1 1 0.5

A = 0.75 X 0.4 + 0.5 X 0.6 = 0.6,


B = 0.25 X 0.4 + 0.5 X 0.6 = 0.4,

Style (0.4) Uniqueness (0.6)


A B C A B C
A 1 3 1 0.42 1 1/6 1 0.125
B 1/3 1 1/3 0.16 6 1 6 0.75
C 1 3 1 0.42 1 1/6 1 0.125

A = 0.42 x 0.4 + 0.125 x 0.6 = 0.238,


B = 0.16 x 0.4 + 0.75 x 0.6 = 0.514,
C = 0.42 x 0.4 + 0.125 x 0.6 = 0.238.

If a n o t h e r copy of A is introduced, paired comparisons would again make A m o r e preferred to B,


whereas one at a time comparisons maintain the d o m i n a n c e of B over A.
A n o t h e r example of reversal is w h e n a state has 60% conservatives and 40% liberals. If there is one
liberal and one conservative candidate the result favors the conservative 6 0 - 4 0 % . If a third less p o p u l a r
conservative candidate is b r o u g h t in, the conservative vote is split and the liberal candidate b e c o m e s the
winner. V o t e splitting is a time h o n o r e d political tactic.
In the early 1980's a numerical example was given by Belton and G e a r [4] p u r p o r t e d to show that the
A H P violated the utility theory m a n d a t e of rank preservation. T h e y concluded that the A H P leads to
arbitrary ranking of alternatives. O n e can show that if copies, as in the hat case, cause change in rank
then small perturbations of copies can also and perturbations of these can also and so on. Thus an
arbitrary alternative can also. A n u m b e r of o t h e r examples are included in [36] where this p h e n o m e n o n
of wrongly enforcing rank preservation every time, has been shown by several authors to go against what
is observed in practice.
But there are situations where rank preservation is proper. In admitting students to a school, the rank
of students already accepted and notified for the term should remain u n c h a n g e d if a new application is
a d d e d to the collection. This is an instance where standards are applied to each alternative individually
(thus minimizing regret by not changing an earlier decision). Absolute m e a s u r e m e n t prohibits the
influence of o t h e r alternatives as a g r o u p from affecting a rating of the alternatives one at a time, as also
does the ideal m o d e in relative m e a s u r e m e n t .
444 T.L. Saaty / Analytic Hierarchy Process

The A H P has been extended to include an ideal mode by dividing the weights of the alternatives on
each criterion by the largest weight among them. In this manner, a new alternative would be compared
only with the 'ideal' alternative for that criterion. There can be no rank reversal when irrelevant
alternatives are added [17]. In addition, a best alternative need not be unique. The distributive mode
with normalization is used when uniqueness is important (the hat example), where rank can depend on
properties of the alternatives as a group, and in resource allocation and planning where the relative rank
of all the alternatives must be considered.

6. Normative-descriptive [7]

All science is descriptive not normative. It is based on the notion that knowledge is incomplete. It uses
language and mathematics to understand, describe and predict events sometimes as a test of the accuracy
of the theory. Events involve two things. They are controllable and uncontrollable conditions (e.g. laws)
and people or objects characterized by matter, energy and motion influenced by and sometimes
influencing these conditions. A missile's path is subject to uncontrollable forces like gravity and
controllable forces like the initial aim of the missile, its weight, perhaps the wind, and others. The
conditions are not determined by the objects involved. The idea is to get the missile from A to B by
describing its path with precision.
Economics based on expected utility theory is predicated on the idea that the collective behavior of
many individuals each motivated by self interest determines the market conditions which in turn
influence or control each individual's behavior. In this case both the objects and the conditions are 'up
for grabs' because behavior is subject to rational influences that are thought to be understood. By
optimizing individual behavior through rationality one can optimize the collective conditions and the
resulting system, plus or minus some corrections in the conditions. But conditions are not all economic.
Some are environmental, some social, some political and others cultural. We know little about their
interactions. In attempting to include everything, normative theories treat intangible criteria as tangibles
by postulating a convenient economic scale. It is hard to justify reducing all intangibles to economics in
order to give the appearance of completeness. It is not clear that economic progress solves all problems.
To the contrary, some believe that it can create problems in other areas of human concern.
Multiattribute Utility Theory is essentially a single criterion measurement theory based on utility
theory and economic value. In the final analysis, it is a reductionist theory. All factors must be traded off
in economic terms. Perhaps the reason for this position was technical rather than philosophical. Because
of the behavioral nature of the subject and the presence of intangibles, it did not have many choices for
tools, as it did not know how to deal with priorities. It simply generalized on what economists had
created to accommodate the idea of utility. The axioms of utility theory are assumed to be founded in
behavior, but do not describe the paradoxes encountered nor do they show how to surmount the
difficulties faced in predicting such things as the ups and downs of the economy. Utility theory is
concerned with arithmetic operations and has no axioms about structure.
A normative theory requires criteria established by particular people external to the process of
decision making. Experts often disagree on the criteria to judge the excellence of a normative theory and
the decision resulting from it. For example, a basic criterion of Utility Theory is the principle of
rationality which says that if a person is offered more of that which he values, he should opt to take
more. It is precisely in response to this dictum that Herbert Simon [49] developed his idea of sufficiency
(satisficing). Whenever we are saturated even with a highly valued commodity, there is a cutoff point
where the marginal increase in total value is less than or equal to zero. A theory constructed to satisfy
such a fundamental assumption will encounter difficulties in its applications. In such situations rank
reversals would be appropriate in order to prevent the disadvantages of oversaturation.
T.L. Saaty / Analytic Hierarchy Process 445

The A H P is a descriptive theory in the sense of the physical sciences. It treats people separately from
the conditions in which they find themselves because so far we have no comprehensive and integrated
theory of socio-economic-political-environmental-cultural factors that would enable us to deduce opti-
mality principles for people's behavior. Completeness and optimality imply that there is an underlying
order of which the solution of a problem simply elicits the relevant part of the order. There should be no
surprises by changing rank due to some left out alternative particularly if it is 'irrelevant'.
The A H P does not insist that a decision is necessarily a wrong one unless it is done in some prescribed
way and according to some rules. The purpose of the AHP is to assist people in organizing their thoughts
and judgments to make more effective decisions. Its structures are based on observations of how
influences are transmitted and its arithmetic derives from how psychologists have observed people to
function in attempting to understand their behavior. The axioms of the AHP deal with the reciprocal
property, homogeneous comparisons, inner and outer dependence (hierarchies and networks), and
expectations about the outcome and about fulfilling prior commitments with respect to rank.
The A H P begins with the traditional concept of ordinal preference and advances further into
numerical paired comparisons from which a ranking is derived. By imposing a multiplicative structure on
the numbers (a u . a j k = aik), the reciprocal condition is obtained. Thus the A H P infers behavioral
characteristics of judgments (inconsistency and intransitivity) from its basic framework of paired compar-
isons. It begins by taking situations with a known underlying ratio scale and hence known comparison
ratios, and shows how its method of deriving a scale uniquely through the eigenvector gives back the
original scale. Then through perturbation it shows that a derived scale should continue through the
eigenvector to approximate the original scale providing that there is high consistency.
How then do we judge the soundness of a decision theory besides legislating normative standards? In
the A H P the optimality of a decision depends on the value system and experience of the consistent
decision maker and not on the rules and thoughts of experts creating scales. (S)he is assisted by
mathematical procedures that produce the best rank order relative to the structure of his decision
framework. In the AHP, as in any activity in life, the decision maker does not need to rely on an expert
but rather becomes an expert through study and practice and the assistance of other decision makers.

7. Conclusions

J. Aczel [1] cites an anonymous quotation which says:


"An economist is someone who cannot see something working in practice without asking whether it would work
in theory".
As a descriptive rather than a prescriptive theory, the A H P will undergo adaptation by practitioners.
Apart from the elicitation of judgment regarding dependence in the AHP, a uniform process is applied
throughout to derive the priorities. The arithmetic operations in the AHP are based on the notion of
dominance that is a prominent aspect of human thinking. These operations are also known in
mathematics, the behavioral sciences, and generalizations in hierarchies and networks. In applications
they give rise to what is known as an additive model in a hierarchy and a compound nonlinear model in a
network. The former leads to a multilinear form, a tensor, which from its definition is nonlinear. With
such forms one can approximate arbitrarily closely to any underlying measurements. Thus in a suffi-
ciently elaborate hierarchy, and more generally throughout the AHP, an additive model is appropriate.
As a descriptive theory, the A H P processes ranking through four general modes: 1) the absolute mode,
to preserve rank or impose known standards on the alternatives, 2) the ideal mode, to preserve rank from
irrelevant alternatives, 3) the distributive mode, to allow rank to change when the number and
measurement of alternatives can affect preference among the alternatives by considering them as a set
rather than one at a time, and 4) the supermatrix mode, to deal with dependence among the criteria or
446 T.L. Saaty / Analytic Hierarchy Process

the alternatives and between the criteria and the alternatives. The extension of the AHP to neural
networks with Fredholm operators as generalizations of the eigenvalue formulation applied in the
context of a hierarchy and a network will appear in the Journal of Mathematical Psychology [45]. My
thanks to David Hauser, Robert Nachtmann, Kirti Peniwati, Ed Wasil, an unknown referee and Sarah
Becker for their help and suggestions in preparing this final version of the paper.

References

[1] Aczel, J., "Why, how and how not to apply mathematics to economics and to other social and behavioral sciences. An
example: Merging relative scores", in: Proceedings of the Seminar Debrecen-Graz, Grazer Math. Bet. 315, 1991.
[2] Bard, J.F., "A multiobjective methodology for selecting subsystem automation options", Management Science 32/12 (1986)
1628-1641.
[3] Bauer, R.A., Collar, E., and Tang, V., "The Silverlake Project - Transformation at IBM, Oxford University Press, New York,
1992.
[4] Belton, V., and Gear, T., "On a short-coming of Saaty's method of analytic hierarchies", Omega 11/3 (1983) 228-230.
[5] Blumenthal, A.L., The Process of Cognition, Prentice-Hall, Englewood Cliffs, NJ, 1977.
[6] Buede, D.M., "Software review: Three packages for AHP: Criterium, Expert Choice and HIPRE 3 + ", Journal of Multi-Criteria
Decision Analysis 1 (1992) 119-121.
[7] Bunge, M., Treatise on Basic Philosophy, Vol. 7, D. Reidel, Boston, MA, 1985.
[8] Chu, Y.-P., and Chu, R.-L., "The subsidence of preference reversals in simplified and marketlike experimental settings",
American Economic Review 80/4 (1990) 902-911.
[9] Clark, S.A., "Revealed independence and quasi-linear choice", Oxford Economic Papers 40/3 (1988) 550-559.
[10] Cook, T., Falchi, P., and Mariano, R. "An urban allocation model combining time series and analytic hierarchical methods",
Management Science 30/2 (1984) 198-208.
[11] Coombs, C.H., A Theory of Data, Wiley, New York, 1964.
[12] Communications of the Operations Research Society of Japan 31/8 (1986).
[13] Corbin, R., and Marley, A.A.J., "Random utility models with equality: An apparent, but not actual, generalization of random
utility models", Journal of Mathematical Psychology 11 (1974) 274-293.
[14] Deturck, D.M., "The approach to consistency in the analytic hierarchy process", Mathematical Modelling 9 (1987) 345-352.
[15] Edwards, W., and Newman, J.R., Multiattribute Evaluation, Series: Quantitative applications in the social sciences, Number
07-026, Sage, Beverly Hills, CA, 1982.
[16] European Journal of Operational Research 48/1 (1990), Special Issue: Decision Making by the Analytic Hierarchy Process:
Theory and Applications.
[17] Forman, E.H., "Multicriteria prioritization in open and closed systems", George Washington University, 1992, forthcoming.
[18] Forman, E.H., "Facts and fictions about the Analytic Hierarchy Process", in: T.L. Saaty, Multicriteria Decision Making, RWS
Publications, Pittsburgh, PA, 1990.
[19] Golden, B.L., Wasil, E.A., and Harker, P.T. (eds.), The Analytic Hierarchy Process: Applications and Studies, Springer-Verlag,
New York, 1989.
[20] Grether, D.M., and Plott, C.R., "Economic theory of choice and the preference reversal phenomenon", American Economic
Review 69/4 (1979) 623.
[21] Hamalainen, R.P., "Computer assisted energy policy analysis in the parliament of Finland", Interfaces 18/4 (1988) 12-23.
[22] Harker, P.T., "Incomplete pairwise comparisons in the analytic hierarchy process", Mathematical Modelling 9 (1987) 837-848.
[23] Harker, P.T., "Alternative modes of questioning in the analytic hierarchy process", Mathematical Modelling 9 (1987) 353-360.
[24] Harker, P.T., and Millet, J., "Globally effective questioning in the Analytic Hierarchy Process", European Journal of
Operational Research 48/1 (1990) 88-97.
[25] Holder, R.D., "Some comments on the analytic hierarchy process", Journal of the Operational Research Society 41 (1990)
1073-1076.
[26] Keeney, R.L., and Raiffa, H., Decisions with Multiple Objectives: Preference and Value Tradeoffs, Wiley, New York, 1976.
[27] Mathematical Modelling 9 / 3 - 5 (1987), Special Issue: The Analytic Hierarchy Process - Theoretical Developments and Some
Applications.
[28] Mathematical Modelling, 1993, forthcoming.
[29] McCord, M., and de Neufville, R., "Empirical demonstration that expected utility decision analysis is not operational", in: S.
Wenstop (ed.), Foundation of Utility and Risk Theory with Applications, D. Reidel, Boston, MA, 1983, 181-200.
[30] "Proceedings of the 2nd International Symposium on the Analytic Hierarchy Process: Vol. 1 and 2", University of Pittsburgh,
Pittsburgh, PA, 1991.
T.L. Saaty / Analytic Hierarchy Process 447

[31] "Reprints of International Symposium on the Analytic Hierarchy Process", Tianjin University, Tianjin, China, 1988.
[32] Saaty, T.L., "Scenarios and priorities in transport planning: Application to the Sudan", Transportation Research 11/5 (1977).
[33] Saaty, T.L., "The Sudan transport study", Interfaces 8/1 (1977) 37-57.
[34] Saaty, T.L., and Vargas, L.G., "Inconsistency and rank preservation", Journal of Mathematical Psychology 28/2 (1984).
[35] Saaty, T.L., "Axiomatic foundation of the Analytic Hierarchy Process", Management Science 32/7 (1986) 841-855.
[36] Saaty, T.L., and Vargas, L.G., "Uncertainty and rank order in the analytic hierarchy process", European Journal of
Operational Research 32 (1987) 107-117.
[37] Saaty, T.L., "Risk: Its priority and probability, the Analytic Hierarchy Process", Risk Analysis 7/2 (1987).
[38] Saaty, T.L., "Rank according to Perron: A new insight", Mathematics Magazine 60/4 (1987) 211-213.
[39] Saaty, T.L., "A note on multiplicative operations in the Analytic Hierarchy Process", Preprints of International Symposium on
the Analytic Hierarchy Process, Tianjin University, Tianjin, China, 1988, 82-86.
[40] Saaty, T,L., Multicriteria Decision Making: The Analytic Hierarchy Process, RWS Publications, 4922 Ellsworth Ave., Pittsburgh,
PA, 1990.
[41] Saaty, T.L., "Resolution of the rank preservation-reversal issue", 1993, forthcoming.
[42] Saaty, T.L., and Vargas, L.G., Prediction, Projection and Forecasting, Kluwer Academic, Boston, MA, 1991.
[43] Saaty, T.L., and Kearns, K.P., Analytical Planning, RWS Publications, Pittsburgh, PA, 1991.
[44] Saaty, T.L., and Vargas, L.G., "A model of neural impulse firing and synthesis", forthcoming in Journal of Mathematical
Psychology, 1993.
[45] Saaty, T.L., and Vargas, L.G., "Deriving Bayes theorem from the Analytic Hierarchy Process", 1992, forthcoming.
[46] Schoemaker, P.J.H., and Waid, C.C., "An experimental comparison of different approaches to determining weights in additive
utility models", Management Science 28/2 (1982) 182-196.
[47] Segal, U., "Does the preference reversal phenomenon necessarily contradict the independence axiom?", American Economic
Reciew 78/1 (1988) 233-236.
[48] Shepard, R.N., "A taxonomy of some principal types of data and of multidimensional methods for their analysis", in: R.N.
Shepard, A.K. Romney and S.B. Nerlove (eds.), Multidimensional Scaling, Seminar Press, New York, 1972.
[49] Simon, H.A., "A behavioral model of rational choice", Quarterly Journal of Economics 69 (1955) 99-118.
[50] Socio-Economic Planning Sciences 20/6 (1986), Special Issue: The Analytic Hierarchy Process.
[51] Socio-Economic Planning Sciences 25/2 (1991), Special Issue: Public-SectorApplications of the Analytic Hierarchy Process.
[52] Tversky, A., Slovic, P., and Kahneman, D., "The causes of preference reversal", American Economic Re~'iew 80/1 (1990)
204-217.
[53] Vargas, L.G., "An overview of the Analytic Hierarchy Process and its applications", European Journal of Operational Research
48/1 (1990) 2-8.
[54] Zahedi, F., "The Analytic Hierarchy Process - A survey of the method and its applications", Interfaces 16/4 (1986) 96-108.

You might also like