You are on page 1of 11

AppA.

qxd 12/6/02 2:54 PM Page 672

A P P E N D I X

A
Probability and Statistics

This appendix is intended to serve as a brief review of the probability and statistics
concepts used in this text. Students requiring more review than is available in this ap-
pendix should consult one of the texts listed in the bibliography.

A.1 PROBABILITY

Uncertainty in organizational decision making is a fact of life. Demand for an organi-


zation’s output is uncertain. The number of employees who will be absent from work
on any given day is uncertain. The price of a stock tomorrow is uncertain. Whether or
not it will snow tomorrow is uncertain. Each of these events is more or less uncertain.
We do not know exactly whether or not the event will occur, nor do we know the
value that a particular random variable (e.g., price of stock, demand for output, num-
ber of absent employees) will assume.
In common terminology we reflect our uncertainty with such phrases as “not very
likely,’’ “not a chance,’’ “for sure.’’ But, while these descriptive terms communicate
one’s feeling regarding the chances of a particular event’s occurrence, they simply are
not precise enough to allow analysis of chances and odds.
Simply put, probability is a number on a scale used to measure uncertainty. The
range of the probability scale is from 0 to 1, with a 0 probability indicating that an
event has no chance of occurring and a probability of 1 indicating that an event is ab-
solutely sure to occur. The more likely an event is to occur, the closer its probability is
to 1. This probability definition, which is general, needs to be further augmented to il-
lustrate the various types of probability that decision makers can assess. There are
three types of probability that the operations manager should be aware of:

672
AppA.qxd 12/6/02 2:54 PM Page 673

A.2 EVENT RELATIONSHIPS AND PROBABILITY LAWS 673

• Subjective probability
• Logical probability
• Experimental probability

Subjective Probability
Subjective probability is based on individual information and belief. Different indi-
viduals will assess the chances of a particular event in different ways, and the same
individual may assess different probabilities for the same event at different points in
time. For example, one need only watch the blackjack players in Las Vegas to see that
different people assess probabilities in different ways. Also, daily trading in the stock
market is the result of different probability assessments by those trading. The sellers
sell because it is their belief that the probability of appreciation is low, and the buyers
buy because they believe that the probability of appreciation is high. Clearly, these
different probability assessments are about the same events.

Logical Probability
Logical probability is based on physical phenomena and on symmetry of events. For
example, the probability of drawing a three of hearts from a standard 52-card playing
deck is 1/52. Each card has an equal likelihood of being drawn. In flipping a coin, the
chance of “heads’’ is 0.50. That is, since there are only two possible outcomes from
one flip of a coin, each event has one-half the total probability, or 0.50. A final exam-
ple is the roll of a single die. Since each of the six sides are identical, the chance of
any one event occurring (i.e., a 6, a 3, etc.) is 1/6.

Experimental Probability
Experimental probability is based on frequency of occurrence of events in trial situa-
tions. For example, in determining the appropriate inventory level to maintain in the
raw material inventory, we might measure and record the demand each day from that
inventory. If, in 100 days, demand was 20 units on 16 days, the probability of demand
equaling 20 units is said to be 0.16 (i.e., 16/100). In general, experimental probability
of an event is given by
number of times event occurred
probability of event =
total number of trials
Both logical and experimental probability are referred to as objective probability
in contrast to the individually assessed subjective probability. Each of these is based
on, and directly computed from, facts.

A.2 EVENT RELATIONSHIPS AND PROBABILITY LAWS

Events are classified in a number of ways that allow us to state rules for probability
computations. Some of these classifications and definitions follow.
AppA.qxd 12/6/02 2:54 PM Page 674

674 APPENDIX A / PROBABILITY AND STATISTICS

1. Independent events: events are independent if the occurrence of one does not af-
fect the probability of occurrence of the others.
2. Dependent events: events are termed dependent if the occurrence of one does af-
fect the probability of occurrence of others.
3. Mutually exclusive events: two events are termed mutually exclusive if the occur-
rence of one precludes the occurrence of the other. For example, in the birth of a
child, the events “It’s a boy!’’ and “It’s a girl!’’ are mutually exclusive.
4. Collectively exhaustive events: a set of events is termed collectively exhaustive if
on any one trial at least one of them must occur. For example, in rolling a die, one
of the events 1, 2, 3, 4, 5, or 6 must occur; therefore, these six events are collec-
tively exhaustive.
We can also define the union and intersection of two events. Consider two events
A and B. The union of A and B includes all outcomes in A or B or in both A and B. For
example, in a card game you will win if you draw a diamond or a jack. The union of
these two events includes all diamonds (including the jack of diamonds) and the re-
maining three jacks (hearts, clubs, spades). The or in the union is the inclusive or.
That is, in our example you will win with a jack or a diamond or a jack of diamonds
(i.e., both events).
The intersection of two events includes all outcomes that are members of both
events. Thus, in our previous example of jacks and diamonds, the jack of diamonds is
the only outcome contained in both events and is therefore the only member of the in-
tersection of the two events.
Let us now consider the relevant probability laws based on our understanding of
the above definitions and concepts. For ease of exposition let us define the following
notation:
P(A) = probability that event A will occur
P(B) = probability that event B will occur
If two events are mutually exclusive, then their joint occurrence is impossible.
Hence, P(A and B) = 0 for mutually exclusive events. If the events are not mutually
exclusive, P(A and B) can be computed (as we will see in the next section); this prob-
ability is termed the joint probability of A and B. Also, if A and B are not mutually ex-
clusive, then we can also define the conditional probability of A given that B has al-
ready occurred or the conditional probability of B given that A has already occurred.
These probabilities are written as P(A|B) and P(B| A), respectively.

The Multiplication Rule


The joint probability of two events that are not mutually exclusive is found by using
the multiplication rule. If the events are independent events, the joint probability is
given by
P(A and B) = P(A) × P(B|A) or P(B) × P(A|B)
If the events are independent, then P(B|A) and P(A|B) are equal to P(B) and P(A),
respectively, and therefore the joint probability is given by
AppA.qxd 12/6/02 2:54 PM Page 675

A.2 EVENT RELATIONSHIPS AND PROBABILITY LAWS 675

P(A and B) = P(A) × P(B)


From these two relationships, we can find the conditional probability for two depen-
dent events from
P( A and B)
P( A | B) =
P( B)
and
P( A and B)
P( B | A) =
P( A)
Also, the P(A) and P(B) can be computed if the events are independent, as
P( A and B)
P( A) =
P( B)
and
P( A and B)
P( B) =
P( A)

The Addition Rule


The addition rule is used to compute the probability of the union of two events. If two
events are mutually exclusive. The P(A and B) = 0 as we indicated previously. There-
fore, the probability of either A or B or both is simply the probability of A or B. This
is given by
P(A or B) = P(A) + P(B)
But, if the events are not mutually exclusive, then the probability of A or B is given by
P(A or B) = P(A) + P(B) − P(A and B)
We can denote the reasonableness of this expression by looking at the following Venn
diagram.

A B

The two circles represent the probabilities of the events A and B, respectively. The
shaded area represents the overlap in the events; that is, the intersection of A and B. If
we add the area of A and the area of B, we have included the shaded area twice.
Therefore, to get the total area of A or B, we must subtract one of the areas of the in-
tersection that we have added.
AppA.qxd 12/6/02 2:54 PM Page 676

676 APPENDIX A / PROBABILITY AND STATISTICS

If two events are collectively exhaustive, then the probability of (A or B) is equal


to 1. That is, for two collectively exhaustive events, one or the other or both must
occur, and therefore, P(A or B) must be 1.

A.3 STATISTICS

Because events are uncertain, we must employ special analyses in organizations to


ensure that our decisions recognize the chance nature of outcomes. We employ statis-
tics and statistical analysis to
1. Concisely express the tendency and the relative uncertainty of a particular situa-
tion.
2. Develop inferences or understanding about a situation.
“Statistics’’ is an elusive and often misused term. Batting averages, birth weights,
student grade points are all statistics. They are descriptive statistics. That is, they are
quantitative measures of some entity and, for our purposes, can be considered as data
about the entity. The second use of the term “statistics’’ is in relation to the body of
theory and methodology used to analyze available evidence (typically quantitative)
and to develop inferences from the evidence.
Two descriptive statistics that are often used in presenting information about a
population of items (and consequently in inferring some conclusions about the popu-
lation) are the mean and the variance. The mean in a population (denoted as µ) can be
computed in two ways, each of which gives identical results.
k
µ= ∑ X j P( X j )
j =1

where
k = the number of discrete values that the random variable Xj may assume
Xj = the value of the random variable
P(Xj) = is the probability (or relative frequency) of Xj in the population
Also, the mean can be computed as

N
µ= ∑ Xi / N
i =1

where
N = the size of the population (the number of different items in the population)
Xi = the value of the ith item in the population
The mean is also termed the expected value of the population and is written as E(X).
AppA.qxd 12/6/02 2:54 PM Page 677

A.3 STATISTICS 677

The variance of the items in the population measures the dispersion of the items
about their mean. It is computed in one of the following two ways:

k
2
σ = ∑ ( X j − µ )2 P( X j )
j =1

or
k
( Xi − µ )2
σ2 = ∑ N
i =1

The standard deviation, another measure of dispersion, is simply the square root of
the variance or

σ = √σ2

Descriptive Versus Inferential Statistics


Organizations are typically faced with decisions for which a large portion of the rele-
vant information is uncertain. In hiring graduates of your university, the “best’’
prospective employee is unknown to the organization. Also, in introducing a new
product, proposing a tax law change to boost employment, drilling an oil well, and so
on, the outcomes are always uncertain.
Statistics can often aid management in reducing this uncertainty. This is accom-
plished through the use of one or the other, or both, of the purposes of statistics. That
is, statistics is divided according to its two major purposes: describing the major char-
acteristics of a large mass of data and inferring something about a large mass of data
from a smaller sample drawn from the mass. One methodology summarizes all the
data; the other reasons from a small set of the data to the larger total.
Descriptive statistics uses such measures as the mean, median, mode, range, vari-
ance, standard deviation, and such graphical devices as the bar chart and the his-
togram. When an entire population (a complete set of objects or entities with a com-
mon characteristic of interest) of data is summarized by computing such measures as
the mean and the variance of a single characteristic, the measure is referred to as a pa-
rameter of that population. For example, if the population of interest is all female
freshmen at your university and all their ages were used to compute an arithmetic av-
erage of 19.2 years, this measure is called a parameter of that population.
Inferential statistics also uses means and variance, but in a different manner. The
objective of inferential statistics is to infer the value of a population parameter
through the study of a small sample (a portion of a population) from that population.
For example, a random sample of 30 freshmen females could produce the information
that there is 90 percent certainty that the average age of all freshmen women is be-
tween 18.9 and 19.3 years. We do not have as much information as if we had used the
entire population, but then we did not have to spend the time to find and determine
the age of each member of the population either.
AppA.qxd 12/6/02 2:54 PM Page 678

678 APPENDIX A / PROBABILITY AND STATISTICS

Before considering the logic behind inferential statistics, let us define the primary
measures of central tendency and dispersion used in both descriptive and inferential
statistics.

Measures of Central Tendency


The central tendency of a group of data represents the average, middle, or “normal’’
value of the data. The most frequently used measures of central tendency are the
mean, the median, and the mode.
The mean of a population of values was given earlier as

k
∑ Ni
X
µ=N
i =1

where

µ = the mean (µ pronounced “mu’’)


Xi = the value of the ith data item
N = the number of data items in the population

The mean of a sample of items from a population is given by

n
∑ ni
X
X=
i =1

where

X = the sample mean (pronounced “X bar’’)
Xi = the value of the ith data item in the sample
n = the number of data items selected in the sample

The median is the middle value of a population of data (or sample) where the
data are ordered by value. That is, in the following data set

3, 2, 9, 6, 1, 5, 7, 3, 4

4 is the median since (as you can see when we order the data)

1, 2, 3, 3, 4, 5, 6, 7, 9

50 percent of the data values are above 4 and 50 percent below 4. If there are an even
number of data items, then the mean of the middle two is the median. For example, if
there had also been an 8 in the above data set, the median would be 4.5 = (4 + 5)/2.
The mode of a population (or sample) of data items is the value that most fre-
quently occurs. In the above data set, 3 is the mode of the set. A distribution can have
more than one mode if there are two or more values that appear with equal frequency.
AppA.qxd 12/6/02 2:54 PM Page 679

A.3 STATISTICS 679

Measures of Dispersion
Dispersion refers to the scatter around the mean of a distribution of values. Three
measures of dispersion are the range, the variance, and the standard deviation.
The range is the difference between the highest and the lowest value of the data
set, that is, Xhigh − Xlow.
The variance of a population of items is given by

N
( Xi − µ )2
σ2 = ∑ N
i =1

where
σ2 = the population variance (pronounced “sigma squared”)
The variance of a sample of items is given by

n
( Xi − X )2
S2 = ∑ n
i =1

where
S 2 = the sample variance
The standard deviation is simply the square root of the variance. That is,

n
( Xi − µ )2
σ= ∑ N
i =1

and

n
( Xi − X )2
S= ∑ n
i =1

σ and S are the population and sample standard deviations, respectively.

Inferential Statistics
A basis of inferential statistics is the interval estimate. Whenever we infer from par-
tial data to an entire population, we are doing so with some uncertainty in our infer-
ence. Specifying an interval estimate (e.g., average weight is between 10 and 12
pounds) rather than a point estimate (e.g., the average weight is 11.3 pounds) simply
helps to relate that uncertainty. The interval estimate is not as precise as the point esti-
mate.
AppA.qxd 12/6/02 2:54 PM Page 680

680 APPENDIX A / PROBABILITY AND STATISTICS

Inferential statistics uses probability samples where the chance of selection of


each item is known. A random sample is one in which each item in the population has
an equal chance of selection.
The procedure used to estimate a population mean from a sample is to
1. Select a sample of size n from the population.

2. Compute X the mean and S the standard deviation.

3. Compute the precision of the estimate (i.e., the ± limits around X within which the
mean µ is believed to exist).
Steps 1 and 2 are straightforward, relying on the equations we have presented in ear-
lier sections. Step 3 deserves elaboration.
The precision of an estimate for a population parameter depends on two things:
the standard deviation of the sampling distribution, and the confidence you desire to
have in the final estimate. Two statistical laws provide the logic behind Step 3.
First, the law of large numbers states that as the size of a sample increases toward
infinity, the difference between the estimate of the mean and the true population mean
tends toward zero. For practical purposes, a sample of size 30 is assumed to be “large
enough’’ for the sample estimate to be a good estimate of the population mean.
Second, the central limit theorem states that if all possible samples of size n were
taken from a population with any distribution, the distribution of the means of those
samples would be normally distributed with a mean equal to the population mean and
a standard deviation equal to the standard deviation of the population divided by the
square root of the sample size. That is, if we took all of the samples of size 100 from
the population shown in Figure 1, the sampling distribution would be as shown in
Figure 2. The logic behind Step 3 is that
1. Any sample of size n from the population can be considered to be one observation
from the sampling distribution with the mean µ x = µ and the standard deviation

σ
σx =
n

σx = σ = 20 = 2
σ = 20 n 100

µx = 50 X µx = 50 X

Figure 1 Population distribution. Figure 2 Sampling distribution.


AppA.qxd 12/6/02 2:54 PM Page 681

A.3 STATISTICS 681

2. From our knowledge of the normal distribution, we know that there is a number
(see normal probability table, back cover) associated with each probability value
of a normal distribution (e.g., the probability that an item will be within ± 2 stan-
dard deviations of the mean of a normal distribution is 94.45 percent, Z = 2 in this
case).
3. The value of the number Z is simply the number of standard deviations away from
the mean that a given point lies. That is,

( X − µ)
Z=
σ

or in the case of Step 3


(X − µx )
Z=
Zσ x

4. The precision of a sample estimate is given by Zσx– .



5. The interval estimate is given by the point estimate X plus or minus the precision,

or X ± Zσx– .

In the previous example shown in Figures 1 and 2, suppose that a sample estimate X
was 56 and the population standard deviation σ was 20. Also, suppose that the desired
confidence was 90 percent. Since the associated Z value for 90 percent is 1.645, the
interval estimate for µ is

 20 
56 ± 1.645 
 100 

or
56 ± 3.29 or 52.71 to 59.29
This interval estimate of the population mean is based solely on information de-
rived from a sample and states that the estimator is 90 percent confident that the true
mean is between 52.71 and 59.29. There are numerous other sampling methods and
other parameters that can be estimated; the student is referred to one of the references
in the bibliography for further discussion.

Standard Probability Distributions


The normal distribution, discussed and shown in Figure 2, is probably the most com-
mon probability distribution in statistics. Some other common distributions are the
Poisson, a discrete distribution, and the negative exponential, a continuous distribu-
tion. In project management, the beta distribution plays an important role. A continu-
ous distribution, it is generally skewed, as in Figure 1. Two positive parameters, alpha
AppA.qxd 12/6/02 2:54 PM Page 682

682 APPENDIX A / PROBABILITY AND STATISTICS

and beta, determine the distribution’s shape. Its mean, µ, and variance, σ2, are given
by
α
µ=
α +β
αβ
σ2 = 2
(α + β ) (α + β + 2)
These are often approximated by
µ = (a + 4m + b)/6
and a standard deviation approximated by
σ = (b − a)/6
where
a is the optimistic value that might occur once in a hundred times,
m is the most likely (modal) value, and
b is the pessimistic value that might occur once in a hundred times.
Recent research (Keefer and Verdini, 1993) has indicated that a much better approxi-
mation is given by
µ = 0.630 d + 0.185 (c + e)
σ = 0.630 (d − µ)2 + 0.185 [(c − µ)2 + (e − µ)2]
2

where
c is an optimistic value at one in 20 times,
d is the median, and
e is a pessimistic value at one in 20 times.
See Chapter 8 for another method for approximating µ and σ.2

BIBLIOGRAPHY

ANDERSON, D., D. SWEENEY, and T. WILLIAMS. Statis- MENDENHALL, W., R. L. SCHAEFFER, and D. WACKERLY.
tics for Business and Economics. 7th ed. Cincinnati, Mathematical Statistics with Applications, 3rd ed.
OH: South-Western, 1998. Boston: PWS-Kent, 1986.
BHATTACHARYYA, G., and R. A. JOHNSON. Mathematical NETER, J., W. WASSERMAN, and G. A. WHITMORE.
Statistics. Paramus, NJ: Prentice-Hall, 1999. Applied Statistics, 3rd. ed. Boston: Allyn and Bacon,
KEEFER, D. L., and W. A. VERDINI. “Better Estimation 1987.
of PERT Activity Time Parameters.’’ Management Sci-
ence, September 1993.

You might also like