You are on page 1of 135

Ang and Tang: Probability Concepts in Engineering (2

nd
Ed, 2004)
Chapter 3: Analytical Models of Random Phenomena
Analytical Models
of
Random Phenomena
Cheng-Liang Chen
PSE
LABORATORY
Department of Chemical Engineering
National TAIWAN University
Chen CL 1
Analytical Models
of
Random Phenomena
Chen CL 2
Random Events and Random Variables
In engineering and the physical sciences, many random phenomena of interest
are associated with the numerical outcomes of some physical quantities.
The number of bulldozers that remain operating after 6 months,
The time required to complete a project,
The ood (in meter) of a river above mean ow level.
Sometimes, the possible outcomes are not in numerical terms,
Failure or survival of a chain,
Incompletion or completion of a project,
Closing and opening of highway routes.
These events may also be identied in numerical terms by articially assigning
numerical values to each of the possible outcomes,
Assigning a numerical value of 1 to survival of a chain
Assigning a numerical value of 1 to completion of a project
Assigning a numerical value of 1 to opening of a highway route
Therefore, the possible outcomes of a random phenomenon can be represented
by numerical values, either naturally or assigned articially.
Chen CL 3
Random Events and Random Variables
In any case, an outcome or event may then be identied through the value or
range of values of a function, which is called a random variable X.
If the values of X represent oods above the mean ow level,
then X > 2 m stands for the occurrence of a ood higher than 2 m;
If X represents the possible states of a chain (failure or survival),
then X = 0 means failure of the chain.
A random variable is a mathematical device for identifying events in numerical
terms.
In terms of the random variable X, we can speak of an event as (X = a), or
(X > a), or (X a), or (a < X b).
A random variable may be considered as a mathematical function or rule that
maps (or transform) events in a sample space into the number system (i.e., the
real line).
Chen CL 4
Random Events and Random Variables
The advantages and purpose of identifying events in numerical terms:
To conveniently represent events analytically,
To graphically display events and their respective probabilities.
For example:
Mutually exclusive events are mapped into nonoverlapping intervals on the
real line, whereas
Intersecting events are represented by the respective overlapping intervals on
the real line.
In Fig. 3.1, the events E
1
and E
2
are mapped
into the real line through the random variable
X, and thus can be identied, respectively, as
indicated below: (a < c < b < d)
E
1
= (a < X b)
E
2
= (c < X d)
E
1
E
2
= (c < X b)
E
1
E
2
= (a < X d)
E
1
E
2
= (X a) + (X > d)
Chen CL 5
Probability Distribution of a Random Variable
As the values or ranges of values of a random variable represent events, the
numerical values of the random variable are associated with specic probability or
probability measures.
These probability measures may be assigned according to prescribed rules that are
called probability distributions or probability law.
If X is a random variable, its probability distribution can always be described by
its cumulative distribution function (CDF),
F
X
= P(X x) for all x
X is a discrete RV if only discrete values of x have positive probabilities.
X is a continuous RV if probability measures are dened for all values of x.
Probability Distribution for a discrete random variable X,
Probability mass fcn, PMF, p
X
(x): p
X
(x) P(X = x)
Cumulative distribution fcn, CDF: F
X
(x) =

x
i
x
P(X = x
i
) =

x
i
x
p
X
(x
i
)
Chen CL 6
Probability Distribution of a Random Variable
Probability Distribution for a continuous random variable X,
Probability density fcn, PDF, f
X
(x): P(a < X b)
_
b
a
f
X
(x)dx
Cumulative distribution fcn, CDF: F
X
(x) = P(X x) =
_
x

f
X
()d
f
X
(x) =
dF
X
(x)
dx
f
X
(x)dx = P(x < X x + dx)
Chen CL 7
Probability Distribution of a Random Variable
Any function used to represent the probability distribution of a random variable
must necessarily satisfy the axioms of probability theory.
If F
X
(x) is the CDF of X, then it must satisfy the following conditions:
(i) F
X
() = 0; and F
X
() = 1.0
(ii) F
X
(x) 0, for all values of x, and is nondecreasing with x.
(iii) F
X
(x) is continuous to the right with x.
Some observations:
P(a < X b) =
_
b

f
X
(x)dx
_
a

f
X
(x)dx
P(a < X b) =

x
i
b
p
X
(x
i
)

x
i
a
p
X
(x
i
)
P(a < X b) = F
X
(b) F
X
(a)
Chen CL 8
Probability Distribution of a Random Variable
Ex: Mapping Events into Real Line
Consider Example 2.1 again, which involves a discrete random variable.
Using X as the random variable whose values
represent the number of operating bulldozers after
6 months, the events of interest are mapped into
the real line as shown in Fig. E3.1a.
Thus, (X = 0), (X = 1), (X = 2), and (X = 3)
now represent the corresponding events of interest.
Assuming again that each of the three bulldozers is equally likely to be operating
or nonoperating after 6 months, i.e., probability of operating is 0.5, and that the
conditions between bulldozers are statistically independent, the PMF and CDP of
X are shown in Fig.s E3.1b and E3.1c.
Chen CL 9
Probability Distribution of a Random Variable
Ex: Load on A Beam
For a continuous random variable, consider the
100-kg load in Example 2.5. If the load is equally
likely to be placed anywhere along the span of the
beam of 10 m, then the PDF of the load position
X is uniformly distributed in 0 < x 10;
f
X
(x) =
_
c =
1
10
for 0 < x 10
0 otherwise
F
X
=
_
x
0
cdx =
_

_
0 for 0 > x
x/10 for 0 < x 10
1 for x > 10
P(2 < X 5) =
_
5
2
1
10
dx = 0.30
P(2 < X 5) = F
X
(5) F
X
(2)
=
5
10

2
10
= 0.30
Chen CL 10
Probability Distribution of a Random Variable
Ex: Useful Life of Welding Machines
The useful life, T (in hours) of welding machines is
not completely predictable, but may be described
by the exponential distribution, with the following
PDF and CDF ( is a constant):
f
T
(t) =
_
e
t
for t 0
0 for t < 0
F
T
(t) =
_
t
0
e

d
=
_
1 e
t
for t 0
0 for t < 0
Chen CL 11
Main Descriptors of a Random Variable
Central Values
The main descriptors contain information on the properties of the
random variable that are of rst importance in many practical
applications.
Mean Value or Expected Value, E(X), of a random variable X

X
= E(X) =
_

x
i
x
i
p
X
(x
i
) if X is a discrete RV
_

xf
X
(x)dx if X is a continuous RV
The median, x
m
, of a random variable X, is the value at which the
CDF is 50% (F
X
(x
m
) = 0.50) and thus larger and smaller values
are equally probable.
The mode, x, is the most probable value of a random variable
X; i.e., it is the value of the random variable with the largest
probability or the highest probability density.
Chen CL 12
Main Descriptors of a Random Variable
Mathematical Expectation
Given a function g(X), its expected value E[g(X)], can be obtained
as a generalization of previous equation.
E[g(X)] is known as the mathematical expectation of g(X) and is
the weighted average of the function g(X).
E[g(X)] =
_

x
i
g(x
i
)p
X
(x
i
) if X is a discrete RV
_

g(x)f
X
(x)dx if X is a continuous RV
Chen CL 13
Main Descriptors of a Random Variable
Measures of Dispersion
Measure of dispersion is used to indicate how widely or narrowly
the values of a random variable are dispersed.
Of special interest is a quantity that gives a measure of how closely
or widely the values of the variate are clustered around a central
value. (g(x) = (x
X
)
2
)
Var(X) =
_

x
i
(x
i

X
)
2
p
X
(x
i
) if X is a discrete RV
_

(x
X
)
2
f
X
(x)dx if X is a continuous RV
Chen CL 14
Main Descriptors of a Random Variable
Measures of Dispersion
Var(X) =
_

(x
X
)
2
f
X
(x)dx
=
_

(x
2
2
X
x +
2
X
)f
X
(x)dx
= E(X
2
) 2
X
E(X) +
2
X
= E(X
2
)
2
X
A more convenient measure
of dispersion is the square
root of the variance (the
standard deviation),
X
;

X
=
_
Var(X)
Coecient of variation
(c.o.v.), a (nondimensional)
measure of dispersion
relative to central value.

X
=

X

X
Chen CL 15
Main Descriptors of a Random Variable
Measures of Dispersion
Ex: Operating Bulldozers
The PMF of the number of operating bulldozers
after 6 months is shown in Fig. E3.1b. On
this basis, we obtain the expected number of
operating bulldozers after 6 months as

X
= E(X) = 0
_
1
8
_
+ 1
_
3
8
_
+ 2
_
3
8
_
+ 3
_
1
8
_
= 1.50
As the random variable is discrete, the mean value of 1.5 is not necessarily
a possible value; in this case, we may only conclude that the mean number
of operating bulldozers is between 1 and 2 at the end of 6 months. The
corresponding variance is
Var(X) = (0 1.5)
2
_
1
8
_
+ (1 1.5)
2
_
3
8
_
+ (2 1.5)
2
_
3
8
_
+ (3 1.5)
2
_
1
8
_
Var(X) =
_
0
2
_
1
8
_
+ 1
2
_
3
8
_
+ 2
2
_
3
8
_
+ 3
2
_
1
8
_
(1.5)
2
= 0.75

X
=

0.75 = 0.866
X
=
0.866
1.50
= 0.577
which means that the degree of dispersion is over 50% of the mean value, a
relatively large dispersion.
Chen CL 16
Main Descriptors of a Random Variable
Measures of Dispersion
Ex: Useful Life Time
The useful life, T, of welding machines is a random variable with an exponential
probability distribution; the PDF and CDF are,
f
T
(t) = e
t
and F
T
(t) = 1 e
t
; t 0
Chen CL 17
Main Descriptors of a Random Variable
Measures of Dispersion
Ex: Useful Life Time
The mean life of the welding machines

T
= E(T) =
_

0
te
t
dt =
1

The parameter of the exponential distribution is the reciprocal of the mean


value, = 1/E(T). The mode is zero, whereas the median life t
m
, variance,
standard deviation, and the coecient of variance are,
0.50 =
_
t
m
0
e
t
dt = e
t

t
m
0
= e
t
m
+ 1
t
m
=
ln(0.50)

=
0.693

= 0.693
T
Var(T) =
_

0
(t 1/)
2
e
t
dt =
1

T
=
1

=
T

T
= 1.0
Chen CL 18
Main Descriptors of a Random Variable
Measures of Skewness
A measure of asymmetry or skewness is the third central moment;
E(X
X
)
3
=
_

x
i
(x
i

X
)
3
p
X
(x
i
) if X is a discrete RV
_

(x
X
)
3
f
X
(x)dx if X is a continuous RV
=
E(X
X
)
3

3
(dimensionless) skewness coecient
The above third moment will be positive (or negative) if the values of X above

X
are more widely dispersed than the dispersion of the values below
X
.
Chen CL 19
Useful Probability Distributions
There are a number of both discrete and continuous distribution
functions that are especially useful because of one or more of the
following reasons:
The function is the result of an underlying physical process and can
be derived on the basis of certain physically reasonable assumptions.
The function is the result of some limiting process.
It is widely known, and the necessary probability and statistical
information (including probability tables) are widely available.
Chen CL 20
Useful Probability Distributions
Gaussian (Normal) Distribution
Gaussian (Normal) Distribution, N(, )
f
X
(x) =
1

2
exp
_

1
2
_
x

_
2
_
< x <
Chen CL 21
Useful Probability Distributions
Gaussian (Normal) Distribution
Standard Normal Distribution, N(0, 1)
f
X
(x) =
1

2
e
(1/2)x
2
< x <
CDF: (s) F
S
(s), s
p
=
1
(p)
(s) = 1 (s), =
1
(1 p)
Chen CL 22
Useful Probability Distributions
Gaussian (Normal) Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
0.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
0.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
0.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
0.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7518 .7549
0.7 .7580 .7612 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
0.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
0.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986
3.0 .9986 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990
3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993
3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995
3.3 .99952 3.5 .99977 3.7 .99989 3.9 .99995 4.5 1.0000
3.4 .99966 3.6 .99984 3.8 .99993 4.0 .99997
Chen CL 23
Useful Probability Distributions
Gaussian (Normal) Distribution
Areas (or probabilities) covered within
1, 2, and 3s
( number of standard deviations about
the mean = 0 of the standard normal
distribution are, respectively, equal to
68.3%, 95.4%, and 99.7%.)
(1) (1) = (1) [1 (1)]
(1) (1) = 0.683
(2) (2) = 0.954
(3) (3) = 0.997
Chen CL 24
Useful Probability Distributions
Gaussian (Normal) Distribution
Evaluate probabilities of any normal distribution by using (s)
P(a < X b) =
1

2
_
b
a
exp
_

1
2
_
x

_
2
_
dx

s
x

, dx = ds

=
1

2
_
(b)/
(a)/
e
(1/2)s
2
ds
=
1

2
_
(b)/
(a)/
e
(1/2)s
2
ds
=
_
b

_
a

_
Chen CL 25
Useful Probability Distributions
Gaussian (Normal) Distribution
Drainage from A Community
The drainage from a community during a storm is a normal random variable
estimated to have a mean of 1.2 million gallons per day (mgd) and a standard
deviation of 0.4 mgd; i.e., N(1.2, 0.4) mgd. If the storm drain system is designed
with a maximum drainage capacity of 1.5 mgd, what is the underlying probability
of ooding during a storm that is assumed in the design of the drainage system?
Flooding in the community will occur when the drainage load exceeds the
capacity of the drainage system; the probability of ooding is
P(X > 1.5) = 1 P(X 1.5)
= 1
_
1.5 1.2
0.4
_
= 1 (0.75)
= 1 0.7734 = 0.227
Chen CL 26
Useful Probability Distributions
Gaussian (Normal) Distribution
Drainage from A Community
The probability that the drainage during a storm will be between 1.0 mgd and
1.6 mgd,
P(1.0 < X 1.6) =
_
1.61.2
0.4
_

_
1.01.2
0.4
_
= (1.0) (0.5)
= (1.0) [1 (0.5)]
= 0.8413 [1 0.6915] = 0.533
The 90-percentile drainage load from the community during a storm.
This is the value of the random variable at which the cummulative probability is
less than 0.90.
P(X x
0.90
) =
_
x
0.90
1.2
0.40
_
= 0.90
x
0.90
1.2
0.40
=
1
(0.90) = 1.28 (Table A.1)
x
0.90
= 1.28(0.40) + 1.2 = 1.71 mgd
Chen CL 27
Useful Probability Distributions
Lognormal Distribution
Logarithmic Normal (Lognormal) Distribution
If a random variable X has a
lognormal distribution, its PDF is
f
X
(x) =
1

2(x)
exp
_

1
2
_
ln(x)

_
2
_
x 0
x 0
= E(ln(X))
=
_
Var(ln(X))
Chen CL 28
Useful Probability Distributions
Lognormal Distribution
If X is lognormal with parameters and , then ln(X) is normal
with mean and standard deviation ; i.e., ln(X) N(, ).
P(a < X b) =
1
(x)

2
_
b
a
exp
_

1
2
_
ln(x)

_
2
_
dx

s =
ln(x)

, dx = xds

=
1

2
_
(ln(b))/
(ln(a))/
e
(1/2)s
2
ds
=
1

2
_
(ln(b))/
0
e
(1/2)s
2
ds
1

2
_
(ln(a))/
0
e
(1/2)s
2
ds
=
_
ln(b)

_
ln(a)

_
Chen CL 29
Useful Probability Distributions
Lognormal Distribution

X
= E(X) =
1
(x)

2
_

0
xexp
_

1
2
_
ln(x)

_
2
_
dx

y = ln(x), dx = xdy = e
y
dy

=
1

2
_

e
y
exp
_

1
2
_
y

_
2
_
dy
=
1

2
_

exp
_
y
1
2
_
y

_
2
_
dy
?
=
_
1

2
_

exp
_

1
2
_
y ( +
2
)

_
2
__
dy
. .
area under N( +
2
, ) = 1.0
exp
_
+
1
2

2
_
= exp
_
+
1
2

2
_
= ln(
X
)
1
2

2
Chen CL 30
Useful Probability Distributions
Lognormal Distribution
E(X
2
) =
1

2
_

e
y
exp
_

1
2
_
y

_
2
_
dy
=
1

2
_

exp
_

1
2
2
_
y
2
2( + 2
2
) +
2
_
_
dy
=
_
1

2
_

exp
_

1
2
_
y ( + 2
2
)

_
2
_
dy
_
. .
area under N( + 2
2
, ) = 1.0
exp
_
2( +
2
)
_
= exp
_
2( +
2
)
_
= exp
_
2
_
+
1
2

2
_
+
2
_
=
2
X
e

2
Var(X) = E(X
2
)
2
X
=
2
X
_
e

2
1
_

2
= ln
_
1 +
_

X
_
2
_
= ln
_
1 +
2
X
_

2
X
(
X
if
X
0.3)
Chen CL 31
Useful Probability Distributions
Lognormal Distribution
The median, instead of the mean, is often used to designate the
central value of a lognormal random variable,
0.5 = P(X x
m
) =
_
ln(x
m
)

_
0 =
1
(0.50) =
ln(x
m
)

= ln(x
m
), x
m
= e


X
= exp
_
+
1
2

2
_
= exp ()
. .
x
m
exp
_
1
2
ln
_
1 +
2
X
_
_

X
= x
m
_
1 +
2
X
> x
m
Chen CL 32
Useful Probability Distributions
Lognormal Distribution
Drainage from A Community
In Example 3.9, if the distribution of storm drainage from the community is a
lognormal random variable instead of normal, with the same mean and standard
deviation, the probability of ooding during a storm would be evaluated as follows.
First, we obtain the parameters and of the lognormal distribution as follows:

2
= ln
_
1 +
_
0.4
1.2
_
2
_
= ln(1.111) = 0.105
= 0.324
= ln(1.20)
1
2
(0.324)
2
= 0.130
P(X > 1.50) = 1 P(X 1.50)
= 1
_
ln(1.5)0.130
0.324
_
= 1 (0.85) = 1 0.8023 = 0.198 (0.227)
which may be compared with the probability of 0.227 from Example 3.9,
illustrating the fact that the result depends on the underlying distribution of the
random variable.
Chen CL 33
Useful Probability Distributions
Lognormal Distribution
Drainage from A Community
Also, with the lognormal distribution, we obtain the probability that the drainage
will be between 1.0 mgd and 1.6 mgd:
P(1.0 < X 1.6) =
_
ln(1.6)0.130
0.324
_

_
ln(1.0)0.130
0.324
_
= (1.049) (0.401)
= (1.049) [1 (0.401)]
= 0.8531 [1 0.6554] = 0.509 (0.533)
The 90% value of the drainage load from the community,
P(X x
0.90
) =
_
ln(x
0.90
)0.130
0.324
_
= 0.90
ln(x
0.90
) 0.130
0.324
=
1
(0.90) = 1.28 (Table A.1)
x
0.90
= e
(0.324)(1.28)+0.130
= 1.72 mgd (1.71)
Chen CL 34
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
In many engineering applications, there are often problems involving the
occurrence or recurrence of an event, which is unpredictable, in a sequence
of discrete trials. For example,
In allocating a eet of construction equipment for a project, the anticipated
conditions of every piece of equipment in the eet over the duration of the
project would have some bearing on the determination of the required eet
size.
In planning the ood control system for a river basin, the annual maximum
ow of the river over a sequence of years would be important in determining
the design ood level. In these cases, the operational conditions of every
piece of equipment, and the annual maximum ow of the river relative to a
specied ood level constitute the respective trials.
In these problems, there are only two possible outcomes in each trial:
occurrence and nonoccurrence of an event
Each piece of equipment may or may not malfunction over the duration of
the project;
In each year, the maximum ow of the river may or may not exceed some
specied ood level.
Chen CL 35
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Problems of the type that we just described above may be modeled by a
Bernoulli sequence, which is based on the following assumptions:
1. In each trial, there are only two possibilities-the occurrence and nonoccurrence
of an event.
2. The probability of occurrence of the event in each trial is constant.
3. The trials are statistically independent.
In the two examples introduced above, we may model each of the problems as
a Bernoulli sequence as follows:
Over the duration of the project, the operational conditions between equipment
are statistically independent, and the probability of malfunction for every piece
of equipment is the same;
then, the conditions of the entire eet of equipments constitute a Bernoulli
sequence.
If the annual maximum oods between any 2 years are statistically independent
and in each year the probability of the oods exceeding some specied level
is constant,
then the annual maximum oods over a series of years can be modeled as a
Bernoulli sequence.
Chen CL 36
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
In a Bernoulli sequence, if X is the random number of occurrences
of an event among n trials, in which the probability of occurrence
of the event in each trial is p and the corresponding probability of
nonoccurrence is (1 p), then
the probability of exactly x occurrences among the n trials is
governed by the binomial PMF,
P(X = x) =
_
n
x
_
p
x
(1 p)
nx
b(x; n, p)
=
n!
x!(n x)!
p
x
(1 p)
nx
, x = 0, 1, . . . , n
P(X x) =
x

k=0
_
n
k
_
p
k
(1 p)
nk
B(x; n, p)
Chen CL 37
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Cumulative Values
for The
Binomial Probability Distribution
B(x; n, p) = P[X x] =
x

k=0
b(k; n, p)
Chen CL 38
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50
n=1 0 0.9900 0.9500 0.0900 0.8000 0.7000 0.6000 0.5000
1 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n=2 0 0.9801 0.9025 0.8100 0.6400 0.4900 0.3600 0.2500
1 0.9999 0.9975 0.9900 0.9600 0.9100 0.8400 0.7500
2 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n =3 0 0.9703 0.8574 0.7290 0.5120 0.3430 0.2160 0.1250
1 0.9997 0.9927 0.9720 0.8960 0.7840 0.6480 0.5000
2 1.0000 0.9999 0.9990 0.9920 0.9730 0.9360 0.8750
3 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n=4 0 0.9606 0.8145 0.6561 0.4096 0.2401 0.1296 0.0625
1 0.9994 0.9860 0.9477 0.8192 0.6517 0.4752 0.3125
2 1.0000 0.9995 0.9963 0.9728 0.9163 0.8208 0.6875
3 1.0000 1.0000 0.9999 0.9984 0.9919 0.9744 0.9375
4 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
n=5 0 0.9510 0.7738 0.5905 0.3277 0.1681 0.0778 0.0313
1 0.9990 0.9774 0.9185 0.7373 0.5282 0.3370 0.1875
2 1.0000 0.9988 0.9914 0.9421 0.8369 0.6826 0.5000
3 1.0000 1.0000 0.9995 0.9933 0.9692 0.9130 0.8125
4 1.0000 1.0000 1.0000 0.9997 0.9976 0.9898 0.9688
5 1.0000 1.0000 1.0000 1.0000
n=10 0 0.9044 0.5987 0.3487 0.1074 0.0282 0.0060 0.0010
1 0.9957 0.9139 0.7361 0.3758 0.1493 0.0464 0.0107
2 0.9999 0.9885 0.9298 0.6778 0.3828 0.1673 0.0547
3 1.0000 0.9990 0.9872 0.8791 0.6496 0.3823 0.1719
4 1.0000 0.9999 0.9984 0.9672 0.8497 0.6331 0.3770
5 1.0000 1.0000 0.9999 0.9936 0.9526 0.8338 0.6230
6 1.0000 1.0000 1.0000 0.9991 0.9894 0.9452 0.8281
7 0.9999 0.9999 0.9877 0.9453
8 1.0000 1.0000 0.9983 0.9893
9 0.9999 0.9990
10 1.0000 1.0000
Chen CL 39
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50
n=20 0 0.8179 0.3585 0.1216 0.0115 0.0008 0.0000 0.0000
1 0.9831 0.7358 0.3917 0.0692 0.0076 0.0005 0.0000
2 0.9990 0.9245 0.6769 0.2061 0.0355 0.0036 0.0002
3 1.0000 0.9841 0.8670 0.4114 0.1071 0.0160 0.0013
4 1.0000 0.9974 0.9568 0.6296 0.2375 0.0510 0.0059
5 1.0000 0.9997 0.9887 0.8042 0.4164 0.1256 0.0207
6 1.0000 1.0000 0.9976 0.9133 0.6080 0.2500 0.0577
7 1.0000 1.0000 0.9996 0.9679 0.7723 0.4159 0.1316
8 1.0000 1.0000 0.9999 0.9900 0.8867 0.5956 0.2517
9 1.0000 1.0000 1.0000 0.9974 0.9520 0.7553 0.4119
10 0.9994 0.9829 0.8725 0.5881
11 0.9999 0.9949 0.9435 0.7483
12 1.0000 0.9987 0.9790 0.8684
13 0.9997 0.9935 0.9423
14 1.0000 0.9984 0.9793
15 0.9997 0.9941
16 1.0000 0.9987
17 0.9998
18 1.0000
n=50 0 0.6050 0.0769 0.0052 0.0000 0.0000 0.0000 0.0000
1 0.9106 0.2794 0.0338 0.0002 0.0000 0.0000 0.0000
2 0.9862 0.5405 0.1117 0.0013 0.0000 0.0000 0.0000
3 0.9984 0.7604 0.2503 0.0057 0.0000 0.0000 0.0000
4 0.9999 0.8964 0.4312 0.0185 0.0002 0.0000 0.0000
5 1.0000 0.9622 0.6161 0.0480 0.0007 0.0000 0.0000
6 1.0000 0.9882 0.7702 0.1034 0.0025 0.0000 0.0000
7 1.0000 0.9968 0.8779 0.1094 0.0073 0.0001 0.0000
8 1.0000 0.9992 0.9421 0.3073 0.0183 0.0002 0.0000
9 1.0000 0.9998 0.9755 0.4437 0.0402 0.0008 0.0000
10 1.0000 1.0000 0.9906 0.5836 0.0789 0.0022 0.0000
Chen CL 40
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50
n=50 11 1.0000 1.0000 0.9968 0.7107 0.1390 0.0057 0.0000
12 1.0000 1.0000 0.9990 0.8139 0.2229 0.0133 0.0002
13 1.0000 1.0000 0.9997 0.8894 0.3279 0.0280 0.0005
14 1.0000 1.0000 0.9999 0.9393 0.4468 0.0540 0.0013
15 1.0000 1.0000 1.0000 0.9692 0.5692 0.0955 0.0033
16 0.9856 0.6839 0.1561 0.0077
17 0.9937 0.7822 0.2369 0.0164
18 0.9975 0.8594 0.3356 0.0325
19 0.9991 0.9152 0.4465 0.0595
20 0.9997 0.9522 0.5610 0.1013
21 0.9999 0.9749 0.6701 0.1611
22 1.0000 0.9877 0.7660 0.2399
23 0.9944 0.8438 0.3359
24 0.9976 0.9022 0.4439
25 0.9991 0.9427 0.5561
26 0.9997 0.9686 0.6641
27 0.9999 0.9840 0.7601
28 1.0000 0.9924 0.8389
29 0.9966 0.8987
30 0.9986 0.9405
31 0.9995 0.9675
32 0.9998 0.9836
33 0.9999 0.9923
34 1.0000 0.9967
35 0.9987
36 0.9995
37 0.9998
38 1.0000
Chen CL 41
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50
n=100 0 0.3660 0.0059 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.7358 0.0371 0.0003 0.0000 0.0000 0.0000 0.0000
2 0.9206 0.1183 0.0019 0.0000 0.0000 0.0000 0.0000
3 0.9816 0.2578 0.0078 0.0000 0.0000 0.0000 0.0000
4 0.9966 0.4360 0.0237 0.0000 0.0000 0.0000 0.0000
5 0.9995 0.6160 0.0576 0.0000 0.0000 0.0000 0.0000
6 0.9999 0.7660 0.1172 0.0001 0.0000 0.0000 0.0000
7 1.0000 0.8720 0.2061 0.0003 0.0000 0.0000 0.0000
8 1.0000 0.9369 0.3209 0.0009 0.0000 0.0000 0.0000
9 1.0000 0.9718 0.4513 0.0023 0.0000 0.0000 0.0000
10 1.0000 0.9885 0.5832 0.0057 0.0000 0.0000 0.0000
11 1.0000 0.9957 0.7030 0.0126 0.0000 0.0000 0.0000
12 1.0000 0.9985 0.8018 0.0253 0.0000 0.0000 0.0000
13 1.0000 0.9995 0.8761 0.0469 0.0001 0.0000 0.0000
14 1.0000 0.9999 0.9274 0.0804 0.0002 0.0000 0.0000
15 1.0000 1.0000 0.9601 0.1285 0.0004 0.0000 0.0000
16 1.0000 1.0000 0.9794 0.1923 0.0010 0.0000 0.0000
17 1.0000 1.0000 0.9900 0.2712 0.0022 0.0000 0.0000
18 1.0000 1.0000 0.9954 0.3621 0.0045 0.0000 0.0000
19 1.0000 1.0000 0.9980 0.4602 0.0089 0.0000 0.0000
20 1.0000 1.0000 0.9992 0.5595 0.0165 0.0000 0.0000
21 1.0000 1.0000 0.9997 0.6540 0.0288 0.0000 0.0000
22 1.0000 1.0000 0.9999 0.7389 0.0479 0.0001 0.0000
23 1.0000 1.0000 1.0000 0.8109 0.0755 0.0003 0.0000
24 0.8686 0.1136 0.0006 0.0000
25 0.9125 0.1631 0.0012 0.0000
Chen CL 42
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50
n=100 26 0.9442 0.2244 0.0024 0.0000
27 0.9658 0.2964 0.0046 0.0000
28 0.9800 0.3768 0.0084 0.0000
29 0.9888 0.4623 0.0148 0.0000
30 0.9939 0.5491 0.0248 0.0000
31 0.9969 0.6331 0.0398 0.0001
32 0.9984 0.7107 0.0615 0.0002
33 0.9993 0.7793 0.0913 0.0004
34 0.9997 0.8371 0.1303 0.0009
35 0.9999 0.8839 0.1795 0.0018
36 0.9999 0.9201 0.2386 0.0033
37 1.0000 0.9470 0.3068 0.0060
38 0.9660 0.3822 0.0105
39 0.9790 0.4621 0.0176
40 0.9875 0.5433 0.0284
41 0.9928 0.6225 0.0443
42 0.9960 0.6967 0.0666
43 0.9979 0.7635 0.0967
44 0.9989 0.8211 0.1356
45 0.9995 0.8689 0.1841
46 0.9997 0.9070 0.2421
47 0.9999 0.9362 0.3086
48 0.9999 0.9577 0.3822
49 1.0000 0.9729 0.4602
50 0.9832 0.5398
Chen CL 43
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
x p = .01 p = .05 p = .10 p = .20 p = .30 p = .40 p = .50
n=100 51 0.9900 0.6178
52 0.9942 0.6914
53 0.9968 0.7579
54 0.9983 0.8159
55 0.9991 0.8644
56 0.9996 0.9033
57 0.9998 0.9334
58 0.9999 0.9557
59 1.0000 0.9716
60 0.9824
61 0.9895
62 0.9940
63 0.9967
64 0.9982
65 0.9991
66 0.9996
67 0.9998
68 0.9999
69 1.0000
Chen CL 44
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Road Graders of A Highway Project
Five road graders are used in the construction of a highway project. The
operational life T of each grader is a lognormal random variable with a mean
life of
T
= 1500 hr and a cov of 30% (
T
=

T

T
= 0.3, see Fig. E3.14).
Chen CL 45
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Road Graders of A Highway Project
Assuming statistical independence among the conditions of the machines, the
probability that two of the ve machines will malfunction in less than 900 hr of
operation can be evaluated. The parameters of the lognormal distribution are:

2
= ln
_
1 + 0.3
2

= 0.086 0.30; and = ln(1500)


1
2
(0.3)
2
= 7.27.
Then, the probability that a machine will malfunction within 900 hr is
p = P(T 900) =
_
ln(900) 7.27
0.30
_
= (1.56) = 0.0594
For the ve machines taken collectively, the actual operational lives of the
dierent machines may conceivably be as shown in Fig. E3.14; i.e., machines
No.1 and 4 have operational lives less than 900 hr, whereas machines No. 2, 3,
and 5 have operational lives longer than 900 hr. The corresponding probability
of this exact sequence is p
2
(1 p)
3
. But the two malfunctioning machines may
happen to any two of the ve machines; therefore, the number of sequences
with two malfunctioning machines among the ve is
5!
2!3!
= 10. Consequently, if
X is the number of road graders malfunctioning in 900 hr,
P(X = 2) = 10(0.0594)
2
(1 0.0594)
3
= 0.0294
Chen CL 46
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Road Graders of A Highway Project
The probability of malfunction among the ve graders (i.e., there will be
malfunctions in one or more machines) would be
P(X 1) = 1 P(X = 0) = 1 (1 0.0594)
5
= 0.2638
The probability that there will be no more than two machines malfunctioning
within 900 hr
P(X 2) =
2

k=0
_
5
k
_
(0.0594)
k
(1 0.0594)
5k
= (0.9406)
5
+ 5(0.0594)(0.9406)
4
+ 10(0.0594)
2
(0.9406)
3
= 0.7362 + 0.2325 + 0.0294 = 0.9981
This last result involves the CDF of the binomial distribution, which is tabulated
in Table A.2 for limited values of the parameters. Using Table A.2 with n = 5,
x = r = 2, and p = 0.05, we obtain a value of 0.9988 from this table.
Chen CL 47
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
In modeling problems with the Bernoulli sequence, the individual
trials must be discrete and statistically independent.
Certain continuous problems may be modeled (approximately at
least) with the Bernoulli sequence.
For example, time and space problems, which are generally
continuous, may be modeled with the Bernoulli sequence by
discretizing time (or space) into appropriate intervals and admitting
only two possibilities within each interval;
what happens in each time (or space) interval then constitutes a
trial, and the series of nite number of intervals is then a Bernoulli
sequence.
Chen CL 48
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Rationing based on Annual Rainfall
The annual rainfall (accumulated generally during the winter and spring) of each
year in Orange County, California, is a Gaussian random variable with a mean of
15 in. and a standard deviation of 4 in.; i.e., N(15, 4).
Suppose the current water policy of the county is such that if the annual rainfall
is less than 7 in. for a given year, water rationing will be required during the
summer and fall of that year.
Assuming X is the annual rainfall, the probability of water rationing in Orange
County in any given year is then
P(X < 7) =
_
7 15
4
_
= (2.0) = 1 (2.0) = 1 0.9772 = 0.0228
Chen CL 49
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Rationing based on Annual Rainfall
If the county wishes to reduce the probability of water rationing to half that of
the current policy, the annual rainfall below which rationing has to be imposed
would be determined as follows:
P(X < x
r
) =
_
x
r
15
4
_
=
1
2
(0.0228) = 0.0114
x
r
15
4
=
1
(0.0114) =
1
(0.9885) = 2.28
x
r
= 15 + (4)(2.28) = 5.88 inch
Under the current water policy, and assuming that the annual rainfalls between
years are statistically independent, the probability that in the next 5 years there
will be at least 1 year in which water rationing will be necessary would be
determined as follows.
Denoting N as the number of years when rationing would be imposed in the
next 5 years, the probability would be
P(N 1) = 1 P(N = 0) = 1
_
5
0
_
(0.0228)
0
(0.9772)
5
= 0.109
Chen CL 50
Useful Probability Distributions
Bernoulli Sequence and Binomial Distribution
Rationing based on Annual Rainfall
Whenever the annual rainfall is less than 7 in. in any given year, the probability
of damage to the agricultural crops in the county is 30%.
Assuming that crop damages between dry years (i.e., with rainfall less than 7
in.) are statistically independent, the probability of crop damage (denoted D) in
the next 3 years may be of interest. In this case, the probability of crop damage
would depend on the number of years (between 0 and 3) that the annual rainfall
will be less than 7 in.; therefore, the solution requires the theorem of total
probability, as follows:
P(D) = 1 P(D)
= 1
_
1.00(0.9772)
3
+ (0.70)
1
(1.00)
2
_
3
1
_
(0.0228)(0.9772)
2
+ (0.70)
2
(1.00)
1
_
3
2
_
(0.0228)
2
(0.9772) + (0.70)
3
(1.00)
0
_
3
3
_
(0.0228)
3
_
= 1 [0.9331 + 0.0457 + 0.0007 + 0] = 1 0.9795 = 0.020
The probability of crop damage in the next 3 years is only 2%.
Chen CL 51
Useful Probability Distributions
Geometric Distribution
In a Bernoulli sequence, the number of trials until a specied
event occurs for the rst time is governed by the geometric
distribution.
We might observe that if the event occurs for the rst time on the
n
th
trial, there must be no occurrence of this event in any of the
prior (n 1) trials.
Geometric Distribution:
If N is the random variable representing the number of trials until
the occurrence of the event, then
P(N = n) = pq
n1
n = 1, 2, . . . ; (q = 1 p)
Chen CL 52
Useful Probability Distributions
Geometric Distribution
Recurrence Time and Return Period
In a time (or space) problem that is appropriately discretized into time (or space)
intervals, T = N, and can be modeled as a Bernoulli sequence, number of time
intervals until rst occurrence of an event is called rst occurrence time.
If the discretized time intervals in the sequence are statistically independent, the
time interval till the rst occurrence of an event must be the same as that of
the time between any two consecutive occurrences of the same event;
thus the probability distribution of the recurrence time is equal to that of the
rst occurrence time.
The recurrence time in a Bernoulli sequence is also governed by the geometric
distribution.
The mean recurrence time, which is popularly known in engineering as the
(average) return period is
T = E(T) =

t=1
t pq
t1
= p(1 + 2q + 3q
2
+ . . .) = p
1
(1 q)
2
=
1
p
Chen CL 53
Useful Probability Distributions
Geometric Distribution
Recurrence Time and Return Period, Ex: Building Design
Suppose that the building code for the design of buildings in a coastal region
species the 50-yr wind as the design wind. That is, a wind velocity with a
return period of 50 years; or on the average, the design wind may be expected
to occur once every 50 yr.
In this case, the probability of encountering the 50-yr wind velocity in any 1 yr
is p = 1/50 = 0.02.
The probability that a newly completed building in the region will be subjected
to the design wind velocity for the rst time on the fth year after its completion
is
P(T = 5) = (0.02)(0.98)
4
= 0.018
Chen CL 54
Useful Probability Distributions
Geometric Distribution
Recurrence Time and Return Period, Ex: Building Design
The probability that the rst such wind velocity will occur within 5 yr after
completion of the building would be
P(T 5) =
5

t=1
(0.02)(0.98)
t1
= 0.02 + 0.0196 + 0.0192 + 0.0188 + 0.0184
= 0.096
This latter event (the rst occurrence of the wind velocity within 5 yr) is the
same as the event of at least one 50-yr wind in 5 yr, which is also the complement
of no 50-yr wind in 5 years; The desired probability may also be calculated as
1 (0.98)
5
= 0.096.
The above is quite dierent from the event of experiencing exactly one 50-yr
wind in 5 yr; the probability in this case is given by the binomial probability
which would be
_
5
1
_
(0.02)(0.98)
4
= 0.092.
Chen CL 55
Useful Probability Distributions
Geometric Distribution
Recurrence Time and Return Period, Ex: Oshore Platform
A xed oshore platform is designed for
a wave height of 8 m above the mean
sea level. This wave height corresponds
to a 5% probability of being exceeded
per year.
The return period of the design wave
height is,
T =
1
0.05
= 20 yr
The probability that the platform will be subjected to the design wave height
within the return period is,
P(H > 8, in 20 yr) = 1 P(H 8, in 20 yr) = 1 (0.95)
20
= 0.3585
Chen CL 56
Useful Probability Distributions
Geometric Distribution
Recurrence Time and Return Period, Ex: Oshore Platform
The probability that the rst exceedance of the design wave height will occur
after the third year is, by the geometric distribution,
P(T > 3) = 1 P(T 3)
= 1 [0.05(0.95)
11
+ 0.05(0.95)
21
+ 0.05(0.95)
31
]
= 1 [0.05 + 0.0475 + 0.0451]
= 1 0.1426 = 0.8574
If the rst exceedance of the design wave height should occur after the third
year as stipulated above, the probability that such a rst exceedance will occur
in the fth year is then
P(T = 5|T > 3) =
P(T = 5 T > 3)
P(T > 3)
=
P(T = 5)
P(T > 3)
=
0.05(0.95)
4
0.8574
= .0475
Chen CL 57
Useful Probability Distributions
Geometric Distribution
Recurrence Time and Return Period
The probability of no event occurring within its return period T,
P(no occurrence in T) = (1 p)
T
= 1 Tp +
T(T 1)
2!
p
2

T(T 1)(T 2)
3!
p
3
+ . . .
= e
Tp
= e
1
= 0.3679 (T =
1
p
)
P( occurrence in T) = 1 0.3679 = 0.6321
For a rare event that is dened as one with a long return period, T, the
probability of the events occurring within its return period is always 0.632.
This result is a useful approximation even for return periods that are not very
long; for instance, for T = 20 time intervals, such as in Example 3.16, the
probability is (p = 1/20; q = 1 1/20)
P(occurrence in T) = 1
_
1
1
20
_
20
= 1 0.3585 = 0.6415
which shows that the error in the above exponential approximation is less than
1.5%. ((0.6415 0.6321)/0.6321 = 0.01487)
Chen CL 58
Useful Probability Distributions
Negative Binomial Distribution
The geometric PMF is the probability law governing the number of
trials, or discrete time units, until the rst occurrence of an event
in a Bernoulli sequence.
The number of time units (or trials) until a subsequent occurrence
of the same event is governed by the negative binomial distribution.
If T
k
is the number of time units until the k
th
occurrence of the
event in a series of Bernoulli trials, then
P(T
k
= n) =
_

_
_
n 1
k 1
_
p
k
q
nk
for n = k, k + 1, . . .
0 for n < k
Chen CL 59
Useful Probability Distributions
Negative Binomial Distribution
If the k
th
occurrence of an event is realized at the n
th
trial, there
must be exactly (k1) occurrences of the event in the prior (n1)
trials and at the n
th
trial the event also occurs.
Thus, from the binomial law, we obtain the probability
P(T
k
= n) =
_
n 1
k 1
_
p
k1
q
nk
p
Chen CL 60
Useful Probability Distributions
Negative Binomial Distribution
Ex: Building Design
In previous building example, the probability that the building in the region will
be subjected to the design wind for the third time on the tenth year is
P(T
3
= 10) =
_
10 1
3 1
_
(0.02)
3
(0.98)
103
= 36(0.000008)(0.8681) = 0.00025
The probability that the third design wind will occur within 5 years would be
P(T
3
5) =
5

n=3
_
n 1
3 1
_
(0.02)
3
(0.98)
n3
=
_
2
2
_
(0.02)
3
(0.98)
0
+
_
3
2
_
(0.02)
3
(0.98)
1
+
_
4
2
_
(0.02)
3
(0.98)
2
= (0.000008) + 3(0.000008)(0.98) + 12(0.000008)(0.96040) = 0.00012
Chen CL 61
Useful Probability Distributions
Negative Binomial Distribution
Ex: A Steel Cable Problem
A steel cable is built up of a number of independent wires as shown in Fig.
E3.19. Occasionally, the cable is subjected to high overloads; on such occasions
the probability of fracture of one of the wires is 0.05, and the failure of two or
more wires during a single overload is unlikely.
Chen CL 62
Useful Probability Distributions
Negative Binomial Distribution
Ex: A Steel Cable Problem
If the cable must be replaced when the third wire fails, the probability that the
cable can withstand at least ve overloads can be determined as follows.
First, we observe that the third wire failure must occur at or after the sixth
overloading. Hence, the required probability is
P(T
3
6) = 1 P(T
3
< 6) = 1
5

n=3
P(T
3
= n)
= 1
_
3 1
3 1
_
(0.05)
3
(0.95)
0

_
4 1
3 1
_
(0.05)
3
(0.95)
1

_
5 1
3 1
_
(0.05)
3
(0.95)
2
= 1 0.00184 = 0.9982
Chen CL 63
Useful Probability Distributions
Poisson Process and Poisson Distribution
Many physical problems of interest to engineers and scientists involve the possible
occurrences of events at any point in time and/or space.
Earthquakes could strike at any time and anywhere over a seismically active
region in the world;
Fatigue cracks may occur anywhere along a continuous weld; and
Trac accidents could happen at any time on a given highway.
Conceivably, such space-time problems may be modeled also with the Bernoulli
sequence, by dividing the time or space into appropriate small intervals, and
assuming that an event will either occur or not occur (only two possibilities)
within each interval, thus constituting a Bernoulli trial.
However, if the event can randomly occur at any instant of time (or at any point
in space), it may occur more than once in any given time or space interval. In
such cases, the occurrences of the event may be more appropriately modeled
with a Poisson process or Poisson sequence.
Chen CL 64
Useful Probability Distributions
Poisson Process and Poisson Distribution
Formally, the Poisson process is based on the following assumptions:
1. An event can occur at random and at any instant of time or any point in
space.
2. The occurrence(s) of an event in a given time (or space) interval is statistically
independent of that in any other nonoverlapping interval.
3. The probability of occurrence of an event in a small interval t is proportional
to t, and can be given by t, where is the mean occurrence rate of the
event (assumed to be constant).
4. The probability of two or more occurrences in t is negligible (of higher orders
of t).
If X
t
is the number of occurrences in a time (or space) interval (0, t), then the
number of statistically independent occurrences of an event in t (time or space)
is governed by the Poisson PMF;
P(X
t
= x) =
(t)
x
x!
e
t
x = 0, 1, 2, . . .
where is the mean occurrence rate, i.e., the average number of occurrences of
the event per unit time (or space) interval; E(X
t
) = t = Var(X
t
).
Chen CL 65
Useful Probability Distributions
Poisson Process and Poisson Distribution
The Bernoulli sequence approaches the Poisson process as the time (or space)
interval is decreased.
From previous statistical data of trac counts, an average of 60 cars per hour
was observed to make left turns at a given intersection. Then, suppose we
are interested in the probability of exactly 10 cars making left turns at the
intersection in a 10-min interval.
As an approximation, we may rst divide the 1-hr duration into 120 intervals
of 30 sec each (1 hour 120 intervals; 10 min 20 intervals), such that the
probability of a left turn (L.T.) in any 30-sec interval would be p = 60/120 = 0.5,
(120 0.5 = 60). Then, allowing no more than one L.T. in any 30-sec interval,
the problem is reduced to the binomial probability of the occurrence of 10 L. T.
among the maximum possible of 20 L.T. in the 10-min interval, in which the
probability of a L.T. in each 30-sec interval is 0.5. Thus,
P(10 L.T. in 10 min) =
_
20
10
_
(0.5)
10
(0.5)
2010
= 0.1762
Chen CL 66
Useful Probability Distributions
Poisson Process and Poisson Distribution
The above solution is grossly approximate because it assumes that no more than
one car will be making L.T. in a 30-sec interval; obviously, two or more L.T.s
are possible.
The solution would be improved if we selected a shorter time interval, say, a
10-sec interval (1 hour 360 intervals; 10 min 60 intervals). Then, the
probability of an L.T. in each interval is p = 60/360 = 0.1667, (360 0.1667 =
60), and
P(10 L.T. in 10 min) =
_
60
10
_
(0.1667)
10
(0.8333)
6010
= 0.1370
Further improvements can be made by subdividing time into shorter intervals.
If the time t is subdivided into n equal intervals, then the binomial PMF would
give
P(x occurrence in t) =
_
n
x
_
_

n
_
x
_
1

n
_
nx
where is the average number of occurrences of the event in time t.
Chen CL 67
Useful Probability Distributions
Poisson Process and Poisson Distribution
=t: average # of events in t (min)
: average # of events per unit time (min)
divide t into n intervals with p =

n
=
t
n
(event/trial)
(Assume: no more than one event in an interval)
P(x occurrence in t) = lim
n
_
n
x
_
_

n
_
x
_
1

n
_
nx
= lim
n
n!
x!(n x)!
_

n
_
x
_
1

n
_
nx
= lim
n
n
n
(n 1)
n

(n x + 1)
n
. .
1 (n>>x)

x
x!
_
1

n
_
n
. .
e

_
1

n
_
x
. .
1
_
1

n
_
n
= 1 +

2
2!


3
3!
+ = e

P(x occurrence in t) =

x
x!
e

=
(t)
x
x!
e
t
Chen CL 68
Useful Probability Distributions
Poisson Process and Poisson Distribution
On this basis, with = 1 L.T. per minute,
the probability of x = 10 L.T. in t = 10 min is then
P(X
10
= 10) =
(t)
x
x!
e
t
=
(1 10)
10
10!
e
110
= 0.1250

0.1370

0.1762
Chen CL 69
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Severe Rainstorms
Historical records of severe rainstorms in a town over the last 20 years indicated
that there had been an average number of four rainstorms per year. Assuming
that the occurrences of rainstorms may be modeled with the Poisson process,
the probability that there would not be any rainstorms next year is
P(X
t
= 0) =
(4 1)
0
0!
e
(41)
= 0.018
The probability of four rainstorms next year
P(X
t
= 4) =
(4 1)
4
4!
e
(41)
= 0.195
The PMF shows the dierent probabilities of the
number of rainstorms in a year(X
t
= 0, 1, . . .)
Chen CL 70
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Severe Rainstorms
Although the average yearly occurrences of rainstorms is four, the probability of
actually experiencing four rainstorms in a year is less than 20%.
The probability of two or more rainstorms (x 2) in the next year is
P(X
1
2) = 1 P(X
1
1)
= 1 P(X
1
= 0) P(X
1
= 1)
= 1
1

x=0
(4 1)
x
x!
e
(41)
= 1 0.018 0.074 = 0.908
Chen CL 71
Useful Probability Distributions
Poisson Process and Poisson Distribution
Cumulative Values
for The
Poisson Probability Distribution
P(x; , t) = P[X x] =
x

k=0
p(k; , t)
Chen CL 72
Useful Probability Distributions
Poisson Process and Poisson Distribution
t
x 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
0 0.3679 0.1353 0.0498 0.0183 0.0067 0.0025 0.0009 0.0003 0.0001 0.0000
1 0.7358 0.4060 0.1991 0.0916 0.0404 0.0174 0.0073 0.003 0.0012 0.0005
2 0.9197 0.6767 0.4232 0.2381 0.1247 0.0620 0.0296 0.0138 0.0062 0.0028
3 0.9810 0.8571 0.6472 0.4335 0.2650 0.1512 0.0818 0.0424 0.0212 0.0103
4 0.9963 0.9473 0.8153 0.6288 0.4405 0.2851 0.1730 0.0990 0.0550 0.0293
5 0.9994 0.9834 0.9161 0.7851 0.6160 0.4457 0.3007 0.1912 0.1157 0.0671
6 0.9999 0.9955 0.9665 0.8893 0.7622 0.6063 0.4497 0.3134 0.2068 0.1301
7 1.0000 0.9989 0.9881 0.9489 0.8666 0.7440 0.5987 0.4530 0.3239 0.2202
8 0.9998 0.9962 0.9786 0.9319 0.8472 0.7291 0.5926 0.4557 0.3328
9 1.0000 0.9989 0.9919 0.9682 0.9161 0.8305 0.7166 0.5874 0.4579
10 0.9997 0.9972 0.9863 0.9574 0.9015 0.8159 0.7060 0.5830
11 0.9999 0.9991 0.9945 0.9799 0.9466 0.8881 0.8030 0.6968
12 1.0000 0.9997 0.9980 0.9912 0.9730 0.9362 0.8758 0.7916
13 0.9999 0.9993 0.9964 0.9872 0.9658 0.9262 0.8645
14 1.0000 0.9998 0.9986 0.9943 0.9827 0.9585 0.9165
15 0.9999 0.9995 0.9976 0.9918 0.9780 0.9513
16 1.0000 0.9998 0.9990 0.9963 0.9889 0.9730
17 0.9999 0.9996 0.9984 0.9947 0.9857
18 1.0000 0.9999 0.9993 0.9976 0.9928
19 0.9999 0.9997 0.9989 0.9965
20 1.0000 0.9999 0.9996 0.9984
21 1.0000 0.9998 0.9993
22 0.9999 0.9997
23 1.0000 0.9999
24 0.9999
25 1.0000
Chen CL 73
Useful Probability Distributions
Poisson Process and Poisson Distribution
t
x 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0
0 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.0012 0.0005 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
3 0.0049 0.0023 0.0011 0.0005 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000
4 0.0151 0.0076 0.0037 0.0018 0.0009 0.0004 0.0002 0.0001 0.0000 0.0000
5 0.0375 0.0203 0.0107 0.0055 0.0028 0.0014 0.0007 0.0003 0.0002 0.0001
6 0.0786 0.0458 0.0259 0.0142 0.0076 0.0040 0.0021 0.0010 0.0005 0.0003
7 0.1432 0.0895 0.0540 0.0316 0.0180 0.0100 0.0054 0.0029 0.0015 0.0008
8 0.2320 0.1550 0.0998 0.0621 0.0374 0.0220 0.0126 0.0071 0.0039 0.0021
9 0.3405 0.2424 0.1658 0.1094 0.0699 0.0433 0.0261 0.0154 0.0089 0.0050
10 0.4599 0.3472 0.2517 0.1757 0.1185 0.0774 0.0491 0.0304 0.0183 0.0108
11 0.5793 0.4616 0.3532 0.2600 0.1847 0.1270 0.0847 0.0549 0.0347 0.0214
12 0.6887 0.5760 0.4631 0.3585 0.2676 0.1931 0.1350 0.0917 0.0606 0.0390
13 0.7813 0.6815 0.5730 0.4644 0.3632 0.2745 0.2009 0.1426 .00984 0.0061
14 0.8540 0.7720 0.6751 0.5704 0.4656 0.3675 0.2808 0.2081 0.1497 0.1049
15 0.9074 0.8444 0.7636 0.6694 0.5681 0.4667 0.3714 0.2866 0.2148 0.1565
16 0.9441 0.8987 0.8355 0.7559 0.6641 0.5660 0.4677 0.3750 0.2920 0.2211
17 0.9678 0.9370 0.8905 0.8272 0.7489 0.6593 0.5640 0.4686 0.3784 0.2970
18 0.9823 0.9626 0.9302 0.8826 0.8195 0.7423 0.6549 0.5622 0.4695 0.3814
19 0.9907 0.9787 0.9573 0.9235 0.8752 0.8122 0.7363 0.6509 0.5606 0.4703
20 0.9953 0.9884 0.9750 0.9521 0.9170 0.8682 0.8055 0.7307 0.6472 0.5591
21 0.9977 0.9939 0.9859 0.9711 0.9469 0.9108 0.8615 0.7991 0.7255 0.6437
22 0.9989 0.9969 0.9924 0.9833 0.9672 0.9418 0.9047 0.8551 0.7931 0.7206
23 0.9995 0.9985 0.9960 0.9907 0.9805 0.9633 0.9367 0.8989 0.8490 0.7875
24 0.9998 0.9993 0.9980 0.9950 0.9888 0.9777 0.9593 0.9317 0.8933 0.8432
Chen CL 74
Useful Probability Distributions
Poisson Process and Poisson Distribution
t
x 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0
25 0.9999 0.9997 0.9990 0.9974 0.9938 0.9869 0.9747 0.9554 0.9269 0.8878
26 1.0000 0.9999 0.9995 0.9987 0.9967 0.9925 0.9848 0.9718 0.9514 0.9221
27 0.9999 0.9998 0.9994 0.9983 0.9959 0.9912 0.9827 0.9687 0.9475
28 1.0000 0.9999 0.9997 0.9991 0.9978 0.9950 0.9897 0.9805 0.9657
29 1.0000 0.9999 0.9996 0.9989 0.9973 0.9940 0.9881 0.9782
30 0.9999 0.9998 0.9994 0.9985 0.9967 0.9930 0.9865
31 1.0000 0.9999 0.9997 0.9992 0.9982 0.9960 0.9919
32 0.9999 0.9999 0.9996 0.9990 0.9978 0.9953
33 1.0000 0.9999 0.9998 0.9995 0.9988 0.9973
34 1.0000 0.9999 0.9997 0.9994 0.9985
35 0.9999 0.9999 0.9997 0.9992
36 1.0000 0.9999 0.9998 0.9996
37 1.0000 0.9999 0.9998
38 1.0000 0.9999
39 0.9999
40 1.0000
Chen CL 75
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Left-turn Bay Design
In designing the left-tum bay at a state highway intersection, the vehicles making
left turns at the intersection may be modeled as a Poisson process.
If the cycle time of the trac light for left turns is 1 min, and the design
criterion requires a left-tum lane that will be sucient 96% of the time (which
may be the criterion in some states in the United States), the lane distance, in
terms of car lengths, to allow for an average left turns of 100 per hour, may be
determined as follows.
Chen CL 76
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Left-turn Bay Design
The mean rate of left turns at the intersection is = 100/60 per minute.
Suppose the design length of the left-tum lane is x car lengths.
Then, during a 1-min cycle of the trac light, the design criterion requires that
the probability of no more than x cars waiting for left turns must be at least
96%;
P(X
t=1
x) =
x

k=0
1
k!
_
100
60
1
_
k
e
(100/60)1
0.960
If x = 3, P(X
t=1
3) =
3

k=0
1
k!
_
100
60
1
_
k
e
(100/60)1
= 0.910
If x = 4, P(X
t=1
4) =
4

k=0
1
k!
_
100
60
1
_
k
e
(100/60)1
= 0.968
A left-turn bay of four car lengths at the intersection is sucient to satisfy the
design requirement.
Chen CL 77
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Trac Control at A School Crosswalk
The street width at a school crosswalk is D ft, and a child crossing the street
walks at a speed of 3.5 ft/sec. In other words, it takes a child t =
D
3.5
sec to
cross the street.
Suppose 60 free intervals (t seconds each) in an hour, on the average, are
desired at this crossing; how much average trac volume can be allowed at this
crosswalk before crossing controls will be necessary?
Assume that cars are crossing the crosswalk constitute a Poisson process.
The number of t-sec intervals in an hour is
3600
t
, whereas in an interval of t
sec the probability of no cars passing through the crosswalk is P(X = 0) =
(t)
0
0!
e
t
= e
t
, if is the average vehicular trac per second.
Therefore the maximum average trac volume that can be allowed is such that
the mean number of free intervals equals 60; that is,
Chen CL 78
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Trac Control at A School Crosswalk
60
_
3600
t
_
e
t
=
_
3600
D
3.5
_
e
(D/3.5)

3.5
D
ln
_
(3600)(3.5)
(60)(D)
_
D=25

3.5
25
ln
_
(3600)(3.5)
(60)(25)
_
= 0.298 vehicles/sec = 1073 vehicles/hr
For various street widths D, the maximum trac ow that can be allowed before
pedestrian crossing controls should be installed:
D (ft)= 25 40 60 75
(veh/h) 1073 522 263 173
The above method has been adopted by the Joint Committee of the Institute of
Trac Engineers and the International Association of Chiefs of Police.
Chen CL 79
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: A Steel Pipeline Problem
A major steel pipeline is used to transport crude oil from an oil production
platform to a renery over a distance of 100 km. Even though the entire pipeline
is inspected once a year and repaired as necessary, the steel material is subject
to damaging corrosion.
Assume that from past inspection records, the average distance between locations
of such corrosions is determined to be 0.15 km. In this case, if the occurrence
of corrosions along the pipeline is modeled as a Poisson process with a mean
occurrence rate of = 0.15/km, the probability that there will be 10 locations
of damaging corrosion between inspections is
P(X
100
= 10) =
(0.15 100)
10
10!
e
0.15100
= 0.049
Chen CL 80
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: A Steel Pipeline Problem
The probability of at least ve corrosion sites between inspections (100 km)
P(X
100
5) = 1 P(X
100
4)
= 1
4

n=0
(0.15 100)
n
n!
e
0.15100
= 1
_
(15)
0
0!
e
15
+
(15)
1
1!
e
15
+
(15)
2
2!
e
15
+
(15)
3
3!
e
15
+
(15)
4
4!
e
15
_
= 1 e
15
..
310
7
_
1 + 15 + 112.5 + 562.5 + 2109.4
_
. .
=2800.4
= 1 0.0009 = 0.9991
Chen CL 81
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: A Steel Pipeline Problem
In any one of the corrosion sites, there may be one or more cracks that could
initiate fracture failure. If the probability of this event occurring at a corrosion
site is 0.001, the probability of fracture failure along the entire 100-km pipeline
between inspection and repair would be (denote F for fracture failure),
P(F) = 1 P(F) = 1 P(F X
100
0)
= 1

n=0
P
_
F|X
100
= n
_
P(X
100
= n)
= 1

n=0
(1 0.001)
n
(0.15 100)
n
n!
e
(0.15100)
= 1 e
15
_
1 + (0.999)
15
1!
+ (0.999)
2
15
2
2!
+ (0.999)
2
15
3
3!
+ . . .
_
= 1 e
15
e
0.99915
= 1 e
(0.9991)15
= 1 0.985 = 0.015
Chen CL 82
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Large Earthquakes
In the last 50 years, suppose that there were two large earthquakes (p =
2/50 = 0.04, with magnitudes M 6) in Southern California. If we model the
occurrences of such large earthquakes as a Bernoulli sequence, the probability
of such large earthquakes in Southern California in the next 15 years would
be evaluated as follows. First, the annual probability of occurrence of large
earthquakes is p = 2/50 = 0.04. Then
P(X 1) = 1 P(X = 0) = 1
_
15
0
_
(0.04)
0
(0.96)
15
= 0.458
If the occurrences of large earthquakes in Southern California were modeled
as a Poisson process, we would rst determine the mean occurrence rate as
= 2/50 = 0.04 per year, and the probability of such large earthquakes in the
next 15 years then becomes
P(X
15
1) = 1 P(X
15
= 0) = 1
(0.04 15)
0
0!
e
(0.0415)
= 0.451
Chen CL 83
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Large Earthquakes
Suppose that during an earthquake of M 6, the ground shaking intensity Y
at a particular building site has a lognormal distribution with a median of 0.20g
and a c.o.v. of 0.25. (i.e., = ln(x
m
) = ln(0.2g);
X
= 0.25, if cov < 0.3)
If the seismic capacity of a building is 0.30g, the probability that the building
will suer damage during an earthquake of magnitude M 6 would be
P(D|M 6) = P(Y > 0.30g) = 1 P(Y 0.30g)
= 1
_
ln(0.3g)

_
= 1
_
ln(0.3g) ln(0.2g)
0.25
_
= 1
_
ln(1.5)
0.25
_
= 1 0.947 = 0.053
P(D|M 6) = 1 P(D|M 6) = 1 0.053 = 0.947
Chen CL 84
Useful Probability Distributions
Poisson Process and Poisson Distribution
Ex: Large Earthquakes
In the next 20 years, the probability that the building will not suer damage
from large earthquakes (assuming the Poisson process for occurrences of large
earthquakes) would be (giving = 0.04; t = 20; P(D|M 6) = 0.947)
P(D in 20 years) =

n=0
(0.947)
n
(0.04 20)
n
n!
e
0.0420
= e
0.80
_
1 + (0.947)
1
(0.80)
1
1!
+ (0.947)
2
(0.80)
2
2!
+ (0.947)
3
(0.80)
3
3!
+ . . .
_
= e
0.80
e
0.9470.80
= e
0.0424
= 0.958
Chen CL 85
Useful Probability Distributions
Poisson Process and Poisson Distribution
Further Notes
In both the Bernoulli sequence and the Poisson process, the
occurrences of an event between trials (in the case of the
Bernoulli model) and between intervals (in the Poisson model)
are statistically independent.
More generally, the occurrence of a given event in one trial (or
interval) may aect the occurrence or nonoccurrence of the same
event in subsequent trials (or intervals).
In other words, the probability of occurrence of an event in a given
trial may depend on earlier trials, and thus could involve conditional
probabilities.
If this conditional probability depends on the immediately preceding
trial (or interval), the resulting model is a Markov chain (or Markov
process).
Chen CL 86
Useful Probability Distributions
The Exponential Distribution
In the case of a Bernoulli sequence, the probability of the recurrence time
between events is described by the geometric distribution.
If the occurrences of an event constitute a Poisson process, the recurrence time
would be described by the exponential distribution.
In the case of a Poisson process, if T
1
is the time till the rst occurrence of an
event, then (T
1
> t) means that there is no occurrence of the event in (0, t);
P(T
1
> t) = P(X
t
= 0) =
(t)
0
0!
e
t
= e
t
Because the occurrences of an event in nonoverlapping intervals are statistically
independent, T
1
is also the recurrence time between two consecutive occurrences
of the same event.
The CDP (and PDF) of T
1
, therefore, is the exponential distribution,
F
T
1
(t) = P(T
1
t) = 1 e
t
f
T
1
=
dF
dt
= e
t
Chen CL 87
Useful Probability Distributions
The Exponential Distribution
If the mean occurrence rate, , is constant, the mean recurrence time, E(T
1
), or
return period for a Poisson process can be shown to be
E(T
1
) =
1

This may be compared with the corresponding return period of 1/p for the
Bernoulli sequence.
However, for events with small occurrence rate , 1/ 1/p.
Observation: in a Poisson process with occurrence rate , the probability of an
event occurring in a unit time interval (i.e., t = 1) is
p = P(X
1
= 1) = e

=
_
1 +
1
2

2
+ . . .
_
for small
For rare events, i.e., events with small mean occurrence rates or long return
periods, the Bernoulli and the Poisson models should give approximately the
same results.
Chen CL 88
Useful Probability Distributions
The Exponential Distribution
Ex: Earthquakes in San Francisco
According to Benjamin (1968), the historical record of earthquakes in San
Francisco from 1836 to 1961 shows that there were 16 earthquakes with ground
motion intensity in MM-scale of VI or higher. If the occurrence of such high-
intensity earthquakes in the San Francisco-Bay Area can be assumed to constitute
a Poisson process, the probability that the next high-intensity earthquake will
occur within the next 2 years would be evaluated as follows.
The mean occurrence rate of high-intensity earthquakes in the region is
=
16
125
= 0.128 quake per year
P(T
1
2) = 1 e
0.1282
= 0.226
Chen CL 89
Useful Probability Distributions
The Exponential Distribution
Ex: Earthquakes in San Francisco
The above is equivalent to the probability of the occurrence of such high-intensity
earthquakes (one or more) in the next two years. With the Poisson model, this
latter probability would be
P(X
2
1) = 1 P(X
2
0) = 1 P(X
2
= 0)
= 1
(0.128 2)
0
0!
e
0.1282
= 1 e
0.1282
= 0.226
The probability that no earthquakes of this high intensity will occur in the next
10 years is (or by Poisson distribution)
P(T
1
> 10) = e
0.12810
= 0.278
P(X
10
= 0) =
(0.128 10)
0
0!
e
0.12810
= 0.278
Chen CL 90
Useful Probability Distributions
The Exponential Distribution
Ex: Earthquakes in San Francisco
The return period of an intensity VI earthquake in
San Francisco,
T
1
=
1
0.128
= 7.8 year
The probability of occurrence of large earthquakes
within a given time t is given by the CDF of T
1
;
P(T
1
t) = 1 e
0.128t
t = 5, 10, . . .
The probability of high-intensity earthquakes occurring within the return period
of 7.8 years in the San Francisco area would be
P(T
1
7.8) = 1 e
0.1287.8
= 1 e
1.0
= 0.632
Chen CL 91
Useful Probability Distributions
The Exponential Distribution
For a Poisson process the probability of an event occurring (once or more) within
its return period is always equal to 1 e
1
= 0.632.
This may be compared with the probability of events with long return periods of
the Bernoulli model.
The exponential distribution is also useful as a general-purpose probability
function.
The PDP, CDF, and mean and variance of the exponential distribution are
f
X
(x) =
_
e
x
for x 0
0 for x < 0
F
X
(x) =
_
1 e
x
for x 0
0 for x < 0

X
=
1


2
X
=
1

2
Chen CL 92
Useful Probability Distributions
The Shifted Exponential Distribution
The PDP and CDP of the exponential distributions start at x = 0.
In general, the distribution can start at any positive value of x;
the resulting distribution may be called the shifted exponential distribution.
The corresponding PDP and CDP starting at a,
f
X
(x) =
_
e
(xa)
for x a
0 for x < a
F
X
(x) =
_
1 e
(xa)
for x a
0 for x < a
Chen CL 93
Useful Probability Distributions
The Shifted Exponential Distribution
The exponential distribution is appropriate for modeling the distribution of the
operational life, or time-to-failure of systems under chance or constant failure
rate condition.
In this regard, the parameter is related to the mean life or mean time-to-failure
E(T) as
=
1
E(T)
For a random variable X with the shifted exponential distribution starting at
x = a, the mean value of X would be
E(X) = a +
1

E(x a) =
1


X
=
1

Chen CL 94
Useful Probability Distributions
The Shifted Exponential Distribution
Ex: Diesel Engines to Generate Backup Electrical Power
Suppose that four identical diesel engines are used to generate backup electrical
power for the emergency control system of a nuclear power plant. Assume
that at least two of the diesel-powered units are required to supply the needed
emergency power; in other words, at least two of the four engines must start
automatically during sudden loss of outside electrical power.
The operational life T of each diesel engine may be modeled with the shifted
exponential distribution, with a rated mean operational life of 15 years and a
guaranteed minimum life of 2 years.
In this case, the reliability of the emergency backup system would clearly be of
interest.
For example, the probability that at least two of the four diesel engines will
start automatically during an emergency within the rst 4 years of the life of the
system can be determined as follows.
Chen CL 95
Useful Probability Distributions
The Shifted Exponential Distribution
Ex: Diesel Engines to Generate Backup Electrical Power
First, the probability that any of the engines will start without any problem
within 4 years is ( = 1/(15 2); x = 4; a = 2)
P(T > 4) = 1 P(T 4) = 1
_
1 e
(
1
152
)(42)
_
= 0.8574
Then, denoting N as the number of engines starting during an emergency, the
reliability of the backup system within 4 years is
P(N 2) =
4

n=2
_
4
n
_
(0.8574)
n
(0.1426)
4n
= 1
1

n=0
_
4
n
_
(0.8574)
n
(0.1426)
4n
= 1
_
4
0
_
(0.8574)
0
(0.1426)
40

_
4
1
_
(0.8574)
1
(0.1426)
41
= 1 0.0004 0.0099 = 0.990
Therefore, the reliability of the backup system within 4 years is 99%, even
though the reliability of each engine is only about 86%.
Chen CL 96
Useful Probability Distributions
The Gamma Distribution
The PDF, mean and variance of gamma distribution for a random variable X,
(, k are parameters of the distribution)
f
X
(x) =
_

_
(x)
k1
(k)
e
x
for x 0
0 for x < 0

X
=
k


2
X
=
k

2
(k) =
_

0
x
k1
e
x
dx where k > 1.0
= (k 1)(k 1)
= (k 1)(k 2) (k i)(k i)
Chen CL 97
Useful Probability Distributions
The Gamma Distribution
Calculation of the probability involving the gamma distribution can be performed
using tables of the incomplete gamma function, which are usually given for the
ratio (e.g., Harter, 1963):
I(u, k) =
_
u
0
y
k1
e
y
dy
(k)
P(a < X b) =

k
(k)
_
b
a
x
k1
e
x
dx

Let y = x

=
1
(k)
_
_
b
0
y
k1
e
y
dy
_
a
0
y
k1
e
y
dy
_
= I(b, k) I(a, k)
Therefore, in eect, the incomplete gamma function ratio is also the CDF of the
gamma distribution.
Chen CL 98
Useful Probability Distributions
The Gamma Distribution
Ex: Load on Buildings
The gamma distribution may be used to represent the distribution of the
equivalent uniformly distributed load (EUDL) on buildings. For a particular
building, if the mean EUDL is 15 psf (pounds per square foot) and the c.o.v. is
25%, the parameters of the appropriate gamma distribution are,
=

k/
k/
=
1

k
k =
1

2
=
1
(0.25)
2
= 16
=
k

=
16
15
= 1.067
The design live load is generally specied (conservatively) to be on the high side.
For instance, if the design EUDL is specied to be 25 psf, the probability that
this design load will be exceeded according is
P(L > 25) = 1 P(L 25) = 1 I(1.067 25, 16)
= 1 I(26.67, 16) = 1 0.671 = 0.329
Chen CL 99
Useful Probability Distributions
Gamma Distribution and Poisson Process
If the occurrences of an event constitute a Poisson process in time, then the
time till the k
th
occurrence of the event is governed by the gamma distribution.
Earlier, in Sect. 3.2.7, we saw that the time until the rst occurrence of the
event is governed by the exponential distribution.
Let T
k
denote the time until the k
th
occurrence of an event; then (T
k
t)
means that there were k or more occurrences of the event in time t.
Hence, on the basis of Eq. 3.34, we obtain the CDF of T
k
as
F
T
k
(t) =

x=k
P(X
t
= x) = 1
k1

x=0
(t)
x
x!
e
t
= 1
_
1 +
(t)
1!
+
(t)
2
2!
+ +
(t)
k1
(k 1)!
_
e
t
f
T
k
(t) =
dF
T
k
(t)
dt
?
=
(t)
k1
(k 1)!
e
t
for t 0 (Eq. 3.44)
Chen CL 100
Useful Probability Distributions
Gamma Distribution and Poisson Process
For k = 1, i.e., for the time until the rst occurrence of an event, Eq. 3.44 is
reduced to the exponential distribution of Eq. 3.36.
f
T
k
(t) =
(t)
k1
(k 1)!
e
t
(Gamma distribution)
f
T
1
(t) = e
t
(Exponential distribution)
The above gamma distribution with integer k is known also as the Erlang
distribution.
In this case, the mean time until the k
th
occurrence of an event and its variance
are
E(T
k
) =
k

Var(T
k
) =
k

2
Chen CL 101
Useful Probability Distributions
Gamma Distribution and Poisson Process
Ex: Fatal Accidents on A Particular Highway
Suppose that fatal accidents on a particular highway occur on the average about
once every 6 months. If we can assume that the occurrences of accidents on this
highway constitute a Poisson process, with mean occurrence rate of = 1/6
per month, the time until the occurrence of the rst accident (or between
two consecutive accidents) would be described by the exponential distribution,
specically with the following PDF:
f
T
1
=
1
6
(t/6)
(11)
(1 1)!
e
t/6
The time until the occurrence of the second accident (or the time between every
other accidents) on the same highway is described by the gamma distribution,
with the PDF,
f
T
2
=
1
6
(t/6)
(21)
(2 1)!
e
t/6
Chen CL 102
Useful Probability Distributions
Gamma Distribution and Poisson Process
Ex: Fatal Accidents on A Particular Highway
Whereas the time until the occurrence of the third accident would also be
gamma distributed with the PDF,
f
T
3
=
1
6
(t/6)
(31)
(3 1)!
e
t/6
The above PDFs are illustrated
graphically in Fig. E3.28, and the
corresponding mean occurrence
tlmes of T
1
, T
2
, and T
3
are,
respectively, 6, 12, and 18 months.
Note: We might recognize that the exponential and gamma distributions
are the continuous analogues, respectively, of the geometric and negative
binomial distributions; that is, the geometric and negative binomial distributions
govern the rst and kth occurrence times of a Bernoulli sequence, whereas the
exponential and gamma distributions govern the corresponding occurrence times
of a Poisson process.
Chen CL 103
Useful Probability Distributions
Shifted Gamma Distribution
Most probability distributions are described with two parameters, or even with
one parameter, such as the exponential distribution.
The shifted gamma distribution is one of the few exceptions with three
parameters.
A three-parameter distribution may be useful for tting statistical data in which
the skewness (involving the third moment) in the data is signicant; in particular,
the third parameter would be necessary in order to explicitly include the skewness
in the observed data.
As an extension of Eq. 3.42, the PDF of the three-parameter shifted gamma
distribution for a random variable X, may be expressed as (, k 1.0)
f
X
(x) =
[(x )]
k1
(k)
e
(x)
x

X
=
k


2
X
=
k

2
Chen CL 104
Useful Probability Distributions
Shifted Gamma Distribution
Ex: Residual Stresses in Flanges of Steel H-Section
The three-parameter gamma distribution can
be shown to give better t with statistical
data when there is signicant skewness in the
observed data. For instance, shown below
in Fig. E3.29 is the histogram of measured
residual stresses in the anges of steel H-
sections. The mean, standard deviation, and
skewness coecient of the measured ratios
of residual stress/yield stress are, respectively,
0.3561, 0.1927, and 0.8230.
Clearly, because the data show signicant skewness, a three-parameter
distribution is necessary in order to include the skewness for adequately tting
the histogram of the measured residual stresses. As shown in Fig. E3.29, the
three-parameter gamma PDF (solid curve) that includes the skewness of 0.8230
has a much closer t to the histogram than the normal or lognormal distributions
which are, of course, two-parameter distributions. This is further veried later
in Example 7.10 with the K-S goodness-of-t test.
Chen CL 105
Useful Probability Distributions
Hypergeometric Distribution
The hypergeometric distribution arises when samples from a nite population,
consisting of two types of elements (e.g., good and bad), are being examined.
It is the basic distribution underlying many sampling plans used in connection
with acceptance sampling and quality control.
Consider a lot of N items, among which m are defective and the remaining
(N m) items are good.
If a sample of n items is taken at random from this lot, the probability that
there will be x defective items in the sample is given by the hypergeometric
distribution as follows:
P(X = x) =
_
m
x
__
N m
n x
_
_
N
n
_
x = 1, 2, . . . , m
Chen CL 106
Useful Probability Distributions
Hypergeometric Distribution
The above distribution is based on the following:
In the lot of N items, the number of samples of size n is
_
N
n
_
;
among these, the number of samples with x defectives is
_
m
x
__
N m
n x
_
Therefore, assuming that the samples are equally likely to be selected, we obtain
the hypergeometric distribution.
Chen CL 107
Useful Probability Distributions
Hypergeometric Distribution
Ex: Detection of Strain Gages
In a box of 100 strain gages, suppose we suspect that there may be four gages
that are defective. If six of the gages from the box were used in an experiment,
the probability that one (and zero) defective gage was used in the experiment is
evaluated as follows (in this case, we have N = 100, m = 4, and n = 6); thus,
P(X = 1) =
_
4
1
__
100 4
6 1
_
_
100
6
_
= 0.205
P(X = 0) =
_
4
0
__
100 4
6 0
_
_
100
6
_
= 0.778
At least one defective gages was used in the experiments,
P(X 1) = 1 P(X = 0) = 1 0.778 = 0.222
Chen CL 108
Useful Probability Distributions
Hypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
In a large reinforced concrete construction project, 100 concrete cylinders are
to be collected from the daily concrete mixes delivered to the construction
site. Furthermore, to ensure material quality, the acceptance/rejection criterion
requires that ten of these cylinders (selected at random) must be tested for
crushing strength after curing for 1 week, and nine of the ten cylinders tested
must have a required minimum strength.
Q: Is the acceptance/rejection criterion stringent enough?
Whether the acceptance/rejection criterion is too stringent, or not stringent
enough, depends on whether it is dicult or easy for poor-quality concrete mixes
to go undetected.
Chen CL 109
Useful Probability Distributions
Hypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
For example, if there is d percent of defective concrete, then on the basis of the
specied acceptance/rejection criterion, the probability of rejection of the daily
concrete mixes would be (denoting X as the number of defective cylinders in
the test)
P(X > 1) = 1 P(X 1)
= 1
_

_
_
100d
0
__
100(1 d)
10
_
_
100
10
_
+
_
100d
1
__
100(1 d)
9
_
_
100
10
_
_

_
Chen CL 110
Useful Probability Distributions
Hypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
For example, if there is 5% (2%) defectives in the daily concrete mixes, or
d = 5%, (2%),
d = 5% : P(rejection) = 1
_

_
_
5
0
__
95
10
_
_
100
10
_
+
_
5
1
__
95
9
_
_
100
10
_
_

_
= 1 [0.5837 + 0.0034] = 0.413
d = 2% : P(rejection) = 1
_

_
_
2
0
__
98
10
_
_
100
10
_
+
_
2
1
__
98
9
_
_
100
10
_
_

_
= 1 [0.8091 + 0.1818] = 0.009
Chen CL 111
Useful Probability Distributions
Hypergeometric Distribution
Ex: A Large Reinforced Concrete Construction Project
Therefore, if 5% of the concrete mixes were defective, it is likely (with 41%
probability) that the defective material will be discovered with the proposed
acceptance/rejection criterion,
whereas if 2% of the concrete mixes were defective, the likelihood of the daily
mixes being rejected is very low (with 0.009 probability).
Hence, if the contract requires concrete with less than 2% defectives, then the
proposed acceptance/rejection criterion is not stringent enough;
on the other hand, if material with 5% defectives is acceptable, then the proposed
criterion may be satisfactory.
Chen CL 112
Useful Probability Distributions
Beta Distribution
Most probability distributions are for random variables whose range of values are
unlimited in either one or both directions.
In some engineering applications, there may be problems in which there are nite
lower and upper bound values of the random variables; in these cases, probability
distributions with nite lower and upper limits would be appropriate.
The beta distribution is one of the few distributions appropriate for a random
variable whose range of possible values are bounded, say between a and b. Its
PDF is given by
f
X
(x) =
_

_
1
B(q, r)
(x a)
q1
(b x)
r1
(b a)
q+r1
for a x b
0 otherwise
f
X
(x) =
_
_
_
1
B(q, r)
x
q1
(1 x)
r1
for 0 x 1
0 otherwise (standard beta dist.)
B(q, r) =
_
1
0
x
q1
(1 x)
r1
dx =
(q)(r)
(q + r)
(Beta function)
Chen CL 113
Useful Probability Distributions
Beta Distribution
The probability associated with
a beta distribution can be
evaluated in terms of the
incomplete beta function, and
values of the incomplete beta
function ratio B
x
(q, r)/B(q, r)
have been tabulated.
Chen CL 114
Useful Probability Distributions
Beta Distribution
B
x
(q, r) =
_
x
0
y
q1
(1 y)
r1
dy 0 < x < 1.0
F
X
(x) =
1
B(q, r)
_
x
0
y
q1
(1 y)
r1
dy =
B
x
(q, r)
B(q, r)
(x|q, r)=1 (x|r, q)
P(x
1
< X x
2
) =
1
B(q, r)
_
x
2
x
1
(x a)
q1
(b x)
r1
(b a)
q+r1
dx

Let y =
xa
ba
1 y =
bx
ba
, dy =
dx
ba

=
1
B(q, r)
_
_
(x
2
a)/(ba)
0
{y
q1
(1 y)
r1
}dy
_
(x
1
a)/(ba)
0
{}dy
_

Let u =
x
2
a
ba
, v =
x
1
a
ba

= (u|q, r) (v|q, r)

X
= a +
q
q + r
(b a),
2
X
=
qr
(q + r)
2
(q + r + 1)
(b a)
2

X
=
2(r q)
(q + r)(q + r + 2)
(b a)

X
, x = a +
1 q
2 q r
(b a) (mode of X)
Chen CL 115
Useful Probability Distributions
Beta Distribution
Ex: Duration Required to Complete An Activity
The duration required to complete an activity in a construction project has been
estimated by the subcontractor to be as follows:
Minimum duration = 5 days
Maximum duration = 10 days
Expected duration = 7 days
The coecient of variation of the required duration is estimated to be 10%.
In this case, the beta distribution may be appropriate with a = 5 days and
b = 10 days. The parameters of the distribution would be determined as follows:
5 +
q
q + r
(10 5) = 7 q =
2
3
r
qr
(q + r)
2
(q + r + 1)
(10 5)
2
= (0.1 7)
2
q = 3.26, r = 4.89
Chen CL 116
Useful Probability Distributions
Beta Distribution
Ex: Duration Required to Complete An Activity
The probability that the activity will be completed within 9 days
P(T 9) =
u
(3.26, 4.89), u =
9 5
10 5
= 0.8
From tables of the incomplete beta function ratios we obtain after suitable
interpolation
P(T 9) =
0.8
(3.26, 4.89) = 0.993
Chen CL 117
Multiple Random Variables
Chen CL 118
Multiple Random Variables
Ex: Rainfall Intensity and Temperature
Rainfall intensity at a gauge station: RV X
Temperature for run-o of a river: RV Y
(X = x, Y = y): or [(X = x) (Y = y)]
a joint event dened by values of RVs in XY -space
Joint Probability Mass Function:
p
X,Y
(x, y) = P[X = x and Y = y]
Joint Probability Distribution Function:
F
X,Y
(x, y) = P[X x and Y y]
=

x
i
x

y
j
y
p
X,Y
(x
i
, y
j
)
Chen CL 119
Multiple Random Variables
Ex: May Temperature and Rainfall of US City
p
X,Y
(65, 4)
_
=
14
50
_
= .28
F
X,Y
(55, 1) =

x
i
55

y
j
1
p
X,Y
(x
i
, y
j
)
= 0 + 0 + .02 + .02 + .02 + .04 = .10
Chen CL 120
Multiple Random Variables
Joint Probability Distribution
The cummulative probability of the joint occurrences of the events
dened by X x, Y y
F
X,Y
(x, y) P[X x, Y y]
Axioms of Probability:
F
X,Y
(, ) = 0 F
X,Y
(, ) = 1
F
X,Y
(, y) = 0 F
X,Y
(, y) = F
Y
(y) = P[Y y]
F
X,Y
(x, ) = 0 F
X,Y
(x, ) = F
X
(x) = P[X x]
Note: F
X,Y
(x, y) is nonnegative and nondecreasing
Chen CL 121
Multiple Random Variables
X, Y are Discrete RVs
Probability Mass Functtion:
F
X,Y
(x, y) P[X x, Y y]
=

x
i
x,y
i
y
P[X = x
i
, Y = y
i
]
=

x
i
x,y
i
y
p
X,Y
(X = x
i
, Y = y
i
)
Conditional PMF:
p
X|Y
(x|y) = P[X = x|Y = y] =
p
X,Y
(x, y)
p
Y
(y)
p
Y |X
(y|x) = P[Y = y|X = x] =
p
X,Y
(x, y)
p
X
(x)
Chen CL 122
Multiple Random Variables
X, Y are Discrete RVs
Marginal PMF:
p
X
(x) =P[X = x] =

y
j
P[X = x|Y = y
j
]P[Y = y
j
]
=

y
j
P[X = x, Y = y
j
] =

y
j
p
X,Y
(x, y
j
)
p
Y
(y) =P[Y = y] =

x
i
P[Y = y|X = x
i
]P[X = x
i
] =

x
i
p
X,Y
(x
i
, y)
Statistical Independent:
p
X|Y
(x|y) = p
X
(x)
p
Y |X
(y|x) = p
Y
(y)
p
X,Y
(x, y) = p
X
(x)p
Y
(y)
Chen CL 123
Multiple Random Variables
X, Y are Discrete RVs
Ex: Construction Labor Survey
From a survey of construction labor:
work duration (x = 6, 8, 10, 12 hrs) and
productivity (y = 50%, 70%, 90%)
Joint PMF p
X,Y
(x, y):
Relative
(x, y) # of obs. frequencies
6, 50 2 0.014
6, 70 5 0.036
6, 90 10 0.072
8, 50 5 0.036
8, 70 30 0.216
8, 90 25 0.180
10, 50 8 0.058
10 70 25 0.180
10,90 11 0.079
12, 50 10 0.072
12, 70 6 0.043
12, 90 2 0.014
139 1.000
Chen CL 124
Multiple Random Variables
X, Y are Discrete RVs
Ex: Construction Labor Survey
Marginal PMF:
p
X
(x) =

y
j
=50,70,90
p
X,Y
(x, y
j
)
p
X
(8) = 0.036 + 0.216 + 0.180
= 0.432
Conditional probability:
p
Y |X
(90%|8) =
p
X,Y
(8, 90%)
p
X
(8)
=
0.180
0.432
= 0.417
Chen CL 125
Multiple Random Variables
X, Y are Continuous RVs
Probability Density Function:
F
X,Y
(x, y)dxdy P[x < X x + dx, y < Y y + dy]
F
X,Y
(x, y)
_
x

_
y

f
X,Y
(u, v)dvdu
Note:
(1) f
X,Y
(x, y)dxdy =

2
F
X,Y
(x, y)
xy
(2) P[a < X b, c < Y d] =
_
b
a
_
d
c
f
X,Y
(u, v)dvdu
Chen CL 126
Multiple Random Variables
X, Y are Continuous RVs
Conditional PDF:
f
X|Y
(x|y) =
f
X,Y
(x, y)
f
Y
(y)
f
Y |X
(y|x) =
f
X,Y
(x, y)
f
X
(x)
Marginal PDF:
f
X
(x) =
_

f
X,Y
(x, y)dy
f
Y
(y) =
_

f
X,Y
(x, y)dx
Chen CL 127
Multiple Random Variables
Bivariate Normal Distribution
Probability density function:
f
X,Y
(x, y) =
1
2
X

Y
_
1
2
exp
_

1
2(1
2
)
_
(x
X
)
2

2
X
2
(x
X
)(y
Y
)

Y
+
(y
Y
)
2

2
Y
__
Chen CL 128
Multiple Random Variables
Bivariate Normal Distribution
Ex: Radar Network for Tracking Satelite
Tracking satelite using radar network
Forcast errors in azimuth (X) and elevation (Y )

X
=
Y
= 0,
X
= 5 sec,
Y
= 2 sec, = 0
Bivariate normal density function:
f
X,Y
(x, y) =
1
20
exp
_

1
2
_
x
2
5
2
+
y
2
2
2
__
Bivariate normal distribution function:
F
X,Y
(x, y) =
_
x

2(5)
exp
_


2
2(25)
___
y

2(2)
exp
_

u
2
2(4)
_
du
_
dv
=
_
x
5
_

_
y
2
_
The probability that the forecast azimuth and elevation errors do not each exceed
+3 seconds
P[X 3 and Y 3] =
_
3
5
_

_
3
2
_
= (.7257)(.9332) = .677
Chen CL 129
Covariance

XY
Cov(X, Y ) = E
_
(X
X
)(Y
Y
)

= E [XY ] E(X)E(Y )
If X, Y are independent: Cov(X, Y ) = 0
f
X,Y
(x, y) = f
X
(x)f
Y
(y)
E[X, Y ] =
_

xyf
X,Y
(x, y)dydx
=
_

xf
X
(x)dx
_

yf
Y
(y)dy
= E[X] E[Y ]
If Cov(X, Y ) is large positive:
Values of X and Y tend to be both large or both small relative to their respective
mean
Chen CL 130
Covariance

XY
Cov(X, Y ) = E
_
(X
X
)(Y
Y
)

= E[XY ] E(X)E(Y )
If Cov(X, Y ) is large negative: X large Y small
X small Y large
If Cov(X, Y ) is small: little linear relationship
Cov(X, Y ):
A measure of degre of linear interrelationship between variates X, Y
Problem: Cov(X, Y ) is scaling dependent !
Chen CL 131
Correlation

Cov(X, Y )

Y
[1, 1] (Scaling independent)
= +1.0
= 1.0
0 < < 1.0
= 0
= 0 = 0
Note: is a measure of linear relationship
NOT imply a causal eect between variables
Chen CL 132
Covariance and Correlation
Ex: A Cantilever Beam
S
1
, S
2
: independent random loads (
1
,
1
), (
2
,
2
)
Shear force: Q = S
1
+ S
2
; bending moment: M = aS
1
+ 2aS
2

Q
=
1
+
2

2
Q
=
2
1
+
2
2

M
= a
1
+ 2a
2

2
M
= a
2
(
2
1
+ 4
2
2
)
Chen CL 133
Covariance and Correlation
Ex: A Cantilever Beam
E[QM] = E [(S
1
+ S
2
)(aS
1
+ 2aS
2
)]
= aE[S
2
1
] + 3a E[S
1
S
2
]
. .
=E[S
1
] E[S
2
]
+2aE[S
2
2
]
= a(
2
1
+
2
1
) + 3a(
1
)(
2
) + 2a(
2
2
+
2
2
)
= a(
2
1
+ 2
2
2
) + a(
2
1
+ 2
2
2
+ 3
1

2
)
= a(
2
1
+ 2
2
2
) +
Q

M
Cov(Q, M) = E[QM]
Q

M
= a(
2
1
+
2
2
)

QM
=
Cov(Q, M)

M
=
a(
2
1
+
2
2
)
_

2
1
+
2
2
_
a
2
(
2
1
+ 4
2
2
)
If
1
=
2

QM
=
3

10
= 0.948
Q, M: strong (linear) correlation at the support
Q, M: NO causal relation
Chen CL 134
Thank You for Your Attention
Questions Are Welcome

You might also like