You are on page 1of 11

PROBABILITY

Introduction
Probability is mathematical concepts and techniques to deal with uncertainties (i.e. we
cannot predict with 100% accuracy) arising from almost every walk of life such as
gambling, investment, accidents, waiting times, measurements and etc

Counting Rules
In many problems of probability and statistics, we need to list all the alternatives that
are possible in a given situation, or at least determine how many different possibilities
there are.
1. Presentation Factorial Notation:
For any natural number n, we use the notation n! , read as n factorial, to
denote the product of the first n consecutive positive integers. That is,
n!= (1)(2)(3)...(n 2)(n 1)(n ) = (n 1)!(n )
Note also, 0!= 1 .

2. Basic Rules of Counting


(a) Multiplication Principle
Suppose n choices must be made, with m1 ways to make choice 1, and for
each of these ways, m2 ways to make choice 2, and so on, and with mn
ways to make choice n. Then, there are m1 m2 ....mn different ways to
make the entire sequence of choices.

(b) Simple Permutation


Ordered arrangements (i.e. order is important) of objects are called
permutations. The number of different (i.e. distinguishable) ordered
arrangement of r objects chosen from n different (or distinct) objects
without repetition is given by the formula:
n!
n Pr = (n )(n 1)(n 2 )...(n r + 1) =
(n r )!

(c) Combination
The number of different ways of choosing r different objects from a set of n
distinct objects without regard to the order of selection is given the formula:
n n!
n Cr = Cr = r = r!(n r )!
n

1
Probability
1 Notation
(a) P(E ) is the probability (or likelihood or chance) that a particular random
event E will occur.
(b) P(E ') is the probability (or likelihood or chance) that a particular random
event E will not occur. And E is known as the complementary event of E.
(c) P(E F ) is the probability that either event E or event F or both will occur.
(d) P(E F ) is the probability that both events E and F will occur.
P (E F )
(e) P(E | F ) = is the probability that event E will occur on the
P (F )
condition that event F has just occurred

2 Rules
(a) 0 P(E ) 1
(b) P(E ) + P(E ') = 1
(c) P(E F ) = P(E ) + P(F ) P(E F )
(d) For any mutually exclusive events E and F (i.e. events E and F will not
occur simultaneously),
(i) P(E F ) = 0
(ii) P(E F ) = P(E ) + P(F )
(e) For any independent events E and F (i.e. when the outcome of one event
does not affect the probability of occurrence of another event),
(i) P(E | F ) = P(E )
(ii) P(F | E ) = P(F )
(iii) P(E F ) = P(E )P(F )

Conditional Probability
P (E F )
The conditional probability of event E given even F, P(E | F ) = , takes
P(F )
into account information about the occurrence of event F to find the probability of E.
This concept can be extended to revise probability based on new information and to
determine the probability that a particular effect was due to a specific cause. The
procedure for revising these probabilities is known as the Bayes theorem:
P(Ei F ) P(Ei )P(F | Ei ) P(Ei )P(F | Ei )
P (E i | F ) = = =
P(F ) P (F ) P(E k )P(F | E k )
k

Some diagrams such as probability trees are useful aid to apply the Bayes theorem.

2
Normal Distribution
Frequency polygons or frequency distributions can assume almost any shape or form,
depending on the data. However, the data obtained from many experiments in
practical situations often follow a common pattern, known as the normal distribution
or the Gaussian distribution, which is generally regarded as the most important
distribution and much statistical theory is based on it.

1. ( )
Properties of normal distributions, X follows N , 2 , include
(a) it is bell-shaped and thus symmetrical about its mean in appearance,
(b) the mean, the median, mid-range, mid-hinge and the mode are all equal,
(c) the total area under the curve above the x-axis is equal to 1,
(d) the normal distribution is completely determined by its parameters mean
and standard deviation . That is, each different value of or
specifies a different normal distribution
(e) the area under the normal curve from X = a to X = b is the probability
that an observed data value will be between a and b.

2. Standard Normal Distribution


The normal distribution is really a family of distributions in which one member
is distinguished from another on the basis of the values of mean and standard
deviation . In other words, there is a different normal distribution for each
different value of either mean and standard deviation . The most
important member of this family of distribution is the standard normal
distribution which has mean = 0 and standard deviation = 1 , i.e.
(
Z follows N = 0, 2 = 12 . )
Moreover, the area under a normal curve X having N , 2 ( ) between X = a
and X = b is the same as the area under the standard normal curve between the
a b
Z-score for a (i.e. ) and Z-score for b (i.e. ). That is,

a b
P(a < X < b ) = P <Z<

X
or =Z

3
Sampling Distributions of Sample Means
Sampling distribution of the sample means X is the (probability) distribution
consisting of all possible sample means of a given sample size n, selected from a
population (with population mean and population standard deviation ). Both
the population mean and population standard deviation are constants but usually
unknown to us.
Sampling distributions of sample means X depend on three factors:
a. shape of population normal (symmetrical) or non-normal
(non-symmetrical)
b. population standard deviation known or unknown and
c. sample size n large or small

1. Central Limit Theorem:


The sampling distribution of the sample means X from most population (with
population mean and population standard deviation ) is approximately
normally distributed if the sample size n is at least 30 observations.

That is, X follows N X = , X =
n

where X and X are the mean and standard deviation of the distribution of

sample means X . (Note: X is also known as the standard error of the

sample means.) As the sample size increases, the standard error of the sample
means decreases, so that a larger proportion of sample means X is closer to the
population mean . If population standard deviation is not available,

substitute with sample standard deviation s =


(
xi x )
2

.
n 1
2. If the population distribution is fairly symmetrical, the sampling distribution of
the sample means is approximately normal if samples of at least 15 observations
are selected.
3. If the population is normally distributed, the sampling distribution of the sample
means is normally distributed regardless of the sample size.

4
Sampling Distribution of Sample Proportions
Sampling distribution of sampling proportions P is a (probability) distribution
consisting of all possible sample proportions of a given sample size n selected from a
population with population proportion p (which is a constant but usually unknown to
us). Based on the Central Limit Theorem, the sampling distribution of sample

p (1 p )
proportions P follows a normal distribution N P = p, P = .
If p is
n

not available, substitute p with sample proportion p . As the sample size increases,

the standard error of the sample proportions decreases, so that a larger proportion of
sample proportions P is closer to the population proportion p.

EXAMPLE 1:
(a) A sushi cafe has three waiters. On a particular day, waiter A serves 50% of the
customers and she has a record of incorrect service in 1% of the orders. Waiter B
serves 30% of the customers, and he has a record of 0.07% incorrect service.
Waiter C serves the rest of the customers, and he has a record of incorrect service
in 1.2% of the orders.
(i) Present the above information with a tree diagram.
(ii) By using the result of part (i) or otherwise, find the probability of being
served by waiter A, given that you have been served an incorrect order.
(iii) Comment with justification on the service level of the cafe.

(b) In order to survive in the competitive market, the sushi cafe also provides delivery
service. The manager finds that the delivery times of sushi to customers are
normally distributed with a mean of 30 minutes and a standard deviation of 10
minutes. She promises to deliver sushi to customers within 35 minutes and will
give a free salad dish costing $8 for each late delivery.

(i) If a customer has already waited for 25 minutes, what is the chance that the
customer will finally get a free salad?

(ii) If there are 500 delivery orders a month, what is the expected monthly cost of
giving free salad dishes?

(iii) If the manager decides to cut 50% of the expected monthly cost of giving free
salad in (ii) above, how should the promised delivery time be revised?

5
Example 1 Outlined Solution
a (i)
Let A = event of waiter A serving, B = event of waiter B serving, and C = event
of waiter C serving.
Cr = event of a correct service, Cr = event of an incorrect service

Prior Conditional
Joint probabilities
probabilities probabilities

0.99 Cr 0.5 0.99 = 0.495

0.01
Cr 0.5 0.01 = 0.005
0.5 A
0.9993 Cr 0.3 0.9993 = 0.29979
0.3
B 0.0007
0.2 Cr 0.3 0.0007 = 2.110-4
C 0.988
Cr 0.2 0.988 = 0.1976
0.012
Cr 0.2 0.012 = 0.0024

Total = 1

a (ii)
0.5 0.01
P ( A | Cr ' ) =
0.5 0.01 + 0.3 0.0007 + 0.2 0.012
0.005
=
0.005 + 2.1 104 + 0.0024
500
= 0.6570
761

a (iii)
The likelihood that a customer would be served correctly is

P(Cr ) = 0.5 0.99 + 0.3 0.9993 + 0.2 0.988


= 0.495 + 0.29979 + 0.1976
0.9924

which is very close to 1. This reflects a very good service level.

6
(b)

(i) Let X be the delivery time


X follows Normalmean=30, std dev = 10

P( X > 35) P(Z > 35 30


) P(Z > 0.5) 1 0.6915
= 10
= =
P( X 25) P(Z 25 35
10
) P(Z 0.5) 1 0.3085
0.3085
= = 0.4461
0.6915

(ii) Expected Monthly cost

= 500*P(Z > (35-30)/10)*$8

= 500*(1 0.6915)*$8

= $1234
(iii) Let T be the new time required.

P(X > T) = 50%*P(X > 35) = 50%*0.3085 = 0.154

T 30
= 1.02 from Z-table as P(Z > 1.02) = 0.1539
10

T = 30+10*1.02

= 40.2 min

EXAMPLE 2:
(a) A couple Bob and Doris have life insurance policies. The risk department of
their insurance company is reviewing their policies and has assessed the
probability for Bob to live 20 more years is 0.8 and the probability for Doris to
live 20 more years is 0.85, based on the information they have provided.

(i) Do we have sufficient information to find the probability that both Bob and
Doris will live 20 more years? If your answer is NO, state an assumption
which can help to find the probability. (2 marks)

(ii) Based on your answer or assumption in Part (i), find the probability that at
least one of them will live 20 more years. (5 marks)

(iii) Based on your answer or assumption in Part (i), find the probability that
exactly one of them will live 20 more years. (5 marks)

7
(b) A new test has been recently introduced to diagnose EBOLA. For people with
EBOLA, it is known that the test scores follow a normal distribution with mean
70 and standard deviation 5. For people without the disease, the test scores
follow another normal distribution with mean 61 and the same standard deviation
5. It is estimated that 15% of the population in a seriously affected city in
Africa has the disease. It is proposed that a person be classified as having the
disease if the persons test score exceeds 66, otherwise the person will be
classified as not having the disease. If a person is randomly selected from the
population to take the test,
(i) what is the probability that this person be classified as having the disease?
(8 marks)
(ii) find the probability that this person be misclassified. (5 marks)

(TOTAL 25 MARKS)

EXAMPLE 2 Outlined Solution


(i) Let B be the event that Bob will live 20 more years
D be the event that Doris will live 20 more years
There are two cases to calculate P(B and D):
P(B and D) = P(B)xP(D) if B and D are independent (i.e. if one person can live
20 more years or not, it will not affect the chance of the other person to live 20
more years or not.)
P(B and D) = P(B)x P(D|B) = P(DxP(B|D) if B and D are dependent (i.e. if one
person can live 20 more years or not, it will affect the chance of the other person
to live 20 more years or not)
Since we do not know how the life of one person depends on the other (i.e.
P(B|D) and P(D|B) are not known), we can only assume B and D to be
independent.
(ii) P(B or D) = P(B) + P(D) P(B and D)
=0.8 + 0.85 0.8x0.85
= 0.8 + 0.85 - 0.68 = 0.9700
(iii) P(B and D) = P(B)x(1 P(D))
=0.8 x (1 0.85) = 0.1200
P(B and D) = (1 P(B))xP(D)
= (1 - 0.8) (0.85) = 0.1700
Answer: 0.1200 + 0.1700 = 0.2900

8
(b) This question is very similar to one example we did in class (i.e. Supplementary
Ex 7.4, Q18). Please refer to that problem/solution and use the probability tree
diagram to solve this question as follows:

Let X and Y be the scores of those with and without the disease.
( ) (
X N = 70, 2 = 25 and Y = 61, 2 = 25 )
(i) There are two possibilities that a person is classified as having the disease
- the person has the disease and the person is also classified as having
the disease or
- the person does not have the disease but the person is classified as
having the disease.
Therefore, the required probability
= 0.15 * P( X > 66) + (1 0.15) * P(Y > 66)
66 70 66 61
= 0.15 * P Z > + 0.85 * P Z >
5 5
= 0.15 * P(Z > 0.8) + 0.85 * P(Z > 1) = 0.15 * 0.7881 + 0.85 * 0.1587
= 0.2531

(ii) There are two possibilities that a person is misclassified


- the person has the disease but the person is classified as not having the
disease or
- the person does not have the disease but the person is classified as
having the disease.
Therefore, the required probability
= 0.15 * P( X < 66) + (1 0.15) * P(Y > 66)
= 0.15 * P(Z < 0.8) + 0.85 * P(Z > 1) = 0.15 * 0.2119 + 0.85 * 0.1587
= 0.1667

9
Class Practice 1:

(a) Replacement times for TV sets are normally distributed with a mean of 8.2 years

and a standard deviation of 1.1 years based on Consumer Council Report.

(i) Find the probability that a randomly selected TV will have a replacement

time less than 5.0 years.

(ii) If you want to provide a warranty so that only 2% of the TV sets will be

replaced before the warranty expires, what is the time length of the

warranty?

(iii) Suppose you have a 4-years old TV, what is the chance that you will not

have to replace your TV within the coming two years?

(b) An insurance company classifies drivers into 3 categories: safe, normal and risky.

From past records, the proportion of drivers in each category and the probability

that a driver in each category will not incur accidents in a year are as shown in

the following table.

Category Proportion Probability of not

having accidents in a

year

Safe 0.15 0.98

Normal 0.8 0.93

Risky 0.05 0.85

Suppose a new customer makes a third party insurance contract with the

company. What is the probability that (s)he will incur accidents within a year?

10
(c) AD students taking a particular course may suffer from two kinds of problems

repeat (R) and drop (D) which occur independently. There are 20% of

students repeating the course and 5% of students drop the course eventually.

Find the proportion of students who suffer at least one of these two problems.

(d) Seven students (A, B, C, D, E, F and G) are going to present one by one. Find

the number of arrangements that can be made so that A will present before D and

D before E.

(e) Each student at a local university has to take only C(omputing) or only

M(athematics) or both C and M. The probability that a student is taking C

given that (s)he is taking M is 1/5, and the probability that a student is taking M

given that (s)he is taking C is 1/3. Find the probability that a student selected at

random is taking both C and M.

Class Practice 2:
The lifetime of a certain type battery is normally distributed with mean of 8,200 hours
and standard deviation of 50 hours.
(a) What is the probability that a battery of this type has life time between 8,210 and
8,220 hours.
(b) A random sample of 40 such batteries is taken. What is the probability that
their mean life-time will be between 8,210 and 8,220 hours?

11

You might also like