Professional Documents
Culture Documents
Page 0
Rohini Somanathan
'
&
Page 1
%
Rohini Somanathan
'
Administrative Information
Internal Assessment: 25% for Part 1
1. Midterm: 20%
2. Lab assignments, Tutorial attendance and class participation: 5%
Problem Sets: - Do as many problems from the book as you can. All odd-numbered
exercises have solutions so focus on these.
Tutorials: -Check the notice board in front of the lecture theatre for lists.
Punctuality is critical - coming in late disturbs the rest of the class and me
&
Page 2
%
Rohini Somanathan
'
&
Page 3
%
Rohini Somanathan
'
.05
.1
probability
.15
.2
.25
P(0)=.0001, P(1)=.001, P(2)=.044, P(3)=.12, P(4)=.21, P(5)=.25 ... (display binomial(10, k, .5))
10
When should we conclude that there is gender bias? Can we get an estimate of this bias?
&
Page 4
%
Rohini Somanathan
'
&
Page 5
%
Rohini Somanathan
'
Definitions
An experiment is any process whose outcome is not known in advance with certainty. These
outcomes may be random or non-random, but we should be able to specify all of them and
attach probabilities to them.
Experiment
Event
10 coin tosses
4 heads
select 10 LS MPs
one is female
go to your bus-stop at 8
&
Page 6
%
Rohini Somanathan
'
1
8
Let us define the event A as atleast one head. Then A = {s1 , . . . , s7 }, Ac = {s8 }. A and Ac are
exhaustive events.
The events exactly one head and exactly two heads are mutually exclusive events.
Notice that there are lots of different ways in which we can define a sample space and the
most useful way to do so depending on the event we are interested in (# heads, or with
picking from a deck of cards, we may be interested in the suit, the number or both)
&
Page 7
%
Rohini Somanathan
'
P(Ai )
i=1
Note:
We will typically use P(A) or Pr(A) instead of P(A)
For finite sample spaces S is straightforward to define. For any S which is a subset of the
real line (and therefore infinite) let S be the set of all intervals in S.
&
Page 8
%
Rohini Somanathan
'
Result 3:If A1 and A2 are subsets of S such that A1 A2 , then P(A1 ) P(A2 )
Proof: Lets write A2 as: A2 = A1 (Ac1 A2 ). Since these are disjoint, we can use property
3 to get P(A2 ) = P(A1 ) + P(Ac1 A2 ). The second term on the RHS is non-negative (by
axiom 1), so P(A2 ) P(A1 ).
Result 4: For each A S, 0 P(A) 1
Proof: Since A S, we can directly apply the previous result to obtain
P() P(A) P(S) or 0 P(A) 1
&
Page 9
%
Rohini Somanathan
'
(1)
&
Page 10
%
Rohini Somanathan
'
1
2
<x+y<
1
4
3
2
(c) y < 1 x2
(d) x = y
answers: (1) 1/2, 1/6, 3/8 (2) .1, .4 (3) 1-/4, 3/4, 2/3, 0
&
Page 11
%
Rohini Somanathan
'
The probability of any event A can now be found as the sum of pi for all outcomes si that
belong to A.
A sample space containing n outcomes is called a simple sample space if the probability
assigned to each of the outcomes s1 . . . , sn is n1 . Probability measures are easy to define in
such spaces. If the event A contains exactly m outcomes, then P(A) = m
n
Notice that for the same experiment, we can define the sample space in multiple ways
depending on the events of interest. For example- suppose were interested in obtaining a
given number of heads in the tossing of 3 coins, our sample space can either comprise all
the 8 possible outcomes (a simple space) or just four outcomes (0,1,2 and 3 heads).
We can arrive at the total number of elements in a sample space through listing all possible
outcomes. A simple sample space for a coin-tossing experiment with 3 fair coins would have
a eight possible outcomes, a roll of two dice would have 36, etc. We then just calculate the
number of elements contained in our event A and divide this by the total number of
outcomes to get our probability (P(2 heads)=3/8 and P(sum of 7)=1/6
Listing outcomes can take a long time, and we can use a number of counting methods to
make things easier and avoid mistakes.
&
Page 12
%
Rohini Somanathan
'
&
Page 13
%
Rohini Somanathan
'
Permutations
Suppose we are sampling k objects from a total of n distinct objects without replacement.
We are interested in the total number of different arrangements of these objects we can
obtain.
We first pick one object- this can happen in n different ways. Since we are now left with
n 1 objects, the second one can be picked in (n 1) different ways, and so on.
The total number of permutations of n objects taken k at a time is given by
Pn,k = n(n 1) . . . (n k + 1)
and Pn,n = n!
Pn,k can alternatively be written as:
Pn,k = n(n 1).. . . . (n k + 1) = n(n 1).. . . . (n k + 1)
(n k)!
n!
=
(n k)!
(n k)!
In the case with replacement, we can apply the multiplication rule derived above. In this
case there are n outcomes possible for each of the k selections, so the number of elements in
S is nk .
&
Page 14
%
Rohini Somanathan
'
It turns out that for k = 23 this number is .507, so you should take the bet (if you are not
risk-averse)
&
Page 15
%
Rohini Somanathan
'
&
Page 16
%
Rohini Somanathan
'
n!
n1 !n2 !...nk !
Examples:
An student organization of 1000 people is picking 4 office-bearers and 8 members for its
1000!
managing council. The total number of ways of picking this groups is given by 4!8!988!
105 students have to be organized into 4 tutorial groups, 3 with 25 students each and
one with the remaining 30 students. How many ways can students be assigned to
groups?
&
Page 17
%
Rohini Somanathan
'
n
[
i=1
Ai ) =
n
X
i=1
P(Ai )
X
i<j
P(Ai Aj ) +
i<j<k
&
Page 18
%
Rohini Somanathan
'
Independent Events
Definition: Let A and B be two events in a sample space S. Then A and B are independent
iff P(A B) = P(A)P(B). If A and B are not independent, A and B are said to be dependent.
Events may be independent because they are physically unrelated -tossing a coin and rolling
a die, two different people falling sick with some non-infectious disease, etc.
This need not be the case however, it may just be that one event provides no relevant
information on the likelihood of occurrence of the other.
Example:
The even A is getting an even number on a roll of a die .
The event B is getting one of the first four numbers.
The intersection of these two events is the event of rolling the number 2 or 4, which we
know has probability 13 .
Are A and B independent? Yes because P(A)P(B) =
12
23
1
3
This is because the occurrence of A does not affect the likelihood that B will occur, or
vice-versa. Why?
If A and B are independent, then A and Bc are also independent as are Ac and Bc . (We
require P(A Bc ) = P(A)P(Bc ). But A = (A B) (A Bc ), so with A and B independent,
P(A Bc ) = P(A) P(A)P(B) = P(A)[1 P(B)] = P(A)P(Bc ). Starting now with A and B
complement, we can use the same argument to show Ac and Bc independent.
&
Page 19
%
Rohini Somanathan
'
1
2
3. Suppose A and B are disjoint sets in S. Does it tell us anything about the independence of
events A and B?
4. Remember that disjointness is a property of sets whereas independence is a property of the
associated probability measure and the dependence of events will depend on the probability
measure that is being used.
&
Page 20
%
Rohini Somanathan
'
1
2
1
2
1
9
1
In this case, P(A1 A2 A3 ) = P(3, 6) = 36
= ( 12 )( 21 )( 19 ) = P(A1 )P(A2 )P(A3 ) but
1
1
P(A1 A3 ) = P(3, 6) = 36
6= P(A1 )P(A3 ) = 18
, so the events are not independent, nor
pairwise independent.
&
Page 21
%
Rohini Somanathan
'
Conditional probability
When we conduct an experiment, we are absolutely sure that the event S will occur.
Suppose now we have some additional information about the outcome, say that it is an
element of B S.
What effect does this have on the probabilities of events in S? How exactly can we use such
additional information to compute conditional probabilities?
Example: The experiment involves tossing two fair coins in succession. What is the
probability of two tails? Suppose you know the first one is a head? What if it is a tail?
We denote the conditional probability of event A, given B by P(A|B)
B is now the conditional sample space and since B is certain to occur, P(B|B) = 1
Event A will now occur iff A B occurs
Definition: Let A and B be two events in a sample space S. If P(B) 6= 0, then conditional
probability of event A given event B is given by
P(A|B) =
P(A B)
P(B)
Notice that P(.|B) is now a probability set function (probability measure) defined for
subsets of B.
For independent events A and B, the conditional and unconditional probabilities are equal:
P(A)P(B)
P(A|B) = P(B) = P(A)
&
Page 22
%
Rohini Somanathan
'
P(AB)
P(B)
Multiplying both sides by P(B), we have the multiplication rule for probabilities:
P(A B) = P(A|B)P(B)
This is especially useful in cases where an experiment can be interpreted as being
conducted in two stages. In such cases, P(A|B) and P(B) can often be very easily assigned.
Examples:
Two cards are drawn successively, without replacement from an ordinary deck of
playing cards. What is the probability of drawing two aces?
Here the event B is that the first card drawn is an ace and the event A is that the
4
1
3
1
second card is an ace. P(B) is clearly 52
= 13
and P(A|B) = 51
= 17
The required
1
1
1
probability P(A B) is therefore ( 13 )( 17 ) = 221
There are two types of candidates, competent and incompetent (C and I). The share of
I-type candidates seeking admission is 0.3. All candidates are interviewed by a
committee and the committee rejects incompetent candidates with probability 0.9.
What is the probability that an incompetent candidate is admitted?
Here were interested in P(A I) where P(I) = .3 and P(A|I) = .1, so the required
probability is .03.
&
Page 23
%
Rohini Somanathan
'
If P(Ai ) > 0 for all i, then using the multiplication rule derived above, this can be written as:
P(B) =
k
X
P(Ai )P(B|Ai )
i=1
P(Y = 50) =
50
X
x=1
1
1
1
1
1
1
.
=
(1 + + + +
) = .09
51 x 50
50
2
3
50
&
Page 24
%
Rohini Somanathan
'
Bayes Theorem
Bayes Theorem: (or Bayes Rule) Let the events A1 , A2 , . . . Ak form a partition of S such that
P(Aj ) > 0 for all j = 1, 2, . . . , k, and let B be any event such that P(B) > 0. Then for i = 1, . . . , k,
P(Ai |B) =
P(B|Ai )P(Ai )
k
P
P(Aj )P(B|Aj )
j=1
Proof:
By the definition of conditional probability,
P(Ai |B) =
P(Ai B)
P(B)
The denominators in these expressions are the same by the law of total probability and the
numerators are the same using the multiplication rule.
In the case where the partition of S consists of only two events,
P(A|B) =
P(B|A)P(A)
P(B|A)P(A) + P(B|Ac )P(Ac )
&
Page 25
%
Rohini Somanathan
'
Bayes Rule...remarks
Bayes rule provides us with a method of updating events in the partition based on the new
information provided by the occurrence of the event B
Since P(Aj ) is the probability of event Aj prior to the occurrence of event B, it is referred
to as the prior probability of event Aj .
P(Aj |B) is the updated probability of the same event after the occurrence of B and is called
the posterior probability of event Aj .
Bayes rule is very commonly used in game-theoretic models. For example, in political
economy models a Bayes-Nash equilibrium is a standard equilibrium concept: Players (say
voters) start with beliefs about politicians and update these beliefs when politicians take
actions. Beliefs are constrained to be updated based on Bayes conditional probability
formula.
In Bayesian estimation, prior distributions on population parameters are updated given
information contained in a sample. This is in contrast to more standard procedures where
only the sample information is used. The sample would now lead to different estimates,
depending on the prior distribution of the parameter that is used.
A word about Bayes: He was a non-conformist clergyman (1702-1761), with no formal
mathematics degree. He studied logic and theology at the University of Edinburgh.
&
Page 26
%
Rohini Somanathan
'
P(Positive|Disease)P(Disease)
(.98)(.001)
=
= .089
P(Positive)
(.98)(.001) + (.01)(.999)
So in spite of the test being very effective in catching the disease, we have a large number of
false positives.
&
Page 27
%
Rohini Somanathan
'
1
3
2
3
if
&
Page 28
%
Rohini Somanathan
'
1
3
The contestant can therefore double his probability of being correct by switching. The
posterior probability of A2 is 32 while that of A1 remains 13 .
&
Page 29
%
Rohini Somanathan
'
witness drew on published studies to obtain a figure for the frequency of sudden infant death syndrome
(SIDS, or cot death) in families having some of the characteristics of the defendants family. He went on
to square this figure to obtain a value of 1 in 73 million for the frequency of two cases of SIDS in such a
family. ..This approach is, in general, statistically invalid. It would only be valid if SIDS cases arose
independently within families,.. there are very strong a priori reasons for supposing that the assumption
will be false. There may well be unknown genetic or environmental factors that predispose families to SIDS,
so that a second case within the family becomes much more likely. The true frequency of families with two
cases of SIDS may be very much less incriminating than the figure presented to the jury at trial.
&
Page 30
%
Rohini Somanathan