You are on page 1of 36

Discrete Random Variables and

Probability Distributions

Random Variables
Random Variable (RV): A numeric outcome that results
from an experiment
For each element of an experiments sample space, the
random variable can take on exactly one value
Discrete Random Variable: An RV that can take on only
a finite or countably infinite set of outcomes
Continuous Random Variable: An RV that can take on
any value along a continuum (but may be reported
discretely
Random Variables are denoted by upper case letters (Y)
Individual outcomes for RV are denoted by lower case
letters (y)

Probability Distributions
Probability Distribution: Table, Graph, or Formula that
describes values a random variable can take on, and its
corresponding probability (discrete RV) or density
(continuous RV)
Discrete Probability Distribution: Assigns probabilities
(masses) to the individual outcomes
Continuous Probability Distribution: Assigns density at
individual points, probability of ranges can be obtained by
integrating density function
Discrete Probabilities denoted by: p(y) = P(Y=y)
Continuous Densities denoted by: f(y)
Cumulative Distribution Function: F(y) = P(Yy)

Discrete Probability Distributions


Probability (Mass) Function :
p( y ) P (Y y )
p( y ) 0 y

p( y ) 1
all y

Cumulative Distribution Function (CDF) :


F ( y ) P(Y y )
F (b) P (Y b)

p( y )

F () 0 F () 1
F ( y ) is monotonically increasing in y

Example Rolling 2 Dice (Red/Green)


Y = Sum of the up faces of the two die. Table gives value of y for all elements in S

Red\Green

10

10

11

10

11

12

Rolling 2 Dice Probability Mass Function & CDF


y

p(y)

F(y)

1/36

1/36

2/36

3/36

3/36

6/36

4/36

10/36

5/36

15/36

6/36

21/36

5/36

26/36

4/36

30/36

10

3/36

33/36

11

2/36

35/36

12

1/36

36/36

# of ways 2 die can sum to y


p( y )
# of ways 2 die can result in
y

F ( y ) p (t )
t 2

Rolling 2 Dice Probability Mass Function

Rolling 2 Dice Cumulative Distribution Function

Expected Values of Discrete RVs


Mean (aka Expected Value) Long-Run average
value an RV (or function of RV) will take on
Variance Average squared deviation between a
realization of an RV (or function of RV) and its mean
Standard Deviation Positive Square Root of
Variance (in same units as the data)
Notation:
Mean: E(Y) =
Variance: V(Y) = 2
Standard Deviation:

Expected Values of Discrete RVs


Mean : E (Y ) yp( y )
all y

Mean of a function g (Y ) : E g (Y ) g ( y ) p( y )

all y

Variance : V (Y ) E (Y E (Y )) E (Y )
2

( y ) 2 p ( y ) y 2 2 y 2 p( y )
all y

all y

y 2 p ( y ) 2 yp( y ) 2 p( y )
all y

all y

all y

E Y 2 2 ( ) 2 (1) E Y 2 2
Standard Deviation :

Expected Values of Linear Functions of Discrete RVs

Linear Functions : g (Y ) aY b (a, b constants)


E[aY b] (ay b) p ( y )
all y

a yp( y ) b p ( y ) a b
all y

all y

V [aY b] (ay b) (a b) p ( y )
2

all y

ay a
all y

all y

a 2 ( y ) 2 p ( y ) a 2 2
all y

aY b a

p( y ) a y p( y )
2

Example Rolling 2 Dice


y

p(y)

yp(y)

y2p(y)

1/36

2/36

4/36

2/36

6/36

18/36

3/36

12/36

48/36

4/36

20/36

100/36

5/36

30/36

180/36

6/36

42/36

294/36

5/36

40/36

320/36

4/36

36/36

324/36

10

3/36

30/36

300/36

11

2/36

22/36

242/36

12

1/36

12/36

144/36

Sum 36/36
=1.00

252/36
=7.00

1974/36=
54.833

12

E (Y ) yp( y ) 7.0
y 2

12

E Y y 2 p( y ) 2
2

y 2

54.8333 (7.0) 2 5.8333

5.8333 2.4152

Tchebysheffs Theorem/Empirical Rule


Tchebysheff: Suppose Y is any random variable with
mean and standard deviation . Then:
P(-k Y +k) 1-(1/k2) for k 1
k=1: P(-1 Y +1) 1-(1/12) = 0 (trivial result)
k=2: P(-2 Y +2) 1-(1/22) =
k=3: P(-3 Y +3) 1-(1/32) = 8/9

Note that this is a very conservative bound, but that it


works for any distribution
Empirical Rule (Mound Shaped Distributions)
k=1: P(-1 Y +1) 0.68
k=2: P(-2 Y +2) 0.95
k=3: P(-3 Y +3) 1

Proof of Tchebysheffs Theorem


Breaking real line into 3 parts :
i ) (-,( -k ) ]
ii ) [( -k ), ( k )] iii ) [( k ) , )
Making use of the definition of Variance :

V (Y ) ( y ) 2 p ( y )
2

( -k )

( y )

p( y)

( k )

( y )

p( y)

( y )

p( y)

( k )

( -k )

In Region i ) : y k ( y ) 2 k 2 2
In Region iii ) : y k ( y ) 2 k 2 2
k P (Y k )
2

( k )

( y )

p ( y ) k 2 2 P (Y k )

( -k )

2 k 2 2 P (Y k ) k 2 2 P(Y k )
k 2 2 1 P( k Y k )

2
1
1
2 2 2 1 P( k Y k ) P ( k Y k ) 1 2
k
k
k

Moment Generating Functions (I)


Consider the series expansion of e x :

xi
x 2 x3
e 1 x ...
2 6
i 0 i!
Note that by taking derivatives with respect to x, we get :
x

de x
2 x 3x 2
x2
0 1

... 1 x ... e x
dx
2!
3!
2!
d 2e x
2x
0 1
...
2
dx
2!
Now, Replacing x with tY , we get :

i
2
3
(
tY
)
(
tY
)
(
tY
)
e tY
1 tY

...
i!
2
6
i 0

t 2Y 2 t 3Y 3
1 tY

...
2
6

Moment Generating Functions (II)


Taking derivatives with respect to t and evaluating at t 0 :
de tY
dt

t 0

d 2 e tY
dt 2

2tY 2 3t 2Y 3
t 2Y 3
2
0Y

... Y tY
... Y 0 0 ... Y
2!
3!
2!
t 0
t 0
0 Y 2 tY 3 ...

t 0

t 0

Y 2 0 ... Y 2

Taking the expected value of e tY , and labelling function as M (t ) :

M (t ) E e

tY

all y

ty

p( y)

all y

i 0

ty i

p( y)

i!

M ' (t ) t 0 E (Y ), M ' ' (t ) t 0 E Y 2 , ... M ( k ) (t )

t 0

E YK

M(t) is called the moment-generating function for Y, and cam be used


to derive any non-central moments of the random variable (assuming it
exists in a neighborhood around t=0).
Also, useful in determining the distributions of functions of rndom
variables

Probability Generating Functions


Consider the function t Y and its derivatives :
dt Y
Yt Y 1
dt
d 2t Y
Y 2

Y
(
Y

1
)
t
dt 2
d ktY
Y k

Y
(
Y

1
)...(
Y

(
k

1
))
t
dt k
Let P (t ) E t Y :

P ' (t ) t 1 E (Y )

P ' ' (t ) t 1 E Y (Y 1)
P ( k ) (t )

t 1

E Y (Y 1)...(Y (k 1))

P(t) is the
probability
generating
function for Y

Discrete Uniform Distribution


Suppose Y can take on any integer value between a and b
inclusive, each equally likely (e.g. rolling a dice, where a=1
and b=6). Then Y follows the discrete uniform distribution.
f ( y)

1
b ( a 1)

a yb

0
ya
int ( y ) (a 1)
F ( y)
a y b int( x) integer portion of x
b

(
a

1
)

1
yb
b
a 1
b

1
1
1
b(b 1) (a 1) a b(b 1) a( a 1)

E (Y ) y

y y

2(b (a 1))
b

(
a

1
)
b

(
a

1
)
b

(
a

1
)
2
2
y a
y 1

y 1

1
1

y a
b (a 1) b ( a 1)
b(b 1)(2b 1) a (a 1)(2a 1)

6(b (a 1))

E Y 2 y 2

a 1

1
b(b 1)(2b 1) (a 1) a( 2a 1)
2
y

y2

(
a

1
)
6
6
y 1
y 1

b(b 1)(2b 1) a (a 1)(2a 1) b(b 1) a (a 1)


V (Y ) E Y E (Y )

6(b (a 1))
2(b (a 1))

Note : When a 1 and b n :


E (Y )

n 1
2

V (Y )

(n 1)(n 1)
12

(n 1)(n 1)
12

Bernoulli Distribution
An experiment consists of one trial. It can result in one of
2 outcomes: Success or Failure (or a characteristic being
Present or Absent).
Probability of Success is p (0<p<1)
Y = 1 if Success (Characteristic Present), 0 if not

y 1

p
p( y )
1 p

y0

E (Y ) yp( y ) 0(1 p ) 1 p p
y 0

E Y 2 0 2 (1 p ) 12 p p

V (Y ) E Y E (Y ) p p 2 p (1 p )
2

p (1 p )

Binomial Experiment
Experiment consists of a series of n identical trials
Each trial can end in one of 2 outcomes: Success or
Failure
Trials are independent (outcome of one has no
bearing on outcomes of others)
Probability of Success, p, is constant for all trials
Random Variable Y, is the number of Successes in
the n trials is said to follow Binomial Distribution with
parameters n and p
Y can take on the values y=0,1,,n
Notation: Y~Bin(n,p)

Binomial Distribution
Consider outcomes of an experiment with 3 Trials :
SSS y 3 P( SSS ) P (Y 3) p (3) p 3
SSF , SFS , FSS y 2 P ( SSF SFS FSS ) P (Y 2) p (2) 3 p 2 (1 p)
SFF , FSF , FFS y 1 P( SFF FSF FFS ) P(Y 1) p(1) 3 p (1 p ) 2
FFF y 0 P ( FFF ) P(Y 0) p (0) (1 p )3
In General :
n
n!

y!(n y )!
y

1) # of ways of arranging y S s (and (n y ) F s ) in a sequence of n positions


2) Probability of each arrangement of y S s (and (n y ) F s ) p y (1 p) n y
n y
3) P(Y y ) p ( y ) p (1 p ) n y
y

y 0,1,..., n

EXCEL Functions :
p ( y ) is obtained by function : BINOMDIST( y, n, p,0)
F ( y ) is obtained by function : BINOMDIST( y, n, p,1)
n i n i
a b
i 0 i
n

Binomial Expansion : (a b)
n

n y
n
p (1 p ) n y p (1 p) 1 " Legitimate" Probability Distribution
y 0 y
n

p ( y )
y 0

Binomial Distribution Expected Value


n!
f ( y)
p y q n y
y!(n y )!

y 0,1,..., n q 1 p

n!
n!
y n y
E (Y ) y
p q y
p y q n y
y 0 y!( n y )!
y 1 y!( n y )!

(Summand 0 when y 0)
n

yn!
n!
y n y
y n y
E (Y )
p q
p q
y 1 y ( y 1)!( n y )!
y 1 ( y 1)!( n y )!

Let y * y 1 y y * 1 Note : y 1,..., n y * 0,..., n 1


n

n 1
n(n 1)!
(n 1)!
y*1 n ( y*1)
y* ( n 1) y*
E (Y ) *
p
q

np
p
q

*
*
*
y * 0 y ! n ( y 1) !
y * 0 y ! ( n 1) y !
n 1

np ( p q ) n 1 np p (1 p )

n 1

np (1) np

Binomial Distribution Variance and S.D.


f ( y)

n!
p y q n y
y!(n y )!

y 0,1,..., n q 1 p

Note : E Y 2 is difficult (impossible?) to get, but E Y (Y 1) E Y 2 E (Y ) is not :

n!
n!
y n y
E Y (Y 1) y ( y 1)
p q y ( y 1)
p y q n y
y 0
y 2
y!(n y )!

y!(n y )!

(Summand 0 when y 0,1)


n

n!
p y q n y
y 2 ( y 2)!( n y )!

E Y (Y 1)

Let y ** y 2 y y ** 2
E Y (Y 1)

n2

y** 0

Note : y 2,..., n y ** 0,..., n 2

n2
n(n 1)(n 2)! y** 2 n ( y** 2)
(n 2)!
2
p
q

n
(
n

1
)
p
p y**q ( n 2 ) y**

**
**
**
*
y ! n ( y 2) !
y** 0 y ! ( n 2) y !

n(n 1) p 2 ( p q ) n 2 n(n 1) p 2 p (1 p )

n2

n(n 1) p 2

E Y 2 E Y (Y 1) E (Y ) n(n 1) p 2 np np[(n 1) p 1] n 2 p 2 np 2 np n 2 p 2 np (1 p )

V (Y ) E Y 2 E (Y ) n 2 p 2 np (1 p ) (np ) 2 np (1 p )
np (1 p )

Binomial Distribution MGF & PGF


M (t ) E e

tY

n y

n y
e p (1 p )
y 0
y

n
pe t
y 0 y

ty

M ' (t ) n pe t (1 p )

(1 p ) n y pe t (1 p )

n 1

n2

n 1

et

pe t e t pe t (1 p )

E (Y ) M ' (0) np p (1) (1 p )

pe t np pe t (1 p )

M ' ' (t ) np ( n 1) pe t (1 p )

n 1

(1) np

E Y 2 M ' ' (0) np ( n 1) p (1) (1 p )

n2

e
n 1

p (1) (1) p (1) (1 p )

np ( n 1) p 1 n 2 p 2 np 2 np n 2 p 2 np (1 p )

V (Y ) E Y 2 E (Y ) n 2 p 2 np (1 p ) ( np ) 2 np (1 p )

np (1 p )

P (t ) E t

y 0

n y

p (1 p ) n y
y

n
pt y (1 p ) n y pt (1 p ) n
y 0 y
n

n 1

[1]

Geometric Distribution
Used to model the number of Bernoulli trials needed until
the first Success occurs (P(S)=p)
First Success on Trial 1 S, y = 1 p(1)=p
First Success on Trial 2 FS, y = 2 p(2)=(1-p)p
First Success on Trial k FFS, y = k p(k)=(1-p)k-1 p

p( y ) (1 p ) y 1 p

y 1,2,...

y 1

y 1

y 1

y 1
y 1
p
(
y
)

(
1

p
)
p

p
(
1

p
)

Setting y * y 1 and noting that y 1,2,... y * 0,1,...

1
p
p( y ) p (1 p) p
1

p
y 1
y * 0
1 (1 p )

y*

Geometric Distribution - Expectations

dq
d
d
E (Y ) y q p p
p q p q q
dq
dq
dq

y 1

y 1

y 1

y 1

y 1

y 1

(1 q)(1) q( 1)
d q
p (1 q ) q
p 1
p

2
2
2
dq 1 q
(1 q )
(1 q )
p
p

E Y (Y 1) y ( y 1) q
y 1

d 2q y
d2
p pq
pq 2
2
dq
y 1 dq

2 p 1
2
V (Y ) E Y 2 E (Y ) 2
p
p

d2
q pq 2

dq
y 1

y 1

q q

y 1

q
d
1
2 pq
2 pq 2q
3

pq

pq

2
(
1

q
)
(

1
)

2
1 q
3
2
3
dq (1 q )
p
p
1 q

2q 1 2(1 p ) p 2 p
E Y (Y 1) E (Y ) 2
2
2
p
p
p
p

d2
pq 2
dq
E Y2

y 1

q
p2

2 p 1 1 p q
2 2
2
p
p
p

Geometric Distribution MGF & PGF

M (t ) E e

pqe

tY

P (t ) E t Y

p
p
e ty q y 1 p e ty q y qe t
q y 1
q y 1
y 1

qe

t y 1

y 1

pe
pe

t
1 qe 1 (1 p)e t

p
p
y
t y q y 1 p t y q y tq
q y 1
q y 1
y 1

ptq
pt
pt
y 1
tq

q y 1
1 tq 1 (1 p )t

Negative Binomial Distribution


Used to model the number of trials needed until the rth
Success (extension of Geometric distribution)
Based on there being r-1 Successes in first y-1 trials,
followed by a Success

y 1 r
y r
p (1 p)
p ( y )
r 1

y r , r 1,...

r
E (Y )
(Proof Given in Chapter 5)
p
r (1 p )
V (Y )
(Proof Given in Chapter 5)
2
p

Poisson Distribution
Distribution often used to model the number of
incidences of some characteristic in time or space:
Arrivals of customers in a queue
Numbers of flaws in a roll of fabric
Number of typos per page of text.

Distribution obtained as follows:

Break down the area into many small pieces (n pieces)


Each piece can have only 0 or 1 occurrences (p=P(1))
Let =np Average number of occurrences over area
Y # occurrences in area is sum of 0s & 1s over pieces
Y ~ Bin(n,p) with p = /n
Take limit of Binomial Distribution as n with p = /n

Poisson Distribution - Derivation


y

n!
n!


p( y)
p y (1 p ) n y
1
y!(n y )!
y!(n y )! n
n
Taking limit as n :
y

lim p ( y ) lim
n

n!


1
y!(n y )! n
n

n y

y
n(n 1)...(n y 1)

lim
1
y
n

y!
(n )
n

n y

y
n(n 1)...(n y 1)(n y )!
n
lim
1

y! n
n y (n y )!
n n

n n 1 n y 1
lim

...
1
n

y!
n
n n n

n
n y 1
... lim

1 for all fixed y


n n
n

Note : lim

lim p ( y ) lim 1
n
y! n
n

a
From Calculus, we get : lim 1 e a
n
n

y
y

e
lim p ( y ) e
y 0,1,2,...
n
y!
y!

Series expansion of exponential function : e x


x 0

e e e 1 " Legitimate" Probability Distribution


y!
y 0
y 0 y!

p( y )
y 0

xi
i!

EXCEL Functions :
p ( y ) : POISSON( y, ,0)
F ( y ) : POISSON( y, ,1)

Poisson Distribution - Expectations


e y
f ( y)
y!

y 0,1,2,...

e y
e y
e y
y 1


E (Y ) y

e
e



y!
y!
y 0
y 1
y 1 ( y 1)!
y 1 ( y 1)!

e y
e y
e y
E Y (Y 1) y ( y 1)

y ( y 1)

y 0
y 2
y 2 ( y 2)!
y!
y!

y 2
e
2 e e 2
y 2 ( y 2)!
2

E Y 2 E Y (Y 1) E (Y ) 2

V (Y ) E Y 2 E (Y ) 2 [ ]2

Poisson Distribution MGF & PGF

e
e
M (t ) E e e

y 0
y 0
y!

tY

y 0

ty

t y

y!

e t

e e

e t 1

t y

y!

e
e t
P(t ) E t t


y!
y 0
y 0
y!

y 0

y!

e e t e (t 1)

Hypergeometric Distribution
Finite population generalization of Binomial Distribution
Population:
N Elements
k Successes (elements with characteristic if interest)

Sample:
n Elements
Y = # of Successes in sample (y = 0,1,,,,,min(n,k)

p( y)

k

y

N k

n y
N

n

y 0,1,..., min(n, k )

k
E (Y ) n

k N k N n
V (Y ) n

N
N
N

(Proof in Chapter 5)
(Proof in Chapter 5)

You might also like