You are on page 1of 106

Statistical Inference for

Actuarial Science

Lectures 1 and 2
Dr. Md Mushfiqur Rahman
East West University

Lecture Goals

Explain how decisions are often based on incomplete


information

Explain key definitions:


Population vs. Sample
Parameter vs. Statistic
Descriptive vs. Inferential Statistics

Describe random sampling


Explain the difference between Descriptive and
Inferential statistics

Dealing with Uncertainty


Everyday decisions are based on incomplete information
The price of IBM stock will be higher in six months than it is now.
The bank interest rates will remain unchanged for the rest of the
year.

Because of uncertainty, the statements should be modified:


The price of IBM stock is likely to be higher in six months than it
is now.
It is probable that the bank interest rates will remain unchanged
for the rest of the year.

Key Definitions

A population is the collection of all items of interest or


under investigation

N represents the population size

A sample is an observed subset of the population

n represents the sample size

A parameter is a specific characteristic of a population

A statistic is a specific characteristic of a sample

Population vs. Sample


Population
a b

Sample

cd

ef gh i jk l m n
o p q rs t u v w
x y

Values calculated using


population data are called
parameters

gi
o

n
r

y
Values computed from
sample data are called
statistics

Examples of Populations

All registered voters in Bangladesh


Incomes of all families living in Dhaka city

Annual returns of all stocks traded on the


Dhaka Stock Exchange

Grade point averages of all the students in this


university

Random Sampling
Simple random sampling is a procedure in which

each member of the population is chosen strictly by


chance,
each member of the population is equally likely to be
chosen, and
every possible sample of n objects is equally likely to be
chosen

The resulting sample is called a random sample

Descriptive and Inferential Statistics


Two branches of statistics:

Descriptive statistics

Collecting, summarizing, and processing data to


transform data into information

Inferential statistics

provide the bases for predictions, forecasts, and


estimates that are used to transform information into
knowledge

Descriptive Statistics

Collect data

Present data

e.g., Survey

e.g., Tables and graphs

Summarize data

e.g., Sample mean =


n

Inferential Statistics

Estimation

e.g., Estimate the population


mean income using the sample
mean income

Hypothesis testing

e.g., Test the claim that the


population mean family income
in Uttara residents weight is 50,
000.00 taka

Inference is the process of drawing conclusions or


making decisions about a population based on
sample results

The Decision Making Process


Decision

Knowledge

Experience, Theory,
Literature, Inferential
Statistics

Information
Descriptive Statistics,
Use of Computers

Begin Here:
Identify the
Problem

Data

Types of Data
Data

Categorical

Numerical

Examples:

Marital Status
Are you registered to
vote?
Eye Color
(Defined categories or
groups)

Discrete
Examples:

Number of Children
Defects per hour
(Counted items)

Continuous
Examples:

Weight
Voltage
(Measured characteristics)

Measurement Levels
Differences between
measurements, true
zero exists

Ratio Data
Quantitative Data

Differences between
measurements but no
true zero

Interval Data

Ordered Categories
(rankings, order, or
scaling)

Ordinal Data
Qualitative Data

Categories (no
ordering or direction)

Nominal Data

Graphical
Presentation of Data

Data in raw form are usually not easy to use


for decision making

Some type of organization is needed


Table
Graph

The type of graph to use depends on the


variable being summarized

Graphical
Presentation of Data
(continued)

Categorical
Variables
Frequency distribution
Bar chart
Pie chart
Pareto diagram

Numerical
Variables
Line chart
Frequency distribution
Histogram and ogive
Stem-and-leaf display
Scatter plot

Introduction to
Probability Distributions
Random Variable
Represents a possible numerical value from
a random experiment
Random
Variables
Discrete
Random Variable

Continuous
Random Variable

Discrete Random Variables


Can only take on a countable number of values
Examples:
Roll a die twice
Let X be the number of times 4 comes up
(then X could be 0, 1, or 2 times)

Toss a coin 5 times.


Let X be the number of heads
(then X = 0, 1, 2, 3, 4, or 5)

Discrete Probability Distribution


Experiment: Toss 2 Coins.

Let X = # heads.

Show P(x) , i.e., P(X = x) , for all values of x:

T
T
H

T
H
T

Probability Distribution
x Value

Probability

1/4 = .25

2/4 = .50

1/4 = .25

Probability

4 possible outcomes

.50
.25

Probability Distribution
Required Properties

P(x) 0 for any value of x

The individual probabilities sum to 1;

P(x) 1
x

(The notation indicates summation over all possible x values)

Cumulative Probability Function


The cumulative probability function, denoted
F(x0), shows the probability that X is less than or
equal to x0

F(x 0 ) P(X x 0 )
In other words,

F(x 0 )

P(x)

x x0

Expected Value
Expected Value (or mean) of a discrete
distribution (Weighted Average)
E(x) xP(x)
x

Example: Toss 2 coins,


x = # of heads
expected value of x and cumulative
probability for x:
E(x) = (0 x .25) + (1 x .50) + (2 x .25)
= 1.0

P(x)
0

.25

.50

.25

Variance and Standard


Deviation
Variance of a discrete random variable X

E(X ) (x ) P(x)
2

Standard Deviation of a discrete random variable X

2
(x

)
P(x)

Standard Deviation Example


Example: Toss 2 coins, X = # heads,
compute standard deviation (recall E(x) = 1)

2
(x

)
P(x)

(0 1)2 (.25) (1 1)2 (.50) (2 1)2 (.25) .50 .707


Possible number of heads
= 0, 1, or 2

Functions of Random Variables


If P(x) is the probability function of a discrete
random variable X , and g(X) is some function of
X , then the expected value of function g is

E[g(X)] g(x)P(x)
x

Functions of Random Variables


Let a and b be any constants.
a)

E(a) a

and

Var(a) 0

i.e., if a random variable always takes the value 5,5,5,5,


5,5 it will have mean 5 and variance 0

b)

E(bX) bX

and

Var(bX) b

i.e., the expected value of bX is bE(x)

2
X

Linear Functions of Random Variables

Let random variable X have mean x and variance 2x


Let a and b be any constants.
Let Y = a + bX
Then the mean and variance of Y are

Y E(a bX)

Var(a bX)

Linear Functions of Random Variables

Let random variable X have mean x and variance 2x


Let a and b be any constants.
Let Y = a + bX
Then the mean and variance of Y are

Y E(a bX) a bX

Var(a bX) b
2

so that the standard deviation of Y is

Y b X

Probability Distributions
Some Probability
Distributions
Discrete
Probability
Distributions

Continuous
Probability
Distributions

Binomial

Uniform

Geometric

Normal

Poisson

Exponential

Bernoulli Distribution
Consider only two outcomes: success or failure
Let P denote the probability of success
Let 1 P be the probability of failure
Define random variable X:
x = 1 if success, x = 0 if failure
Ex: Was the newborn child a girl?
Then the Bernoulli probability function is

P(0) (1 P) and P(1) P

Bernoulli Distribution
The mean is = P

E(X) xP(x) (0)(1 P) (1)P P


X

The variance is 2 = P(1 P)

2 E[(X )2 ] (x )2 P(x)
X

(0 P) (1 P) (1 P) P P(1 P)
2

Binomial Distribution Formula


P(x)

n!
x ! (n x )!

P (1- P)

P(x) = probability of x successes in n trials,


with probability of success P on each trial
x = number of successes in sample,
(x = 0, 1, 2, ..., n)
n = sample size (number of trials
or observations)
P = probability of success

nX

Example: Flip a coin four


times, let x = # heads:
n=4

P = 0.5
1 - P = (1 - 0.5) = 0.5
x = 0, 1, 2, 3, 4

Example:
Calculating a Binomial Probability
Ex: What is the probability of one success in five
observations if the probability of success is 0.1?
x = 1, n = 5, and P = 0.1

n!
P X (1 P)n X
x! (n x)!
5!

(0.1)1(1 0.1)5 1
1!(5 1)!

P(x 1)

(5)(0.1)(0.9)4
.32805

Binomial Distribution
Mean and Variance
Mean

E(x) nP

Variance and Standard Deviation

nP(1- P)
2

nP(1- P)
Where n = sample size
P = probability of success
(1 P) = probability of failure

Binomial Characteristics
Examples

nP (5)(0.1) 0.5
Mean
nP(1- P) (5)(0.1)(1 0.1)
0.6708

P(x)

.6
.4
.2
0

nP (5)(0.5) 2.5
nP(1- P) (5)(0.5)(1 0.5)
1.118

n = 5 P = 0.1

P(x)
.6
.4
.2
0

n = 5 P = 0.5
x

The Poisson Distribution


Apply the Poisson Distribution when:
We wish to count the number of times an event
occurs in a given continuous interval
The probability that an event occurs is very small

The number of events that occur are independent


The average number of events per unit time is
(lambda)

Poisson Distribution Formula

e
P(x)
x!

where:
x = number of successes per unit
= expected number of successes per unit
e = base of the natural logarithm system (2.71828...)

Poisson Distribution
Characteristics
Mean

E(x)

Variance and Standard Deviation

E[( X ) ]
2


where = expected number of successes per unit

Using Poisson Tables


Example: Find P(X = 2) if = .50

e X e 0.50 (0.50)2
P( X 2)

.0758
X!
2!

Joint Probability Functions


A joint probability function is used to express the
probability that X takes the specific value x and
simultaneously Y takes the value y, as a function of x
and y

P(x, y) P(X x Y y)
The marginal probabilities are

P(x) P(x, y)
y

P(y) P(x, y)
x

Conditional Probability Functions


The conditional probability function of the random
variable Y expresses the probability that Y takes the
value y when the value x is specified for X.

P(x, y)
P(y | x)
P(x)
Similarly, the conditional probability function of X, given
Y = y is:

P(x, y)
P(x | y)
P(y)

Independence
The jointly distributed random variables X and Y are
said to be independent if and only if their joint probability
function is the product of their marginal probability
functions:

P(x, y) P(x)P(y)
for all possible pairs of values x and y

A set of k random variables are independent if and only


if

P(x1, x 2 ,, xk ) P(x1 )P(x2 )P(xk )

Covariance
Let X and Y be discrete random variables with means
X and Y

The expected value of (X - X)(Y - Y) is called the


covariance between X and Y
For discrete random variables
Cov(X, Y) E[(X X )(Y Y )] (x x )(y y )P(x, y)
x

An equivalent expression is

Cov(X, Y) E(XY) xy xyP(x, y) xy


x

Covariance and Independence


The covariance measures the strength of the
linear relationship between two variables
If two random variables are statistically
independent, the covariance between them
is 0
The converse is not necessarily true

Correlation
The correlation between X and Y is:

Cov(X, Y)
Corr(X, Y)
X Y
= 0 no linear relationship between X and Y
> 0 positive linear relationship between X and Y
when X is high (low) then Y is likely to be high (low)
= +1 perfect positive linear dependency

< 0 negative linear relationship between X and Y


when X is high (low) then Y is likely to be low (high)
= -1 perfect negative linear dependency

Portfolio Analysis
Let random variable X be the price for stock A
Let random variable Y be the price for stock B
The market value, W, for the portfolio is given by the
linear function

W aX bY
(a is the number of shares of stock A,
b is the number of shares of stock B)

Portfolio Analysis
(continued)

The mean value for W is

W E[W] E[aX bY]


aX bY
The variance for W is

2W a2 2X b2 2Y 2abCov(X, Y)
or using the correlation formula

2W a2 2X b2 2Y 2abCorr(X,Y) X Y

Example: Investment Returns


Return per $1,000 for two types of investments

P(xiyi)

Economic condition

Investment
Passive Fund X Aggressive Fund Y

.2

Recession

- $ 25

- $200

.5

Stable Economy

+ 50

+ 60

.3

Expanding Economy

+ 100

+ 350

E(x) = x = (-25)(.2) +(50)(.5) + (100)(.3) = 50


E(y) = y = (-200)(.2) +(60)(.5) + (350)(.3) = 95

Computing the Standard Deviation


for Investment Returns
P(xiyi)

Economic condition

Investment
Passive Fund X Aggressive Fund Y

0.2

Recession

- $ 25

- $200

0.5

Stable Economy

+ 50

+ 60

0.3

Expanding Economy

+ 100

+ 350

X (-25 50)2 (0.2) (50 50)2 (0.5) (100 50)2 (0.3)


43.30
y (-200 95)2 (0.2) (60 95)2 (0.5) (350 95)2 (0.3)
193.71

Covariance for Investment Returns


P(xiyi)

Economic condition

Investment
Passive Fund X Aggressive Fund Y

.2

Recession

- $ 25

- $200

.5

Stable Economy

+ 50

+ 60

.3

Expanding Economy

+ 100

+ 350

Cov(X, Y) (-25 50)(-200 95)(.2) (50 50)(60 95)(.5)


(100 50)(350 95)(.3)
8250

Portfolio Example
Investment X:
Investment Y:

x = 50
x = 43.30
y = 95
y = 193.21
xy = 8250

Suppose 40% of the portfolio (P) is in Investment X and


60% is in Investment Y:
E(P) .4 (50) (.6) (95) 77

P (.4)2 (43.30)2 (.6)2 (193.21)2 2(.4)(.6)(8250)


133.04
The portfolio return and portfolio variability are between the values
for investments X and Y considered individually

Interpreting the Results for


Investment Returns
The aggressive fund has a higher expected
return, but much more risk

y = 95 > x = 50
but
y = 193.21 > x = 43.30
The Covariance of 8250 indicates that the two
investments are positively related and will vary
in the same direction

Continuous Probability Distributions


A continuous random variable is a variable that
can assume any value in an interval
time required to complete a task
temperature of a solution
Age in years, height in inches

These can potentially take on any value,


depending only on the ability to measure
accurately.

Cumulative Distribution Function


The cumulative distribution function, F(x), for a
continuous random variable X expresses the
probability that X does not exceed the value of x

F(x) P(X x)
Let a and b be two possible values of X, with
a < b. The probability that X lies between a
and b is

P(a X b) F(b) F(a)

Probability Density Function


The probability density function, f(x), of random variable X has the
following properties:
1. f(x) > 0 for all values of x
2. The area under the probability density function f(x) over all values of the
random variable X is equal to 1.0
3. The probability that X lies between two values is the area under the
density function graph between the two values
4. The cumulative density function F(x0) is the area under the probability
density function f(x) from the minimum x value up to x0

F(x 0 )

x0

f(x)dx

xm

where xm is the minimum value of the random variable x

Probability as an Area
Shaded area under the curve is the
probability that X is between a and b
f(x)

P (a x b )
= P (a < x < b )
(Note that the
probability of any
individual value is zero)

Expectations for Continuous


Random Variables

The mean of X, denoted X , is defined as the


expected value of X

X E(X)

The variance of X, denoted X2 , is defined as the


expectation of the squared deviation, (X - X)2, of a
random variable from its mean

2X E[(X X )2 ]

The Uniform Distribution


The uniform distribution is a probability
distribution that has equal probabilities for all
possible outcomes of the random variable

f(x)
Total area under the
uniform probability
density function is 1.0

xmin

xmax x

The Uniform Distribution


(continued)

The Continuous Uniform Distribution:

f(x) =

1
if a x b
ba
0

otherwise

where
f(x) = value of the density function at any x value
a = minimum value of x
b = maximum value of x

Properties of the Uniform Distribution


The mean of a uniform distribution is

ab

2
The variance is
2
(b
a)
2
12

Uniform Distribution Example


Example: Uniform probability distribution
over the range 2 x 6:
1
f(x) = 6 - 2 = .25 for 2 x 6
f(x)

.25

(b - a)2 (6 - 2)2

1.333
12
12
2

ab 26

4
2
2

Linear Functions of Variables


Let W = a + bX , where X has mean X and
variance X2 , and a and b are constants
Then the mean of W is

W E(a bX) a bX
the variance is

Var(a bX) b
2
W

the standard deviation of W is

W b X

2
X

Linear Functions of Variables


(continued)

An important special case of the previous results is the


standardized random variable

X X
Z
X
which has a mean 0 and variance 1

The Normal Distribution


(continued)

Bell Shaped
Symmetrical
f(x)
Mean, Median and Mode
are Equal
Location is determined by the
mean,

Spread is determined by the


standard deviation,
The random variable has an
infinite theoretical range:
+ to

Mean
= Median
= Mode

Many Normal Distributions

By varying the parameters and , we obtain


different normal distributions

The Normal Distribution


Shape
f(x)

Changing shifts the


distribution left or right.

Changing increases
or decreases the
spread.

Given the mean and variance we define the normal


distribution using the notation

X ~ N(, 2 )

The Normal Probability


Density Function
The formula for the normal probability density
function is

1
(x )2 /2 2
f(x)
e
2
Where

e = the mathematical constant approximated by 2.71828


= the mathematical constant approximated by 3.14159
= the population mean
= the population standard deviation
x = any value of the continuous variable, < x <

Cumulative Normal Distribution


For a normal random variable X with mean and
variance 2 , i.e., X~N(, 2), the cumulative
distribution function is

F(x 0 ) P(X x 0 )
f(x)

P(X x 0 )

x0

Finding Normal Probabilities


The probability for a range of values is
measured by the area under the curve

P(a X b) F(b) F(a)

Finding Normal Probabilities


(continued)

F(b) P(X b)
a

F(a) P(X a)

P(a X b) F(b) F(a)

The Standardized Normal

Any normal distribution (with any mean and variance


combination) can be transformed into the
standardized normal distribution (Z), with mean 0
and variance 1
f(Z)

Z ~ N(0,1)

1
0

Need to transform X units into Z units by subtracting the


mean of X and dividing by its standard deviation

X
Z

Example

If X is distributed normally with mean of 100


and standard deviation of 50, the Z value for
X = 200 is

X 200 100
Z

2.0

50

This says that X = 200 is two standard


deviations (2 increments of 50 units) above
the mean of 100.

Comparing X and Z units

100
0

200
2.0

X
Z

( = 100, = 50)
( = 0, = 1)

Note that the distribution is the same, only the


scale has changed. We can express the problem in
original units (X) or in standardized units (Z)

Finding Normal Probabilities


b
a
P(a X b) P
Z



b a
F
F

f(x)

a
a

Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below
f(X) P( X ) 0.5

0.5

P( X ) 0.5

0.5

P( X ) 1.0

The Standardized Normal Table


Table gives the probability F(a) for any value a

.9772

Example:
P(Z < 2.00) = .9772
0

2.00

The Standardized Normal Table


(continued)

For negative Z-values, use the fact that the


distribution is symmetric to find the needed
probability:
.9772
.0228

Example:
P(Z < -2.00) = 1 0.9772
= 0.0228

2.00

.9772
.0228

-2.00

General Procedure for


Finding Probabilities
To find P(a < X < b) when X is
distributed normally:

Draw the normal curve for the problem in


terms of X

Translate X-values to Z-values


Use the Cumulative Normal Table

Finding Normal Probabilities


Suppose X is normal with mean 8.0 and
standard deviation 5.0
Find P(X < 8.6)

X
8.0
8.6

Finding Normal Probabilities


(continued)

Suppose X is normal with mean 8.0 and


standard deviation 5.0. Find P(X < 8.6)
X 8.6 8.0
Z

0.12

5.0
=8
= 10

8 8.6

P(X < 8.6)

=0
=1

0 0.12

P(Z < 0.12)

Solution: Finding P(Z < 0.12)


Standardized Normal Probability
Table (Portion)

F(z)

.10

.5398

.11

.5438

.12

.5478

P(X < 8.6)


= P(Z < 0.12)
F(0.12) = 0.5478

.13

.5517

0.00
0.12

Upper Tail Probabilities


Suppose X is normal with mean 8.0 and
standard deviation 5.0.
Now Find P(X > 8.6)

X
8.0
8.6

Upper Tail Probabilities


(continued)

Now Find P(X > 8.6)


P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z 0.12)
= 1.0 - 0.5478 = 0.4522

0.5478

1.000

1.0 - 0.5478
= 0.4522

Z
0
0.12

Z
0
0.12

Finding the X value for a


Known Probability
Steps to find the X value for a known
probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:

X Z

Finding the X value for a


Known Probability

(continued)

Example:
Suppose X is normal with mean 8.0 and
standard deviation 5.0.
Now find the X value so that only 20% of all
values are below this X
.2000

?
?

8.0
0

X
Z

Find the Z value for


20% in the Lower Tail
1. Find the Z value for the known probability
Standardized Normal Probability 20% area in the lower
Table (Portion)

F(z)

.82

.7939

.83

.7967

.84

.7995

.85

.8023

tail is consistent with a


Z value of -0.84
.80
.20

?
8.0
-0.84 0

X
Z

Finding the X value


2. Convert to X units using the formula:

X Z
8.0 ( 0.84)5.0
3.80
So 20% of the values from a distribution
with mean 8.0 and standard deviation
5.0 are less than 3.80

Assessing Normality
Not all continuous random variables are
normally distributed

It is important to evaluate how well the data is


approximated by a normal distribution

The Normal Probability Plot


Normal probability plot
Arrange data from low to high values
Find cumulative normal probabilities for all values
Examine a plot of the observed values vs. cumulative
probabilities (with the cumulative normal probability
on the vertical axis and the observed data values on
the horizontal axis)
Evaluate the plot for evidence of linearity

The Normal Probability Plot


(continued)

A normal probability plot for data


from a normal distribution will be
approximately linear:
100
Percent

0
Data

The Normal Probability Plot


(continued)

Left-Skewed

Right-Skewed
100
Percent

Percent

100

Data

0
Data

Uniform
Nonlinear plots
indicate a deviation
from normality

Percent

100

0
Data

Normal Distribution Approximation


for Binomial Distribution
Recall the binomial distribution:
n independent trials
probability of success on any given trial = P

Random variable X:
Xi =1 if the ith trial is success
Xi =0 if the ith trial is failure

E(X) nP
Var(X) nP(1- P)
2

Normal Distribution Approximation


for Binomial Distribution
(continued)

The shape of the binomial distribution is


approximately normal if n is large

The normal is a good approximation to the binomial


when nP(1 P) > 9
Standardize to Z from a binomial distribution:

X E(X)
X np
Z

Var(X)
nP(1 P)

Normal Distribution Approximation


for Binomial Distribution
(continued)

Let X be the number of successes from n independent


trials, each with probability of success P.

If nP(1 - P) > 9,

a nP

nP

P(a X b) P
Z

nP(1

P)
nP(1

P)

Binomial Approximation Example


40% of all voters support ballot proposition A. What
is the probability that between 76 and 80 voters
indicate support in a sample of n = 200 ?
E(X) = = nP = 200(0.40) = 80
Var(X) = 2 = nP(1 P) = 200(0.40)(1 0.40) = 48
( note: nP(1 P) = 48 > 9 )

76 80
80 80

P(76 X 80) P
Z
200(0.4)(1 0.4)
200(0.4)(1 0.4)

P( 0.58 Z 0)
F(0) F( 0.58)
0.5000 0.2810 0.2190

The Exponential Distribution


Used to model the length of time between two
occurrences of an event (the time between
arrivals)
Examples:
Time between trucks arriving at an unloading dock
Time between transactions at an ATM Machine
Time between phone calls to the main operator

The Exponential Distribution


(continued)

The exponential random variable T (t>0) has a


probability density function

f(t) e

for t 0

Where
is the mean number of occurrences per unit time
t is the number of time units until the next occurrence
e = 2.71828

T is said to follow an exponential probability distribution

The Exponential Distribution


Defined by a single parameter, its mean (lambda)
The cumulative distribution function (the probability that
an arrival time is less than some specified time t) is

F(t) 1 e
where

e = mathematical constant approximated by 2.71828


= the population mean number of arrivals per unit
t = any value of the continuous variable where t > 0

Exponential Distribution
Example
Example: Customers arrive at the service counter at
the rate of 15 per hour. What is the probability that the
arrival time between consecutive customers is less
than three minutes?

The mean number of arrivals per hour is 15, so = 15


Three minutes is .05 hours
P(arrival time < .05) = 1 e- X = 1 e-(15)(.05) = 0.5276
So there is a 52.76% probability that the arrival time
between successive customers is less than three
minutes

Covariance
Let X and Y be continuous random variables, with
means x and y
The expected value of (X - x)(Y - y) is called the
covariance between X and Y

Cov(X, Y) E[(X x )(Y y )]


An alternative but equivalent expression is

Cov(X, Y) E(XY) xy
If the random variables X and Y are independent, then the
covariance between them is 0.

Correlation
Let X and Y be jointly distributed random variables.
The correlation between X and Y is

Cov(X, Y)
Corr(X, Y)
X Y

Sums of Random Variables


Let X1, X2, . . .Xk be k random variables with
means 1, 2,. . . k and variances
12, 22,. . ., k2. Then:

The mean of their sum is the sum of their


means

E(X1 X2 Xk ) 1 2 k

Sums of Random Variables


(continued)

Let X1, X2, . . .Xk be k random variables with means 1,


2,. . . k and variances 12, 22,. . ., k2. Then:

If the covariance between every pair of these random


variables is 0, then the variance of their sum is the
sum of their variances

Var(X1 X2 Xk ) 12 22 k2

However, if the covariances between pairs of random


variables are not 0, the variance of their sum is
K 1 K

Var(X 1 X2 Xk ) 12 22 k2 2 Cov(X i , X j )
i1 ji1

Differences Between Two


Random Variables
For two random variables, X and Y

The mean of their difference is the difference of their


means; that is

E(X Y) X Y

If the covariance between X and Y is 0, then the


variance of their difference is

Var(X Y) 2X 2Y

If the covariance between X and Y is not 0, then the


variance of their difference is

Var(X Y) 2X 2Y 2Cov(X, Y)

Linear Combinations of
Random Variables
A linear combination of two random variables, X and Y,
(where a and b are constants) is

W aX bY
The mean of W is

W E[W] E[aX bY] aX bY

Linear Combinations of
Random Variables
(continued)

The variance of W is

2W a2 2X b2 2Y 2abCov(X, Y)
Or using the correlation,

2W a2 2X b2 2Y 2abCorr(X,Y) X Y
If both X and Y are joint normally distributed random
variables then the linear combination, W, is also
normally distributed

Example
X = minutes to complete task 1; x = 20, x = 5
Y = minutes to complete task 2; y = 30, y = 8
What are the mean and standard deviation for the time to complete
both tasks?

W X Y

W X Y 20 30 50
Since X and Y are independent, Cov(X,Y) = 0, so

2W 2X 2Y 2Cov(X, Y) (5)2 (8)2 89


The standard deviation is

W 89 9.434

You might also like