You are on page 1of 37

Anderson uu Sweeney

Williams

u
u

CONTEMPORAR
Y
BUSINESS
STATISTICS

WITH MICROSOFT EXCEL


Slides Prepared by JOHN LOUCKS u

2001 South-Western/Thomson Learning

Chapter 7
Sampling and Sampling Distributions

Simple Random
Sampling
Point Estimation
Introduction to Sampling
Distributions
x
Sampling Distributionp of
Sampling Distribution of
Other Sampling Methods

Statistical Inference

The purpose of statistical inference is to obtain


information about a population from
information contained in a sample.
A population is the set of all the elements of
interest.
A sample is a subset of the population.
The sample results provide only estimates of
the values of the population characteristics.
A parameter is a numerical characteristic of a
population.
With proper sampling methods, the sample
results will provide good estimates of the
population characteristics.
3

Simple Random Sampling

Finite Population
A simple random sample from a finite
population of size N is a sample selected
such that each possible sample of size n has
the same probability of being selected.
Replacing each sampled element before
selecting subsequent elements is called
sampling with replacement.
Sampling without replacement is the
procedure used most often.
In large sampling projects, computergenerated random numbers are often used
to automate the sample selection process.
4

Simple Random Sampling

Infinite Population
A simple random sample from an infinite
population is a sample selected such that
the following conditions are satisfied.
Each element selected comes from the
same population.
Each element is selected independently.
The population is usually considered infinite
if it involves an ongoing process that makes
listing or counting every element
impossible.
The random number selection procedure
cannot be used for infinite populations.
5

Point Estimation

In point estimation we use the data from the


sample to compute a value of a sample
statistic that serves as an estimate of a
populationxparameter.
We refer to as the point estimator of the
population mean .
s is the point estimator of the population
standard
deviation .
p
is the point estimator of the population
proportion p.

Sampling Error

The absolute difference between an unbiased


point estimate and the corresponding
population parameter is called the sampling
error.
Sampling error is the result of using a subset
of the population (the sample), and not the
entire population to develop estimates.
The sampling
| x |errors are:
for sample mean
|sp-| for sample standard
| p
deviation
for sample proportion
7

Example: St. Edwards


St. Edwards University receives 1,500
applications
annually from prospective students. The
application
forms contain a variety of information including
the
individuals scholastic aptitude test (SAT) score
and
whether or not the individual is an in-state
resident.
The director of admissions would like to know,
at
least roughly, the following information:
the average SAT score for the applicants,
and
8

Example: St. Edwards

Alternative #1: Take a Census of 1,500


Applicants
SAT Scores
Population Mean
xi

990
1,500
Population Standard 2Deviation
( xi )

80
1,500
In-State Applicants
Population Proportion
1,080
p
.72
1,500
9

Example: St. Edwards

Alternative #2: Take a Sample of 50


Applicants
Excel can be used to select a simple random
sample without replacement.
The process is based on random numbers
generated by Excels RAND function.
RAND function generates numbers in the
interval from 0 to 1.
Any number in the interval is equally likely.
The numbers are actually values of a
uniformly distributed random variable.

10

Example: St. Edwards

Using Excel to Select a Simple Random Sample


1500 random numbers are generated, one
for each applicant in the population.
Then we choose the 50 applicants
corresponding to the 50 smallest random
numbers as our sample.
Each of the 1500 applicants have the same
probability of being included.

11

Using Excel to Select


a Simple Random Sample

Formula Worksheet
A
1
2
3
4
5
6
7
8
9

SAT Score
1008
1025
952
1090
1127
1015
965
1161

B
In-State
Yes
No
Yes
Yes
Yes
No
Yes
No

C
Random
Number
=RAND()
=RAND()
=RAND()
=RAND()
=RAND()
=RAND()
=RAND()
=RAND()

Note: Rows 10-1501 are not shown.


12

Using Excel to Select


a Simple Random Sample

Value Worksheet
A
1
2
3
4
5
6
7
8
9

SAT Score
1008
1025
952
1090
1127
1015
965
1161

B
In-State
Yes
No
Yes
Yes
Yes
No
Yes
No

C
Random
Number
0.38184
0.08037
0.25515
0.82225
0.38700
0.52999
0.27962
0.28245

Note: Rows 10-1501 are not shown.


13

Using Excel to Select


a Simple Random Sample

Put Random Numbers in Ascending Order


Step 1 Select cells A2:A1501
Step 2 Select the Data pull-down menu
Step 3 Choose the Sort option
Step 4 When the Sort dialog box appears:
Choose Random Numbers
in the Sort by text box
Choose Ascending
Click OK

14

Using Excel to Select


a Simple Random Sample

Value Worksheet (Sorted)


A
1
2
3
4
5
6
7
8
9

SAT Score
1107
1043
991
1008
1127
982
1163
1008

B
In-State
No
Yes
Yes
No
Yes
Yes
Yes
No

C
Random
Number
0.00027
0.00192
0.00303
0.00481
0.00538
0.00583
0.00649
0.00667

Note: Rows 10-1501 are not shown.


15

Example: St. Edwards

Point Estimates
x as Point Estimator of
xi 49 , 850
x

997
50
50
s as Point Estimator of
2
277 , 097
( xi x )
s

75. 2
49
49
p as Point Estimator of p
p 34 50 . 68

Note: Different random numbers would have


identified a different sample which would have
resulted in different point estimates.
16

Sampling Distribution of x
x
The sampling distribution of
is the
probability distribution of all possible values of
the sample
x
mean
.
x

Expected Value of
E(x) =
where:
= the population mean

17

Sampling Distribution of x

x
Standard Deviation of
Finite Population

N n
x ( )
n N 1

Infinite Population

x
n

A finite population is treated as being


infinite if n/N < .05.
( N n) / ( N 1)
is the finite correction factor.
is referred to as the standard error of the
x
mean.

18

Sampling Distribution of x

If we use a large (n > 30) simple random


sample, the central limit theorem enables us
x
to conclude that the sampling
distribution of
can be approximated by a normal probability
distribution.

x sample is small (n <


When the simple random
30), the sampling distribution of
can be
considered normal only if we assume the
population has a normal probability
distribution.

19

Example: St. Edwards

x
Sampling Distribution of

for the SAT Scores

E ( x ) 990

80

11. 3
n
50

20

Example: St. Edwards

x
Sampling Distribution of
for the SAT Scores
What is the probability that a simple
random sample of 50 applicants will provide
an estimate of the population mean SAT score
that is within plus or minus 10 of the actual
population mean ?
x
In other words, what is the probability that
will be between 980 and 1000?

21

Example: St. Edwards

x
Sampling Distribution of

for the SAT Scores


Sampling
distribution
of x

Area = .3106

Area = .3106

980 9901000
Using the standard normal probability table
with
z = 10/11.3 = .88, we have area = (.3106)(2)
= .6212.
22

Sampling Distribution ofp


p
The sampling distribution of
is the
probability distribution of all possible values of
the samplep proportion .

p
Expected Value of

E ( p) p

where:
p = the population proportion

23

Sampling Distribution of p

p
Standard Deviation of
Finite Population

p(1 p ) N n
n
N 1

Infinite Population

p(1 p)
n

p is referred to as the standard error of the


proportion.

24

Example: St. Edwards

p
Sampling Distribution of
Residents

for In-State

.72(1.72)

.0635
50

E ( p ) p . 72

The normal probability distribution is an


acceptable approximation since np = 50(.72)
= 36 > 5 and
25
n(1 - p) = 50(.28) = 14 > 5.

Example: St. Edwards

p
Sampling Distribution of
for In-State
Residents
What is the probability that a simple
random sample of 50 applicants will provide
an estimate of the population proportion of instate residents that is within plus or minus .05
of the actual population proportion?
p
In other words, what is the probability that
will be between .67 and .77?

26

Example: St. Edwards

p
Sampling Distribution of
Residents

for In-State
Sampling
distribution
of p

Area = .2852

Area = .2852

p
0.67 0.72 0.77

For z = .05/.0635 = .79, the area = (.2852)(2) = .


5704.
The probability is .5704 that the sample
27
proportion will

Other Sampling Methods

Stratified Random Sampling


Cluster Sampling
Systematic Sampling
Convenience Sampling
Judgment Sampling

28

Stratified Random Sampling

The population is first divided into groups of


elements called strata.
Each element in the population belongs to one
and only one stratum.
Best results are obtained when the elements
within each stratum are as much alike as
possible (i.e. homogeneous group).
A simple random sample is taken from each
stratum.
Formulas are available for combining the
stratum sample results into one population
parameter estimate.

29

Stratified Random Sampling

Advantage: If strata are homogeneous, this


method is as precise as simple random
sampling but with a smaller total sample size.
Example: The basis for forming the strata
might be department, location, age, industry
type, etc.

30

Cluster Sampling

The population is first divided into separate


groups of elements called clusters.
Ideally, each cluster is a representative smallscale version of the population (i.e.
heterogeneous group).
A simple random sample of the clusters is then
taken.
All elements within each sampled (chosen)
cluster form the sample.
continued

31

Cluster Sampling

Advantage: The close proximity of elements


can be cost effective (I.e. many sample
observations can be obtained in a short time).
Disadvantage: This method generally requires
a larger total sample size than simple or
stratified random sampling.
Example: A primary application is area
sampling, where clusters are city blocks or
other well-defined areas.

32

Systematic Sampling

If a sample size of n is desired from a


population containing N elements, we might
sample one element for every n/N elements in
the population.
We randomly select one of the first n/N
elements from the population list.
We then select every n/Nth element that
follows in the population list.
This method has the properties of a simple
random sample, especially if the list of the
population elements is a random ordering.
continued
33

Systematic Sampling

Advantage: The sample usually will be easier


to identify than it would be if simple random
sampling were used.
Example: Selecting every 100th listing in a
telephone book after the first randomly
selected listing.

34

Convenience Sampling

It is a nonprobability sampling technique.


Items are included in the sample without
known probabilities of being selected.
The sample is identified primarily by
convenience.
Advantage: Sample selection and data
collection are relatively easy.
Disadvantage: It is impossible to determine
how representative of the population the
sample is.
Example: A professor conducting research
might use student volunteers to constitute a
sample.
35

Judgment Sampling

The person most knowledgeable on the subject


of the study selects elements of the population
that he or she feels are most representative of
the population.
It is a nonprobability sampling technique.
Advantage: It is a relatively easy way of
selecting a sample.
Disadvantage: The quality of the sample
results depends on the judgment of the person
selecting the sample.
Example: A reporter might sample three or
four senators, judging them as reflecting the
general opinion of the senate.

36

End of Chapter 7

37

You might also like