You are on page 1of 17

CHAPTER 2

STAT 505
Why Probability?
 Probability is the study of chance. Historically probability was
studied to help the wealthy class win gambling games.
However today it has many uses.

 Why do we need probability in a statistics course?

 You can use the techniques of chapter 1 to describe fully a


population - This means we have all the data.

 For example if I want to know the average score on the first


test in this course, I can easily calculate it since there are only
50 students in this class.

 If the mean score is 75 - this is the average score for the class.
The branch of statistics that deals with population data you
dealt with chapter 1 and it is called descriptive statistics.
However suppose I want the average height of an adult male.
It is impossible to get the data for all adult males so what we
do is construct a sample!
Probability Experiments

A probability experiment is an action through which specific results


(counts, measurements or responses) are obtained.

Example:
Rolling a die and observing the number that is rolled is a
probability experiment.
The result of a single trial in a probability experiment is the
outcome.
The set of all possible outcomes for an experiment is the sample
space.
Example:
The sample space when rolling a die has six outcomes.
{1, 2, 3, 4, 5, 6}
Events

An event consists of one or more outcomes and is a subset of the sample
space.

Events are
represented by
Example: uppercase letters.
A die is rolled. Event A is rolling an even number.

A simple event is an event that consists of a single outcome.


Example:
A die is rolled. Event A is rolling an even number.
This is not a simple event because the outcomes of event A are {2,
4, 6}.
Classical Probability

 Classical (or theoretical) probability is used when each


outcome in a sample space is equally likely to occur. The classical
probability for event E is given by
Number of outcomes in event
P (E )  .
Total number of outcomes in sample space

Example:
A die is rolled. Find the probability of Event A: rolling a 5.

There is one outcome in Event A: {5}

1
P(A) =  0.167
“Probability of 6
Event A.”
Empirical Probability
 Empirical (or statistical) probability is based on observations
obtained from probability experiments. The empirical frequency of
an event E is the relative frequency of event E.

Frequency of Event E
P (E ) 
Total frequency
f

n
Example:
A travel agent determines that in every 50 reservations she makes, 12
will be for a cruise. What is the probability that the next reservation she
makes will be for a cruise?
12
P(cruise) =  0.24
50
RANDOM VARIABLES

 A random variable is a variable that has a single numeric


value for each outcome of an experiment.

 Consider this example: The experiment is rolling a single six


sided die. A random variable x can hold the result of one roll of
this die. So x can be 1,2,3,4,5, or 6. This is an example of a
discrete random variable since it has a finite number of
values.
RANDOM VARIABLES
 Here is one final example:

 The experiment is to measure the amount of gas in the fuel


tank on a particular car. The random variable x can hold the
result of this measurement. In this case x can be any real
number from 0 to the fuel capacity of the tank. For example a
Ford Explorer has a fuel capacity of 21 gallons, so x could be
any real number from 0 to 21 gallons. There are not a finite or
countable number of these measurements so x is a continuous
random variable

 A discrete random variable has either a finite number of


values or a countable number of values

 A continuous random variable has infinitely many values and


those values can be associated with measurements on a
continuous scale with no interruptions.
RANDOM SAMPLES
 The number of times we can flip a coin or toss a paper cup
is infinite, therefore the samples values is infinite.

 U.S Census – attempts to measure various features of the


United States population. Although U.S Census is not
completely accurate, it still does a good job in describing
our population.

 A random sample is one where the researcher insures


(usually through the use of random numbers applied to a
list of the entire population) that each member of that
population has an equal probability of being selected.

 Random samples are an important foundation of Statistics.


Almost all of the mathematical theory upon which Statistics
are based rely on assumptions which are consistent with a
random sample.
Sample Survey
 A survey of a population made by using only a portion of the
population.

 For any sample survey:

 - state the objectives clearly.


 - define the target population carefully (what kind of people to
interview )
 - design the sample selection plan using randomization, so to
reduce the sample bias.
 - decide on a method of measurement that will minimize
measurement bias.
 - use a pretest to try out the plan.
 - organize the data collection and data management.
 - plan carefully the data analysis.
 - write the conclusions in light of the original objectives.
Sample Survey
 Establish the goals of the project - What you want to learn

 Determine your sample - Whom you will interview

 Choose interviewing methodology - How you will interview

 Create your questionnaire - What you will ask

 Pre-test the questionnaire, if practical - Test the questions

 Conduct interviews and enter data - Ask the questions

 Analyze the data - Produce the reports


EXAMPLE
 Ratings of current products or services (ex. Cable internet)

 what kind of people to interview.

 The next thing to decide is how many people you need to


interview (the representative sample) . The larger the
sample, the more precisely it reflects the target group.
However, the rate of improvement in the precision
decreases as your sample size increases. For example, to
increase a sample from 250 to 1,000 only doubles the
precision. You must make a decision about your sample
size based on factors such as: time available, budget and
necessary degree of precision.

 Avoiding a Biased Sample. A biased sample will produce


biased results.
EXAMPLE
 Probable Bias - Atypical People

 Reason - Limited to people with Internet access. 


Internet users are not representative of the general
population, even when matched on age, gender,
etc..  This can be a serious problem, unless you are
only interested in people who have Internet
access. 

 Once you have decided on your sample you must


decide on your method of data collection. Each
method has advantages and disadvantages.
Interviewing Methods
 Personal Interviews

 Telephone Surveys

 Mail Surveys

 Computer Direct Interviews

 Email Surveys

 Internet/Intranet (Web Page) Surveys

 Scanning Questionnaires
Question Types
 - multiple choice, numeric open end and text open
end

 1. Where do you live?


 North South East West

 2. How much did you spent on groceries this week?


______

 3. How can we improve our services? ______________


Observational Study – are studies in which subjects are
observed in their natural state .

 Subjects may be measured and tested (e.g. total cholesterol


measured, disease status ascertained) but there is no
intervention or treatment (e.g. patients allocated to different
exercise programs, patients allocated to new drug or placebo).

 Observational studies include cohort studies, case-control


studies, ecological studies, cross-sectional studies.
EXAMPLES
 Decide what type of studies are the following three studies:

 The applicants are interested in the etiology and treatment


of a disease affecting the knee. Briefly they plan to:
 a) Compare leg measurements between subjects with and
without disease (Observational)

b) Compare leg measurements between the symptomatic
and asymptomatic leg of diseased individuals
(Observational)
 c) Randomly allocate subjects with disease to treatment or
no treatment and compare change in leg measurements
over a period of 6 months between the two groups
(Experimental)

You might also like