Professional Documents
Culture Documents
INTRODUCTION TO
STATISTICS
What is Statistics?
Statistics
Statistics is the science of conducting studies to collect, organize,
summarize, analyze, and draw conclusions from data.
Types of statistics:
i) Descriptive statistics
Describe a phenomenon
Consists of the collection, organization, summarization, and
presentation of the data.
ii) Inferential statistics
Consists of generalizing from samples to populations, performing
estimations and hypothesis tests, determining relationships among
variables, and making prediction.
2
Definitions
Population
The collection of all outcomes,
responses, measurements, or
counts that are of interest.
Sample
The collection of data from a subset of
the population.
What is Data?
Data
The responses, counts, measurements, or
observations that have been collected.
Qualitative Data
Qualitative Data:
Variables that can be placed into distinct
categories, according to some characteristic
or attribute.
Non-numerical measurements.
Examples:
gender (Male or Female)
Geographic locations
Eye color
etc
5
Quantitative Data
Quantitative data:
Numerical measurements and can be ordered
or ranked.
Examples:
Age
Weights
Temperature
Heights
Quantitative Data:
Discrete vs. Continuous
Discrete data:
finite number of possible data values: 0, 1, 2,
3, 4.
Ex: Number of classes a student is taking
Continuous data:
infinite number of possible data values on a
continuous scale.
Often include fractions and decimals.
Ex: Weight of a baby
7
Measuring Variables
To establish relationships between variables,
researchers must observe the variables and
record their observations. This requires that the
variables be measured.
The process of measuring a variable requires a
set of categories called a scale of
measurement (or measurement scales) and a
process that classifies each individual into one
category.
Nominal scale
Classifies data into mutually exclusive (nonoverlapping), exhausting categories in which no order or
ranking can be imposed on the data.
Example:
1. Classified according to subject taught:
History, Mathematics, English, Psychology
2. Classifying survey subjects as male or female
3. Marital status: Single, Married, Divorced, separated
9
continue
2. Ordinal scale
Classifies data into categories that can be ordered or
ranked.
Ordinal measurements tell you the direction of
difference between two individuals.
Example:
1. Student evaluation (from 1 to 5)
2. Guest speaker (superior, average or poor)
3. Sample size that evaluate (small, medium or
large)
10
continue
3. Interval scale
Classifies data into categories that can be ranked.
An ordered series of equal-sized categories.
Interval measurements identify the direction and magnitude
of a difference.
There is no meaningful zero.
Example:
1. Temperature since there is a meaningful difference of 1F
between each unit, such as 72F and 73F (more hot). 0F
does not mean no heat at all.
2. IQ since there is a meaningful difference of 1 point
between an IQ of 109 and an IQ of 110. IQ test do not
measure people who have no intelligence.
11
continue
4. Ratio scale
is an interval scale where a value of zero indicates none of the
variable or exists a true zero.
Ratio measurements identify the direction and magnitude of
differences and allow ratio comparisons of measurements.
Example:
1. Height
2. Weight
3. Number of phone calls received
4. Salary
5. Age
12
Observational study
Survey
Experiment
Simulation
13
Experiment
A treatment is applied to part of a population
and responses are observed.
14
Simulation
Uses a mathematical or physical model to
reproduce the conditions of a situation or
process. Often involves the use of computers.
15
Sampling Techniques
16
Non-Random Sampling
Some members of the population have no
chance of being picked. Often leads to
biased samples.
17
Convenience Samples
Data is collected that is readily available and
easy to get.
Often biased in some way such as selfselection bias when people choose to
participate, because they have an interest in
the issue in question.
18
xx x
xx
xxxxxx x xx x x
x
x
x
xx xx xxxx xx xxxxxxxx xxxxx
x xx xx x xx xxxxxx x
xx
x xxx xxxx xxxxxx xxx x xxx x xxx xxx
x
x xxx xx x xxxx xx
x
x x
19
Systematic Sampling
Choose a starting value or starting point at
random. Then, choose every kth member of
the population.
Example: Select every 3rd patient who enters
the hospitals.
20
Stratified Sampling
Divide a population into at least 2 different
subgroups (strata) that share the same
characteristics (age, gender, ethnicity,
income, etc) and select a random sample
from each group.
Advantages:
Unbiased
Good random representative sample
Obtain more information
21
Cluster Sampling
Divide the population into many like
subgroups (clusters); randomly select some
of those clusters, and then select all of the
members of those clusters to be in the
sample.
Advantage: geographically separately
populations
22
Exercises
Classify each of the following as nominal-level,
ordinal level, interval-level, or ratio-level
measurement.
1. Pages in the city of Malaysia telephone book.
2. Rankings of tennis players.
3. Weights of cupboards.
23
24
Exercises
Classify each variable as discrete or continuous.
1. Number of doughnuts sold each day by Doughnut
Heaven.
2. Weights of cats.
25
26
Exercises
Classify each variables as qualitative or quantitative.
1. Number of bicycles sold in 1 month by a store.
2. Colors of balloon in a party.
3. Times it takes to drive to school.
4. Capacity in cubic feet of six truck beds.
27
28