Professional Documents
Culture Documents
— Florence Nightingale
Statistical Terms Crossword
To behold is to look beyond the fact; to observe, to go
beyond the observation. Look at the world of people,
and you will be overwhelmed by what you see. But
select from that mass of humanity a well-chosen few,
and observe them with insight, and they will tell you
more than all the multitudes together.
— Paul D. Leedy
From his book, “Practical Research,” 1993
Choosing the Appropriate Statistic
Some factors to consider:
• Research design
• Number of groups
• Number of variables
• Level of measurement
(nominal, ordinal, interval/ratio)
Statistical Methods
Statistical Methods
Multivariate
Descriptive Statistics
Descriptive Methods
spread regression
Inferential Statistics
Inferential Methods
2 groups: t-test
— Fletcher Knebel
Randomization
• Random selection is how you draw the
sample for your study from a population.
• This is related to the external validity, or
generalizability, of your results.
Randomization
• Random assignment is how you assign your
sample to groups or treatments in your study.
• This is related to internal validity.
• Random assignment is a required feature of a
true experimental design.
Randomization
Variables
• Ordinal
• Interval
• Ratio
Nominal-Level Variables
• Data are organized into categories
• Categories have no inherent order
• Categories are exclusive
• Categories are exhaustive
• Examples are sex, ethnicity, marital status
Examples of Nominal-Level Questions
Interval level:
What is your age in years? ____
Ordinal level:
What is your age group?
18 years or younger
19-44 years
45 years or older
Importance of Levels of Measurement
— Benjamin Disraeli
Measures of Central Tendency
Level of
Statistic
Measurement
15,20,21,20,36,15,25,15
15,15,15,20,20,21,25,36
Example of Mode
Statistics 600
Race of Respondent
Example of Median
6
Statistics
5
EDUC Education level
N Valid 24
Mis sing 0 4
Median 6.00
3
N= 24
Education level
Example of Mean
Age of Respondent
200
MEAN
100
Age of Respondent
I abhor averages. I like the individual case. A man
may have six meals one day and none the next,
making an average of three meals per day, but that
is not a good way to live.
— Louis D. Brandeis
Measures of Variation
Level of
Statistic
Measurement
600
400
Frequency
200
0
w hite black other
Race of Respondent
Example of Range
EDUC Education level
Cumulative
Frequency Percent Percent 10
4 Some high s chool 1 4.2 4.2
5 Completed high school 6 25.0 29.2
6 Some college 6 25.0 54.2 9
7 Completed college 3 12.5 66.7
8 Some graduate work 4 16.7 83.3 8
9 A graduate degree 4 16.7 100.0
Total 24 100.0
7
Statistics
6
EDUC Education level
N Valid 24
5
Mis sing 0
Median 6.00
Range 5 4
Minimum 4
3
Maximum 9 N= 24
Education level
Example of Standard Deviation
Age of Respondent
200
-1 SD MEAN +1 SD
100
Frequency
Age of Respondent
Measures of Relationships
Level of
Statistic
Measurement
— Robert Boynton
Example of Spearman Correlation
RINCOM91 Respondent's Income DEGREE RS Highest Degree
Correlations
RINCOM91
Res pondent's
Income
Spearman's rho EDUC Highes t Year Correlation Coefficient .363**
of School Completed Sig. (2-tailed) .000
N 945
**. Correlation is significant at the .01 level (2-tailed).
Scatterplot of Self Esteem By Height
Relationship Between Two Variables
HEIGHT ESTEEM
N Valid 24 24
Mis sing 0 0 3.5
Mean 66.7917 2.7583
Std. Deviation 7.03395 .59558
3.0
Correlations
2.5
ESTEEM
HEIGHT Pears on Correlation .347
Sig. (2-tailed) .097
N 24 2.0
ESTEEM
1.5
50 60 70 80 90
HEIGHT
Example of Chi-Square Test
SEX
1 Male 2 Female Total
Count % within SEX Count % within SEX Count % within SEX
RACE 1 white 552 86.1% 705 82.1% 1257 83.8%
2 black 66 10.3% 102 11.9% 168 11.2%
3 other 23 3.6% 52 6.1% 75 5.0%
Total 641 100.0% 859 100.0% 1500 100.0%
Chi-Square Tests
Asymp. Sig.
Value df (2-s ided)
Pears on Chi-Square 5.994 a 2 .050
N of Valid Cas es 1500
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 32.05.
A Statistical Sampler
— H.G. Wells
Some Terminology
• Descriptive statistics
Statistics that allow the researcher to organize or
summarize data to give meaning or facilitate insight.
• Inferential statistics
Methods that allow inferences to be made from a sample
to a population
• Hypothesis testing
A statistical test of an expected relationship between two
or more variables
Statistical inference
Statistical inference is the process of estimating
population parameters from sample statistics.
Statistical inference may be used to ascertain whether
differences exist between groups...
90
80
70
Height in inches
60
50
40
30
20
10
Males Females
3.5
3.0
2.5
GENDER
2.0
FEMALES
1.5 MALES
20 30 40 50 60
AGE
Level of Measurement
• Reliability analysis
Assesses the consistency of multi-item scales
• Factor Analysis
Examines the relationships among variables and
reveals related sets of variables (constructs)
• Structural Equation Modeling
Methods for testing theories about the
relationships among variables
Hypothesis Testing Decision Chart
1 Strongly agree
2 Agree
3 Neither agree nor disagree
4 Disagree
5 Strongly disagree
males
females
1 2 3 4 5
meanmales=2.5
meanfemales=3.2
We can use the SPSS statistical package to run an
independent samples t-test:
Group Statistics
Std. Error
GENDER N Mean Std. Deviation Mean
EXERCISE 1 male 25 2.56 1.158 .232
2 female 25 3.24 1.012 .202
9
Education level
8
9 A graduate degree
8 Some graduate work 7
7 Completed college
6 Some college 6
5 Completed high school
4 Some high school 5
3 Completed grade school
2 Some grade school 4
1 No formal education
3
N= 14 10
Female Male
Gender
Because the dependent variable (education level)
is ordinal-level, we use the Mann-Whitney U Test.
Ranks
Test Statisticsb
Intervention Group O X O O
Control Group O O O
O = observation X = treatment/intervention
We can use the SPSS statistical package to perform a
repeated measures ANOVA on the sample data:
100%
Attendance (% of days)
90%
Intervention
Control
80%
70%
Month 0 Month 1 Month 2
TIME
Factor Analysis Example
Just for the fun of it, I performed a factor analysis on the music
questions to see if we could identify a pattern of underlying
dimensions, or factors, in the data.
MUSIC GENRES
I'm going to read you a list
of some types of music. Big Band Folk
Bluegrass Jazz
Can you tell me which of Country/Western Opera
the statements on this card Blues or R & B Rap
comes closest to your Broadway Musicals Heavy Metal
feeling about each type of Classical
music. (HAND CARD “B”
TO RESPONDENT.)
RESPONSE CARD “B”
Let's start with big band
music. Do you like it very 1 Like Very Much
2 Like It
much, like it, have mixed
3 Mixed Feelings
feelings, dislike it, dislike it 4 Dislike It
very much, or is this a type 5 Dislike Very Much
of music that you don't 8 DK Much About It
know much about? 9 NA
Factor Analysis Results
The factor analysis revealed four factors in the music preference items.
The varieties of music were associated with the factors as shown below:
Pattern Matrixa
Factor
1 2 3 4
CLASSICL Clas sical Music .844 -.033 -.127 .054
OPERA Opera .715 -.004 -.032 .086
MUSICALS Broadway Mus icals .663 .109 -.024 -.104
FOLK Folk Music .502 -.064 .341 -.005
BIGBAND Bigband Music .459 .240 .125 -.171
JAZZ Jazz Mus ic .035 .766 -.110 .029
BLUES Blues or R & B Mus ic -.024 .714 .106 .057
BLUGRASS Bluegrass Music .070 .084 .753 .052
COUNTRY Country Wes tern Mus ic -.084 -.034 .596 -.033
HVYMETAL Heavy Metal Mus ic -.012 -.016 .020 .602
RAP Rap Mus ic .030 .074 -.004 .559
Extraction Method: Principal Axis Factoring.
Rotation Method: Oblimin with Kaiser Normalization.
a. Rotation converged in 8 iterations .
Factor Analysis Results
FACTORS
F1 F2 F3 F4
Classical Folk
Heavy
Opera Jazz Blues Bluegrass Country Rap
Metal
MEASURED VARIABLES
Do not put faith in what statistics say until you
have carefully considered what they do not say.
— William W. Watt
More Cool Statistics Web Sites
— Unknown
Statistical Power Analysis
• Power (1 - )
Relationship between power
and other parameters:
EFFECT SIZE
TYPE OF MEASURE OF
TEST EFFECT SIZE
SMALL MEDIUM LARGE
Independent |mA-mB|
Samples T- .2 .5 .8
test
Product
Moment rXY .10 .30 .50
Correlation
Testing a mean against a true alternative:
1 slightly larger than 0 (“small effect”)
Sampling distribution of
means when H0 is true
Area= Sampling distribution of
means when H1 is true
Area=1-
Area=
0 1
Area= Area=1-
Area=
0 1
90
80
Power
power=.80
70
60
http://www.asu.edu/graduate/statistics/hotline/
An approximate answer to the right question
is worth a great deal more than a precise
answer to the wrong question.
http://www.public.asu.edu/~eagle/stat_sampler.ppt