Research process data collection

24/02/20
15
COLLECTING DATA
THE RESEARCH PROCESS
DATA COLLECTION 1:WHAT TO MEASURE?
Hypothesis:
Coca-cola kills sperm.
Independent Variable
The proposed cause
A predictor variable
A manipulated variable (in experiments)
Coca-cola in the hypothesis above
Dependent Variable
The proposed effect

An outcome variable
Measured not manipulated (in experiments)
Sperm in the hypothesis above
LEVELS OF MEASUREMENT
Categorical (entities are divided into distinct categories):

Binary variable:There are only two categories
e.g. dead or alive.
Ordinal variable:The same as a nominal variable but the categories have a logical order
e.g. whether people got a fail, a pass, a merit or a distinction in their exam.
Continuous (entities get a distinct score):
Interval variable: Equal intervals on the variable represent equal differences in the property being measured
e.g. the difference between 6 and 8 is equivalent to the difference between 13 and 15.
Ratio variable:The same as an interval variable, but the ratios of scores on the scale must also make sense
e.g. a score of 16 on an anxiety scale means that the person is, in reality, twice as anxious as someone scoring 8.
MEASUREMENT ERROR
Measurement error
The discrepancy between the actual value were trying to measure, and the number we use to represent tha
Example:
You (in reality) weigh 80 kg.

You stand on your bathroom scales and they say 83 kg.
The measurement error is 3 kg.
TYPES OF VARIATION
Differences in performance created by a specific experimental manipulation.
Unsystematic Variation
Differences in performance created by unknown factors.

Age, Gender, IQ,Time of day, Measurement error etc.
Randomization
Minimizes unsystematic variation.
THE ONLY EQUATION YOU WILL EVER NEED!!!!
A SIMPLE STATISTICAL MODEL
In Statistics we fit models to our data (i.e. we use a statistical model to represent w
The mean is a hypothetical value (i.e. it doesnt have to be a value that actually ex
As such, the mean is simple statistical model.
Slide 8
The mean is the sum of all scores divided by the number of scores.
The mean is also the value from which the (squared) scores deviate least (it has th
xi
Mean ( X ) ni 1
Slide 10
THE MEAN: EXAMPLE

Collect some data:
1, 3, 4, 3, 2
Add them up:
n
xi 1 3 4 3 2 13
i1
Divide by the number of scores, n:

n
xi
i 1
13 2.6
5
MEASURING THE FIT OF THE MODEL
It is not a perfect representation of the data

How can we assess how well the mean represents reality?
Rating (out of 5)
A PERFECT FIT
6
5
4
3
2
1
0
34
Rater
CALCULATING ERROR
A deviation is the difference between the mean and an actual data point.
Deviation xi x
USE THE TOTAL ERROR?
Score
Mean
Deviation
2.6
-1.6
We could just take the error between the

and the-0.6
data and add them.
2 mean2.6
3
2.6
0.4
2.6
0.4
2.6
1.4
Total =
( X X ) 0
SUM OF SQUARED ERRORS
We could add the deviations to find out the total error.

Deviations cancel out because some are
positive and others negative.
Therefore, we square each deviation.
If we add these squared deviations we get the Sum of Squared Errors (SS).
Score
Mean
Deviation
Squared
Deviation
2.6
-1.6
2.56
2.6
-0.6
0.36
2.6
0.4
0.16
2.6
0.4
0.16
2.6
1.4
1.96
Total
5.20
SS
( X X )2 5.20
VARIANCE
The sum of squares is a good measure of overall variability, but is dependent on th

We calculate the average variability by dividing by the number of scores (n).
This value is called the variance (s2).
STANDARD DEVIATION
The variance has one problem: it is measured in

units squared.
This isnt a very meaningful metric so we take the square root value.
i1
5.205
1.02
IMPORTANT THINGS TO REMEMBER
The Sum of Squares,Variance, and Standard Deviation represent the same thing:
The Fit of the mean to the data
The variability in the data
How well the mean represents the observed data
Error
Slide 21
TEST STATISTICS
A Statistic for which the frequency of particular values is known.

Observed values can be used to test hypotheses.
11
DATA COLLECTION 2: HOW TO MEASURE
Correlational research:
Observing what naturally goes on in the world without directly interfering with it.
Cross-sectional research:
This term implies that data come from people at different age points with different people representing each
Experimental research:
One or more variable is systematically manipulated to see their effect (alone or in combination) on an outcom
METHODS OF DATA COLLECTION
Different entities in experimental conditions
Repeated measures (within-subject)
The same entities take part in all experimental conditions.

Economical
Practice effects
Fatigue
CHOOSING THE APPROPRIAT
Any given research question or hypothesis can be tested using statistical analysis.
Even though statistics appear confusing, there are very clear rules that dictate which test you ca
Number of DV/IVs
Scales of measurement for each of these variables
The levels or number of conditions within each
Whether assumptions are violated
ONEdecision
CATEGORICAL
IV to organise your own minds and work out which test is most a
Using
trees can be useful
No.
Of
DVs
What
measurment
of DV?
No.
Of IVs
What
measurement
IV?
If
categorical,
how many
levels?
If categorical,
Same or
different
participants
used in each?
Different
Two
Continuous
One
yes
Parametric
test
Categorical
Yes
MannWhitney
U- test
Dependent
t- test
No
Different
More than two
Yes
Wilcoxon
MP test
One-Way
Ind.
ANOVA
No
Same
Yes
No
Nonparametric
test
Independent
t-test
No
Same
One
Parametric
Assumptions
upheld?
Kruskall
Wallis
One Way
Rep
ANOVA
Friedmans
ANOVA
TWO OR MORE IVS

No. Of DVs
What
measurment
of DV?
No.
Of IVs
What
measurement IV?
If categorical,
how many levels?
Categorical
One
Continuous
Two
or
more
If categorical,
Same or different
participants used
in each?
Parametric
Assumptions
upheld?
Parametric test
Different
yes
Independent
Factorial
ANOVA/
Multiple
Regression
Same
Yes
Factorial
Repeated
measures
ANOVA
Both
Yes
Factorial
Mixed
ANOVA
Multiple
Regression
Continuous
Yes
Both
Yes
Multiple
Regression
ANCOVA
ONE CONTINUOUS IV
No.
Of
DVs
What
measurment
of DV?
No.
Of IVs
What
measurement
IV?
If
categorical,
how many
levels?
-
If categorical,
Same or
different
participants
used in each?
-
Parametric
Assumptions
upheld?
Yes
One
Continuous
One
Continuous
No
Parametric
test
Pearson
correlation
or
Regression
Nonparametric
test
Spearmans
correlation
or Kendalls
Tau
TWO OR MORE DVS

No. Of DVs
Two
or
more
What
measurment
of DV?
No.
Of IVs
What
measurement IV?
One
Categorical
Yes
MANOVA
Categorical
Yes
Factorial
MANOVA
Both
Yes
MANCOVA
Continuous
Two
or
more
If categorical,
how many levels?
If categorical,
Same or different
participants used
in each?
Parametric
Assumptions
upheld?
Parametric test
Note: designs using categorical DVs are not included in this decision tree. See Andy Field p.822 for brief review
VALIDITY
Content validity
Evidence that the content of a test corresponds to the content of the construct it was designed to cover
Ecological validity
Evidence that the results of a study, experiment or test can be applied, and allow inferences, to real-world co
RELIABILITY
Reliability
The ability of the measure to produce the same results under the same conditions.
Test-Retest Reliability
The ability of a measure to produce consistent results when the same entities are tested at two different poin

Research process data collection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research process data collection

Uploaded by

Copyright:

Available Formats

24/02/20

THE RESEARCH PROCESS

DATA COLLECTION 1:WHAT TO MEASURE?

The proposed effect

Categorical (entities are divided into distinct categories):

Continuous (entities get a distinct score):

You (in reality) weigh 80 kg.

Differences in performance created by a specific experimental manipulation.

Differences in performance created by unknown factors.

THE RESEARCH PROCESS

THE ONLY EQUATION YOU WILL EVER NEED!!!!

A SIMPLE STATISTICAL MODEL

THE MEAN: EXAMPLE

Divide by the number of scores, n:

MEASURING THE FIT OF THE MODEL

It is not a perfect representation of the data

USE THE TOTAL ERROR?

We could just take the error between the

SUM OF SQUARED ERRORS

We could add the deviations to find out the total error.

The sum of squares is a good measure of overall variability, but is dependent on th

The variance has one problem: it is measured in

IMPORTANT THINGS TO REMEMBER

A Statistic for which the frequency of particular values is known.

DATA COLLECTION 2: HOW TO MEASURE

METHODS OF DATA COLLECTION

Different entities in experimental conditions

Repeated measures (within-subject)

The same entities take part in all experimental conditions.

CHOOSING THE APPROPRIAT

TWO OR MORE IVS

TWO OR MORE DVS

You might also like