You are on page 1of 16

24/02/20

15

COLLECTING DATA
THE RESEARCH PROCESS

THE RESEARCH PROCESS

DATA COLLECTION 1:WHAT TO MEASURE?

Hypothesis:
Coca-cola kills sperm.

Independent Variable
The proposed cause
A predictor variable
A manipulated variable (in experiments)
Coca-cola in the hypothesis above

Dependent Variable

The proposed effect


An outcome variable
Measured not manipulated (in experiments)
Sperm in the hypothesis above

LEVELS OF MEASUREMENT

Categorical (entities are divided into distinct categories):


Binary variable:There are only two categories
e.g. dead or alive.

Ordinal variable:The same as a nominal variable but the categories have a logical order
e.g. whether people got a fail, a pass, a merit or a distinction in their exam.

Continuous (entities get a distinct score):

Interval variable: Equal intervals on the variable represent equal differences in the property being measured
e.g. the difference between 6 and 8 is equivalent to the difference between 13 and 15.

Ratio variable:The same as an interval variable, but the ratios of scores on the scale must also make sense
e.g. a score of 16 on an anxiety scale means that the person is, in reality, twice as anxious as someone scoring 8.

MEASUREMENT ERROR

Measurement error

The discrepancy between the actual value were trying to measure, and the number we use to represent tha

Example:

You (in reality) weigh 80 kg.


You stand on your bathroom scales and they say 83 kg.
The measurement error is 3 kg.

TYPES OF VARIATION

Differences in performance created by a specific experimental manipulation.

Unsystematic Variation

Differences in performance created by unknown factors.


Age, Gender, IQ,Time of day, Measurement error etc.

Randomization
Minimizes unsystematic variation.

THE RESEARCH PROCESS

THE ONLY EQUATION YOU WILL EVER NEED!!!!

A SIMPLE STATISTICAL MODEL

In Statistics we fit models to our data (i.e. we use a statistical model to represent w
The mean is a hypothetical value (i.e. it doesnt have to be a value that actually ex
As such, the mean is simple statistical model.

Slide 8

The mean is the sum of all scores divided by the number of scores.
The mean is also the value from which the (squared) scores deviate least (it has th

xi

Mean ( X ) ni 1

Slide 10

THE MEAN: EXAMPLE


Collect some data:
1, 3, 4, 3, 2
Add them up:
n

xi 1 3 4 3 2 13
i1

Divide by the number of scores, n:


n

xi

i 1

13 2.6
5

MEASURING THE FIT OF THE MODEL

It is not a perfect representation of the data


How can we assess how well the mean represents reality?

Rating (out of 5)

A PERFECT FIT

6
5
4
3
2
1
0

34

Rater

CALCULATING ERROR

A deviation is the difference between the mean and an actual data point.

Deviation xi x

USE THE TOTAL ERROR?

Score

Mean

Deviation

2.6

-1.6

We could just take the error between the


and the-0.6
data and add them.
2 mean2.6
3

2.6

0.4

2.6

0.4

2.6

1.4

Total =

( X X ) 0

SUM OF SQUARED ERRORS

We could add the deviations to find out the total error.


Deviations cancel out because some are
positive and others negative.
Therefore, we square each deviation.
If we add these squared deviations we get the Sum of Squared Errors (SS).

Score

Mean

Deviation

Squared
Deviation

2.6

-1.6

2.56

2.6

-0.6

0.36

2.6

0.4

0.16

2.6

0.4

0.16

2.6

1.4

1.96

Total

5.20

SS

( X X )2 5.20

VARIANCE

The sum of squares is a good measure of overall variability, but is dependent on th


We calculate the average variability by dividing by the number of scores (n).
This value is called the variance (s2).

STANDARD DEVIATION

The variance has one problem: it is measured in


units squared.
This isnt a very meaningful metric so we take the square root value.

i1

5.205

1.02

IMPORTANT THINGS TO REMEMBER

The Sum of Squares,Variance, and Standard Deviation represent the same thing:
The Fit of the mean to the data
The variability in the data
How well the mean represents the observed data
Error

Slide 21

TEST STATISTICS

A Statistic for which the frequency of particular values is known.


Observed values can be used to test hypotheses.

11

DATA COLLECTION 2: HOW TO MEASURE

Correlational research:
Observing what naturally goes on in the world without directly interfering with it.

Cross-sectional research:

This term implies that data come from people at different age points with different people representing each

Experimental research:

One or more variable is systematically manipulated to see their effect (alone or in combination) on an outcom

METHODS OF DATA COLLECTION

Different entities in experimental conditions

Repeated measures (within-subject)

The same entities take part in all experimental conditions.


Economical
Practice effects
Fatigue

CHOOSING THE APPROPRIAT

Any given research question or hypothesis can be tested using statistical analysis.
Even though statistics appear confusing, there are very clear rules that dictate which test you ca
Number of DV/IVs
Scales of measurement for each of these variables
The levels or number of conditions within each
Whether assumptions are violated

ONEdecision
CATEGORICAL
IV to organise your own minds and work out which test is most a
Using
trees can be useful
No.
Of
DVs

What
measurment
of DV?

No.
Of IVs

What
measurement
IV?

If
categorical,
how many
levels?

If categorical,
Same or
different
participants
used in each?
Different

Two

Continuous

One

yes

Parametric
test

Categorical

Yes

MannWhitney
U- test
Dependent
t- test

No
Different
More than two

Yes

Wilcoxon
MP test
One-Way
Ind.
ANOVA

No
Same

Yes
No

Nonparametric
test

Independent
t-test

No

Same
One

Parametric
Assumptions
upheld?

Kruskall
Wallis
One Way
Rep
ANOVA
Friedmans
ANOVA

TWO OR MORE IVS


No. Of DVs

What
measurment
of DV?

No.
Of IVs

What
measurement IV?

If categorical,
how many levels?

Categorical

One

Continuous

Two
or
more

If categorical,
Same or different
participants used
in each?

Parametric
Assumptions
upheld?

Parametric test

Different

yes

Independent
Factorial
ANOVA/
Multiple
Regression

Same

Yes

Factorial
Repeated
measures
ANOVA

Both

Yes

Factorial
Mixed
ANOVA
Multiple
Regression

Continuous

Yes

Both

Yes

Multiple
Regression
ANCOVA

ONE CONTINUOUS IV
No.
Of
DVs

What
measurment
of DV?

No.
Of IVs

What
measurement
IV?

If
categorical,
how many
levels?
-

If categorical,
Same or
different
participants
used in each?
-

Parametric
Assumptions
upheld?

Yes
One

Continuous

One

Continuous
No

Parametric
test

Pearson
correlation
or
Regression

Nonparametric
test

Spearmans
correlation
or Kendalls
Tau

TWO OR MORE DVS


No. Of DVs

Two
or
more

What
measurment
of DV?

No.
Of IVs

What
measurement IV?

One

Categorical

Yes

MANOVA

Categorical

Yes

Factorial
MANOVA

Both

Yes

MANCOVA

Continuous
Two
or
more

If categorical,
how many levels?

If categorical,
Same or different
participants used
in each?

Parametric
Assumptions
upheld?

Parametric test

Note: designs using categorical DVs are not included in this decision tree. See Andy Field p.822 for brief review

VALIDITY

Content validity
Evidence that the content of a test corresponds to the content of the construct it was designed to cover

Ecological validity

Evidence that the results of a study, experiment or test can be applied, and allow inferences, to real-world co

RELIABILITY

Reliability
The ability of the measure to produce the same results under the same conditions.

Test-Retest Reliability

The ability of a measure to produce consistent results when the same entities are tested at two different poin

You might also like