You are on page 1of 13

What is correlation

• A (Pearson) correlation is a number between -1 and +1 that indicates


to what extent 2 metric variables are linearly related. It's best
understood by looking at some scatterplots.
• A relationship of direct proportionality that, when plotted on a graph,
traces a straight line. In linear relationships, any given change in an
independent variable will always produce a corresponding change in
the dependent variable
• a correlation of -1 indicates a perfect linear descending
relation: higher scores on one variable imply lower scores on the
other variable.
• a correlation of 0 means there's no linear relation between 2
variables whatsoever. However, there may be a (strong) non-linear
relation nevertheless.
• a correlation of 1 indicates a perfect ascending linear relation: higher
scores on one variable are associated with higher scores on the other
variable.
• Direction of the relationship
• -1 : perfectly negative linear relationship
• 0 : no relationship
• +1 : perfectly positive linear relationship
• Strength
• .1 < | r | < .3 … small / weak correlation
• .3 < | r | < .5 … medium / moderate correlation
• .5 < | r | ……… large / strong correlation
Bivariate Pearson Correlation
• The bivariate Pearson Correlation is commonly used to measure the
following:
• Correlations among pairs of variables
• Correlations within and between sets of variables

• The bivariate Pearson correlation indicates the following:


• Whether a statistically significant linear relationship exists between two
continuous variables
• The strength of a linear relationship (i.e., how close the relationship is to
being a perfectly straight line)
• The direction of a linear relationship (increasing or decreasing)
Note this
• Cannot address non-linear relationships or relationships among
categorical variables. If you wish to understand relationships that
involve categorical variables and/or non-linear relationships, you will
need to choose another measure of association.
• The bivariate Pearson Correlation only reveals associations among
continuous variables. The bivariate Pearson Correlation does not
provide any inferences about causation, no matter how large the
correlation coefficient is.
• Data often contain just a sample from a (much) larger population: I
surveyed 100 customers (sample) but I'm really interested in all my
100,000 customers (population). Sample outcomes typically differ
somewhat from population outcomes. So finding a non zero
correlation in my sample does not prove that 2 variables are
correlated in my entire population; if the population correlation is
really zero, I may easily find a small correlation in my sample.
However, finding a strong correlation in this case is very unlikely and
suggests that my population correlation wasn't zero after all.
Hypothesis
• Two-tailed significance test:
• H0: ρ = 0 ("the population correlation coefficient is 0; there is no association")
H1: ρ ≠ 0 ("the population correlation coefficient is not 0; a nonzero correlation
could exist")
• One-tailed significance test:
• H0: ρ = 0 ("the population correlation coefficient is 0; there is no association")
H1: ρ > 0 ("the population correlation coefficient is greater than 0; a positive
correlation could exist")
OR
H1: ρ < 0 ("the population correlation coefficient is less than 0; a negative
correlation could exist")
• where ρ is the population correlation coefficient.
Run the test
• Variables:
• Correlation Coefficients:
• Test of Significance: Click Two-tailed or One-tailed, depending on your
desired significance test. SPSS uses a two-tailed test by default.
• Flag significant correlations:
• Options: Clicking Options will open a window where you can specify
which Statistics to include (i.e., Means and standard deviations, Cross-
product deviations and covariances) and how to address Missing
Values (i.e., Exclude cases pairwise or Exclude cases listwise).
• Note that the pairwise/listwise setting does not affect your computations if
you are only entering two variable, but can make a very large difference if
you are entering three or more variables into the correlation procedure.
Run the test
• Correlation > bivariate > variable>Ok
• As a rule of thumb, a correlation is statistically significant if its “Sig. (2-
tailed)” < 0.05.
• Strongest correlation is between depression and overall well being
• r = -0.801. It's based on N = 117 children and its 2-tailed significance, p =
0.000. This means there's a 0.000 probability of finding this sample
correlation -or a larger one- if the actual population correlation is zero.
• IQ does not correlate with anything. Its strongest correlation is 0.152 with
anxiety but p = 0.11 so it's not statistically significantly different from zero.
That is, there's an 0.11 chance of finding it if the population correlation is
zero. This correlation is too small to reject the null hypothesis.
Decision
• Well being and Depression have a statistically significant linear
relationship (p < .000).
• The direction of the relationship is negative (i.e., Well being and
Depression are negatively correlated), meaning that one variable tend
to increase and other variable tend to decrease together (i.e., greater
well being is associated with smaller Depression).
• The magnitude, or strength, of the association is approximately strong
5 < | r | ……… large / strong correlation

You might also like