• A (Pearson) correlation is a number between -1 and +1 that indicates
to what extent 2 metric variables are linearly related. It's best understood by looking at some scatterplots. • A relationship of direct proportionality that, when plotted on a graph, traces a straight line. In linear relationships, any given change in an independent variable will always produce a corresponding change in the dependent variable • a correlation of -1 indicates a perfect linear descending relation: higher scores on one variable imply lower scores on the other variable. • a correlation of 0 means there's no linear relation between 2 variables whatsoever. However, there may be a (strong) non-linear relation nevertheless. • a correlation of 1 indicates a perfect ascending linear relation: higher scores on one variable are associated with higher scores on the other variable. • Direction of the relationship • -1 : perfectly negative linear relationship • 0 : no relationship • +1 : perfectly positive linear relationship • Strength • .1 < | r | < .3 … small / weak correlation • .3 < | r | < .5 … medium / moderate correlation • .5 < | r | ……… large / strong correlation Bivariate Pearson Correlation • The bivariate Pearson Correlation is commonly used to measure the following: • Correlations among pairs of variables • Correlations within and between sets of variables
• The bivariate Pearson correlation indicates the following:
• Whether a statistically significant linear relationship exists between two continuous variables • The strength of a linear relationship (i.e., how close the relationship is to being a perfectly straight line) • The direction of a linear relationship (increasing or decreasing) Note this • Cannot address non-linear relationships or relationships among categorical variables. If you wish to understand relationships that involve categorical variables and/or non-linear relationships, you will need to choose another measure of association. • The bivariate Pearson Correlation only reveals associations among continuous variables. The bivariate Pearson Correlation does not provide any inferences about causation, no matter how large the correlation coefficient is. • Data often contain just a sample from a (much) larger population: I surveyed 100 customers (sample) but I'm really interested in all my 100,000 customers (population). Sample outcomes typically differ somewhat from population outcomes. So finding a non zero correlation in my sample does not prove that 2 variables are correlated in my entire population; if the population correlation is really zero, I may easily find a small correlation in my sample. However, finding a strong correlation in this case is very unlikely and suggests that my population correlation wasn't zero after all. Hypothesis • Two-tailed significance test: • H0: ρ = 0 ("the population correlation coefficient is 0; there is no association") H1: ρ ≠ 0 ("the population correlation coefficient is not 0; a nonzero correlation could exist") • One-tailed significance test: • H0: ρ = 0 ("the population correlation coefficient is 0; there is no association") H1: ρ > 0 ("the population correlation coefficient is greater than 0; a positive correlation could exist") OR H1: ρ < 0 ("the population correlation coefficient is less than 0; a negative correlation could exist") • where ρ is the population correlation coefficient. Run the test • Variables: • Correlation Coefficients: • Test of Significance: Click Two-tailed or One-tailed, depending on your desired significance test. SPSS uses a two-tailed test by default. • Flag significant correlations: • Options: Clicking Options will open a window where you can specify which Statistics to include (i.e., Means and standard deviations, Cross- product deviations and covariances) and how to address Missing Values (i.e., Exclude cases pairwise or Exclude cases listwise). • Note that the pairwise/listwise setting does not affect your computations if you are only entering two variable, but can make a very large difference if you are entering three or more variables into the correlation procedure. Run the test • Correlation > bivariate > variable>Ok • As a rule of thumb, a correlation is statistically significant if its “Sig. (2- tailed)” < 0.05. • Strongest correlation is between depression and overall well being • r = -0.801. It's based on N = 117 children and its 2-tailed significance, p = 0.000. This means there's a 0.000 probability of finding this sample correlation -or a larger one- if the actual population correlation is zero. • IQ does not correlate with anything. Its strongest correlation is 0.152 with anxiety but p = 0.11 so it's not statistically significantly different from zero. That is, there's an 0.11 chance of finding it if the population correlation is zero. This correlation is too small to reject the null hypothesis. Decision • Well being and Depression have a statistically significant linear relationship (p < .000). • The direction of the relationship is negative (i.e., Well being and Depression are negatively correlated), meaning that one variable tend to increase and other variable tend to decrease together (i.e., greater well being is associated with smaller Depression). • The magnitude, or strength, of the association is approximately strong 5 < | r | ……… large / strong correlation