You are on page 1of 30

Copyright 2011 Pearson Education, Inc.

Association between
Quantitative Variables
Chapter 6
6.1 Scatterplots
Is household natural gas consumption
associated with climate?

Annual household natural gas consumption
measured in thousands of cubic feet (MCF)

Climate as measured by the National Weather
Service using heating degree days (HDD)

Copyright 2011 Pearson Education, Inc.
3 of 30
6.1 Scatterplots
Association between Numerical Variables

A graph displaying pairs of values as points on a
two-dimensional grid

The explanatory variable is placed on the x-axis

The response variable is placed on the y-axis
Copyright 2011 Pearson Education, Inc.
4 of 30
6.1 Scatterplots
Scatterplot of Natural Gas Consumption (y)
versus Heating Degree-Days (x)


Copyright 2011 Pearson Education, Inc.
5 of 30
6.2 Association in Scatterplots
Visual Test for Association

Compare the original scatterplot to others
that randomly match the coordinates

If you can pick the original out as having a
pattern, then there is an association




Copyright 2011 Pearson Education, Inc.
6 of 30
6.2 Association in Scatterplots
Describing Association

1. Direction. Does it trend up or down?
2. Curvature. Is the pattern linear or curved?
3. Variation. Are the points tightly clustered
around the trend?
4. Outliers. Is there something unexpected?



Copyright 2011 Pearson Education, Inc.
7 of 30
6.2 Association in Scatterplots
Gas Consumption vs. Heating Degree Days

1. Direction: Positive.
2. Curvature: Linear.
3. Variation: Considerable scatter.
4. Outliers: None apparent.



Copyright 2011 Pearson Education, Inc.
8 of 30
6.3 Measuring Association
Covariance



A measure that quantifies the linear
association
Depends on units of measurement and is
therefore difficult to interpret

Copyright 2011 Pearson Education, Inc.
9 of 30

1 1 2 2
cov( , )
1
n n
x x y y x x y y x x y y
x y
n

6.3 Measuring Association


Correlation (r)



Standardized measure of the strength of
the linear association (has no units)
Always between -1 and +1
Easy to interpret



Copyright 2011 Pearson Education, Inc.
10 of 30
cov( , )
corr( , )
x y
x y
x y
S S

6.3 Measuring Association


Gas Consumption and Heating Degree Days

Cov (HDD, Gas) = 63,357 HDD X MCF

Corr (HDD, Gas) = 0.55

The association is positive and moderate.





Copyright 2011 Pearson Education, Inc.
11 of 30
6.3 Measuring Association
Scatterplot for r = 1






Copyright 2011 Pearson Education, Inc.
12 of 30
6.3 Measuring Association
Scatterplot for r = -0.95








Copyright 2011 Pearson Education, Inc.
13 of 30
6.3 Measuring Association
Scatterplot for r = 0.75






Copyright 2011 Pearson Education, Inc.
14 of 30
6.3 Measuring Association
Scatterplot for r = -0.50






Copyright 2011 Pearson Education, Inc.
15 of 30
6.3 Measuring Association
Scatterplot for r = 0






Copyright 2011 Pearson Education, Inc.
16 of 30
6.3 Measuring Association
Correlation Matrix

A table showing all of the correlations among
a set of numerical variables.






Copyright 2011 Pearson Education, Inc.
17 of 30
6.4 Summarizing Association with a Line
Expressed using z-scores


Slope-Intercept Form







Copyright 2011 Pearson Education, Inc.
18 of 30

x
z rz

y a bx
and /
y x
a y bx b rs s
6.4 Summarizing Association with a Line
Line Relating Gas Consumption (y) to
Heating Degree Days (x)










Copyright 2011 Pearson Education, Inc.
19 of 30
x y 0126 . 0 6 . 42


6.4 Summarizing Association with a Line
Lines and Prediction
Use the correlation line to customize an ad for
estimated savings from insulation based on
climate.
For a home in a cold climate (HDD = 8,800), the
predicted gas consumption is 154 MCF.
At $10 / MCF, the predicted cost is $1,540.
Assuming that insulation saves 30% on gas bill,
estimated savings is $462.












Copyright 2011 Pearson Education, Inc.
20 of 30
6.5 Spurious Correlation
Lurking Variables

Scatterplots and correlation reveal
association, not causation

Spurious correlations result from underlying
lurking variables











Copyright 2011 Pearson Education, Inc.
21 of 30
6.5 Spurious Correlation
Checklist: Covariance and Correlation

Numerical variables
No obvious lurking variables
Linear
Outliers











Copyright 2011 Pearson Education, Inc.
22 of 30
4M Example 6.1:
LOCATING A NEW STORE
Motivation

Is it better to locate a new retail outlet far
from competing stores?
Copyright 2011 Pearson Education, Inc.
23 of 30
4M Example 6.1:
LOCATING A NEW STORE
Method

Is there an association between sales at the retail
outlets and distance to nearest competitor? For
55 stores in the chain, data are gathered for total
sales in the prior year and distance in miles from
the nearest competitor.
Copyright 2011 Pearson Education, Inc.
24 of 30
4M Example 6.1:
LOCATING A NEW STORE
Mechanics





Copyright 2011 Pearson Education, Inc.
25 of 30
4M Example 6.1:
LOCATING A NEW STORE
Mechanics

Compute the correlation between sales and
distance to be r = 0.741







Copyright 2011 Pearson Education, Inc.
26 of 30
4M Example 6.1:
LOCATING A NEW STORE
Message

The data show a strong, positive linear association
between distance to the nearest competitor and
sales. It is better to locate a new store far from its
competitors.
Copyright 2011 Pearson Education, Inc.
27 of 30
Best Practices

To understand the relationship between two
numerical variables, start with a scatterplot.

Look at the plot, look at the plot, look at the plot.

Use clear labels for the scatterplot.



Copyright 2011 Pearson Education, Inc.
28 of 30
Best Practices (Continued)
Describe a relationship completely.

Consider the possibility of lurking variables.

Use a correlation to quantify the association
between two numerical variables that are linearly
related.



Copyright 2011 Pearson Education, Inc.
29 of 30
Pitfalls
Dont use the correlation if data are categorical.

Dont treat association and correlation as
causation.

Dont assume that a correlation of zero means
that the variables are not associated.

Dont assume that a correlation near -1 or +1
means near perfect association.
Copyright 2011 Pearson Education, Inc.
30 of 30

You might also like