You are on page 1of 27

Measures the relative strength of the linear relationship between two variables

Unit-less

Ranges between 1 and 1


The closer to 1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker any positive linear relationship

r = -1 Y

X Y

r = -.6

r=0 Y

r = +1

r = +.3

r=0

Linear relationships Y Y

Curvilinear relationships

X Y Y

Strong relationships Y Y

Weak relationships

X Y Y

No relationship Y

X Y

In correlation, the two variables are treated as equals. In regression, one variable is considered independent (=predictor) variable (X) and the other the dependent (=outcome) variable Y.

Y=mX+B?

A slope of 2 means that every 1-unit change in X yields a 2-unit change in Y.

P=.22; not significant

The linear regression model:

intercept

Love of Math = 5 + .01*math SAT score


slope

If you know something about X, this knowledge helps you predict something about Y.

The average baby weights in Mumbai is 3400 gm

Your Best guess at a random babys weight, given no information about the baby, is what?
3400 grams

But, what if you have relevant information? Can you make a better guess?

X=gestation time Assume that babies that gestate for longer are born heavier, all other things being equal. Pretend (at least for the purposes of this example) that this relationship is linear. Example: suppose a one-week increase in gestation, on average, leads to a 100gram increase in birth-weight

Y=birth weight (g) X=gestatio n time (weeks)

Best fit line is chosen such that the sum of the squared (why squared?) distances of the points (Yis) from the line is minimized: Or mathematically..(max and mins from calculus) Derivative[(Yi(mx+b))2]=0

A new baby is born that had gestated for just 30 weeks. Whats your best guess at the birth-weight? Are you still best off guessing 3400? NO!

Y=birth weight (g) X=gestatio n time (weeks)

300 0

30

Y=birth weight (g)

300 0

(x,y)= (30,3000 )

X=gestatio n time (weeks)

30

The babies that gestate for 30 weeks appear to center around a weight of 3000 grams.

In Math-Speak E(Y/X=30 weeks)=3000 grams

Note that not every Y-value (Yi) sits on the line. Theres variability.
Yi=3000 + random errori

In fact, babies that gestate for 30 weeks have birth-weights that center at 3000 grams, but vary around 3000 with some variance 2

Approximately what distribution do birth-weights follow? Normal. Y/X=30 weeks ~ N(3000, 2)

Y=birth weight (g) X=gestatio n time (weeks)

20

30

40

Y=baby weights (g)


Y/X=40 weeks ~ N(4000, 2) Y/X=30 weeks ~ N(3000, 2) Y/X=20 weeks ~ N(2000, 2)

X=gestatio n times (weeks)

20

30

40

E(Y/X=40 weeks)=4000 E(Y/X=30 weeks)=3000 E(Y/X=20 weeks)=2000

E(Y/X)= Y/X = 100 grams/week*X weeks

Ys are modeled
Yi= 100*X + random errori

Fixed exactly on the line

Follows a normal distribution

Linear regression assumes that

1. The relationship between X and Y is linear 2. Y is distributed normally at each value of X 3. The variance of Y at every value of X is the same (homogeneity of variances)

Why? The math requires itthe mathematical process is called least squares because it fits the regression line by minimizing the squared errors from the line (mathematically easy, but not general relies on above assumptions).

More than one predictor

= + 1*X + 2 *W + 3 *Z

Each regression coefficient is the amount of change in the outcome variable that would be expected per one-unit change of the predictor, if all other variables in the model were held constant.

Control Variables

Product Quality

Purchase Satisfaction Revisit Intention

5 ITEM SCALE

5 ITEM SCALE

10 ITEM SCALE

Cluster Sampling Sample Size: 450

You might also like