Professional Documents
Culture Documents
The central purpose of regression is to create a linear equation relating the independent
variable X, to the dependent variable Y. It permits us to answer the following kind of
question;
How much additional income does each additional year of education provide
Regression assumes that both variables are measured on at least an interval level and
should only be used if we think that this assumption is close to being met..
Yi^ = + Xi
where Y^ is being used as Yhat.
If the model (equation) is correct. for the population, Y^ equals Y|X, this is known as the
conditional mean of Y given X, and is the population mean of Y for the particular value
of X.
Thus the regression line can be considered the path of mean values of the Y as X changes.
The line is produced by plugging values of X into the linear regression formula and
solving for Y^.
is the regression slope -- the amount of change in Y for each unit change in X. (Note
that one must specify the units)
Yi = + Xi + i
Yi^ = a + bXi
is the sample prediction equation
Yi = a + bXi + ei
The best fitting line is defined as the one which minimizes the sum of the squared
distances of all points from the regression line.
The procedure for improving best estimates for a dependent variable (Y) by accounting
for its relationship with an independent variable (X) is called simple linear correlation
and regression analysis
Simple Linear Correlation and Regression Analysis
Simple linear correlation and regression analysis is the use of the formula for a straight
line to improve best estimates of an interval/ratio dependent variable (Y) for all values of
an interval/ratio independent variable (X)
Linear means “straight line”
Scatterplots
A linear regression formula is the formula for a straight line
Simple linear correlation and regression statistics apply only to scatterplots with
coordinates in a linear, cigar-shaped pattern
The formula for a straight line to estimate Y is: Ý = a + bX