Math (Regression Theory)

Elements of regression
theory
Done by:
Borysenko Alina
Dubovik Maksim
Definition
Regression is a statistical measure that
attempts to determine the strength of
the relationship between one dependent
variable (usually denoted by Y) and a
series of other changing variables
(known as independent variables).
Definition
Regression analysis is often used in business or investment world to

attempt to predict the effect of certain INPUTS on an OUTPUT.
For example:
A company may want to see
A company may want to predict
if its sales can be predicted by
the effect of the price of steel on
a movement in the GDP
car-sales.
Problems of regression analysis

1.MULTICOLLINEARITY
case in which two or more explanatory

variables in the regression model are highly
correlated, making it difficult or impossible to
isolate their individual effects on the
dependent variable.

2. HETEROSCEDASTICITY
Error variance increases

with the increase of
Independent variable.

3. AUTOCORRELATION
casein which the error term in one time period is
correlated with the error term in any other time
period.
statistical relationship between the sequences of a
number of values taken with a shift a shift in time.

4. ERRORS IN VARIABLES
casein which the variables in the regression model
include measurement errors.
errors in the explanatory (independent) variables lead
to biased and inconsistent parameter estimates.
Correlation dependence
Correlation dependence is statistical
relationship between two or more
random variables.
Correlation ratio or correlation
coefficient (or ) are the mathematical
measure of correlation between two
random variables.
Correlation table
The primary task of statistical processing of the experimental data is the

systematization of the data.
That is when correlation table will help you:

Y/X
...
X takes values X1, X2, , Xm
Y takes values Y1, Y2, , Yk
n - frequency
Correlation table (example)

We studied the relationship between the quality of
goods Y (%) and quantity of goods X (pcs).
The observation results are shown in the form of
correlation table:
Y/X
18
70
75
80
22
26
5
46
7+46+1=54
29
72
29+72=101
29
85
90
5+7=12
30
46+29=75
1+72+29=1
02
29+8=37
8+3=11
100
Empirical lines of regression

Empirical regression is based on the data of the
combination group.
It represents the dependence of the group mean
values of dependent variables (Y) from the group
mean values of dependent variables (X).
Graphical representation of the empirical regression
is a broken line composed of points (abscissae group mean value of X, ordinates group mean
value of Y)
Empirical line of regression (example)
The dependence between the amount of sales of good (Y) and the costs
for its advertisement (X) is represented below:
X
1,5
4,0
5,0
7,0
8,5
10,0
11,0
12,5
5,0
4,5
7,0
6,5
9,5
9,0
11,0
9,0
Lets depict experimental data as points in Cartesian coordinates. The

broken line joining these points is called the empirical line of
regression:
The estimation of parameters using

Least Squares Method

Aline of best fitis a straight line that is the

best approximation of the given set of data. It
is used to study the nature of the relation
between two variables.
A more accurate way of finding the line of best

fit is theleast square method.

Use
the following steps to find the equation of line of best fit for a
set ofordered pairs (; ), ()

STEP 1 : Calculate the mean of the X-values and mean of the Yvalues.

STEP
2 : The following formula gives the slope (a) of

the line of best fit:

STEP
3 : Compute the y-intercept (b) of the line by

using the formula:

STEP
4 : Use the slope a and the y-intercept b to form

the equation of the line.

Least Squares Method (example)
X
11
12
10
12
14
1. Calculate the means:

2.
1
1,6
-4
-6,4
2,56
10
-4,4
-13,2
19,36
11
4,6
-4
-18,4
21,16
-0,4
-1
0,4
0,16
-1,4
-1,4
1,96
12
-2,4
-12
5,76
12
5,6
-6
-33,6
31,36
2,6
-3
-7,8
6,76
-0,4
-0,8
0,16
10
14
-5,4
-37,8
29,16
-131
118,4

2. Calculate
the slope:
3. Calculate the y-intercept:
4. Form the equation of the line:
Point estimations

Point estimations of parameters of regression

(a and b):
Point estimations (example)

X
15
17
11
14
14
13
15
10
13
14
14
84
36
13
91
49
15
90
36
15
60
225
17
136
289
10
90
81
11
66
121
13
78
36
10
14
70
196
93
95
779
1073
Point estimations (example)
Regression Slope: Confidence Interval

To construct aconfidence intervalfor the slope of the
regression line, we need to know thestandard errorof
thesampling distributionof the slope.
STEP 1 : Identify a sample statistic. The sample statistic is the
regression slope (a) calculated from sample data.
STEP 2 : Select a confidence level. The confidence level
describes the uncertainty of a sampling method. Often,
researchers choose 90%, 95%, or 99% confidence levels.
STEP 3 : Find the margin of error.
STEP 4 : Specify the confidence interval. The range of the
confidence interval is defined by thesample statistic+margin of
error.
Confidence Interval (example)
Regression equation:
Predictor
Coef
SE Coef
15
5,0
0,00
0,55
0,24
2,29
0,01
What is the 99% confidence interval for the slope of the

regression line?
Confidence Interval (example)
1. 99% confidence level (given)

2. Margin of error:
1. The standard error is given in the regression output. It is 0.24.
2.
3. Critical probability:
4. Degrees of freedom:
5. Using Table of Probabilities for Students t-Distribution, we find that
the critical value is 2.63.
6. Margin of error:
4. The 99% confidence interval is -0.08 to 1.18.
Nonlinear regression
Nonlinear regressionis a form ofregression
analysisin which observational data are
modeled by a function which is a nonlinear
combination of the model parameters and
depends on one or more independent
variables.
The data are fitted by a method of
successive approximations.
Examples:
Examples:
Multiple Regression
The general purpose of multiple regression is to learn more about the

relationship between several independent or predictor variables and a
dependent or criterion variable.
Personnel professionals use multiple regression procedures to

determine equitable compensation.You can determine a number of
factors or dimensions such as "amount of responsibility" (Resp) or
"number of people to supervise" (No_Super) thatyou believe to
contribute to the value of a job.
The personnel analyst then usually conducts a salary survey among

comparable companies in the market, recording the salaries and
respective characteristics for different positions. This information can be
used in a multiple regression analysis to build a regression equation
of the form:
Multiple Regression
Once this so-called regression line has been
determined, the analyst can now easily construct a
graph of the expected (predicted) salaries and the
actual salaries of job incumbents in his or her
company.
RESULT: Thus, the analyst is able to determine which
position is underpaid (below the regression line) or
overpaid (above the regression line), or paid equitably.

Math (Regression Theory)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Math (Regression Theory)

Uploaded by

Copyright:

Available Formats

Elements of regression

Regression analysis is often used in business or investment world to

A company may want to see

A company may want to predict

if its sales can be predicted by

the effect of the price of steel on

a movement in the GDP

Problems of regression analysis

case in which two or more explanatory

Problems of regression analysis

Error variance increases

Problems of regression analysis

Problems of regression analysis

The primary task of statistical processing of the experimental data is the

That is when correlation table will help you:

X takes values X1, X2, , Xm

Y takes values Y1, Y2, , Yk

Correlation table (example)

Empirical lines of regression

Empirical line of regression (example)

Lets depict experimental data as points in Cartesian coordinates. The

The estimation of parameters using

Aline of best fitis a straight line that is the

A more accurate way of finding the line of best

The estimation of parameters using

set ofordered pairs (; ), ()

The estimation of parameters using

2 : The following formula gives the slope (a) of

The estimation of parameters using

3 : Compute the y-intercept (b) of the line by

The estimation of parameters using

4 : Use the slope a and the y-intercept b to form

The estimation of parameters using

1. Calculate the means:

The estimation of parameters using

The estimation of parameters using

3. Calculate the y-intercept:

4. Form the equation of the line:

Point estimations of parameters of regression

Point estimations (example)

Point estimations (example)

Regression Slope: Confidence Interval

Confidence Interval (example)

What is the 99% confidence interval for the slope of the

Confidence Interval (example)

1. 99% confidence level (given)

The general purpose of multiple regression is to learn more about the

Personnel professionals use multiple regression procedures to

The personnel analyst then usually conducts a salary survey among

You might also like