You are on page 1of 4

Ordinary Least Squares Regression with Shazam

Data

Data files in the statistics class are usually Excel files stored
on the hard drive or a floppy disk, but data may be entered directly by
selecting “file”, “new”, and “dataset” on the standard toolbar (first
toolbar or second row) of the Shazam screen.. Usually it is easier to
create Excel files with your data. Each row of the Excel spreadsheet
should be an observation and each column a variable. Suppose pcenrgy,
popdnsty, pcincome, imptergy, tropics are five variables of a cross
section analysis of 31 countries. Each observation or row would be a
country and each column would contain data for one of the variables in
that country. To import the excel data into Shazam, the first row
should contain the variable names. Shazam will look for these names and
show them as column headings when the data is imported. Choose a
variable name of 8 or fewer characters; the first character of the
variable name should be a letter. Also do not put any special
characters or spaces in the variable name. The excel file should contain
no blank cells (missing data) if you wish to avoid complications. The
first column should be the dependent variable to simplify matters.
Shown below is the file in Excel.

pcenrgy popdnst pcincom imprter tropics


y e gy
1525 13 8570 -25 2
5215 2 20540 -98 1
3279 97 27980 68 2
5167 310 26420 78 2
772 19 4720 40 1
7879 3 19290 -50 2
47 19 110 5 1
707 129 860 -2 2
6918 123 32500 24 2
5613 17 24080 55 3
4150 106 26050 47 2
4156 234 28260 58 2
92 75 370 66 1
2454 111 4430 47 2
260 313 390 18 1
3003 269 15810 97 1
3964 333 37850 80 2
109 47 330 82 1
692 61 4680 -88 1
1456 48 3680 -51 1
33 150 210 86 2
4741 456 25820 10 2
4290 13 19480 19 2
265 36 410 74 1
308 12 2010 -141 2
1939 108 10450 90 2
7162 4896 32940 100 1
5736 21 26220 38 3
878 116 2800 63 1
3786 243 20710 -15 2
7905 29 28740 20 2
2158 25 3450 -298 1

Notice there is a row that must be deleted because there is no data in


it. Save the file where you can find it on a disk or the hard drive.

Shazam
In Shazam, the top row across the screen says Shazam –
Professional Edition etc. The second row is a toolbar with a Shazam
symbol first, then file, edit, project, data etc. to the right along the
toolbar. In the third row, a toolbar has “New” and “Open” as the first
two options with a file symbol next to “Open”. When “Open” is
depressed (selected using the mouse) Shazam brings up a menu in windows
that enables you to select the appropriate excel file you stored
earlier.

Be sure to identify the appropriate “type of file” so that


Windows will show all files of that type that are stored on the drive.
The default type of file Shazam looks for is a Shazam file. Change this
to “microsoft excel” or “all files” so that the window will show excel
files on the drive. Select the file that you have stored and then
select “open”. Shazam will then show a message about variable names and
data and asks “do you want to continue”; you should answer “yes”.
Another menu will appear that will permit you to select a spreadsheet—
normally select sheet 1 and open. Another popup will appear asking if
you wish to add the data set to the current project; you should indicate
“yes”. Then you will give the data set a name and select “save” and
when another window appears give the project a name and “save”.

Check again to be sure the variable names are correctly entered


and that there are no blank cells in the data set. At this point
“load” the data. You are using Shazam to do ordinary least squares
regression and test to see if the assumptions of regression have been
met by testing for multicollinearity, autocorrelation, and
heteroschedasticity. You will want a regression equation, t tests of
signficance for partial correlaltion coefficients (variable
coefficients), a Durbin Watson statistic, an R squared and adjusted R
squared, an F test for significance of the equation, a variable
correlation matrix to test for multicollinearity, and a White test for
heteroschedasticity.

Shazam will do all these things, some automatically with the OLS
command; others will be obtained as options or as a second command. The
Shazam edition we are using provides “wizards” that assist you in
writing the appropriate commands for ordinary least squares and other
procedures. Although the program has these helps, it continues to
function as a command driven program. Command windows enter commands to
the program and output windows show the outputs from these commands.

At this point, select Command Editor on the third toolbar of the


Shazam screen. The data window will recede into the background and a
new window will appear with a fourth toolbar. To obtain the variable
correlation coefficient matrix, enter the following command on the blank
command editor screen:
stat pcenrgy popdnsty pcincome imptergy tropics/pcor

Notice that if the abbreviations are correctly entered, Shazam will


recognize them and show them as blue characters.

Next use the wizard to help write commands for the ordinary least
squares procedure. Select “Wizards” on the second toolbar and a window
appears that describes the purpose of “Wizards”. Use the wizard to
construct commands to complete the multiple regression project. Select
“Next” and a menu of choices will appear. Select Ordinary least
squares regression and “Next”. In the “Tasks to Perform” menu select
all of the boxes except the one for forecasting. Go to the next window
which is a summary of what you have chosen to this point. Move to the
following window by selecting “Next”.

A window now appears that allows you to select the dependent and
independent variables. Shade in per capita consumption of energy and
use the “Add” button to move it to the dependent variable box. Shade in
the other variables (population density, per capita income, imports of
energy, tropics) and add them to the independent variable list. Notice
lags could be introduced at this point. In this practice problem you do
not want to lag anything, however. If you want to use only a part of
the data you could specify the part which you will use at this point. In
this practice problem, you will use the entire sample so make sure the
“use existing” box is marked. Then go on to the next window. This
window gives a number of options that could be used in the regression.
For this regression nothing in the window will be selected. Supposedly,
by not selecting “suppress ANOVA” the program will automatically perform
analysis of variance. This feature does not work as it should: you
will have to put in a command to obtain analysis of variance. Notice
that “Model form” is “Linear”. If you wanted to do a regression in
logarithms, at this point you would change the linear to one of the
other options. In this practice problem, leave it as “Linear”. Go to
the next window that is a menu of diagnostics. Select “print observed,
predicted and residuals” and “heteroskedasticity tests” and move to the
next window. In the practice problem there are no restrictions, so
select “Next”. There are no hypotheses to specify so move to the next
window. This window provides an opportunity to specify obtain confidence
intervals for the variables. Shade in all the variables and move them
to the selected side; go to the next window which is entitled Final
Step. Be sure to select “Generate commands and insert into currently
active editor” box. After you select “Finish” the wizard returns you to
the command editor box. You should see the following in the command
editor:

stat pcenrgy popdnsty pcincome imptergy tropics/pcor


ols pcenrgy popdnsty pcincome imptergy tropics
confid popdnsty pcincome imptergy tropics
diagnos / list het

Notice that new commands have been added to the command editor other
than the “stat” command specified earlier. You need to insert a command
to obtain analysis of variance. This is done by adding a slash and
“anova” after the variable list in the ols command. The command editor
window should now look like this:

stat pcenrgy popdnsty pcincome imptergy tropics/pcor


ols pcenrgy popdnsty pcincome imptergy tropics/anova
confid popdnsty pcincome imptergy tropics
diagnos / list het

Select “run” on the fourth toolbar and Shazam will complete all
the tasks you have selected for it. Be sure to look at the bottom of
the window for any errors or warnings. Pay attention to these because
they indicate data problems. You may need to correct your data.

Print
The print command will give you everything in the “Command Editor
(output) window. A copy of the data is obtained by depressing the
“energy2.xls” (or whatever you have named the data) file on the third
toolbar and then print.

If you have problems with the data, correct them. You will then
have to reload the data. To begin a new regression, it is necessary to
obtain a new “command editor” box. This is done by selecting “New” on
the second toolbar. If you have several different “command editors” and
data sets, the third toolbar becomes filled and an arrow appears at the
right of the third toolbar to allow you to see all the previous command
editors and data sets.

It is wise at this point to consider the output from your commands


to be sure you have everything needed: correlation matrix of variables;
R squared and Adjusted R squared; analysis of variance; confidence
intervals; variable coefficients, t ratios, and p values; Durbin-Watson
statistic; and heteroschedasticity tests.

You might also like