Professional Documents
Culture Documents
E-mail: hazharstat@gmail.com
Research gate: Hazhar Blbas
Cell-phone#: 0750 481 3436
[
Hazhar Blbas 2
Objective of SPSS
Data Entry
Analysis of Data
Out put
Hazhar Blbas 3
Basic structure of SPSS
SPSS has three windows
for working with data:
You must save the data editor window (.sav) and output viewer window (.spv)
separately. Make sure to save both if you want to save your changes in data or
analysis.*
Hazhar Blbas 4
The Data Editor Window (sav)
Variable view
The place to enter variables
Rows define the variable characteristics:
Name, Type, Width, Decimals, Label, Values, Missing, Columns,
Align, Measure, Role
Hazhar Blbas 5
The Data Editor Window (sav)
Data view
The place to enter data
Rows are cases (records)
Columns are variables
Hazhar Blbas 6
Variable View window: Name
– The first character of the variable name must be alphabetic
– Variable names must be unique, and have to be less than 64 characters.
– Spaces and special characters (eg !, ?, ', and *) cannot be used.
– variable names cannot end with a period
– Reserved keywords cannot be used as variable names – these are ALL, AND,
BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO and WITH; (lower and upper case)
Hazhar Blbas 7
Variable View window: Type
• The Type column is showing Numeric for all rows. This means that
numeric (number) values will be expected in the dataset relating to these
variables.
– Click on the ‘type’ box. The two basic types of variables that you will use
are numeric and string. This column enables you to specify the type of
variable.
Hazhar Blbas 8
Variable View window: Width
Width allows you to determine the number of characters
SPSS will allow to be entered for the variable
Hazhar Blbas 9
Variable View window: Decimals
– Number of decimals
– It has to be less than or equal to 16
Hazhar Blbas 10
Variable View window: Label
– You can specify the details of the variable
– You can write characters with spaces up to 256 characters
⁻ If you want to specify where a new line appears in a label, type
\n within the text and SPSS will wrap the label at this point.
Hazhar Blbas 11
Variable View window: Value labels
• This is used and to suggest which numbers represent
which categories when the variable represents a category
• For the value, and the label, you can put up to 60 characters.
• In the value’s blank put the number (code) and label’s blank put the
name of code
• After defining the values click add and then click OK.
Hazhar Blbas 12
Hazhar Blbas
Variable View window : Missing Value
• Missing values are used to define user-specified missing
information.
– No response
– Refused to answer
– Data entry mistakes
Hazhar Blbas 14
Replacing missing values
• click TRANSFORM
• click REPLACE MISSING VALUES
• select the variable with missing values and move it to the right using the
arrow
• SPSS will rename and create a new variable with your filled in data.
• click METHOD to select
what type of method you
would like SPSS to use when
replacing missing values.
• click OK and view your new
data in data view
Hazhar Blbas 15
Variable View window: Measurement
• The almost final column is concerned with the measurement
scale properties of your variable. In statistics certain procedures
are only appropriate for variables measured on specific scales of
measurement. The measurement characteristics recognized by
SPSS are as follows:
Nominal: “non ordered” – categories that cannot be ranked (e.g., Color of eye)
Hazhar Blbas 16
Variable View window: Role of variables
• The final column is concerned with role your variable is going to
take in the analysis. This is a new column for version 18. In
statistics certain procedures are only appropriate for certain types
of variable.
The roles recognized by SPSS are as follows:
• Input: this is variable can be used as an independent predictor.
• Target: this is the outcome of the analysis
• Both: this can be either target or input
• None: no role assigned
• Partition: this variable can be used to partition the data, such as a variable
which defines a test or training data set.
• Split: This is included for compatibility with other PASW programmers
There are very few procedures in version 18 which require the role to be
defined. We will leave all variables with the default role of input.
Hazhar Blbas 17
Saving an SPSS data file (Cont.)
Example1: Save the variable that represents the
age of 30 students on the Desktop and called
“Economic1”.
Age of students:20, 21, 23, 25, 22, 23, 20, 26,
25, 24, 20, 20, 18, 23, 20, 22,
25, 20, 24, 22, 20, 24, 25, 25,
20, 19, 19, 18, 24, 25
Hazhar Blbas 18
Saving an SPSS data file
Solution1:
1. Select File in the Data Editor window and then select Save
2. Ensure the desired directory is displayed in the Look in box.
3. Type Economic 1 in the File name box and then Click on Save.
4. The saved data from SPSS appears in the file Economic 1.sav.
Hazhar Blbas 19
Opening a file
Open the data in previous example.
Solution:
1. Go back into SPSS.
2. Click on File Open Data
3. Click on Economic 1.sav in the file name. Then, Click open.
Hazhar Blbas 20
Opening a data file which already
SPSS has it
Example2: Open the data file named “CATALOG.SAV
in SPSS program.
Solution2:
• Click on FILE OPEN DATA
• Click on MY COMPUTER LOCAL DISK C:/
• Click on PROGRAM FILES (IBM) SPSS Inc
• Click on PASW Statistics 18 Samples English
• Select CATALOG.SAV Open
Hazhar Blbas 21
Saving an SPSS output file (Cont.)
Example3: Find the mean for Example1 and then
save an output file called Economic1
Solution3:
1. Click on Analysis Descriptive Statistics Frequencies
2. Transfer the Age of Students variable into Variable(s) box
3. Click on Statistics and then choose mean from Central Tendency
4. Click on Continue and Ok.
Hazhar Blbas 22
Saving SPSS output file
5. Select File in the SPSS Viewer window and then select Save
6. Ensure the desired directory is displayed in the look in box.
7. Type Economic 1 in the File name box.
Hazhar Blbas 24
Example4: How would you put the following information
into SPSS?
Hazhar Blbas 26
Exporting data from SPSS to Excel
• In SPSS, open SPSS data file “Economic1” and then go to:
• File, Save as,
• Select Type of file (For Example, Excel 2007 (*.xlsx)) you want to save
into and the File name that you want to save into.
Hazhar Blbas 28
Example5: For the following information in Group A data, sort
the cases by variable called Height.
Group A
Solution5:
How to Sort Cases
Data Sort Cases.. . Select the variable named Height
and then transfer it to the sort by box then tick an
Ascending in Sort Order Hazhar Blbas 29
Click Ok
Hazhar Blbas 30
Sort Variables: You can sort the variables in the active dataset based
on the values of any of the variable attributes (e.g., variable name,
data type, measurement level). Values can be sorted in ascending or
descending order.
Solution 6:
How to Sort Variables
Data Sort Variables.. .
Select Name in the Variable
View Columns then
tick an Ascending in Sort Order
Hazhar Blbas 31
Click on Ok
Hazhar Blbas 32
Weight Cases: This option is especially useful when you want to
carry out a chi-square test (see Nonparametric Tests- Chi Square in
the Analyze menu). Usually, a cell in a data file represents one
observation for a particular case. However, on some occasions you
may want a cell to represent the frequency of occurrence of cases of a
particular variable. Also, it is useful to fine Weighted Mean.
Example 9: Calculate the Mean and Weight Mean for the data
represents the degree and credit for his test by SPSS.
Hazhar Blbas 33
Solution 9:
1. Click the Frequency from the Descriptive Statistics in
Analysis and then Plug the Degree variable into Variable box
From the Statistics click the Mean and then click on Continue
Statistics
Degree
Valid 4
N
Missing 0
Mean 72.50
arithmetic operators
+ Addition
– Subtraction
* Multiplication
/ Division
** Exponentiation
Hazhar Blbas 36
Example 10: Depending on Example 4 adding a new variable
named ‘New.height’ which is the natural log (LN) of height
Name Gender Height
JAUNITA 2 5.4
SALLY 2 5.3
DONNA 2 5.6
SABRINA 2 5.7
JOHN 1 5.7
MARK 1 6
ERIC 1 6.4
BRUCE 1 5.9
Value =1 represents Male and Value =2 represents Female
Solution 10:
• Click ‘Transform’ and then click ‘Compute Variable
• Type in New.height in the ‘Target Variable’ box. Then type in
‘ln(height)’ in the ‘Numeric Expression’ box. Click OK
Hazhar Blbas 37
Hazhar Blbas 38
A new variable ‘New.height’ is added to the table
Hazhar Blbas 39
Example 11: Create a new variable named “Sqrt.height”
which is the square root of height.
Solution 11:
Hazhar Blbas 40
A new variable ‘Sqrt.height’ is added to the table
Hazhar Blbas 41
Count Values within Cases
Count creates a variable that counts the occurrences
(frequency) of the same value(s) in a list of
variables for each case. For example, a survey
might contain a list of magazines with yes/no check
boxes to indicate which magazines each respondent
reads. You could count the number of yes responses
for each respondent to create a new variable that
contains the total number of magazines read.
Hazhar Blbas 42
Example 12: We have the table includes the homework
and the average for 5 students among 10 weeks in SPSS
class, (1: bring the homework and 0: missing the
homework). Count the number of bringing the
homework for each students.
Hazhar Blbas 43
Solution 12:Steps to count occurrences of values within cases
In the Data Editor window select Transform.
From the Transform menu select Count Values within Cases....
Enter a target variable name .
Transfer all 10 weeks variables into the Numeric Variable space.
Click on Define Values and write the number 1 in the Value space and then click
Add, Continue, Ok respectively.
Homework: Find the degree of homework by adding 5 degree for bringing each
homework and subtract 5 degree for missing each homework, and then calculate the
final average for the average column that already 5 students have it and the
homework degree that you should find it. Blbas
Hazhar 44
Modifying variables – Recode
A. Recode into Same Variables
You can recode the values of a variable and still retain this variable in
a data file.
B. Recode into Different Variables
In some cases you may want to recode the values of a variable but
retain its original values. To achieve this, you need to recode the
original variable into a different variable
Example 13: Sometimes you might want to use data in a
different form, such as looking at the age groups young (30
and under) and old (31 and above) rather than exact age.
Age of students:30, 31, 33, 35, 32, 33, 30, 36, 35, 34, 30, 30, 28, 33,
30, 32, 35, 30, 34, 32, 30, 34, 35, 35, 30, 29, 29, 28,
34, 35
Hazhar Blbas 45
Solution 13: the values of age could be recoded to 1 and 2,
representing say, young (1) and old (2).
Hazhar Blbas 49
Use Frequencies to tabulate (calculate) New.age variable
and check the results.
You should find that 12 subjects are aged 30 or less and
18 subjects are age 31 or more.
Hazhar Blbas 50
Replace Missing Values
• Missing observations can be problematic in analysis. Some time
series measures cannot be computed if there are missing values in
the series. Sometimes the value for a particular observation is
simply not known. The data should be quantitative for replacing
missing values.
Steps for Replacing Missing Values
in Time Series Variables
In the Data Editor window select Transform.
From the Transform menu select Replace Missing Values...
Transfer the variable that you want to use to replace missing
values into the New Variable’s space.
Select the estimation method that you want it and then enter the
variable name and click on Change.
Finally, click on ok Hazhar Blbas 51
Estimation Methods for Replacing Missing Values
• Series mean: Replaces missing values with the mean for the entire
series.
• Mean of nearby points: Replaces missing values with the mean of valid
surrounding values (Mean of one or two values before and after the
missing values). The span of nearby points is the number of valid values
above and below the missing value used to compute the mean.
• Median of nearby points: Replaces missing values with the median of
valid surrounding values. Especially, when we have the outliers in our
data.
• Linear interpolation:Replaces missing values using a linear interpolation.
The last valid value before the missing value and the first valid value
after the missing value are used for the interpolation. If the first or last
case in the series has a missing value, the missing value is not replaced it
means we can not use it.
• Linear trend at point: Replaces missing values with the linear trend for
that point. The existing series is regressed on an index variable scaled 1
to n. Missing values are replaced with
Hazhar their predicted values.
Blbas 52
Example 14: Find the missing values for all variables by
deserve method in table below:
Solution 14:
• I will run the Series Mean Method to find missing value in age
variable.
• I will run the Median of nearby points Method by number of
Hazhar Blbas 53
span=1 to find missing value in Num_Cigs variable.
• I will run the Linear trend at point Method to find missing
value in Cinemas variable because the missing values are the
first and last data in variable. Also, we can use the first
method.
Hazhar Blbas 54
Frequency Tables (Cont.)
Example 15: From the smoker table in below find the number of
males and females and then draw the Bar Chart for the same variable.
Hazhar Blbas 55
Frequency Tables (Cont.)
Solution 15:
In the Data Editor window select Analyze.
From the Analyze menu select Descriptive Statistics.
From the Descriptive Statistics submenu, select Frequencies
Select Sex of respondent.
Click the right pointing arrow head to move sex into the Variables box
Hazhar Blbas 56
Frequency Tables
Hazhar Blbas 57
Producing a bar chart from frequencies
• To create a bar chart for sex using the Frequencies box:
In the Analyze menu, click Descriptive Statistics.
From the Descriptive Statistics submenu, click Frequencies.
Select Sex of respondent [Sex] and then click on to
choose Bar charts and click on the continue, ok respectively.
Hazhar Blbas 58
Descriptive Statistics (Cont.)
Example 16: Find mean, median, mode, quartiles, standard
deviation, variance, range, minimum, maximum, standard error of
mean, skewness, and kurtosis for variable age and then draw a
Histogram for the same variable called age.
Solution 16:
Select Analyze Descriptive Statistics Frequencies
Select Age last birthday [age] Click Statistics…
Hazhar Blbas 59
Descriptive Statistics (Cont.)
Hazhar Blbas 60
Displaying histogram
Use the Frequencies dialog box to request a histogram:
Select Analyze Descriptive Statistics Frequencies
Select Age last birthday [age] Click Charts Histogram(s)
To display a normal curve on the chart:
Select With normal curve Click Continue and then click OK
Hazhar Blbas 61
Homework:
1- Find the frequency Table and Pie Chart for the
Smoker variable?
1- Find mean, median, mode, quartiles, standard
deviation, variance, range, minimum, maximum,
standard error of mean, skewness, and kurtosis for
Num-Sigs variable and then draw a Histogram for
the same variable.
Hazhar Blbas 62
Testing Normality
Example 17 : Is the weight of student normal or not.
135 119 106 135 180 108 128 160 143 175 170
205 195 185 182 150 175 190 180 195 220 235
Solution 17:
First of all, we have to write the Null and Alternative hypothesis
for checking normality
H0: the data is distributed normally
Ha: the data is not distributed normally
Second, procedure for Testing Normality
• Analyze Descriptive Statistics Explore
• Put the weight variable into the Dependent list and tick Both
• Click on Plots and then tick on Normality
Hazhar Blbas plots with test 63
The p-value 0.692 from Shapiro-Wilk test of normality is
greater than 0.05 which imply that it is acceptable to assume
that the weight of student is normal (or bell-shaped)
Hazhar Blbas 66
One Samples T-test (Cont.)
Null hypothesis is H0: = 1000.
Alternative hypothesis is H1: 1000.
The one sample t-test statistic is 3.582 and the p-value from this
statistic is 0.002 and that is less than 0.05 (the level of
significance usually used for the test) Such a p-value indicates
that the average weight of the sampled population is statistically
significantly different from 140 lb. The 95% confidence interval
estimate for the difference between the population mean weight
and 140 lb is (11.27, 42.46)
Hazhar Blbas 68
Example 19: Testing whether light bulbs have a life of 1000
hours at = .05
800, 750, 940, 970, 790, 980, 820, 760, 1000, 860
Solution 19:
– Null hypothesis is H0: = 1000.
– Alternative hypothesis is H1: 1000.
Hazhar Blbas 69
One-Sample Statistics
Std. Error
N Mean Std. Deviation Mean
BULBLIFE 10 867.0000 96.7299 30.5887
One-Sample Test
Because the p-value (Sig. (2-tailed)) is less than .05, we reject H0. So,
it’s significant.
Hazhar Blbas 70
Independent Samples T-test (Cont.)
• T-tests are used to demonstrate whether two groups are the
same with respect to a variable which has a continuous
normal distribution.
Example 20: A variable which has a continuous normal distribution
in this data set is the age, and two groups might be those
people who smoke and those people who are not smokers
as defined by the question ‘Do you smoke?’
We are testing whether the means demonstrated in the
variable age above are statistically significantly different
with variable smokers?
Solution 20: First of all, we will check the normality for variable age as we
did in the one sample t-test. Since, The p-value 0.269 from Shapiro-Wilk
test of normality is greater than 0.05 which imply that it is acceptable to
assume that the age distribution is normal (or bell-shaped)
Hazhar Blbas 71
Independent Samples T-test (Cont.)
• To Perform Independent Sample T-Test:
– Null hypothesis is H0: 1= 2
– Alternative hypothesis is H1: 1 2
–Analyze Compare Means Independent Samples T-Test
• Put age variable in the Test Variable box and put Smoker variable
in the Grouping variable box
• Select Define Groups: Type 1 in the blank Group 1 & Type 2 in the
blank Group 2
• Select Continue and then ok
Hazhar Blbas 72
Independent Samples T-test (Cont.)
Hazhar Blbas 73
Saving output from SPSS into Word
Make sure you are in the SPSS Viewer window.
• From the File menu select Export.
• In the Objects to Export box, ensure that All Visible objects is selected
• Under Document Type select Word/RTF file (*.doc) from the drop down menu.
• In the File Name box type C:\Training\Stats\wordoutput.
• Click OK.
Hazhar Blbas 74
Paired Samples t-Test (Cont.)
The paired t-test is appropriate for data in which the two
samples are paired in some way. This type of analysis is
appropriate for three separate data collection scenarios:
• Pairs consist of before and after measurements on a single group
of subjects or patients.
• Two measurements on the same subject or entity (right and left
eye, for example) are paired.
• Subjects in one group (e.g., those receiving a treatment) are
paired or matched on a one-to-one basis with subjects in a second
group (e.g., control subjects).
In all cases, the data to be analyzed are the differences within
pairs (e.g., the right eye measurement minus the left eye
measurement). The difference scores are then analyzed as a
one-sample t-test Hazhar Blbas 75
Paired Samples t-Test (Cont.)
Example 21: : Does the Diet Work? A developer of
a new diet is interested in showing that it is effective.
He randomly chooses 15 subjects to go on the diet
for 1 month. He weighs each patient before and after
the 1-month period to see whether there is evidence
of a weight loss at the end of the month.
Solution 21:
The basic assumption for the paired t-test to be valid
when you have small sample sizes is that the difference
scores (μd =before - after) are normally distributed
There is evidence that the mean weight loss is positive, that is, that
the diet is effective in producing weight loss, t(14) = 2.567, one-tailed
p = 0.01 (because μd > 0 whichHazhar
means
Blbas one tailed) 78
• Useful Shortcut Keys
• Ctrl+Home will take you to the top of the file. In a Data file this will be
the first case
• and the first variable.
• Ctrl+End will take you to the end of the file. In a Data file this will be the
last case and
• the last variable.
• Page Up and Page Down will scroll up / down, one page (screen) at a
time.
• Ctrl+Page Up and Ctrl+Page Down will scroll right / left, one page
(screen) at a time.
In a Data file, Home will take you to the left-hand end of the current row,
i.e. to the first variable and the current case.
In a Data file, End will take you to the right-hand end of the current row, i.e.
to the last variable and the current case.
• In an Output file, Ctrl+A will select the entire output (tables and graphs).
• At many stages (in the Data Editor Hazhar or in the Output Viewer), clicking 79the
Blbas