Professional Documents
Culture Documents
Quantitative Techniques in
Business (QTB)
Quantitative Techniques are the techniques used to
business
Develop effective policies and business related strategies
Make effective decisions to achieve business goals
efficiently
Research is based on QTB
Final thesis is based on QTB
Prof.Muhammad Ilyas ,std.Muhammad Saeed
Research Problem
Any problem or opportunity that needs to be addressed through
research process of data collection and analysis is called
Research Problem
Examples
Human Resource manager wants to develop HR policies regarding
Problem Statement
A problem statement is a clear and concise description
Example
Measure the annual turnover of employees in Higher educational
sector of Pakistan
Does advertisement contribute to the sales of a new product in the
market
Which of the two options i.e. stock market or real estate is better for
investment.
Prof.Muhammad Ilyas ,std.Muhammad Saeed
What is Variable?
Vary + able = Change + able
Variable is a characteristic of anything that can vary
(Change).
Examples
Gender
Age
Motivation level
(Male, Female)
(20 years, 30 years, 50 years)
(High, Medium, Low)
Types of Variable
with respect to relation
Budget
Advertisement
Awareness
Sales
Types of Variable
with respect to data
Variable
Categorical
Nominal
Ordinal
Gender
Motivation
1.
2.
1.
2.
Male
Female
Numerical
3.
Discrete
1. No of students
Highly Motivated
2. No of chairs
Moderately
3. Collar size
Motivated
Less Prof.Muhammad
Motivated
Ilyas ,std.Muhammad
Saeed
Continuous
1.
2.
3.
Height
Weight
speed
8
Categorical Variable
A variable whose values are not numerical in nature
Variables
Values
Gender
Male, female
Religion
Motivation level
Numerical Variable
A variable whose values are numerical in nature
Variables
Values
Collar size
Height
No of employees
10
Research Question
Research problem needs to be translated into one or
more research questions that are defined as
A research question is an interrogative statement that
seeks for the tentative relationship among variables
and clarifies what the researcher wants to answer.
Example
What is the impact of advertisement on sales of a new product in
the market
What is the annual turnover of employees in Higher educational
institutions of Pakistan
Does investing in stock market yield more return on investment as
compare to investment in real estate.
Prof.Muhammad Ilyas ,std.Muhammad Saeed
11
Associational:
the market
Difference:
and
12
Research Hypotheses
Research hypotheses are predictive statements about
1. Null Hypothesis
Ho = There is no relationship between Advertising and Sales
2. Alternative Hypothesis
H1 = There is relationship between advertising and sales
13
Hypothesis
o Interrogative statement
Simple statement
oNon-Predictive
Predictive
oNon-Directional
Directional
14
Activity
In groups of four, use the variables provided to write:
An associational question
A difference question
A descriptive question
15
Data
Set of raw facts figures is called Data
Example:
Types of Data
Data
Nature
Qualitative
Quantitative
Time frame
Crosssectional
Time-Series
16
Thank You!
17
Lecture #2
I am really thankful to my gorgeous teachers Sir
Dr.Muhammad Ilyas , for that great knowledge.
18
19
DATA
A set of raw facts and figures are called data
OR
Example: Age 16, 18, 20, 21, 23
Nationality Pakistani, Indian,
American
20
TYPES OF DATA
Data
Nature
Time Frame
Source
Quantitative
Data
Cross
Sectional
Data
Primary Data
Qualitative
Data
Time Series
Data
Secondary
Data
Longitudinal
Data
Prof.Muhammad Ilyas ,std.Muhammad Saeed
21
TYPES OF DATA
On the basis of Nature
22
23
SOURCES OF DATA
Primary Data Source: Primary data is such data
24
HOW
TO
COLLECT
PRIMARY
DATA
Survey method is used to collect primary data
WHAT IS A SURVEY?
25
Survey Design
1. Objectives of Survey
2. Survey Design
3. Pilot Test
4. Field work/Data Collection
5. Data Preparation
6. Data Analysis and Interpretation
7. Discussion and Conclusion
8. Report Writing
Prof.Muhammad Ilyas ,std.Muhammad Saeed
26
1: Objectives of Survey
The first step of survey design is to clearly define that
why we are going to conduct the survey.
27
2: Survey Design
How to survey
(Method)
28
How
to develop
Questionnaire?
1. Decide
what information
is required.
2. Draft some questions on each variable to
elicit the information
3. Put them into a meaningful order and format
4. Pre-test the questionnaire
5. Go back to Step 1, and continue until the
questionnaire is perfect.
29
3: Pilot Test
It is process of checking/assessing the accuracy of the
30
4: Fieldwork/conduct a survey
It is a process of collecting data actually from the
target sample. It can be done in following ways:
Postal survey
Online survey
31
5:
Data
Preparation
After getting your survey completed and knowing the interface
of the SPSS the next step is to prepare the data for analysis.
This process involves four steps.
1. Coding the questionnaire.
2. Defining the variables in SPSS variable view.
3. Entering the data in SPSS data view.
32
Descriptive Analysis
Inferential Analysis
33
Interpretation
Interpretation is a process of making sense of results
34
Report writing
Clarity of thoughts
Complete and self explanatory
Comprehensive and compact
Accurate in all aspects
Support facts
Suitable format for readers
Proper date and signature
reference
Reliable sources
Logical manner
Prof.Muhammad Ilyas ,std.Muhammad Saeed
35
36
8: Report Writing
37
38
39
40
41
42
43
44
45
46
Lecture #3
47
48
49
Introduction to SPSS
Before further processing of the data we should get to
know about SPSS software first .
SPSS
SPSS stands for statistical package for social sciences. It
is basically used for the analysis of quantitative data .
Programs
SPSS Inc.
SPSS 16.0
50
51
52
SPSS Interface
Title Bar
Menu Bar
Tool Bar
Variable definition criteria
Serial Number / Cases
Work sheet
SPSS Views
53
54
Data Entry
After defining the variables enter the data in data view for each
case (row wise) against each variable (column wise)
55
56
Data Processing
After collecting the data, data processing is
started that involves
1. Data coding
2. Defining the variables
3. Data entry in the software
4.Checking for error
57
SAMPLE QUESTIONNAIRE
Please circle or supply your answer
ID_________
SD
SA
1 2 3 4
4 5
5
5. My GPA is
_____________
58
Coding
Coding is the process of assigning numbers to the values or
levels of each variable.
Rules of Coding
1.
2.
3.
4.
5.
6.
7.
59
60
61
61
Lecture 4
62
63
64
65
66
67
Lecture #5
68
69
Session Objectives
After this session the students will be able to analyze the
collected data using descriptive statistics by
Producing summaries of data in both tabular and
graphical forms
Calculating the central tendencies using mean median
and modes
Calculate the dispersion of data using range, IQR and
Standard Deviation
Checking if the data is normally distributed using
Normal curve phenomenon
Prof.Muhammad Ilyas ,std.Muhammad Saeed
70
Analyzing Data
The process of breaking down the complex
data to gain better understanding of it.
There are two types of statistics
Descriptive statistics
Inferential statistics
In this session we will work on descriptive
statistics
SUPERIOR
GROUP OF COLLEGES
Prof.Muhammad
Ilyas ,std.Muhammad
Saeed
71
Descriptive statistics
Descriptive statistics are used to Describe, Summarize,
Organize, and Simplify data in quantitative terms. We will
cover
1. Summarizing Numerical Data
2. Measures of Central Tendency
3. Measurement of Dispersion
4. Checking Data Normality
SUPERIOR
GROUP OF COLLEGES
Prof.Muhammad
Ilyas ,std.Muhammad
Saeed
72
1. Summarizing
Variable
Categorical
Numerical
Frequency
Distribution Table
Five Figure
Summary
Bar chart
Box Plot /
Histograms
73
Frequency Distribution.
A frequency distribution is a tally or count of the number of times each score on a
single variable occurs
Analyze
Descriptive Statistics
frequency tables box is checked)
Frequencies
Percent
40.0
30.7
18.7
89.3
Valid
Muslims
Christians
Hindus
Total
Missing
other religion
5.3
blank
5.3
Total
10.7
75
100.0
Total
Cumulative
Valid Percent
Percent
44.8
44.8
34.3
79.1
20.9
100.0
100.0
74
Interpretation:
75
Bar Charts
With Nominal data, it is better to make a bar graph or chart of the frequency distribution of
variables like religion, ethnic group, or other nominal variables; the points that happen to
be adjacent in your frequency distribution are not by necessarily adjacent.
To get a bar chart select
Graphs
legacy dialogues
interactive
bar chart
OK
76
1.
2.
3.
4.
5.
Minimum value
Maximum Value
Median
Lower Quartile
Upper Quartile
77
78
4 6
8 14
8 7 8
5 16 5
3
6 6
10 6 8 18
12
79
legacy dialogues
interactive
box plot
OK
80
Interpretation
The case processing summary table shows the valid N=75,
with no missing values for total sample of 75 for the variable
math achievement. The plot shows a box plot for math
achievement. The box represents the middle 50% of the
cases (M=13), lower end of the box shows lower quartile
(Q1=7.67), and upper end of the quartile shows upper
quartile (17.00). The whiskers indicate the expected range
(25.33) of scores from minimum (Min=-1.67) to Maximum
(Max=23.67). Scores outside of this range are considered
unusually high or low, such scores are called outliers. There
are no outliers for in this case.
81
Histogram
Histograms are just like bar graph but there is no space between the boxes, indicating that there
is a continuous variable theoretically underlying the scores. Histograms can be used even if data,
as measured, are not continuous, if the underlying is conceptualized as continuous.
To draw a histogram select:
Graphs
legacy dialogues
interactive histogram
OK
82
Interpretation
In this frequencies (number of students), shown by
the bars are for a range of points (in this case SPSS
selected a range of 50: 250-299, 300-349, 350-399,
etc). Notice that the largest number of students
(about 20) had scores in the middle two bars of the
range (450-499 and 500-549).
Similar small
numbers of students have very low and very high
scores. The bars in the histogram form a distribution
(pattern or curve) that is similar to the normal, bell
shaped curve. Thus, the frequency distribution of
the SAT math scores is said to be approximately
normal.
Prof.Muhammad Ilyas ,std.Muhammad Saeed
83
84
Exercise
Analyze
Descriptive statistics
Frequencies
click continue
click on statistics
mark
Ok
Statistics
scholastic aptitude test - math
N
Valid
75
Missing
0
Mean
490.53
Median
490.00
Mode
500
Prof.Muhammad Ilyas ,std.Muhammad Saeed
85
Measures of Variability
RangeThe range (highest minus lowest score)
is the crudest measure of variability but does
give an indication of the spread in scores if they
are ordered.
Inter quartile range (IQR)IQR=Q3-Q1
Standard DeviationThe standard deviation is
based on the deviation (x) of each score from
the mean of all scores.
86
Analyze
Descriptive statistics
Frequencies
click continue
click on statistics
mark
Ok
87
Descriptive Statistics
The Normal Curve
The frequency distributions of many of the variables used in the behavioral
sciences are distributed approximately as a normal curve when N is large.
Properties of Normal Curve
1. The mean, median and mode are equal.
2. It has one hump and this hump is in the middle of the distribution.
3. The curve is symmetric. If you fold the normal curve in half, the right side would
fit perfectly with the left side; that is, it is not skewed.
4. The range is infinite.
5. The curve is neither too peaked nor too flat and its tails are neither too short nor
too long.
88
Nominal
Dichotomous
Ordinal
Normal
Frequency Distribution
Yes
Yes
Yes
Ok
Bar Chart
Yes
Yes
Yes
OK
Histogram
No
No
OK
Yes
Frequency Polygon
No
No
OK
Yes
No
No
Yes
Yes
Mean
No
OK
OK
Yes
Median
No
OK
Yes
OK
Mode
Yes
Yes
OK
OK
Range
No
Always 1
Yes
Yes
Standard Deviation
No
No
OK
Yes
Interquartile Range
No
No
OK
OK
Yes
Always 2
OK
No
No
No
Yes
Yes
Central Tendency
Variability
Shape
Skewness
89
90
90
Lecture #6
91
92
93
Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the
associational and differential research questions using different
parametric and non parametric tests.
understand implement and interpret the chi-square, phi and
cramers V
understand, implement and interpret the correlation statistics
understand, implement and interpret the regression statistics
understand, implement and interpret the T-test statistics
94
Lesson Outline
1.Non parametric test.
1.Chi square /Fisher exact
2.Phi and cramers v
3.Kendall tau-b
2.Parametric test
1.Correlation
1.Pearson correlation
2.Spearman correlation
2.Regression
1.Simple regression
2.Multiple regression
3.T-Test
1.One-sample T-test
2.Independent sample T-test
3.Paired sample T-test
95
Inferential Statistics
Inferential statistics are used to make inferences
(conclusions) about a population from a sample
based on the statistical relationships or differences
between two or more variables using statistical tests
with the assumption that sampling is random in
order to generalize or make predictions about the
future.
96
Inferential Statistics
Inferential statistics are used
To test some hypothesis either to check relationship between
97
98
Confidence Interval
Confidence interval is a range of values constructed for a
variable of interest so that this range has a specified probability
of including the true value of the variable. The specified
probability is called the confidence level, and the end points of
the confidence interval are called the confidence limits.
It is one of the alternatives to null hypothesis significance testing
(NHST).
99
No effect
Small effect
Medium/typical effect
No relationship
Weak relationship
Moderate relationship
>0.70 <1
1
Large effect
Maximum effect
Strong relationship
Perfect relationship
100
or accepted p value
4. State what is the direction of the effect
5. Conclude the results
101
102
103
104
Chi-Squared Test
Assumptions and Conditions for the Chi-Squared
test
The data of the variables must be independent.
Both the variables should be nominal.
All the expected counts are greater than 1 for chi-square.
At least 80% of the expected frequencies should be greater
than or equal to 5.
105
Chi-Squared Test
Checking Assumptions and Conditions for the Chi-Squared test
geometry in h.s. * gender Crosstabulation
gender
male
geometry in h.s. not taken
Count
Expected Count
% of Total
Taken
Count
Expected Count
% of Total
Total
Count
Expected Count
% of Total
Prof.Muhammad Ilyas ,std.Muhammad Saeed
female
Total
10
29
39
17.7
21.3
39.0
13.3%
38.7%
52.0%
24
12
36
16.3
19.7
36.0
32.0%
16.0%
48.0%
34
41
75
34.0
41.0
75.0
45.3%
54.7%
100.0%
106
N
geometry in h.s. * gender
Missing
Percent
75
100.0%
Total
Percent
0
.0%
Percent
75
100.0%
Chi-Square Tests
Value
Pearson Chi-Square
df
12.714a
.000
Continuity Correctionb
11.112
.001
Likelihood Ratio
13.086
.000
Linear-by-Linear Association
N of Valid Casesb
.000
12.544
.000
.000
75
Prof.Muhammad Ilyas ,std.Muhammad Saeed
107
Symmetric Measures
Value
Nominal by Nominal
Phi
Cramer's V
N of Valid Cases
Approx. Sig.
-.412
.000
.412
.000
75
108
Interpretation:
To check the association between gender and geometry in h.s. chi-square test is conducted. The
case processing summary table indicates that there is no participant with missing value. The
assumptions are checked through crosstabs. The Crosstabulation table includes the Counts and
Expected Counts, and their relative percentages within gender. The result shows that there are 24
males who had taken geometry which is 71% of total 34 male students. On the other hand, 12 of 41
females took geometry; that is only 29% of the females. It looks like a higher percentage of males
took geometry than female students. The Ch-Square Test table tell us whether we can be confident
that this apparent difference is not due to chance.
Note, in the Cross Tabulation table, that the Expected Count of the number of male students who
didnt take geometry is 17.7 and the observed or actual Count is 10. Thus, there are 7.7 fewer
males who didnt take geometry than would be expected by chance, given the Totals shown in the
Table. There are also the same discrepancies between observed and expected counts in the other
three cells of the table. A question answered by the chi-square test is whether these discrepancies
between observed and expected counts are bigger than one might expect by chance.
The Chi-Square Tests table is used to determine if there is a statistically significant relationship
between two dichotomous or nominal variables. It tells you whether the relationship is statistically
significant but does not indicate the strength of the relationship, like phi or a correlation does. In
output, we use the Pearson Chi-Square or (for small samples) the Fishers exact test to interpret
the results of the test. They are statistically significant (p < .001), which indicates that we can be
quite certain that males and females are different on whether they take geometry.
Phi is -.412, and like the chi-square, it is statistically significant. Phi is also a measure of effect size
for an associational statistic and, in this case, effect size is moderate according to Cohen (1988)
Prof.Muhammad Ilyas ,std.Muhammad Saeed
109
KENDALLS TAU-B
If the variables are ordered (i.e. ordinal), you have several other choices.
We will use Kendalls tau-b in this problem.
110
Missing
Percent
73
Total
Percent
97.3%
Percent
2.7%
75
100.0%
Count
53
35.6
13.1
4.4
53.0
58.9%
11.0%
2.7%
72.6%
10
18
Expected Count
12.1
4.4
1.5
18.0
% of Total
8.2%
13.7%
2.7%
24.7%
Expected Count
1.3
.5
.2
2.0
% of Total
.0%
.0%
2.7%
2.7%
49
18
73
49.0
18.0
6.0
73.0
67.1%
24.7%
8.2%
100.0%
% of Total
Total
Total
43
Expected Count
Count
Count
Count
Expected Count
% of Total
111
Symmetric Measures
Value
Ordinal by Ordinal
Approx. Tb
Approx. Sig.
Kendall's tau-b
.494
N of Valid Cases
.108
3.846
.000
73
112
Interpretation:
To investigate the relationship between fathers education
and mothers education, Kendalls tau-b was used. The
analysis indicated a significant positive association between
fathers education and mothers education, tau =.572,
p<.001. This means that more highly educated fathers were
married to more highly educated mothers and less educated
fathers were married to less educated mothers. This tau is
considered to be a large effect size (Cohen, 1988).
113
114
Interpretation
Eta was used to investigate the strength of the association
between gender and number of math courses taken
(eta=.33). This is a weak to medium effect size (Cohen,
1988). Males were more likely to take several or all the math
courses than females.
115
116
116
117
Descriptive Statistics
118
119
Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the
associational and differential research questions using different
parametric and non parametric tests.
understand implement and interpret the chi-square, phi and
cramers V
understand, implement and interpret the correlation statistics
understand, implement and interpret the regression statistics
understand, implement and interpret the T-test statistics
120
Lesson Outline
1.Non parametric test.
1.Chi square /Fisher exact
2.Phi and cramers v
3.Kendall tau-b
4.Eta
2.Parametric test
1.Correlation
1.Pearson correlation
2.Spearman correlation
2.Regression
1.Simple regression
2.Multiple regression
3.T-Test
1.One-sample T-test
2.Independent sample T-test
3.Paired sample T-test
Prof.Muhammad Ilyas ,std.Muhammad Saeed
121
Correlation
Correlation is a statistical process that determines the mutual (reciprocal)
value.
The direction of relationship that is defined by the sign (+,-) of the test value
The strength of relationship that is defined by the test value
Correlation Coefficient (r)
The correlation coefficient measures the strength of linear relationship between two
or more numerical variables. The value of correlation coefficient can vary from -1.0
(a perfect negative correlation or association) through 0.0 (no correlation) to +1.0 (a
perfect positive correlation). Note that +1 and -1 are equally high or strong
122
Correlation
Scores on one variable are normally distributed for each value of the
123
Correlation
Checking the assumptions for Pearson Correlation
The assumptions for correlation test are checked through
normal curve (normality assumption) and the scatter plot
(linearity assumption)
Statistics
math
scholastic
achievement aptitude test test
math
N
Valid
75
75
Missing
0
0
Skewness
.044
.128
Std. Error of Skewness
.277
.277
Prof.Muhammad Ilyas ,std.Muhammad Saeed
124
Correlations
math
scholastic
achievement aptitude test
test
- math
math achievement test Pearson
Correlation
Sig. (2-tailed)
N
scholastic aptitude test Pearson
- math
Correlation
Sig. (2-tailed)
.788**
Correlation
.000
75
75
.788**
.000
N
75
**. Correlation is significant at the 0.01 level (2-tailed).
75
Interpretation
To investigate if there was a statistically significant association between Scholastic aptitude
test and math achievement, a correlation was computed. Both the variables were
approximately normal there is linear relationship between them hence fulfilling the
assumptions for Pearson's correlation. Thus, the Pearsons r is calculated, r= 0.79, p = .000
relating that there is highly significant relationship between the variables. The positive sign
of the Pearson's test value shows that there is positive relationship, which means that
students who have high scores in math achievement test do have high scores in scholastic
aptitude test and vice versa. Using Cohens (1988) guidelines the effect size is large relating
Prof.Muhammad
Ilyasmath
,std.Muhammad
Saeed and scholastic aptitude test.125
that there is strong relationship
between
achievement
Correlation
126
Correlation
Correlationsa
math
mother's achieveme
education nt test
Spearman's rho mother's
education
Correlation
Coefficient
Sig. (2-tailed)
math
Correlation
achievement test Coefficient
Sig. (2-tailed)
**. Correlation is significant at the 0.01 level (2tailed).
1.000
3.15**
.006
.315**
1.000
.006
Interpretation
To investigate if there was a statistically significant association between mothers education
and math achievement, a correlation was computed. Mothers education was skewed
(skewness=1.13), which violated the assumption of normality. Thus, the spearman rho
statistic was calculated, r, (73) = .32, p = .006. The direction of the correlation was positive,
which means that students who have highly educated mothers tend to have higher math
achievement test scores and vice versa. Using Cohens (1988) guidelines the effect size is
medium for studies in his area. The r2 indicates that approximately 10% of the variance in
127
REGRESSION ANALYSIS
Regression analysis is used to measure the relationship between two or
Regression Equation
Y = a + bx
128
REGRESSION ANALYSIS
Simple Regression
Simple regression is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variable is one.
Assumptions and conditions of simple regression
Dependent variable should be scale
The relationship of variables should be liner
Data should be independent
high school
Prof.Muhammad Ilyas ,std.Muhammad Saeed
129
REGRESSION
ANALYSIS
Commands
Analyze
Regression
Linear
130
REGRESSION
ANALYSIS
Coefficientsa
Unstandardized
Coefficients
Model
1
(Constant)
.397
grades in h.s.
Standardized
Coefficients
Std. Error
2.530
2.142
a. Dependent Variable: math achievement test
.430
Beta
.504
.157
Sig.
.876
4.987
.000
Interpretation
Simple regression was conducted to investigate how well grades in highschool predict
math achievement scores. The results were statistically significant F (1, 73 ) = 24.87,
p<.001. The indentified equation to understand this relationship was math achievement =
.40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24%
of the variance in math achievement was explained by the grades in high school.
According to Cohen (1988), this is a large effect.
Regression equation is
Y = 0.40 + 2.14X
Prof.Muhammad Ilyas ,std.Muhammad Saeed
131
REGRESSION
ANALYSIS
Multiple Regression
Multiple regressions is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variables are more than one.
Assumptions and conditions of Multiple regression
Dependent variables should be scale.
132
Commands
Analyze
Regression
REGRESSION
ANALYSIS
Linear
133
Coefficient
Model
1 (Constant)
grades in h.s.
father's education
mother's education
gender
Unstandardized Standardized
Coefficients
Coefficients
B
Std. Error
Beta
1.047
2.526
1.946
.427
.465
.191
.313
.083
.406
.375
.141
-3.759
1.321
-.290
T
.415
4.560
.610
1.084
-2.846
Sig.
.680
.000
.544
.282
.006
Interpretation
Simultaneously multiple regression was conducted to investigate the best predictors of
math achievement test scores. The means, standard deviation, and inter correlations
can be found in table. The combination of variables to predict math achievement from
grades in high school, fathers education, mothers education and gender was
statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last
table. Note that high grades and male gender significantly predict math achievement
when all four variables are included. The adjusted R2 value was 0.343. This indicates
that 34 % of the variance in math achievement was explained by the model according
to Cohen (1988), this is a large effect.
Prof.Muhammad Ilyas ,std.Muhammad Saeed
134
REGRESSION ANALYSIS
Regression analysis is used to measure the relationship between two or
Regression Equation
Y = a + bx
135
REGRESSION ANALYSIS
Simple Regression
Simple regression is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variable is one.
Assumptions and conditions of simple regression
Dependent variable should be scale
The relationship of variables should be linear
Data should be independent
high school
Prof.Muhammad Ilyas ,std.Muhammad Saeed
136
REGRESSION
ANALYSIS
Commands
Analyze
Regression
Linear
137
REGRESSION
ANALYSIS
Coefficientsa
Unstandardized
Coefficients
Model
1
B
(Constant)
grades in h.s.
.397
Standardized
Coefficients
Std. Error
2.530
2.142
a. Dependent Variable: math achievement test
.430
Beta
t
.504
.157
Sig.
.876
4.987
.000
Interpretation
Simple regression was conducted to investigate how well grades in high school predict
math achievement scores. The results were statistically significant F (1, 73 ) = 24.87,
p<.001. The indentified equation to understand this relationship was math achievement =
.40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24%
of the variance in math achievement was explained by the grades in high school.
According to Cohen (1988), this is a large effect.
Regression equation is
Y = 0.40 + 2.14X
138
REGRESSION
ANALYSIS
Multiple Regression
Multiple regressions is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variables are more than one.
Assumptions and conditions of Multiple regression
Dependent variables should be scale.
139
Commands
Analyze
Regression
REGRESSION
ANALYSIS
Linear
140
Model
1 (Constant)
grades in h.s.
father's education
mother's education
gender
Coefficient
Unstandardized Standardized
Coefficients
Coefficients
B
Std. Error
Beta
1.047
2.526
1.946
.427
.465
.191
.313
.083
.406
.375
.141
-3.759
1.321
-.290
T
.415
4.560
.610
1.084
-2.846
Sig.
.680
.000
.544
.282
.006
Interpretation
Simultaneously multiple regression was conducted to investigate the best predictors of
math achievement test scores. The means, standard deviation, and inter correlations
can be found in table. The combination of variables to predict math achievement from
grades in high school, fathers education, mothers education and gender was
statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last
table. Note that high grades and male gender significantly predict math achievement
when all four variables are included. The adjusted R2 value was 0.343. This indicates
that 34 % of the variance in math achievement was explained by the model according
to Cohen (1988), this is a large effect.
Prof.Muhammad Ilyas ,std.Muhammad Saeed
141
142
142
143
144
T-TEST Statistics
The t test is used to compare to groups to answer the differential
145
T-TEST Statistics
One sample t-test
One sample t-test is used to determine if there is difference between
population mean (Test value) and the sample mean (X)
population
The data are independent.(scores of one participant are not depend on
scores of the other :participant are independent of one another )
146
T-TEST Statistics
One-Sample Statistics
N
scholastic aptitude test math
Mean
75
Std. Deviation
490.53
94.553
10.918
One-Sample Test
Test Value = 500
t
scholastic aptitude
test - math
-.867
Sig. (2tailed)
Df
74
.389
Mean
Difference
-9.467
Upper
12.29
147
Interpretation:
To investigate the difference between population and the sample, one-sample
t-test is conducted. The One-Sample Statistics table provides basic
descriptive statistics for the variable under consideration. The Mean AT-Math
for the students in the sample will be compared to the hypothesize population
mean, displayed as the Test Value in the One-Sample Test table. On the
bottom line of this table are the t value, df, and the two-tailed sig. (p) value,
which are circled. Note that p=.389 so we can say that the sample mean
(490.53) is not significantly different from the population mean of 500. The
table also provides the difference (-9.47) between the sample and population
mean and the 95% Confidence Interval. The difference between the sample
and the population mean is likely to be between +12.29 and -31.22 points.
Notice that this range includes the value of zero, so it is possible that there is
148
T-TEST Statistics
Independent sample t-test
149
T-TEST Statistics
The first table, Group Statistics, shows descriptive statistics for the two groups (males and females)
separately. Note that the means within each of the three pairs look somewhat different. This might be due
to chance, so we will check the t test in the next table.
The second table, Independent Sample Test, provides two statistical tests. The left two columns of
numbers are the Levenes test for the assumption that the variances of the two groups are equal. This is
not the t test; it only assesses an assumption! If this F test is not significant (as in the case of math
achievement and grades in high school), the assumption is not violated, and one uses the Equal variances
assumed line for the t test and related statistics. However, if Levenes F is statistically significant (Sig. <.05),
as is true for visualization, then variances are significantly different and the assumption of equal variances
is violated. In that case, the Equal variances not assumed line used; and SSPS adjusts t, df, and Sig. The
appropriate lines are circled. Prof.Muhammad Ilyas ,std.Muhammad Saeed
150
Thus, for visualization, the appropriate t=2.39, degree of freedom (df) = 57.15, p=.020. This t is statistically
significant so, based on examining the means, we can say that boys have higher visualization scores than
girls. We used visualization to provide an example where the assumption of equal variances was violated
(Levenes test was significant). Note that for grades in high school, the t is not statistically significant
(p=.369) so we conclude that there is no evidence of a systematic difference between boys and girls on
grades. On the other hand, math achievement is statistically significant because p<.05; males have higher
means.
The 95% Confidence Interval of the Difference is shown in the two right-hand column of the output. The
confidence interval tells us if we repeated the study 100 times, 95 of the times the true (population)
difference would fall within the confidence interval, which for math achievement is between 1.05 points
and 6.97 points. Note that if the Upper and Lower bounds have the same sign (either + and + or and -),
we know that the difference is statistically significant because this means that the null finding of zero
difference lies outside of the confident interval. On the other hand, if zero lies between the upper or lower
limits, there could be no difference, as is the case of grades in h.s. The lower limit of the confidence
interval on math achievement tells us that the difference between males and females could be as small as
1.05 points out 25, which are the maximum possible scores.
Effects size measures for t tests are not provided in the printout but can be estimated relatively easily. For
math achievement, the difference between the means (4.01) would be divided by about 6.4, an estimate
of the pooled (weighted average) standard deviation. Thus, d would be approximately .60, which is,
according to Cohen (1988), a medium to large sized effect. Because you need means and standard
deviations to compute the effect size, you should include a table with means and standard deviations in
your results section for a full interpretation of t tests.
Prof.Muhammad Ilyas ,std.Muhammad Saeed
151
T-TEST Statistics
Paired sample t-test
Paired sample T-test is used to compare two paired groups (e.g. Mothers
and fathers) with respect to there effect on same dependent variable.
Assumptions and conditions of Paired sample T-test
The independent variable is dichotomous and its levels (or groups) are
education?
152
Mean
Pair 1
father's education
mother's education
Std. Deviation
4.73
73
2.830
.331
4.14
73
2.263
.265
Pair 1
Correlation
Sig.
.681
.000
153
The first table shows the descriptive statistics used to compare mothers and
fathers education levels. The second table Paired Samples Correlations, provides
correlations between the two paired scores. The correlation (r=.68) between
mothers and fathers education indicates that highly educate men tend to marry
highly educated women and vice versa. It doesnt tell you whether men or women
have more education. That is what t in the third table tells you.
The last table shows the Paired Samples t Test. The Sig. for the comparison of the
average education level of the students mothers and fathers was p=.019. Thus, the
difference in educational level is statistically significant, and we can tell from the
means in the first table that fathers have more education; however, the effect size is
small (d=.28), which is computed by dividing the mean of the paired differences
(.59) by the standard deviation (2.1) of the paired differences. Also, we can tell from
the confidence interval that the difference in the means could be as small as .10 of a
point or as large as 1.08 points on the 2 to 10 scale.
154
Thank you!
155