Quantative Techniqes

1
Quantitative Techniques in
Business (QTB)
Quantitative Techniques are the techniques used to
gather, sort, analyze and interpret numerical data in

order to improve business decisions.
Numerical data
Numerical data (or quantitative data) is data measured or identified on a
numerical scale. Numerical data can be analysed using statistical methods, and
results can be displayed using tables, charts, histograms and graphs
Examples
Company sales (millions)
18, 12, 20, etc
Number of employees in company(hundreds)
15, 8, 5, etc
Prof.Muhammad Ilyas ,std.Muhammad Saeed
Why Study QTB

Studying QTB is essential as it enables to
Gather, sort, analyze and interpret the data
Have needed, timely, accurate, yet relevant information.
Understand and compare different types of situations
Predict and forecast about the future needs of the
business
Develop effective policies and business related strategies
Make effective decisions to achieve business goals
efficiently
Research is based on QTB
Final thesis is based on QTB
Research Problem
Any problem or opportunity that needs to be addressed through
research process of data collection and analysis is called
Research Problem
Examples
Human Resource manager wants to develop HR policies regarding
employees turnover in order to reduce it.

Marketing manager wants to launch a new product successfully
using advertisement as promotional tool
Finance manager needs to invest excessive money profitably
Problem Statement
A problem statement is a clear and concise description
of any business issue that seeks for Description,

Association or difference of two or more variables.
Example
Measure the annual turnover of employees in Higher educational
sector of Pakistan
Does advertisement contribute to the sales of a new product in the
market
Which of the two options i.e. stock market or real estate is better for
investment.
What is Variable?
Vary + able = Change + able
Variable is a characteristic of anything that can vary
(Change).
Examples
Gender
Age
Motivation level
(Male, Female)
(20 years, 30 years, 50 years)
(High, Medium, Low)
Constant is a characteristic that do not vary

e.g. If all students are male in a class then Gender will be constant
Types of Variable
with respect to relation
Budget
Advertisement
Awareness
Sales
Competitors product, price,

packaging, placement
Types of Variable
with respect to data
Variable
Categorical
Nominal
Ordinal
Gender
Motivation
1.
2.
1.
2.
Male
Female
Numerical
3.
Discrete
1. No of students
Highly Motivated
2. No of chairs
Moderately
3. Collar size
Motivated
Less Prof.Muhammad
Motivated
Ilyas ,std.Muhammad
Saeed
Continuous
1.
2.
3.
Height
Weight
speed
8
Categorical Variable
A variable whose values are not numerical in nature
Variables
Values
Gender
Male, female
Religion
Islam, christianity, Jews, etc
Motivation level
High, medium, low
Types of Categorical variable

1. Nominal variable
A categorical variable whose values are not ordered
Example
Gender
Male, Female
2. Ordinal variable
A categorical variable whose values are in ordered
Example
Education
Metric, inter, graduation
Numerical Variable
A variable whose values are numerical in nature
Variables
Values
Collar size
14, 14.5, 15, 15.5.
Height
5.7, 5.8, 5.3
No of employees
23, 45, 69, 100
Types of Numerical variable

1. Discrete variable
A numerical variable whose values have same interval
Example
collar size
14.5, 15, 15.5..
2. continuous variable
A numerical variable whose values dont have same interval
Example
speed
40.1, 45, 67.
10
Research Question
Research problem needs to be translated into one or
more research questions that are defined as
A research question is an interrogative statement that
seeks for the tentative relationship among variables
and clarifies what the researcher wants to answer.
Example
What is the impact of advertisement on sales of a new product in
the market
What is the annual turnover of employees in Higher educational
institutions of Pakistan
Does investing in stock market yield more return on investment as
compare to investment in real estate.
11
Type of Research Question

Descriptive:
A question that is answered through Summarising

data about a single variable
E.g.: What is the annual turnover of employees in Higher
educational institutions of Pakistan
Associational:
A question that is answered through

determining
strength
and
direction
of
relationship between two or more variables
E.g.: What is the impact of advertisement on sales of a new product in
the market
Difference:
and
A question that is answered through comparing

contrasting two groups or variables
E.g.: Does investing in stock market yield more return on investment as
compare to investment in real estate.
12
Research Hypotheses
Research hypotheses are predictive statements about
the relationship between two variables

Types of Hypothesis
There are two types of hypothesis
1. Null Hypothesis
Ho = There is no relationship between Advertising and Sales
2. Alternative Hypothesis
H1 = There is relationship between advertising and sales
13
Research Question Vs. Hypothesis

Research question
Hypothesis
o Interrogative statement
Simple statement
oNon-Predictive
Predictive
oNon-Directional
Directional
14
Activity
In groups of four, use the variables provided to write:
An associational question
A difference question
A descriptive question
15
Data
Set of raw facts figures is called Data
Example:
Age- 16, 18, 20, 21, 23,

Nationality- Pakistani, Indian, American
Types of Data
Data
Nature
Qualitative
Quantitative
Time frame
Crosssectional
Time-Series
16
Thank You!
17
Lecture #2
I am really thankful to my gorgeous teachers Sir
Dr.Muhammad Ilyas , for that great knowledge.
18
Primary and Secondary Data
19
DATA
A set of raw facts and figures are called data
OR
Example: Age 16, 18, 20, 21, 23
Nationality Pakistani, Indian,
American
20
TYPES OF DATA
Data
Nature
Time Frame
Source
Quantitative
Data
Cross
Sectional
Data
Primary Data
Qualitative
Data
Time Series
Data
Secondary
Data
Longitudinal
Data
21
TYPES OF DATA
On the basis of Nature
Nature wise data can be of two types:

1) Quantitative Data: A data that consist of numbers
for example data about age consists of values like 16,
18, 20, 21, 23
2) Qualitative Data: A data that consists of words
rather than numbers. For example nationality
includes Pakistani, Indian, American
22
TYPES OF DATA Cont

On the basis of Time Frame:
Time wise data can be categorized into two

1) Cross-Sectional Data: Data that is collected from
different units at once
2) Time Series Data: Data that is collected from same
units on different time with same time interval
3) Longitudinal Data: A dataset is longitudinal if it
tracks the same type of information on the same
subjects at multiple points in time.
23
SOURCES OF DATA
Primary Data Source: Primary data is such data
which comes from an original source and are collected

with a specific research question in mind
For example: You want to collect data on
Employee Motivation
Secondary Data Source: Secondary data represents
the previously recorded data collected for another
purpose.
For example: You want to collect the data on profit
of MCB Bank for 5 years
24
HOW
TO
COLLECT
PRIMARY
DATA
Survey method is used to collect primary data
WHAT IS A SURVEY?
Survey is a quantitative research strategy that involves

the structured collection of data from a predetermined sample.
25
Survey Design
1. Objectives of Survey
2. Survey Design
3. Pilot Test
4. Field work/Data Collection
5. Data Preparation
6. Data Analysis and Interpretation
7. Discussion and Conclusion
8. Report Writing
26
1: Objectives of Survey
The first step of survey design is to clearly define that
why we are going to conduct the survey.
Example: The basic aim of survey is to collect updated, accurate yet

relevant data in order to answer a research problem
27
2: Survey Design
After setting objectives of survey we develop the plan

(design) of survey deciding that:
Whom to survey (Sample Selection)
Where to survey (Site Selection)
How to survey
What to survey (Questions for required information)
(Method)
28
How
to develop
Questionnaire?
1. Decide
what information
is required.
2. Draft some questions on each variable to
elicit the information
3. Put them into a meaningful order and format
4. Pre-test the questionnaire
5. Go back to Step 1, and continue until the
questionnaire is perfect.
29
3: Pilot Test
It is process of checking/assessing the accuracy of the
wording sequence and ability to understand the

question by conducting survey from one or two
respondent as a trail in order to refine questionnaire.
30
4: Fieldwork/conduct a survey
It is a process of collecting data actually from the
target sample. It can be done in following ways:
Self administered survey
Postal survey
Online survey
31
5:
Data
Preparation
After getting your survey completed and knowing the interface
of the SPSS the next step is to prepare the data for analysis.
This process involves four steps.
1. Coding the questionnaire.
2. Defining the variables in SPSS variable view.
3. Entering the data in SPSS data view.
32
6: Data Analysis and Interpretation

It is a process of summarizing, organizing and transforming data with
the goal of highlight the useful information, suggesting conclusions in

order to answer the research question and support good decision
making.
Data can be analyzed in two ways:
Descriptive Analysis
Inferential Analysis
33
Interpretation
Interpretation is a process of making sense of results
by explaining and assigning meaning to them.
34
Report writing
Clarity of thoughts
Complete and self explanatory
Comprehensive and compact
Accurate in all aspects
Support facts
Suitable format for readers
Proper date and signature
reference
Reliable sources
Logical manner
35
7: Discussion and Conclusion

Discussion:( same result like previous result or
change discussion phase like discussion with proper
background
2. Conclusion: (what is our result of your study and
what we achieve)
1.
36
8: Report Writing
37
HOW TO COLLECT SECONDARY

DATA
Sources that are used to collect secondary data can be:
Documentary
Government survey
Academic survey
Companys financial statements
Bank reports
38
Important links of Secondary Data

www.wdi.com
www.pwt.com
www.ifs.com
www.fbs.com
www.sbp.com
39
Step: 1 Search WDI
40
Step 2: Click Data Bank on the WDI

Web Page
41
Step 3: Select your Desired Country

for the Extraction of Data
42
Step 4: Select Required Variables

from the list
43
Step 5: Select Years
44
Step 6: View Data
45
Step 7: Click on Excel to Export

Data
46
Lecture #3
47
In the name of Allah Kareem,

Most Beneficent, Most Gracious,
the Most Merciful !
48
49
Introduction to SPSS
Before further processing of the data we should get to
know about SPSS software first .
SPSS
SPSS stands for statistical package for social sciences. It
is basically used for the analysis of quantitative data .
How to open SPSS

Start
Menu
Programs
SPSS Inc.
SPSS 16.0
50
51
Welcome window SPSS 16.0
52
SPSS Interface
Title Bar
Menu Bar
Tool Bar
Variable definition criteria
Serial Number / Cases
Work sheet
SPSS Views
53
54
Data Entry
After defining the variables enter the data in data view for each
case (row wise) against each variable (column wise)
55
56
Data Processing
After collecting the data, data processing is
started that involves
1. Data coding
2. Defining the variables
3. Data entry in the software
4.Checking for error
57
SAMPLE QUESTIONNAIRE
Please circle or supply your answer
ID_________
SD
SA
1. I would recommend this course to other students 1 2

2. I worked very hard in this course
1 2 3 4
4 5
5
3. My college is : Arts & sciences___ _ Business____ Engineering____

4. My gender is
5. My GPA is
_____________
6. For this class, I did: (Check all that apply

The reading
The homework
Extra credit
58
Coding
Coding is the process of assigning numbers to the values or
levels of each variable.
Rules of Coding
1.
2.
3.
4.
5.
6.
7.
All data should be numeric.

Each variable for each case or participant must occupy the same
column.
All values (codes) for a variable must be mutually exclusive.
Each variable should be coded to give maximum information.
For each participant, there must be a code or value for each
variable.
Apply any coding rule consistently for all participants.
Use high numbers (values or codes) for the agree, good, or
positive end of a variable that is ordered
59
Defining the variables

In SPSS first of all the variables are defined in variable view
This includes
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Name of the variable (Short without space)

Type
(Numeric, String)
Width
(8, 10, etc)
Decimals
(2, 3, 5 etc)
Label
(Full name of the variable)
Values
(answer categories with codes)
Missing
(blank, multiple, wrong answers)
Columns
(6, 8, 10 etc)
Align
(Left, right, centre)
Measure
(Nominal, ordinal, scale)
60
SUPERIOR GROUP OF COLLEGES
61
61
Lecture 4
62
Data File Management
63
After this lecture you would:

Learn four useful data transformation techniques:
Count
Recode (Revise and Reverse)
Compute a new variable
64
Problem 5.1: Count Math Courses Taken

How many math courses (algebra1, algebra2, geometry,
trigonometry and calculus) did each of the 75
participants take in high school? Label your new
variable
65
Problem 5.2: Recode and Relabel Mothers and Fathers Education
Recode mothers and fathers education so that those

with no postsecondary education have a value of 1,
those with some posts secondary have a value of 2, and
those with a bachelors degree or more have a value of
3. Label the new variables and values
66
Problem5.3: Recode and Compute Pleasure Scale Score
Compute the average pleasure scale from item02,

item06, item10 and item14 after reversing (use the
Recode command) item06 and item10. Name the new
computed variable pleasure and label its highest and
lowest values.
67
Lecture #5
68
69
Session Objectives
After this session the students will be able to analyze the
collected data using descriptive statistics by
Producing summaries of data in both tabular and
graphical forms
Calculating the central tendencies using mean median
and modes
Calculate the dispersion of data using range, IQR and
Standard Deviation
Checking if the data is normally distributed using
Normal curve phenomenon
70
Analyzing Data
The process of breaking down the complex
data to gain better understanding of it.
There are two types of statistics
Descriptive statistics
Inferential statistics
In this session we will work on descriptive
statistics
SUPERIOR
GROUP OF COLLEGES
Prof.Muhammad
Ilyas ,std.Muhammad
Saeed
71
Descriptive statistics are used to Describe, Summarize,
Organize, and Simplify data in quantitative terms. We will
cover
1. Summarizing Numerical Data
2. Measures of Central Tendency
3. Measurement of Dispersion
4. Checking Data Normality
SUPERIOR
GROUP OF COLLEGES
Prof.Muhammad
Ilyas ,std.Muhammad
Saeed
72
1. Summarizing
Variable
Categorical
Numerical
Frequency
Distribution Table
Five Figure
Summary
Bar chart
Box Plot /
Histograms
73
Summarizing categorical data
Frequency Distribution.
A frequency distribution is a tally or count of the number of times each score on a
single variable occurs
Analyze
Descriptive Statistics
frequency tables box is checked)
Frequencies
move religion to the variable box
OK (make sure that the Display
Frequency table for religion

religion
Frequency
30
23
14
67
Percent
40.0
30.7
18.7
89.3
Valid
Muslims
Christians
Hindus
Total
Missing
other religion
5.3
blank
5.3
Total
10.7
75
100.0
Total
Cumulative
Valid Percent
Percent
44.8
44.8
34.3
79.1
20.9
100.0
100.0
74
Interpretation:
In this example, there is a Frequency column that shows the

numbers of students who marked each type of religion (e.g., 30
said Muslims, 23 Christians, 14 Hindus, 4 is missing and 4 left
it blank).Notice that there are a total of (67) for the three
responses considered Valid and a total (8) for the two types of
responses considered to be Missing as well as an overall total
(75). The Percent column indicates that 40.0% are Muslims ,
30.7% are Christians , 18.7% are Hindus, 5.3% had one of several
other religions, and 5.3% left the question blank. The Valid
Percentage column excludes the eight missing cases and is often
the column that you would use. Given this data set, it would be
accurate to say that of those not coded as missing, 44.8% were
Muslims and 34.3% Christians and 20.9% were Hindus.
75
Summarizing categorical data
Bar Charts
With Nominal data, it is better to make a bar graph or chart of the frequency distribution of
variables like religion, ethnic group, or other nominal variables; the points that happen to
be adjacent in your frequency distribution are not by necessarily adjacent.
To get a bar chart select
Graphs
legacy dialogues
interactive
bar chart
move variable to the box
OK
76
Summarizing Numerical data
Five Figure Summary
It is used to summarize the Numerical data. Five figures include

locating the following values in data
1.
2.
3.
4.
5.
Minimum value
Maximum Value
Median
Lower Quartile
Upper Quartile
77
Exercise: Calculate Five Figure Summary

2 1 3 2 1 4 3 5 8 8 7 7 4 5 6 2 6 6 6 6
Department B: 20 employees
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1 1 2 2 2 3 3 4 4 5 5 6 6 6 6 6 7
18 19 20
7 8 8
Min
=1
Max
=8
Median
=5
Lower Quartile
= 2.5
Upper Quartile
=6
78
Exercise: Calculate Five Figure Summary

Department B: 30 employees
4 6
8 14
8 7 8
5 16 5
3
6 6
10 6 8 18
12
79
box & whisker plot

For ordinal and normal data, the box and whiskers plot is useful The box and whisker
plot is a graphical representation of distribution of scores and is helpful in distinguishing
between ordinal and normally distributed data
Graphs
legacy dialogues
interactive
box plot
move gender to the x-axis and move SAT math to y-axis
OK
80
Interpretation
The case processing summary table shows the valid N=75,
with no missing values for total sample of 75 for the variable
math achievement. The plot shows a box plot for math
achievement. The box represents the middle 50% of the
cases (M=13), lower end of the box shows lower quartile
(Q1=7.67), and upper end of the quartile shows upper
quartile (17.00). The whiskers indicate the expected range
(25.33) of scores from minimum (Min=-1.67) to Maximum
(Max=23.67). Scores outside of this range are considered
unusually high or low, such scores are called outliers. There
are no outliers for in this case.
81
Histogram
Histograms are just like bar graph but there is no space between the boxes, indicating that there
is a continuous variable theoretically underlying the scores. Histograms can be used even if data,
as measured, are not continuous, if the underlying is conceptualized as continuous.
To draw a histogram select:
Graphs
legacy dialogues
interactive histogram
move variable to the box
OK
82
Interpretation
In this frequencies (number of students), shown by
the bars are for a range of points (in this case SPSS
selected a range of 50: 250-299, 300-349, 350-399,
etc). Notice that the largest number of students
(about 20) had scores in the middle two bars of the
range (450-499 and 500-549).
Similar small
numbers of students have very low and very high
scores. The bars in the histogram form a distribution
(pattern or curve) that is similar to the normal, bell
shaped curve. Thus, the frequency distribution of
the SAT math scores is said to be approximately
normal.
83
MEASUREMENT OF CENTRAL TENDENCY
Mean. The arithmetic average or mean takes into

account all of the available information in computing
the central tendency of a frequency distribution.
Median. The middle score or median is the

appropriate measure of central tendency for ordinal
level raw data.
Mode. The most common category, or mode can be used

with any kind of data generally provides the least precise
information about central tendency
84
Measure of Central Tendency
Exercise
Analyze
mean, median and mode
Frequencies
click continue
put SAT Math into variable box
click on statistics
mark
Ok
Statistics
scholastic aptitude test - math
N
Valid
75
Missing
0
Mean
490.53
Median
490.00
Mode
500
85
Measures of Variability
RangeThe range (highest minus lowest score)
is the crudest measure of variability but does
give an indication of the spread in scores if they
are ordered.
Inter quartile range (IQR)IQR=Q3-Q1
Standard DeviationThe standard deviation is
based on the deviation (x) of each score from
the mean of all scores.
86
Analyze
Range and std deviation
Frequencies
click continue
put SAT Math into variable box
click on statistics
mark
Ok
87
The Normal Curve
The frequency distributions of many of the variables used in the behavioral
sciences are distributed approximately as a normal curve when N is large.
Properties of Normal Curve
1. The mean, median and mode are equal.
2. It has one hump and this hump is in the middle of the distribution.
3. The curve is symmetric. If you fold the normal curve in half, the right side would
fit perfectly with the left side; that is, it is not skewed.
4. The range is infinite.
5. The curve is neither too peaked nor too flat and its tails are neither too short nor
too long.
88
Nominal
Dichotomous
Ordinal
Normal
Frequency Distribution
Yes
Yes
Yes
Ok
Bar Chart
Yes
Yes
Yes
OK
Histogram
No
No
OK
Yes
Frequency Polygon
No
No
OK
Yes
Box &Whisker Plot
No
No
Yes
Yes
Mean
No
OK
OK
Yes
Median
No
OK
Yes
OK
Mode
Yes
Yes
OK
OK
Range
No
Always 1
Yes
Yes
Standard Deviation
No
No
OK
Yes
Interquartile Range
No
No
OK
OK
How many categories
Yes
Always 2
OK
No
No
No
Yes
Yes
Central Tendency
Variability
Shape
Skewness
89
90
90
Lecture #6
91
In the name of Allah Kareem,

Most Beneficent, Most Gracious,
the Most Merciful !
92
93
Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the
associational and differential research questions using different
parametric and non parametric tests.
understand implement and interpret the chi-square, phi and
cramers V
understand, implement and interpret the correlation statistics
understand, implement and interpret the regression statistics
understand, implement and interpret the T-test statistics
94
Lesson Outline
1.Non parametric test.
1.Chi square /Fisher exact
2.Phi and cramers v
3.Kendall tau-b
2.Parametric test
1.Correlation
1.Pearson correlation
2.Spearman correlation
2.Regression
1.Simple regression
2.Multiple regression
3.T-Test
1.One-sample T-test
2.Independent sample T-test
3.Paired sample T-test
95
Inferential Statistics
Inferential statistics are used to make inferences
(conclusions) about a population from a sample
based on the statistical relationships or differences
between two or more variables using statistical tests
with the assumption that sampling is random in
order to generalize or make predictions about the
future.
96
Inferential Statistics
Inferential statistics are used
To test some hypothesis either to check relationship between
variables (two/more) or to compare two groups to measure the

differences among them.
To generalize the results about a population from a sample
To make predictions about the future.
To make conclusions
97
Some basics about inferential statistics!

Statistical significance (The p value)
Statistical significance test is the test of a null

hypothesis Ho which is a hypothesis that we attempt
to reject or nullify. i.e.
Ho =There is no relationship /Difference between
variable 1 and variable 2
p value > 0.05
p value < 0.05
Ho is accepted and H1 is rejected.

Ho is rejected and H1 is accepted.
98
Confidence Interval
Confidence interval is a range of values constructed for a
variable of interest so that this range has a specified probability
of including the true value of the variable. The specified
probability is called the confidence level, and the end points of
the confidence interval are called the confidence limits.
It is one of the alternatives to null hypothesis significance testing
(NHST).
99
The effect size (weak, moderate or strong)

Effect size is the strength of the relationship between the
independent variable and the dependent variable, and/or the
magnitude of the difference between levels of the independent
variable with respect to the dependent variable.
0
>0 0.33
>0.33 0.70
No effect
Small effect
Medium/typical effect
No relationship
Weak relationship
Moderate relationship
>0.70 <1
1
Large effect
Maximum effect
Strong relationship
Perfect relationship
100
Steps in interpreting inferential statistics

1. Relate why a test is applied
2. Discuss for which variable the test is applied
3. Elaborate whether the null hypothesis is rejected
or accepted p value
4. State what is the direction of the effect
5. Conclude the results
101
Types of test used in Inferential Statistics
Inferential statistics include a wide variety of tests to infer

the results. This variety of tests can be classified in two
broader categories that are
1. Non parametric tests
2. Parametric tests
102
Non parametric tests are the statistical tests that are

used
When the level of measurement is nominal or ordinal. E.g. chi-square test or
Kendalls tau-b.
When assumptions about normal distribution in the population is not met
e.g. spearman correlation
http://www.cliffsnotes.com/WileyCDA/Section/Statistics-Glossary.id-305499,articleId30041.html#ixzz0c38lKKZC retrieval data: 07/01/10
Non parametric tests involve

Chi-Square test
Kendalls tau-b
Spearman correlation (will be discussed in correlation section)
103
Non parametric test

Chi-Square Statistics
Chi-Squared test is the most commonly used non-parametric
test to check the association between two nominal variables in
order to accept or reject the null hypothesis. It is used to check
The association between two nominal variables
Hypothesis for Chi-Square Test

Ho = there is no association between gender and geometry in h.s.
H1 = There is association between gender and geometry in h.s.
104
Chi-Squared Test
Assumptions and Conditions for the Chi-Squared
test
The data of the variables must be independent.
Both the variables should be nominal.
All the expected counts are greater than 1 for chi-square.
At least 80% of the expected frequencies should be greater
than or equal to 5.
105
Chi-Squared Test
Checking Assumptions and Conditions for the Chi-Squared test
geometry in h.s. * gender Crosstabulation
gender
male
geometry in h.s. not taken
Count
Expected Count
% of Total
Taken
Count
Expected Count
% of Total
Total
Count
Expected Count
% of Total
female
Total
10
29
39
17.7
21.3
39.0
13.3%
38.7%
52.0%
24
12
36
16.3
19.7
36.0
32.0%
16.0%
48.0%
34
41
75
34.0
41.0
75.0
45.3%
54.7%
100.0%
106
Non parametric test

Case Processing Summary
Cases
Valid
N
geometry in h.s. * gender
Missing
Percent
75
100.0%
Total
Percent
0
.0%
Percent
75
100.0%
Chi-Square Tests
Value
Pearson Chi-Square
Asymp. Sig. (2sided)
df
12.714a
.000
Continuity Correctionb
11.112
.001
Likelihood Ratio
13.086
.000
Fisher's Exact Test
Linear-by-Linear Association
N of Valid Casesb
Exact Sig. (2-sided) Exact Sig. (1-sided)
.000
12.544
.000
.000
75
107
Symmetric Measures
Value
Nominal by Nominal
Phi
Cramer's V
N of Valid Cases
Approx. Sig.
-.412
.000
.412
.000
75
108
Interpretation:
To check the association between gender and geometry in h.s. chi-square test is conducted. The
case processing summary table indicates that there is no participant with missing value. The
assumptions are checked through crosstabs. The Crosstabulation table includes the Counts and
Expected Counts, and their relative percentages within gender. The result shows that there are 24
males who had taken geometry which is 71% of total 34 male students. On the other hand, 12 of 41
females took geometry; that is only 29% of the females. It looks like a higher percentage of males
took geometry than female students. The Ch-Square Test table tell us whether we can be confident
that this apparent difference is not due to chance.
Note, in the Cross Tabulation table, that the Expected Count of the number of male students who
didnt take geometry is 17.7 and the observed or actual Count is 10. Thus, there are 7.7 fewer
males who didnt take geometry than would be expected by chance, given the Totals shown in the
Table. There are also the same discrepancies between observed and expected counts in the other
three cells of the table. A question answered by the chi-square test is whether these discrepancies
between observed and expected counts are bigger than one might expect by chance.
The Chi-Square Tests table is used to determine if there is a statistically significant relationship
between two dichotomous or nominal variables. It tells you whether the relationship is statistically
significant but does not indicate the strength of the relationship, like phi or a correlation does. In
output, we use the Pearson Chi-Square or (for small samples) the Fishers exact test to interpret
the results of the test. They are statistically significant (p < .001), which indicates that we can be
quite certain that males and females are different on whether they take geometry.
Phi is -.412, and like the chi-square, it is statistically significant. Phi is also a measure of effect size
for an associational statistic and, in this case, effect size is moderate according to Cohen (1988)
109
Other Nonparametric Associational Statistics
KENDALLS TAU-B
If the variables are ordered (i.e. ordinal), you have several other choices.
We will use Kendalls tau-b in this problem.
Example: What is the relationship or association between
fathers education and mothers education?
110

Case Processing Summary
Cases
Valid
N
mother education revised * father
education revised
Missing
Percent
73
Total
Percent
97.3%
Percent
2.7%
75
100.0%
mother education revised * father education revised Cross tabulation

father education revised
1
mother education revised
Count
53
35.6
13.1
4.4
53.0
58.9%
11.0%
2.7%
72.6%
10
18
Expected Count
12.1
4.4
1.5
18.0
% of Total
8.2%
13.7%
2.7%
24.7%
Expected Count
1.3
.5
.2
2.0
% of Total
.0%
.0%
2.7%
2.7%
49
18
73
49.0
18.0
6.0
73.0
67.1%
24.7%
8.2%
100.0%
% of Total
Total
Total
43
Expected Count
Count
Count
Count
Expected Count
% of Total
111
Symmetric Measures
Asymp. Std. Errora
Value
Ordinal by Ordinal
Approx. Tb
Approx. Sig.
Kendall's tau-b
.494
N of Valid Cases
.108
3.846
.000
73
a. Not assuming the null hypothesis.

b. Using the asymptotic standard error assuming the null hypothesis.
112
Interpretation:
To investigate the relationship between fathers education
and mothers education, Kendalls tau-b was used. The
analysis indicated a significant positive association between
fathers education and mothers education, tau =.572,
p<.001. This means that more highly educated fathers were
married to more highly educated mothers and less educated
fathers were married to less educated mothers. This tau is
considered to be a large effect size (Cohen, 1988).
113
114
Interpretation
Eta was used to investigate the strength of the association
between gender and number of math courses taken
(eta=.33). This is a weak to medium effect size (Cohen,
1988). Males were more likely to take several or all the math
courses than females.
115
116
116
Lecture#8 Quantitative Technique
117
118
119
Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the
associational and differential research questions using different
parametric and non parametric tests.
understand implement and interpret the chi-square, phi and
cramers V
understand, implement and interpret the correlation statistics
understand, implement and interpret the regression statistics
understand, implement and interpret the T-test statistics
120
Lesson Outline
1.Non parametric test.
1.Chi square /Fisher exact
2.Phi and cramers v
3.Kendall tau-b
4.Eta
2.Parametric test
1.Correlation
1.Pearson correlation
2.Spearman correlation
2.Regression
1.Simple regression
2.Multiple regression
3.T-Test
1.One-sample T-test
2.Independent sample T-test
3.Paired sample T-test
121
Correlation
Correlation is a statistical process that determines the mutual (reciprocal)
relationship between two (or more) variables which are thought to be

mutually related in a way that systematic changes in the value of one
variable are accompanied by systematic changes in the other and vice versa.
It is used to determine
The existence of mutual relationship that is defined by the significance (p)
value.
The direction of relationship that is defined by the sign (+,-) of the test value
The strength of relationship that is defined by the test value
Correlation Coefficient (r)
The correlation coefficient measures the strength of linear relationship between two
or more numerical variables. The value of correlation coefficient can vary from -1.0
(a perfect negative correlation or association) through 0.0 (no correlation) to +1.0 (a
perfect positive correlation). Note that +1 and -1 are equally high or strong
122
Correlation
Assumptions and conditions for Pearson

The two variables have a linear relationship.
Scores on one variable are normally distributed for each value of the
other variable and vice versa.

Outliers (i.e. extreme scores) can have a big effect on the correlation.
123
Correlation
Checking the assumptions for Pearson Correlation
The assumptions for correlation test are checked through
normal curve (normality assumption) and the scatter plot
(linearity assumption)
Statistics
math
scholastic
achievement aptitude test test
math
N
Valid
75
75
Missing
0
0
Skewness
.044
.128
Std. Error of Skewness
.277
.277
124
Correlations
math
scholastic
achievement aptitude test
test
- math
math achievement test Pearson
Correlation
Sig. (2-tailed)
N
scholastic aptitude test Pearson
- math
Correlation
Sig. (2-tailed)
.788**
Correlation
.000
75
75
.788**
.000
N
75
**. Correlation is significant at the 0.01 level (2-tailed).
75
Interpretation
To investigate if there was a statistically significant association between Scholastic aptitude
test and math achievement, a correlation was computed. Both the variables were
approximately normal there is linear relationship between them hence fulfilling the
assumptions for Pearson's correlation. Thus, the Pearsons r is calculated, r= 0.79, p = .000
relating that there is highly significant relationship between the variables. The positive sign
of the Pearson's test value shows that there is positive relationship, which means that
students who have high scores in math achievement test do have high scores in scholastic
aptitude test and vice versa. Using Cohens (1988) guidelines the effect size is large relating
Prof.Muhammad
Ilyasmath
,std.Muhammad
Saeed and scholastic aptitude test.125
that there is strong relationship
between
achievement
Correlation
Spearman Correlation: If the assumptions for Pearson
correlation are not fulfilled then consider the Spearman

correlation with the assumption that the Relationship between
two variables is monotonically non linear
Example: what is the association between mothers education
and math achievement
126
Correlation
Correlationsa
math
mother's achieveme
education nt test
Spearman's rho mother's
education
Correlation
Coefficient
Sig. (2-tailed)
math
Correlation
achievement test Coefficient
Sig. (2-tailed)
**. Correlation is significant at the 0.01 level (2tailed).
1.000
3.15**
.006
.315**
1.000
.006
Interpretation
To investigate if there was a statistically significant association between mothers education
and math achievement, a correlation was computed. Mothers education was skewed
(skewness=1.13), which violated the assumption of normality. Thus, the spearman rho
statistic was calculated, r, (73) = .32, p = .006. The direction of the correlation was positive,
which means that students who have highly educated mothers tend to have higher math
achievement test scores and vice versa. Using Cohens (1988) guidelines the effect size is
medium for studies in his area. The r2 indicates that approximately 10% of the variance in
math achievement test score can be predicted from mothers education.

127
REGRESSION ANALYSIS
Regression analysis is used to measure the relationship between two or
more variables. One variable is called dependent (response, or outcome)

variable and the other is called Independent (explanatory or predictor)
variables.
Regression Equation
Y = a + bx
Y = a + bx1 + cx2 + dx3 + ex4

Y = dependent variable
a = Constant
b, c, d, e, = slope coefficients
x1, x2, x3, x4 = Independent variables
Types of regression analysis

Simple Regression
Multiple regression
128
REGRESSION ANALYSIS
Simple Regression
Simple regression is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variable is one.
Assumptions and conditions of simple regression
Dependent variable should be scale
The relationship of variables should be liner
Data should be independent
Example: Can we predict math achievement from grades in
high school
129
REGRESSION
ANALYSIS
Commands
Analyze
Regression
Linear
130
REGRESSION
ANALYSIS
Coefficientsa
Unstandardized
Coefficients
Model
1
(Constant)
.397
grades in h.s.
Standardized
Coefficients
Std. Error
2.530
2.142
a. Dependent Variable: math achievement test
.430
Beta
.504
.157
Sig.
.876
4.987
.000
Interpretation
Simple regression was conducted to investigate how well grades in highschool predict
math achievement scores. The results were statistically significant F (1, 73 ) = 24.87,
p<.001. The indentified equation to understand this relationship was math achievement =
.40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24%
of the variance in math achievement was explained by the grades in high school.
According to Cohen (1988), this is a large effect.
Regression equation is
Y = 0.40 + 2.14X
131
REGRESSION
ANALYSIS
Multiple Regression
Multiple regressions is used to check the contribution of
independent variables are more than one.
Assumptions and conditions of Multiple regression
Dependent variables should be scale.
Example: How well can you predict math achievement from a
combination of four variables: grades in high school, fathers

education, mother education and gender
132
Commands
Analyze
Regression
REGRESSION
ANALYSIS
Linear
133
Coefficient
Model
1 (Constant)
grades in h.s.
father's education
mother's education
gender
Unstandardized Standardized
Coefficients
Coefficients
B
Std. Error
Beta
1.047
2.526
1.946
.427
.465
.191
.313
.083
.406
.375
.141
-3.759
1.321
-.290
T
.415
4.560
.610
1.084
-2.846
Sig.
.680
.000
.544
.282
.006
Interpretation
Simultaneously multiple regression was conducted to investigate the best predictors of
math achievement test scores. The means, standard deviation, and inter correlations
can be found in table. The combination of variables to predict math achievement from
grades in high school, fathers education, mothers education and gender was
statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last
table. Note that high grades and male gender significantly predict math achievement
when all four variables are included. The adjusted R2 value was 0.343. This indicates
that 34 % of the variance in math achievement was explained by the model according
to Cohen (1988), this is a large effect.
134
REGRESSION ANALYSIS
Regression analysis is used to measure the relationship between two or
more variables. One variable is called dependent (response, or outcome)

variable and the other is called Independent (explanatory or predictor)
variables.
Regression Equation
Y = a + bx
Y = a + bx1 + cx2 + dx3 + ex4

Y = dependent variable
a = Constant
b, c, d, e, = slope coefficients
x1, x2, x3, x4 = Independent variables
Types of regression analysis

Simple Regression
Multiple regression
135
REGRESSION ANALYSIS
Simple Regression
Simple regression is used to check the contribution of
independent variable is one.
Assumptions and conditions of simple regression
The relationship of variables should be linear
Data should be independent
Example: Can we predict math achievement from grades in
high school
136
REGRESSION
ANALYSIS
Commands
Analyze
Regression
Linear
137
REGRESSION
ANALYSIS
Coefficientsa
Unstandardized
Coefficients
Model
1
B
(Constant)
grades in h.s.
.397
Standardized
Coefficients
Std. Error
2.530
2.142
a. Dependent Variable: math achievement test
.430
Beta
t
.504
.157
Sig.
.876
4.987
.000
Interpretation
Simple regression was conducted to investigate how well grades in high school predict
math achievement scores. The results were statistically significant F (1, 73 ) = 24.87,
p<.001. The indentified equation to understand this relationship was math achievement =
.40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24%
of the variance in math achievement was explained by the grades in high school.
According to Cohen (1988), this is a large effect.
Regression equation is
Y = 0.40 + 2.14X
138
REGRESSION
ANALYSIS
Multiple Regression
Multiple regressions is used to check the contribution of
independent variables are more than one.
Assumptions and conditions of Multiple regression
Dependent variables should be scale.
Example: How well can you predict math achievement from a
combination of four variables: grades in high school, fathers

education, mother education and gender
139
Commands
Analyze
Regression
REGRESSION
ANALYSIS
Linear
140
Model
1 (Constant)
grades in h.s.
father's education
mother's education
gender
Coefficient
Unstandardized Standardized
Coefficients
Coefficients
B
Std. Error
Beta
1.047
2.526
1.946
.427
.465
.191
.313
.083
.406
.375
.141
-3.759
1.321
-.290
T
.415
4.560
.610
1.084
-2.846
Sig.
.680
.000
.544
.282
.006
Interpretation
Simultaneously multiple regression was conducted to investigate the best predictors of
math achievement test scores. The means, standard deviation, and inter correlations
can be found in table. The combination of variables to predict math achievement from
grades in high school, fathers education, mothers education and gender was
statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last
table. Note that high grades and male gender significantly predict math achievement
when all four variables are included. The adjusted R2 value was 0.343. This indicates
that 34 % of the variance in math achievement was explained by the model according
to Cohen (1988), this is a large effect.
141
142
142
Last lecture #11
143
144
T-TEST Statistics
The t test is used to compare to groups to answer the differential
research questions. Its values determines the difference by

comparing means
Hypothesis for T-test
HO: there is no Difference
H1: There is difference
Types of T-test
There are three types of T-test
One sample t-test
Independent sample t-test
Paired sample t-test
145
T-TEST Statistics
One sample t-test
One sample t-test is used to determine if there is difference between
population mean (Test value) and the sample mean (X)
Assumptions and conditions of 1 sample t-test

The dependent variable should be normally distributed within the
population
The data are independent.(scores of one participant are not depend on
scores of the other :participant are independent of one another )
Example: is the mean SAT-Math score in the modified HSB

data set significantly different from the presumed population
mean of 500?
146
T-TEST Statistics
One-Sample Statistics
N
scholastic aptitude test math
Mean
75
Std. Deviation
490.53
Std. Error Mean
94.553
10.918
One-Sample Test
Test Value = 500
t
scholastic aptitude
test - math
-.867
Sig. (2tailed)
Df
74
.389
Mean
Difference
-9.467
95% Confidence Interval

of the Difference
Lower
-31.22
Upper
12.29
147
Interpretation:
To investigate the difference between population and the sample, one-sample
t-test is conducted. The One-Sample Statistics table provides basic
descriptive statistics for the variable under consideration. The Mean AT-Math
for the students in the sample will be compared to the hypothesize population
mean, displayed as the Test Value in the One-Sample Test table. On the
bottom line of this table are the t value, df, and the two-tailed sig. (p) value,
which are circled. Note that p=.389 so we can say that the sample mean
(490.53) is not significantly different from the population mean of 500. The
table also provides the difference (-9.47) between the sample and population
mean and the 95% Confidence Interval. The difference between the sample
and the population mean is likely to be between +12.29 and -31.22 points.
Notice that this range includes the value of zero, so it is possible that there is
no difference. Thus, the difference is not statistically significant.
148
T-TEST Statistics
Independent sample t-test
Independent sample T-test is used to compare two

independent groups (Male and Female)with respect to there
effect on same dependent variable.
Assumptions and conditions of Independent T-test
Variance of the dependent variable for two categories of the
independent variable should be equal to each other

Data on dependent variable should be independent.
Example: Do male and female students differ significantly in

regard to their average math achievement scores
149
T-TEST Statistics
The first table, Group Statistics, shows descriptive statistics for the two groups (males and females)
separately. Note that the means within each of the three pairs look somewhat different. This might be due
to chance, so we will check the t test in the next table.
The second table, Independent Sample Test, provides two statistical tests. The left two columns of
numbers are the Levenes test for the assumption that the variances of the two groups are equal. This is
not the t test; it only assesses an assumption! If this F test is not significant (as in the case of math
achievement and grades in high school), the assumption is not violated, and one uses the Equal variances
assumed line for the t test and related statistics. However, if Levenes F is statistically significant (Sig. <.05),
as is true for visualization, then variances are significantly different and the assumption of equal variances
is violated. In that case, the Equal variances not assumed line used; and SSPS adjusts t, df, and Sig. The
appropriate lines are circled. Prof.Muhammad Ilyas ,std.Muhammad Saeed
150
Thus, for visualization, the appropriate t=2.39, degree of freedom (df) = 57.15, p=.020. This t is statistically
significant so, based on examining the means, we can say that boys have higher visualization scores than
girls. We used visualization to provide an example where the assumption of equal variances was violated
(Levenes test was significant). Note that for grades in high school, the t is not statistically significant
(p=.369) so we conclude that there is no evidence of a systematic difference between boys and girls on
grades. On the other hand, math achievement is statistically significant because p<.05; males have higher
means.
The 95% Confidence Interval of the Difference is shown in the two right-hand column of the output. The
confidence interval tells us if we repeated the study 100 times, 95 of the times the true (population)
difference would fall within the confidence interval, which for math achievement is between 1.05 points
and 6.97 points. Note that if the Upper and Lower bounds have the same sign (either + and + or and -),
we know that the difference is statistically significant because this means that the null finding of zero
difference lies outside of the confident interval. On the other hand, if zero lies between the upper or lower
limits, there could be no difference, as is the case of grades in h.s. The lower limit of the confidence
interval on math achievement tells us that the difference between males and females could be as small as
1.05 points out 25, which are the maximum possible scores.
Effects size measures for t tests are not provided in the printout but can be estimated relatively easily. For
math achievement, the difference between the means (4.01) would be divided by about 6.4, an estimate
of the pooled (weighted average) standard deviation. Thus, d would be approximately .60, which is,
according to Cohen (1988), a medium to large sized effect. Because you need means and standard
deviations to compute the effect size, you should include a table with means and standard deviations in
your results section for a full interpretation of t tests.
151
T-TEST Statistics
Paired sample t-test
Paired sample T-test is used to compare two paired groups (e.g. Mothers
and fathers) with respect to there effect on same dependent variable.
Assumptions and conditions of Paired sample T-test
The independent variable is dichotomous and its levels (or groups) are
paired, or matched, in some way (husband-wife, pre-post etc)

The dependent variable is normally distributed in the two conditions
Example: Do students fathers or mothers have more
education?
152
Paired Samples Statistics
Mean
Pair 1
father's education
mother's education
Std. Deviation
Std. Error Mean
4.73
73
2.830
.331
4.14
73
2.263
.265
Paired Samples Correlations

N
Pair 1
Correlation
Sig.
father's education & mother's education

73
.681
.000
153
The first table shows the descriptive statistics used to compare mothers and
fathers education levels. The second table Paired Samples Correlations, provides
correlations between the two paired scores. The correlation (r=.68) between
mothers and fathers education indicates that highly educate men tend to marry
highly educated women and vice versa. It doesnt tell you whether men or women
have more education. That is what t in the third table tells you.
The last table shows the Paired Samples t Test. The Sig. for the comparison of the
average education level of the students mothers and fathers was p=.019. Thus, the
difference in educational level is statistically significant, and we can tell from the
means in the first table that fathers have more education; however, the effect size is
small (d=.28), which is computed by dividing the mean of the paired differences
(.59) by the standard deviation (2.1) of the paired differences. Also, we can tell from
the confidence interval that the difference in the means could be as small as .10 of a
point or as large as 1.08 points on the 2 to 10 scale.
154
Thank you!
155

Quantative Techniqes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantative Techniqes

Uploaded by

Copyright:

Available Formats

1

gather, sort, analyze and interpret numerical data in

Why Study QTB

Predict and forecast about the future needs of the

employees turnover in order to reduce it.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

of any business issue that seeks for Description,

Constant is a characteristic that do not vary

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Competitors product, price,

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Islam, christianity, Jews, etc

High, medium, low

Types of Categorical variable

14, 14.5, 15, 15.5.

5.7, 5.8, 5.3

23, 45, 69, 100

Types of Numerical variable

Type of Research Question

A question that is answered through Summarising

E.g.: What is the annual turnover of employees in Higher

educational institutions of Pakistan

A question that is answered through

E.g.: What is the impact of advertisement on sales of a new product in

A question that is answered through comparing

E.g.: Does investing in stock market yield more return on investment as

compare to investment in real estate.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

the relationship between two variables

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Research Question Vs. Hypothesis

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Age- 16, 18, 20, 21, 23,

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Primary and Secondary Data

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Nature wise data can be of two types:

Prof.Muhammad Ilyas ,std.Muhammad Saeed

TYPES OF DATA Cont

Time wise data can be categorized into two

which comes from an original source and are collected

Survey is a quantitative research strategy that involves

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Example: The basic aim of survey is to collect updated, accurate yet

Prof.Muhammad Ilyas ,std.Muhammad Saeed

After setting objectives of survey we develop the plan

Whom to survey (Sample Selection)

Where to survey (Site Selection)

What to survey (Questions for required information)

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Prof.Muhammad Ilyas ,std.Muhammad Saeed

wording sequence and ability to understand the

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Self administered survey

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Prof.Muhammad Ilyas ,std.Muhammad Saeed

6: Data Analysis and Interpretation

the goal of highlight the useful information, suggesting conclusions in

Data can be analyzed in two ways:

Prof.Muhammad Ilyas ,std.Muhammad Saeed

3. My college is : Arts & sciences___ _ Business Engineering