You are on page 1of 155

1

Quantitative Techniques in
Business (QTB)
Quantitative Techniques are the techniques used to

gather, sort, analyze and interpret numerical data in


order to improve business decisions.
Numerical data
Numerical data (or quantitative data) is data measured or identified on a
numerical scale. Numerical data can be analysed using statistical methods, and
results can be displayed using tables, charts, histograms and graphs
Examples
Company sales (millions)
18, 12, 20, etc
Number of employees in company(hundreds)
15, 8, 5, etc
Prof.Muhammad Ilyas ,std.Muhammad Saeed

Why Study QTB


Studying QTB is essential as it enables to
Gather, sort, analyze and interpret the data
Have needed, timely, accurate, yet relevant information.
Understand and compare different types of situations

Predict and forecast about the future needs of the

business
Develop effective policies and business related strategies
Make effective decisions to achieve business goals
efficiently
Research is based on QTB
Final thesis is based on QTB
Prof.Muhammad Ilyas ,std.Muhammad Saeed

Research Problem
Any problem or opportunity that needs to be addressed through
research process of data collection and analysis is called
Research Problem

Examples
Human Resource manager wants to develop HR policies regarding

employees turnover in order to reduce it.


Marketing manager wants to launch a new product successfully
using advertisement as promotional tool
Finance manager needs to invest excessive money profitably

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Problem Statement
A problem statement is a clear and concise description

of any business issue that seeks for Description,


Association or difference of two or more variables.

Example
Measure the annual turnover of employees in Higher educational

sector of Pakistan
Does advertisement contribute to the sales of a new product in the
market
Which of the two options i.e. stock market or real estate is better for
investment.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

What is Variable?
Vary + able = Change + able
Variable is a characteristic of anything that can vary

(Change).

Examples

Gender
Age
Motivation level

(Male, Female)
(20 years, 30 years, 50 years)
(High, Medium, Low)

Constant is a characteristic that do not vary


e.g. If all students are male in a class then Gender will be constant

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Types of Variable
with respect to relation

Budget

Advertisement

Awareness

Sales

Competitors product, price,


packaging, placement

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Types of Variable
with respect to data
Variable

Categorical

Nominal

Ordinal

Gender

Motivation

1.
2.

1.
2.

Male
Female

Numerical

3.

Discrete

1. No of students
Highly Motivated
2. No of chairs
Moderately
3. Collar size
Motivated
Less Prof.Muhammad
Motivated
Ilyas ,std.Muhammad
Saeed

Continuous

1.
2.
3.

Height
Weight
speed
8

Categorical Variable
A variable whose values are not numerical in nature
Variables

Values

Gender

Male, female

Religion

Islam, christianity, Jews, etc

Motivation level

High, medium, low

Types of Categorical variable


1. Nominal variable
A categorical variable whose values are not ordered
Example
Gender
Male, Female
2. Ordinal variable
A categorical variable whose values are in ordered
Example
Education
Metric, inter, graduation
Prof.Muhammad Ilyas ,std.Muhammad Saeed

Numerical Variable
A variable whose values are numerical in nature
Variables

Values

Collar size

14, 14.5, 15, 15.5.

Height

5.7, 5.8, 5.3

No of employees

23, 45, 69, 100

Types of Numerical variable


1. Discrete variable
A numerical variable whose values have same interval
Example
collar size
14.5, 15, 15.5..
2. continuous variable
A numerical variable whose values dont have same interval
Example
speed
40.1, 45, 67.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

10

Research Question
Research problem needs to be translated into one or
more research questions that are defined as
A research question is an interrogative statement that
seeks for the tentative relationship among variables
and clarifies what the researcher wants to answer.

Example
What is the impact of advertisement on sales of a new product in

the market
What is the annual turnover of employees in Higher educational
institutions of Pakistan
Does investing in stock market yield more return on investment as
compare to investment in real estate.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

11

Type of Research Question


Descriptive:

A question that is answered through Summarising


data about a single variable

E.g.: What is the annual turnover of employees in Higher

educational institutions of Pakistan

Associational:

A question that is answered through


determining
strength
and
direction
of
relationship between two or more variables

E.g.: What is the impact of advertisement on sales of a new product in

the market

Difference:
and

A question that is answered through comparing


contrasting two groups or variables

E.g.: Does investing in stock market yield more return on investment as

compare to investment in real estate.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

12

Research Hypotheses
Research hypotheses are predictive statements about

the relationship between two variables


Types of Hypothesis
There are two types of hypothesis

1. Null Hypothesis
Ho = There is no relationship between Advertising and Sales

2. Alternative Hypothesis
H1 = There is relationship between advertising and sales

Prof.Muhammad Ilyas ,std.Muhammad Saeed

13

Research Question Vs. Hypothesis


Research question

Hypothesis

o Interrogative statement

Simple statement

oNon-Predictive

Predictive

oNon-Directional

Directional

Prof.Muhammad Ilyas ,std.Muhammad Saeed

14

Activity
In groups of four, use the variables provided to write:
An associational question
A difference question
A descriptive question

Prof.Muhammad Ilyas ,std.Muhammad Saeed

15

Data
Set of raw facts figures is called Data
Example:

Age- 16, 18, 20, 21, 23,


Nationality- Pakistani, Indian, American

Types of Data
Data

Nature

Qualitative

Quantitative

Time frame

Crosssectional

Prof.Muhammad Ilyas ,std.Muhammad Saeed

Time-Series

16

Thank You!

Prof.Muhammad Ilyas ,std.Muhammad Saeed

17

Lecture #2
I am really thankful to my gorgeous teachers Sir
Dr.Muhammad Ilyas , for that great knowledge.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

18

Primary and Secondary Data

Prof.Muhammad Ilyas ,std.Muhammad Saeed

19

DATA
A set of raw facts and figures are called data

OR
Example: Age 16, 18, 20, 21, 23
Nationality Pakistani, Indian,
American

Prof.Muhammad Ilyas ,std.Muhammad Saeed

20

TYPES OF DATA
Data

Nature

Time Frame

Source

Quantitative
Data

Cross
Sectional
Data

Primary Data

Qualitative
Data

Time Series
Data

Secondary
Data

Longitudinal
Data
Prof.Muhammad Ilyas ,std.Muhammad Saeed

21

TYPES OF DATA
On the basis of Nature

Nature wise data can be of two types:


1) Quantitative Data: A data that consist of numbers
for example data about age consists of values like 16,
18, 20, 21, 23
2) Qualitative Data: A data that consists of words
rather than numbers. For example nationality
includes Pakistani, Indian, American

Prof.Muhammad Ilyas ,std.Muhammad Saeed

22

TYPES OF DATA Cont


On the basis of Time Frame:

Time wise data can be categorized into two


1) Cross-Sectional Data: Data that is collected from
different units at once
2) Time Series Data: Data that is collected from same
units on different time with same time interval
3) Longitudinal Data: A dataset is longitudinal if it
tracks the same type of information on the same
subjects at multiple points in time.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

23

SOURCES OF DATA
Primary Data Source: Primary data is such data

which comes from an original source and are collected


with a specific research question in mind
For example: You want to collect data on
Employee Motivation
Secondary Data Source: Secondary data represents
the previously recorded data collected for another
purpose.
For example: You want to collect the data on profit
of MCB Bank for 5 years
Prof.Muhammad Ilyas ,std.Muhammad Saeed

24

HOW
TO
COLLECT
PRIMARY
DATA
Survey method is used to collect primary data
WHAT IS A SURVEY?

Survey is a quantitative research strategy that involves


the structured collection of data from a predetermined sample.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

25

Survey Design
1. Objectives of Survey
2. Survey Design
3. Pilot Test
4. Field work/Data Collection
5. Data Preparation
6. Data Analysis and Interpretation
7. Discussion and Conclusion
8. Report Writing
Prof.Muhammad Ilyas ,std.Muhammad Saeed

26

1: Objectives of Survey
The first step of survey design is to clearly define that
why we are going to conduct the survey.

Example: The basic aim of survey is to collect updated, accurate yet


relevant data in order to answer a research problem

Prof.Muhammad Ilyas ,std.Muhammad Saeed

27

2: Survey Design

After setting objectives of survey we develop the plan


(design) of survey deciding that:

Whom to survey (Sample Selection)

Where to survey (Site Selection)

How to survey

What to survey (Questions for required information)

(Method)

Prof.Muhammad Ilyas ,std.Muhammad Saeed

28

How
to develop
Questionnaire?
1. Decide
what information
is required.
2. Draft some questions on each variable to
elicit the information
3. Put them into a meaningful order and format
4. Pre-test the questionnaire
5. Go back to Step 1, and continue until the
questionnaire is perfect.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

29

3: Pilot Test
It is process of checking/assessing the accuracy of the

wording sequence and ability to understand the


question by conducting survey from one or two
respondent as a trail in order to refine questionnaire.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

30

4: Fieldwork/conduct a survey
It is a process of collecting data actually from the
target sample. It can be done in following ways:

Self administered survey

Postal survey

Online survey

Prof.Muhammad Ilyas ,std.Muhammad Saeed

31

5:
Data
Preparation
After getting your survey completed and knowing the interface
of the SPSS the next step is to prepare the data for analysis.
This process involves four steps.
1. Coding the questionnaire.
2. Defining the variables in SPSS variable view.
3. Entering the data in SPSS data view.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

32

6: Data Analysis and Interpretation


It is a process of summarizing, organizing and transforming data with

the goal of highlight the useful information, suggesting conclusions in


order to answer the research question and support good decision
making.

Data can be analyzed in two ways:

Descriptive Analysis

Inferential Analysis

Prof.Muhammad Ilyas ,std.Muhammad Saeed

33

Interpretation
Interpretation is a process of making sense of results

by explaining and assigning meaning to them.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

34

Report writing

Clarity of thoughts
Complete and self explanatory
Comprehensive and compact
Accurate in all aspects
Support facts
Suitable format for readers
Proper date and signature
reference
Reliable sources
Logical manner
Prof.Muhammad Ilyas ,std.Muhammad Saeed

35

7: Discussion and Conclusion


Discussion:( same result like previous result or
change discussion phase like discussion with proper
background
2. Conclusion: (what is our result of your study and
what we achieve)
1.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

36

8: Report Writing

Prof.Muhammad Ilyas ,std.Muhammad Saeed

37

HOW TO COLLECT SECONDARY


DATA
Sources that are used to collect secondary data can be:
Documentary
Government survey
Academic survey
Companys financial statements
Bank reports

Prof.Muhammad Ilyas ,std.Muhammad Saeed

38

Important links of Secondary Data


www.wdi.com
www.pwt.com
www.ifs.com
www.fbs.com
www.sbp.com

Prof.Muhammad Ilyas ,std.Muhammad Saeed

39

Step: 1 Search WDI

Prof.Muhammad Ilyas ,std.Muhammad Saeed

40

Step 2: Click Data Bank on the WDI


Web Page

Prof.Muhammad Ilyas ,std.Muhammad Saeed

41

Step 3: Select your Desired Country


for the Extraction of Data

Prof.Muhammad Ilyas ,std.Muhammad Saeed

42

Step 4: Select Required Variables


from the list

Prof.Muhammad Ilyas ,std.Muhammad Saeed

43

Step 5: Select Years

Prof.Muhammad Ilyas ,std.Muhammad Saeed

44

Step 6: View Data

Prof.Muhammad Ilyas ,std.Muhammad Saeed

45

Step 7: Click on Excel to Export


Data

Prof.Muhammad Ilyas ,std.Muhammad Saeed

46

Lecture #3

Prof.Muhammad Ilyas ,std.Muhammad Saeed

47

In the name of Allah Kareem,


Most Beneficent, Most Gracious,
the Most Merciful !
Prof.Muhammad Ilyas ,std.Muhammad Saeed

48

Prof.Muhammad Ilyas ,std.Muhammad Saeed

49

Introduction to SPSS
Before further processing of the data we should get to
know about SPSS software first .

SPSS
SPSS stands for statistical package for social sciences. It
is basically used for the analysis of quantitative data .

How to open SPSS


Start
Menu

Programs

SPSS Inc.

SPSS 16.0

Prof.Muhammad Ilyas ,std.Muhammad Saeed

50

Prof.Muhammad Ilyas ,std.Muhammad Saeed

51

Welcome window SPSS 16.0

Prof.Muhammad Ilyas ,std.Muhammad Saeed

52

SPSS Interface

Title Bar
Menu Bar
Tool Bar
Variable definition criteria
Serial Number / Cases
Work sheet
SPSS Views

Prof.Muhammad Ilyas ,std.Muhammad Saeed

53

Prof.Muhammad Ilyas ,std.Muhammad Saeed

54

Data Entry
After defining the variables enter the data in data view for each
case (row wise) against each variable (column wise)

Prof.Muhammad Ilyas ,std.Muhammad Saeed

55

Prof.Muhammad Ilyas ,std.Muhammad Saeed

56

Data Processing
After collecting the data, data processing is
started that involves
1. Data coding
2. Defining the variables
3. Data entry in the software
4.Checking for error

Prof.Muhammad Ilyas ,std.Muhammad Saeed

57

SAMPLE QUESTIONNAIRE
Please circle or supply your answer

ID_________
SD

SA

1. I would recommend this course to other students 1 2


2. I worked very hard in this course

1 2 3 4

4 5
5

3. My college is : Arts & sciences___ _ Business____ Engineering____


4. My gender is

5. My GPA is

_____________

6. For this class, I did: (Check all that apply


The reading
The homework
Extra credit

Prof.Muhammad Ilyas ,std.Muhammad Saeed

58

Coding
Coding is the process of assigning numbers to the values or
levels of each variable.
Rules of Coding
1.
2.
3.
4.
5.
6.
7.

All data should be numeric.


Each variable for each case or participant must occupy the same
column.
All values (codes) for a variable must be mutually exclusive.
Each variable should be coded to give maximum information.
For each participant, there must be a code or value for each
variable.
Apply any coding rule consistently for all participants.
Use high numbers (values or codes) for the agree, good, or
positive end of a variable that is ordered
Prof.Muhammad Ilyas ,std.Muhammad Saeed

59

Defining the variables


In SPSS first of all the variables are defined in variable view
This includes
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Name of the variable (Short without space)


Type
(Numeric, String)
Width
(8, 10, etc)
Decimals
(2, 3, 5 etc)
Label
(Full name of the variable)
Values
(answer categories with codes)
Missing
(blank, multiple, wrong answers)
Columns
(6, 8, 10 etc)
Align
(Left, right, centre)
Measure
(Nominal, ordinal, scale)

Prof.Muhammad Ilyas ,std.Muhammad Saeed

60

SUPERIOR GROUP OF COLLEGES

Prof.Muhammad Ilyas ,std.Muhammad Saeed

61

61

Lecture 4

Prof.Muhammad Ilyas ,std.Muhammad Saeed

62

Data File Management

Prof.Muhammad Ilyas ,std.Muhammad Saeed

63

After this lecture you would:


Learn four useful data transformation techniques:
Count
Recode (Revise and Reverse)
Compute a new variable

Prof.Muhammad Ilyas ,std.Muhammad Saeed

64

Problem 5.1: Count Math Courses Taken


How many math courses (algebra1, algebra2, geometry,
trigonometry and calculus) did each of the 75
participants take in high school? Label your new
variable

Prof.Muhammad Ilyas ,std.Muhammad Saeed

65

Problem 5.2: Recode and Relabel Mothers and Fathers Education

Recode mothers and fathers education so that those


with no postsecondary education have a value of 1,
those with some posts secondary have a value of 2, and
those with a bachelors degree or more have a value of
3. Label the new variables and values

Prof.Muhammad Ilyas ,std.Muhammad Saeed

66

Problem5.3: Recode and Compute Pleasure Scale Score

Compute the average pleasure scale from item02,


item06, item10 and item14 after reversing (use the
Recode command) item06 and item10. Name the new
computed variable pleasure and label its highest and
lowest values.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

67

Lecture #5

Prof.Muhammad Ilyas ,std.Muhammad Saeed

68

Prof.Muhammad Ilyas ,std.Muhammad Saeed

69

Session Objectives
After this session the students will be able to analyze the
collected data using descriptive statistics by
Producing summaries of data in both tabular and
graphical forms
Calculating the central tendencies using mean median
and modes
Calculate the dispersion of data using range, IQR and
Standard Deviation
Checking if the data is normally distributed using
Normal curve phenomenon
Prof.Muhammad Ilyas ,std.Muhammad Saeed

70

Analyzing Data
The process of breaking down the complex
data to gain better understanding of it.
There are two types of statistics
Descriptive statistics
Inferential statistics
In this session we will work on descriptive
statistics
SUPERIOR
GROUP OF COLLEGES
Prof.Muhammad
Ilyas ,std.Muhammad
Saeed

71

Descriptive statistics
Descriptive statistics are used to Describe, Summarize,
Organize, and Simplify data in quantitative terms. We will
cover
1. Summarizing Numerical Data
2. Measures of Central Tendency
3. Measurement of Dispersion
4. Checking Data Normality

SUPERIOR
GROUP OF COLLEGES
Prof.Muhammad
Ilyas ,std.Muhammad
Saeed

72

1. Summarizing
Variable
Categorical

Numerical

Frequency
Distribution Table

Five Figure
Summary

Bar chart

Box Plot /
Histograms

Prof.Muhammad Ilyas ,std.Muhammad Saeed

73

Summarizing categorical data

Frequency Distribution.
A frequency distribution is a tally or count of the number of times each score on a
single variable occurs
Analyze
Descriptive Statistics
frequency tables box is checked)

Frequencies

move religion to the variable box

OK (make sure that the Display

Frequency table for religion


religion
Frequency
30
23
14
67

Percent
40.0
30.7
18.7
89.3

Valid

Muslims
Christians
Hindus
Total

Missing

other religion

5.3

blank

5.3

Total

10.7

75

100.0

Total

Cumulative
Valid Percent
Percent
44.8
44.8
34.3
79.1
20.9
100.0
100.0

Prof.Muhammad Ilyas ,std.Muhammad Saeed

74

Interpretation:

In this example, there is a Frequency column that shows the


numbers of students who marked each type of religion (e.g., 30
said Muslims, 23 Christians, 14 Hindus, 4 is missing and 4 left
it blank).Notice that there are a total of (67) for the three
responses considered Valid and a total (8) for the two types of
responses considered to be Missing as well as an overall total
(75). The Percent column indicates that 40.0% are Muslims ,
30.7% are Christians , 18.7% are Hindus, 5.3% had one of several
other religions, and 5.3% left the question blank. The Valid
Percentage column excludes the eight missing cases and is often
the column that you would use. Given this data set, it would be
accurate to say that of those not coded as missing, 44.8% were
Muslims and 34.3% Christians and 20.9% were Hindus.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

75

Summarizing categorical data

Bar Charts
With Nominal data, it is better to make a bar graph or chart of the frequency distribution of
variables like religion, ethnic group, or other nominal variables; the points that happen to
be adjacent in your frequency distribution are not by necessarily adjacent.
To get a bar chart select
Graphs

legacy dialogues

interactive

bar chart

Prof.Muhammad Ilyas ,std.Muhammad Saeed

move variable to the box

OK

76

Summarizing Numerical data

Five Figure Summary

It is used to summarize the Numerical data. Five figures include


locating the following values in data

1.
2.
3.
4.
5.

Minimum value
Maximum Value
Median
Lower Quartile
Upper Quartile

Prof.Muhammad Ilyas ,std.Muhammad Saeed

77

Exercise: Calculate Five Figure Summary


2 1 3 2 1 4 3 5 8 8 7 7 4 5 6 2 6 6 6 6
Department B: 20 employees
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1 1 2 2 2 3 3 4 4 5 5 6 6 6 6 6 7
18 19 20
7 8 8
Min
=1
Max
=8
Median
=5
Lower Quartile
= 2.5
Upper Quartile
=6
Prof.Muhammad Ilyas ,std.Muhammad Saeed

78

Exercise: Calculate Five Figure Summary


Department B: 30 employees

4 6

8 14

8 7 8

5 16 5
3

6 6

10 6 8 18

12

Prof.Muhammad Ilyas ,std.Muhammad Saeed

79

box & whisker plot


For ordinal and normal data, the box and whiskers plot is useful The box and whisker
plot is a graphical representation of distribution of scores and is helpful in distinguishing
between ordinal and normally distributed data
Graphs

legacy dialogues

interactive

box plot

move gender to the x-axis and move SAT math to y-axis

Prof.Muhammad Ilyas ,std.Muhammad Saeed

OK

80

Interpretation
The case processing summary table shows the valid N=75,
with no missing values for total sample of 75 for the variable
math achievement. The plot shows a box plot for math
achievement. The box represents the middle 50% of the
cases (M=13), lower end of the box shows lower quartile
(Q1=7.67), and upper end of the quartile shows upper
quartile (17.00). The whiskers indicate the expected range
(25.33) of scores from minimum (Min=-1.67) to Maximum
(Max=23.67). Scores outside of this range are considered
unusually high or low, such scores are called outliers. There
are no outliers for in this case.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

81

Histogram
Histograms are just like bar graph but there is no space between the boxes, indicating that there
is a continuous variable theoretically underlying the scores. Histograms can be used even if data,
as measured, are not continuous, if the underlying is conceptualized as continuous.
To draw a histogram select:
Graphs

legacy dialogues

interactive histogram

move variable to the box

Prof.Muhammad Ilyas ,std.Muhammad Saeed

OK

82

Interpretation
In this frequencies (number of students), shown by
the bars are for a range of points (in this case SPSS
selected a range of 50: 250-299, 300-349, 350-399,
etc). Notice that the largest number of students
(about 20) had scores in the middle two bars of the
range (450-499 and 500-549).
Similar small
numbers of students have very low and very high
scores. The bars in the histogram form a distribution
(pattern or curve) that is similar to the normal, bell
shaped curve. Thus, the frequency distribution of
the SAT math scores is said to be approximately
normal.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

83

MEASUREMENT OF CENTRAL TENDENCY

Mean. The arithmetic average or mean takes into


account all of the available information in computing
the central tendency of a frequency distribution.

Median. The middle score or median is the


appropriate measure of central tendency for ordinal
level raw data.

Mode. The most common category, or mode can be used


with any kind of data generally provides the least precise
information about central tendency

Prof.Muhammad Ilyas ,std.Muhammad Saeed

84

Measure of Central Tendency

Exercise
Analyze

Descriptive statistics

mean, median and mode

Frequencies

click continue

put SAT Math into variable box

click on statistics

mark

Ok

Statistics
scholastic aptitude test - math
N
Valid
75
Missing
0
Mean
490.53
Median
490.00
Mode
500
Prof.Muhammad Ilyas ,std.Muhammad Saeed

85

Measures of Variability
RangeThe range (highest minus lowest score)
is the crudest measure of variability but does
give an indication of the spread in scores if they
are ordered.
Inter quartile range (IQR)IQR=Q3-Q1
Standard DeviationThe standard deviation is
based on the deviation (x) of each score from
the mean of all scores.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

86

Analyze

Descriptive statistics

Range and std deviation

Frequencies

click continue

put SAT Math into variable box

click on statistics

mark

Ok

Prof.Muhammad Ilyas ,std.Muhammad Saeed

87

Descriptive Statistics
The Normal Curve
The frequency distributions of many of the variables used in the behavioral
sciences are distributed approximately as a normal curve when N is large.
Properties of Normal Curve
1. The mean, median and mode are equal.
2. It has one hump and this hump is in the middle of the distribution.
3. The curve is symmetric. If you fold the normal curve in half, the right side would
fit perfectly with the left side; that is, it is not skewed.
4. The range is infinite.
5. The curve is neither too peaked nor too flat and its tails are neither too short nor
too long.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

88

Nominal

Dichotomous

Ordinal

Normal

Frequency Distribution

Yes

Yes

Yes

Ok

Bar Chart

Yes

Yes

Yes

OK

Histogram

No

No

OK

Yes

Frequency Polygon

No

No

OK

Yes

Box &Whisker Plot

No

No

Yes

Yes

Mean

No

OK

OK

Yes

Median

No

OK

Yes

OK

Mode

Yes

Yes

OK

OK

Range

No

Always 1

Yes

Yes

Standard Deviation

No

No

OK

Yes

Interquartile Range

No

No

OK

OK

How many categories

Yes

Always 2

OK

No

No

No

Yes

Yes

Central Tendency

Variability

Shape
Skewness

Prof.Muhammad Ilyas ,std.Muhammad Saeed

89

SUPERIOR GROUP OF COLLEGES

Prof.Muhammad Ilyas ,std.Muhammad Saeed

90

90

Lecture #6

Prof.Muhammad Ilyas ,std.Muhammad Saeed

91

In the name of Allah Kareem,


Most Beneficent, Most Gracious,
the Most Merciful !
Prof.Muhammad Ilyas ,std.Muhammad Saeed

92

Prof.Muhammad Ilyas ,std.Muhammad Saeed

93

Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the
associational and differential research questions using different
parametric and non parametric tests.
understand implement and interpret the chi-square, phi and
cramers V
understand, implement and interpret the correlation statistics
understand, implement and interpret the regression statistics
understand, implement and interpret the T-test statistics

Prof.Muhammad Ilyas ,std.Muhammad Saeed

94

Lesson Outline
1.Non parametric test.
1.Chi square /Fisher exact
2.Phi and cramers v
3.Kendall tau-b
2.Parametric test
1.Correlation
1.Pearson correlation
2.Spearman correlation
2.Regression
1.Simple regression
2.Multiple regression
3.T-Test
1.One-sample T-test
2.Independent sample T-test
3.Paired sample T-test

Prof.Muhammad Ilyas ,std.Muhammad Saeed

95

Inferential Statistics
Inferential statistics are used to make inferences
(conclusions) about a population from a sample
based on the statistical relationships or differences
between two or more variables using statistical tests
with the assumption that sampling is random in
order to generalize or make predictions about the
future.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

96

Inferential Statistics
Inferential statistics are used
To test some hypothesis either to check relationship between

variables (two/more) or to compare two groups to measure the


differences among them.
To generalize the results about a population from a sample
To make predictions about the future.
To make conclusions

Prof.Muhammad Ilyas ,std.Muhammad Saeed

97

Some basics about inferential statistics!


Statistical significance (The p value)

Statistical significance test is the test of a null


hypothesis Ho which is a hypothesis that we attempt
to reject or nullify. i.e.
Ho =There is no relationship /Difference between
variable 1 and variable 2
p value > 0.05
p value < 0.05

Ho is accepted and H1 is rejected.


Ho is rejected and H1 is accepted.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

98

Confidence Interval
Confidence interval is a range of values constructed for a
variable of interest so that this range has a specified probability
of including the true value of the variable. The specified
probability is called the confidence level, and the end points of
the confidence interval are called the confidence limits.
It is one of the alternatives to null hypothesis significance testing
(NHST).

Prof.Muhammad Ilyas ,std.Muhammad Saeed

99

The effect size (weak, moderate or strong)


Effect size is the strength of the relationship between the
independent variable and the dependent variable, and/or the
magnitude of the difference between levels of the independent
variable with respect to the dependent variable.
0
>0 0.33
>0.33 0.70

No effect
Small effect
Medium/typical effect

No relationship
Weak relationship
Moderate relationship

>0.70 <1
1

Large effect
Maximum effect

Strong relationship
Perfect relationship

Prof.Muhammad Ilyas ,std.Muhammad Saeed

100

Steps in interpreting inferential statistics


1. Relate why a test is applied
2. Discuss for which variable the test is applied
3. Elaborate whether the null hypothesis is rejected

or accepted p value
4. State what is the direction of the effect
5. Conclude the results

Prof.Muhammad Ilyas ,std.Muhammad Saeed

101

Types of test used in Inferential Statistics

Inferential statistics include a wide variety of tests to infer


the results. This variety of tests can be classified in two
broader categories that are
1. Non parametric tests
2. Parametric tests

Prof.Muhammad Ilyas ,std.Muhammad Saeed

102

Non parametric tests are the statistical tests that are


used
When the level of measurement is nominal or ordinal. E.g. chi-square test or
Kendalls tau-b.
When assumptions about normal distribution in the population is not met
e.g. spearman correlation
http://www.cliffsnotes.com/WileyCDA/Section/Statistics-Glossary.id-305499,articleId30041.html#ixzz0c38lKKZC retrieval data: 07/01/10

Non parametric tests involve


Chi-Square test
Kendalls tau-b
Spearman correlation (will be discussed in correlation section)

Prof.Muhammad Ilyas ,std.Muhammad Saeed

103

Non parametric test


Chi-Square Statistics
Chi-Squared test is the most commonly used non-parametric
test to check the association between two nominal variables in
order to accept or reject the null hypothesis. It is used to check
The association between two nominal variables

Hypothesis for Chi-Square Test


Ho = there is no association between gender and geometry in h.s.
H1 = There is association between gender and geometry in h.s.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

104

Chi-Squared Test
Assumptions and Conditions for the Chi-Squared

test
The data of the variables must be independent.
Both the variables should be nominal.
All the expected counts are greater than 1 for chi-square.
At least 80% of the expected frequencies should be greater

than or equal to 5.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

105

Chi-Squared Test
Checking Assumptions and Conditions for the Chi-Squared test
geometry in h.s. * gender Crosstabulation
gender
male
geometry in h.s. not taken

Count
Expected Count
% of Total

Taken

Count
Expected Count
% of Total

Total

Count
Expected Count

% of Total
Prof.Muhammad Ilyas ,std.Muhammad Saeed

female

Total

10

29

39

17.7

21.3

39.0

13.3%

38.7%

52.0%

24

12

36

16.3

19.7

36.0

32.0%

16.0%

48.0%

34

41

75

34.0

41.0

75.0

45.3%

54.7%

100.0%
106

Non parametric test


Case Processing Summary
Cases
Valid

N
geometry in h.s. * gender

Missing

Percent
75

100.0%

Total

Percent
0

.0%

Percent
75

100.0%

Chi-Square Tests
Value
Pearson Chi-Square

Asymp. Sig. (2sided)

df

12.714a

.000

Continuity Correctionb

11.112

.001

Likelihood Ratio

13.086

.000

Fisher's Exact Test

Linear-by-Linear Association
N of Valid Casesb

Exact Sig. (2-sided) Exact Sig. (1-sided)

.000
12.544

.000

.000

75
Prof.Muhammad Ilyas ,std.Muhammad Saeed

107

Symmetric Measures

Value

Nominal by Nominal

Phi
Cramer's V

N of Valid Cases

Approx. Sig.
-.412

.000

.412

.000

75

Prof.Muhammad Ilyas ,std.Muhammad Saeed

108

Interpretation:
To check the association between gender and geometry in h.s. chi-square test is conducted. The
case processing summary table indicates that there is no participant with missing value. The
assumptions are checked through crosstabs. The Crosstabulation table includes the Counts and
Expected Counts, and their relative percentages within gender. The result shows that there are 24
males who had taken geometry which is 71% of total 34 male students. On the other hand, 12 of 41
females took geometry; that is only 29% of the females. It looks like a higher percentage of males
took geometry than female students. The Ch-Square Test table tell us whether we can be confident
that this apparent difference is not due to chance.
Note, in the Cross Tabulation table, that the Expected Count of the number of male students who
didnt take geometry is 17.7 and the observed or actual Count is 10. Thus, there are 7.7 fewer
males who didnt take geometry than would be expected by chance, given the Totals shown in the
Table. There are also the same discrepancies between observed and expected counts in the other
three cells of the table. A question answered by the chi-square test is whether these discrepancies
between observed and expected counts are bigger than one might expect by chance.
The Chi-Square Tests table is used to determine if there is a statistically significant relationship
between two dichotomous or nominal variables. It tells you whether the relationship is statistically
significant but does not indicate the strength of the relationship, like phi or a correlation does. In
output, we use the Pearson Chi-Square or (for small samples) the Fishers exact test to interpret
the results of the test. They are statistically significant (p < .001), which indicates that we can be
quite certain that males and females are different on whether they take geometry.
Phi is -.412, and like the chi-square, it is statistically significant. Phi is also a measure of effect size
for an associational statistic and, in this case, effect size is moderate according to Cohen (1988)
Prof.Muhammad Ilyas ,std.Muhammad Saeed

109

Other Nonparametric Associational Statistics

KENDALLS TAU-B
If the variables are ordered (i.e. ordinal), you have several other choices.
We will use Kendalls tau-b in this problem.

Example: What is the relationship or association between

fathers education and mothers education?

Prof.Muhammad Ilyas ,std.Muhammad Saeed

110

Other Nonparametric Associational Statistics


Case Processing Summary
Cases
Valid
N
mother education revised * father
education revised

Missing
Percent

73

Total
Percent

97.3%

Percent

2.7%

75

100.0%

mother education revised * father education revised Cross tabulation


father education revised
1
mother education revised

Count

53

35.6

13.1

4.4

53.0

58.9%

11.0%

2.7%

72.6%

10

18

Expected Count

12.1

4.4

1.5

18.0

% of Total

8.2%

13.7%

2.7%

24.7%

Expected Count

1.3

.5

.2

2.0

% of Total

.0%

.0%

2.7%

2.7%

49

18

73

49.0

18.0

6.0

73.0

67.1%

24.7%

8.2%

100.0%

% of Total

Total

Total

43

Expected Count

Count

Count

Count
Expected Count
% of Total

Prof.Muhammad Ilyas ,std.Muhammad Saeed

111

Other Nonparametric Associational Statistics

Symmetric Measures

Asymp. Std. Errora

Value
Ordinal by Ordinal

Approx. Tb

Approx. Sig.

Kendall's tau-b
.494

N of Valid Cases

.108

3.846

.000

73

a. Not assuming the null hypothesis.


b. Using the asymptotic standard error assuming the null hypothesis.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

112

Interpretation:
To investigate the relationship between fathers education
and mothers education, Kendalls tau-b was used. The
analysis indicated a significant positive association between
fathers education and mothers education, tau =.572,
p<.001. This means that more highly educated fathers were
married to more highly educated mothers and less educated
fathers were married to less educated mothers. This tau is
considered to be a large effect size (Cohen, 1988).

Prof.Muhammad Ilyas ,std.Muhammad Saeed

113

Other Nonparametric Associational Statistics

Prof.Muhammad Ilyas ,std.Muhammad Saeed

114

Interpretation
Eta was used to investigate the strength of the association
between gender and number of math courses taken
(eta=.33). This is a weak to medium effect size (Cohen,
1988). Males were more likely to take several or all the math
courses than females.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

115

SUPERIOR GROUP OF COLLEGES

Prof.Muhammad Ilyas ,std.Muhammad Saeed

116

116

Lecture#8 Quantitative Technique

Prof.Muhammad Ilyas ,std.Muhammad Saeed

117

Descriptive Statistics

Prof.Muhammad Ilyas ,std.Muhammad Saeed

118

Prof.Muhammad Ilyas ,std.Muhammad Saeed

119

Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the
associational and differential research questions using different
parametric and non parametric tests.
understand implement and interpret the chi-square, phi and
cramers V
understand, implement and interpret the correlation statistics
understand, implement and interpret the regression statistics
understand, implement and interpret the T-test statistics

Prof.Muhammad Ilyas ,std.Muhammad Saeed

120

Lesson Outline
1.Non parametric test.
1.Chi square /Fisher exact
2.Phi and cramers v
3.Kendall tau-b
4.Eta
2.Parametric test
1.Correlation
1.Pearson correlation
2.Spearman correlation
2.Regression
1.Simple regression
2.Multiple regression
3.T-Test
1.One-sample T-test
2.Independent sample T-test
3.Paired sample T-test
Prof.Muhammad Ilyas ,std.Muhammad Saeed

121

Correlation
Correlation is a statistical process that determines the mutual (reciprocal)

relationship between two (or more) variables which are thought to be


mutually related in a way that systematic changes in the value of one
variable are accompanied by systematic changes in the other and vice versa.
It is used to determine
The existence of mutual relationship that is defined by the significance (p)

value.
The direction of relationship that is defined by the sign (+,-) of the test value
The strength of relationship that is defined by the test value
Correlation Coefficient (r)
The correlation coefficient measures the strength of linear relationship between two
or more numerical variables. The value of correlation coefficient can vary from -1.0
(a perfect negative correlation or association) through 0.0 (no correlation) to +1.0 (a
perfect positive correlation). Note that +1 and -1 are equally high or strong

Prof.Muhammad Ilyas ,std.Muhammad Saeed

122

Correlation

Assumptions and conditions for Pearson


The two variables have a linear relationship.

Scores on one variable are normally distributed for each value of the

other variable and vice versa.


Outliers (i.e. extreme scores) can have a big effect on the correlation.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

123

Correlation
Checking the assumptions for Pearson Correlation
The assumptions for correlation test are checked through
normal curve (normality assumption) and the scatter plot
(linearity assumption)

Statistics
math
scholastic
achievement aptitude test test
math
N
Valid
75
75
Missing
0
0
Skewness
.044
.128
Std. Error of Skewness
.277
.277
Prof.Muhammad Ilyas ,std.Muhammad Saeed

124

Correlations
math
scholastic
achievement aptitude test
test
- math
math achievement test Pearson
Correlation
Sig. (2-tailed)
N
scholastic aptitude test Pearson
- math
Correlation
Sig. (2-tailed)

.788**

Correlation

.000
75

75

.788**

.000

N
75
**. Correlation is significant at the 0.01 level (2-tailed).

75

Interpretation
To investigate if there was a statistically significant association between Scholastic aptitude
test and math achievement, a correlation was computed. Both the variables were
approximately normal there is linear relationship between them hence fulfilling the
assumptions for Pearson's correlation. Thus, the Pearsons r is calculated, r= 0.79, p = .000
relating that there is highly significant relationship between the variables. The positive sign
of the Pearson's test value shows that there is positive relationship, which means that
students who have high scores in math achievement test do have high scores in scholastic
aptitude test and vice versa. Using Cohens (1988) guidelines the effect size is large relating
Prof.Muhammad
Ilyasmath
,std.Muhammad
Saeed and scholastic aptitude test.125
that there is strong relationship
between
achievement

Correlation

Spearman Correlation: If the assumptions for Pearson

correlation are not fulfilled then consider the Spearman


correlation with the assumption that the Relationship between
two variables is monotonically non linear
Example: what is the association between mothers education
and math achievement

Prof.Muhammad Ilyas ,std.Muhammad Saeed

126

Correlation
Correlationsa

math
mother's achieveme
education nt test
Spearman's rho mother's
education

Correlation
Coefficient
Sig. (2-tailed)
math
Correlation
achievement test Coefficient
Sig. (2-tailed)
**. Correlation is significant at the 0.01 level (2tailed).

1.000

3.15**

.006

.315**

1.000

.006

Interpretation
To investigate if there was a statistically significant association between mothers education
and math achievement, a correlation was computed. Mothers education was skewed
(skewness=1.13), which violated the assumption of normality. Thus, the spearman rho
statistic was calculated, r, (73) = .32, p = .006. The direction of the correlation was positive,
which means that students who have highly educated mothers tend to have higher math
achievement test scores and vice versa. Using Cohens (1988) guidelines the effect size is
medium for studies in his area. The r2 indicates that approximately 10% of the variance in

math achievement test score can be predicted from mothers education.


Prof.Muhammad Ilyas ,std.Muhammad Saeed

127

REGRESSION ANALYSIS
Regression analysis is used to measure the relationship between two or

more variables. One variable is called dependent (response, or outcome)


variable and the other is called Independent (explanatory or predictor)
variables.

Regression Equation
Y = a + bx

Y = a + bx1 + cx2 + dx3 + ex4


Y = dependent variable
a = Constant
b, c, d, e, = slope coefficients
x1, x2, x3, x4 = Independent variables

Types of regression analysis


Simple Regression
Multiple regression

Prof.Muhammad Ilyas ,std.Muhammad Saeed

128

REGRESSION ANALYSIS

Simple Regression
Simple regression is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variable is one.
Assumptions and conditions of simple regression
Dependent variable should be scale
The relationship of variables should be liner
Data should be independent

Example: Can we predict math achievement from grades in

high school
Prof.Muhammad Ilyas ,std.Muhammad Saeed

129

REGRESSION
ANALYSIS

Commands
Analyze

Regression

Linear

Prof.Muhammad Ilyas ,std.Muhammad Saeed

130

REGRESSION
ANALYSIS
Coefficientsa
Unstandardized
Coefficients
Model
1

(Constant)

.397

grades in h.s.

Standardized
Coefficients

Std. Error
2.530

2.142
a. Dependent Variable: math achievement test

.430

Beta

.504

.157

Sig.
.876

4.987

.000

Interpretation
Simple regression was conducted to investigate how well grades in highschool predict
math achievement scores. The results were statistically significant F (1, 73 ) = 24.87,
p<.001. The indentified equation to understand this relationship was math achievement =
.40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24%
of the variance in math achievement was explained by the grades in high school.
According to Cohen (1988), this is a large effect.
Regression equation is

Y = 0.40 + 2.14X
Prof.Muhammad Ilyas ,std.Muhammad Saeed

131

REGRESSION
ANALYSIS

Multiple Regression
Multiple regressions is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variables are more than one.
Assumptions and conditions of Multiple regression
Dependent variables should be scale.

Example: How well can you predict math achievement from a

combination of four variables: grades in high school, fathers


education, mother education and gender

Prof.Muhammad Ilyas ,std.Muhammad Saeed

132

Commands
Analyze

Regression

Prof.Muhammad Ilyas ,std.Muhammad Saeed

REGRESSION
ANALYSIS
Linear

133

Coefficient

Model
1 (Constant)
grades in h.s.
father's education
mother's education
gender

Unstandardized Standardized
Coefficients
Coefficients
B
Std. Error
Beta
1.047
2.526
1.946
.427
.465
.191
.313
.083
.406
.375
.141
-3.759
1.321
-.290

T
.415
4.560
.610
1.084
-2.846

Sig.
.680
.000
.544
.282
.006

Interpretation
Simultaneously multiple regression was conducted to investigate the best predictors of
math achievement test scores. The means, standard deviation, and inter correlations
can be found in table. The combination of variables to predict math achievement from
grades in high school, fathers education, mothers education and gender was
statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last
table. Note that high grades and male gender significantly predict math achievement
when all four variables are included. The adjusted R2 value was 0.343. This indicates
that 34 % of the variance in math achievement was explained by the model according
to Cohen (1988), this is a large effect.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

134

REGRESSION ANALYSIS
Regression analysis is used to measure the relationship between two or

more variables. One variable is called dependent (response, or outcome)


variable and the other is called Independent (explanatory or predictor)
variables.

Regression Equation
Y = a + bx

Y = a + bx1 + cx2 + dx3 + ex4


Y = dependent variable
a = Constant
b, c, d, e, = slope coefficients
x1, x2, x3, x4 = Independent variables

Types of regression analysis


Simple Regression
Multiple regression

Prof.Muhammad Ilyas ,std.Muhammad Saeed

135

REGRESSION ANALYSIS

Simple Regression
Simple regression is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variable is one.
Assumptions and conditions of simple regression
Dependent variable should be scale
The relationship of variables should be linear
Data should be independent

Example: Can we predict math achievement from grades in

high school
Prof.Muhammad Ilyas ,std.Muhammad Saeed

136

REGRESSION
ANALYSIS

Commands
Analyze

Regression

Linear

Prof.Muhammad Ilyas ,std.Muhammad Saeed

137

REGRESSION
ANALYSIS

Coefficientsa
Unstandardized
Coefficients
Model
1

B
(Constant)
grades in h.s.

.397

Standardized
Coefficients

Std. Error
2.530

2.142
a. Dependent Variable: math achievement test

.430

Beta

t
.504

.157

Sig.
.876

4.987

.000

Interpretation
Simple regression was conducted to investigate how well grades in high school predict
math achievement scores. The results were statistically significant F (1, 73 ) = 24.87,
p<.001. The indentified equation to understand this relationship was math achievement =
.40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24%
of the variance in math achievement was explained by the grades in high school.
According to Cohen (1988), this is a large effect.
Regression equation is

Y = 0.40 + 2.14X

Prof.Muhammad Ilyas ,std.Muhammad Saeed

138

REGRESSION
ANALYSIS

Multiple Regression
Multiple regressions is used to check the contribution of
independent variable(s) in the dependent variable if the
independent variables are more than one.
Assumptions and conditions of Multiple regression
Dependent variables should be scale.

Example: How well can you predict math achievement from a

combination of four variables: grades in high school, fathers


education, mother education and gender

Prof.Muhammad Ilyas ,std.Muhammad Saeed

139

Commands
Analyze

Regression

Prof.Muhammad Ilyas ,std.Muhammad Saeed

REGRESSION
ANALYSIS
Linear

140

Model
1 (Constant)
grades in h.s.
father's education
mother's education
gender

Coefficient
Unstandardized Standardized
Coefficients
Coefficients
B
Std. Error
Beta
1.047
2.526
1.946
.427
.465
.191
.313
.083
.406
.375
.141
-3.759
1.321
-.290

T
.415
4.560
.610
1.084
-2.846

Sig.
.680
.000
.544
.282
.006

Interpretation
Simultaneously multiple regression was conducted to investigate the best predictors of
math achievement test scores. The means, standard deviation, and inter correlations
can be found in table. The combination of variables to predict math achievement from
grades in high school, fathers education, mothers education and gender was
statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last
table. Note that high grades and male gender significantly predict math achievement
when all four variables are included. The adjusted R2 value was 0.343. This indicates
that 34 % of the variance in math achievement was explained by the model according
to Cohen (1988), this is a large effect.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

141

SUPERIOR GROUP OF COLLEGES

Prof.Muhammad Ilyas ,std.Muhammad Saeed

142

142

Last lecture #11

Prof.Muhammad Ilyas ,std.Muhammad Saeed

143

Prof.Muhammad Ilyas ,std.Muhammad Saeed

144

T-TEST Statistics
The t test is used to compare to groups to answer the differential

research questions. Its values determines the difference by


comparing means
Hypothesis for T-test
HO: there is no Difference
H1: There is difference
Types of T-test
There are three types of T-test
One sample t-test
Independent sample t-test
Paired sample t-test
Prof.Muhammad Ilyas ,std.Muhammad Saeed

145

T-TEST Statistics
One sample t-test
One sample t-test is used to determine if there is difference between
population mean (Test value) and the sample mean (X)

Assumptions and conditions of 1 sample t-test


The dependent variable should be normally distributed within the

population
The data are independent.(scores of one participant are not depend on
scores of the other :participant are independent of one another )

Example: is the mean SAT-Math score in the modified HSB


data set significantly different from the presumed population
mean of 500?
Prof.Muhammad Ilyas ,std.Muhammad Saeed

146

T-TEST Statistics
One-Sample Statistics

N
scholastic aptitude test math

Mean
75

Std. Deviation

490.53

Std. Error Mean

94.553

10.918

One-Sample Test
Test Value = 500

t
scholastic aptitude
test - math

-.867

Sig. (2tailed)

Df
74

.389

Mean
Difference
-9.467

Prof.Muhammad Ilyas ,std.Muhammad Saeed

95% Confidence Interval


of the Difference
Lower
-31.22

Upper
12.29

147

Interpretation:
To investigate the difference between population and the sample, one-sample
t-test is conducted. The One-Sample Statistics table provides basic
descriptive statistics for the variable under consideration. The Mean AT-Math
for the students in the sample will be compared to the hypothesize population
mean, displayed as the Test Value in the One-Sample Test table. On the
bottom line of this table are the t value, df, and the two-tailed sig. (p) value,
which are circled. Note that p=.389 so we can say that the sample mean
(490.53) is not significantly different from the population mean of 500. The
table also provides the difference (-9.47) between the sample and population
mean and the 95% Confidence Interval. The difference between the sample
and the population mean is likely to be between +12.29 and -31.22 points.
Notice that this range includes the value of zero, so it is possible that there is

no difference. Thus, the difference is not statistically significant.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

148

T-TEST Statistics
Independent sample t-test

Independent sample T-test is used to compare two


independent groups (Male and Female)with respect to there
effect on same dependent variable.
Assumptions and conditions of Independent T-test
Variance of the dependent variable for two categories of the

independent variable should be equal to each other


Dependent variable should be scale
Data on dependent variable should be independent.

Example: Do male and female students differ significantly in


regard to their average math achievement scores
Prof.Muhammad Ilyas ,std.Muhammad Saeed

149

T-TEST Statistics

The first table, Group Statistics, shows descriptive statistics for the two groups (males and females)
separately. Note that the means within each of the three pairs look somewhat different. This might be due
to chance, so we will check the t test in the next table.
The second table, Independent Sample Test, provides two statistical tests. The left two columns of
numbers are the Levenes test for the assumption that the variances of the two groups are equal. This is
not the t test; it only assesses an assumption! If this F test is not significant (as in the case of math
achievement and grades in high school), the assumption is not violated, and one uses the Equal variances
assumed line for the t test and related statistics. However, if Levenes F is statistically significant (Sig. <.05),
as is true for visualization, then variances are significantly different and the assumption of equal variances
is violated. In that case, the Equal variances not assumed line used; and SSPS adjusts t, df, and Sig. The
appropriate lines are circled. Prof.Muhammad Ilyas ,std.Muhammad Saeed
150

Thus, for visualization, the appropriate t=2.39, degree of freedom (df) = 57.15, p=.020. This t is statistically
significant so, based on examining the means, we can say that boys have higher visualization scores than
girls. We used visualization to provide an example where the assumption of equal variances was violated
(Levenes test was significant). Note that for grades in high school, the t is not statistically significant
(p=.369) so we conclude that there is no evidence of a systematic difference between boys and girls on
grades. On the other hand, math achievement is statistically significant because p<.05; males have higher
means.
The 95% Confidence Interval of the Difference is shown in the two right-hand column of the output. The
confidence interval tells us if we repeated the study 100 times, 95 of the times the true (population)
difference would fall within the confidence interval, which for math achievement is between 1.05 points
and 6.97 points. Note that if the Upper and Lower bounds have the same sign (either + and + or and -),
we know that the difference is statistically significant because this means that the null finding of zero
difference lies outside of the confident interval. On the other hand, if zero lies between the upper or lower
limits, there could be no difference, as is the case of grades in h.s. The lower limit of the confidence
interval on math achievement tells us that the difference between males and females could be as small as
1.05 points out 25, which are the maximum possible scores.
Effects size measures for t tests are not provided in the printout but can be estimated relatively easily. For
math achievement, the difference between the means (4.01) would be divided by about 6.4, an estimate
of the pooled (weighted average) standard deviation. Thus, d would be approximately .60, which is,
according to Cohen (1988), a medium to large sized effect. Because you need means and standard
deviations to compute the effect size, you should include a table with means and standard deviations in
your results section for a full interpretation of t tests.
Prof.Muhammad Ilyas ,std.Muhammad Saeed

151

T-TEST Statistics
Paired sample t-test
Paired sample T-test is used to compare two paired groups (e.g. Mothers
and fathers) with respect to there effect on same dependent variable.
Assumptions and conditions of Paired sample T-test
The independent variable is dichotomous and its levels (or groups) are

paired, or matched, in some way (husband-wife, pre-post etc)


The dependent variable is normally distributed in the two conditions

Example: Do students fathers or mothers have more

education?

Prof.Muhammad Ilyas ,std.Muhammad Saeed

152

Paired Samples Statistics

Mean
Pair 1

father's education

mother's education

Std. Deviation

Std. Error Mean

4.73

73

2.830

.331

4.14

73

2.263

.265

Paired Samples Correlations


N

Pair 1

Correlation

Sig.

father's education & mother's education


73

Prof.Muhammad Ilyas ,std.Muhammad Saeed

.681

.000

153

The first table shows the descriptive statistics used to compare mothers and
fathers education levels. The second table Paired Samples Correlations, provides
correlations between the two paired scores. The correlation (r=.68) between
mothers and fathers education indicates that highly educate men tend to marry
highly educated women and vice versa. It doesnt tell you whether men or women
have more education. That is what t in the third table tells you.
The last table shows the Paired Samples t Test. The Sig. for the comparison of the
average education level of the students mothers and fathers was p=.019. Thus, the
difference in educational level is statistically significant, and we can tell from the
means in the first table that fathers have more education; however, the effect size is
small (d=.28), which is computed by dividing the mean of the paired differences
(.59) by the standard deviation (2.1) of the paired differences. Also, we can tell from
the confidence interval that the difference in the means could be as small as .10 of a
point or as large as 1.08 points on the 2 to 10 scale.

Prof.Muhammad Ilyas ,std.Muhammad Saeed

154

Thank you!

Prof.Muhammad Ilyas ,std.Muhammad Saeed

155

You might also like