QTA Interpretation

Test explained
1- Correlation analysis
2- ANOVA test
3- Independent t-test
4- Paired t-test
5- One sample t-test
6- Regression Analysis
7- Binary logistic Analysis
8- Randomness
9- Normality
10- Multinomial Logistic Regression
Correlation analysis (metric-metric)
Correlations
Educational
Employment
Level (years)
Category
Educational Level (years) Pearson Correlation
Employment Category
Current Salary
Beginning Salary
Pearson Correlation
Current Salary
Salary
.514**
.661**
.633**
.000
.000
.000
474
474
474
474
.514**
.780**
.755**
.000
.000
Sig. (2-tailed)
N
Beginning
Sig. (2-tailed)
.000
474
474
474
474
.661**
.780**
.880**
Sig. (2-tailed)
.000
.000
474
474
474
474
.633**
.755**
.880**
Sig. (2-tailed)
.000
.000
.000
474
474
474
Pearson Correlation
Pearson Correlation
**. Correlation is significant at the 0.01 level (2-tailed).
Double star showing that we are more than 99% sure and there is no bias.
Single star shows that we are 95% sure that there is no bias.
.000
474
No star means not related and insignificant and there is no role of this variable.
.514 most significant moderate and positive relation is found between educational level and
employment category.
To purify the relation between beginning salary and current salary use partial.
Take beginning salary and current salary into variables and the others all are in controlling.
Correlations
Control Variables
Current Salary
Employee Code & Date of
Current Salary
Birth & Educational Level

(years) & Employment
Category & Months since Hire
& Previous Experience
Beginning Salary
(months) & Minority

Classification
Correlation
Beginning Salary
1.000
.674
Significance (2-tailed)
.000
df
464
Correlation
.674
1.000
Significance (2-tailed)
.000
df
464
The pure relation between beginning salary and current salary is 67.4%.
ANOVA TEST groups are non-metric
ANOVA
Current Salary
Sum of Squares
df
Mean Square
Between Groups
8.944E10
4.472E10
Within Groups
4.848E10
471
1.029E8
Total
1.379E11
473
F
434.481
Sig.
.000
As sig value is .000 which shows Ho is rejected and three groups are not same. But which two
are same?
Now check the values on homogenous subset.
Current Salary
Tukey HSDa,,b
Subset for alpha = 0.05
Employment Category
Clerical
363
$27,838.54
Custodial
27
$30,938.89
Manager
84
$63,977.80
Sig.
.227
1.000
Means for groups in homogeneous subsets are displayed.

a. Uses Harmonic Mean Sample Size = 58.031.
b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not
guaranteed.
Independent t-test
Ho equal variances
H1 not equal variances
Independent Samples Test

Levene's Test for
Equality of Variances
t-test for Equality of Means

95% Confidence
Interval of the
Difference
Sig. (2-
F
Weight Equal variances
Sig.
.221
.646
t
.542
df
tailed)
Mean
Std. Error
Difference Difference
Lower
Upper
13
.597
2.57143
4.74714 -7.68414 12.82700
.535 11.880
.602
2.57143
4.80380 -7.90694 13.04980
assumed
Equal variances
not assumed
As sig value is greater than 0.05 therefore accept Ho and it is concluded that variances are equal
while the difference in mean/averages is found insignificant (.597) therefore it is concluded that
males are having insignificantly higher than females (means not that much).
Association analysis (cross tab or cross measuring, test of independence)
For non-metric/ non metric relation
Symmetric Measures
Value
Nominal by Nominal
N of Valid Cases
Contingency Coefficient
Approx. Sig.
.229
474
Value of contingent Coefficient is .229% which is weak and sign is +. It is

significant as the sig value is below 0.05.
Reporting:
There exists a significant but weak positive relation b/w employment
category and Minority classification.
.000
Employment Category * Minority Classification Crosstabulation

Minority Classification
No
Employment Category
Clerical
Custodial
Manager
Total
Count
Yes
Total
276
87
363
% within Employment Category
76.0%
24.0%
100.0%
% within Minority Classification
74.6%
83.7%
76.6%
% of Total
58.2%
18.4%
76.6%
14
13
27
51.9%
48.1%
100.0%
3.8%
12.5%
5.7%
% of Total
3.0%
2.7%
5.7%
80
84
95.2%
4.8%
100.0%
21.6%
3.8%
17.7%
% of Total
16.9%
.8%
17.7%
370
104
474
78.1%
21.9%
100.0%
100.0%
100.0%
100.0%
78.1%
21.9%
100.0%
Count
Count
Count
% of Total
If employment category is the banner then report:

From employment category clerical there are 76% who dont belong to
minority and 24% belongs to minority classification.
From employment category custodial 51.9% dont belong to minority
classification and and 48.1% belongs to minority classification. And so on so
forth.
If minority classification is the banner the report:
Employee who says no to minority classification are 74.6% belongs to
employment category clerical.
Regression analysis
Linear regression
Things to report:
1- accept/reject of hypothesis
2- R, Adjusted R, S.E
3- Presence of Multi, hetero, auto
4- Model (Regression model)
5- Model description
First of all check ANOVA
If the sig value is less than 0.05 then regression is fitted
ANOVAb
Model
1
Sum of Squares
df
Mean Square
Regression
1.148E11
2.869E10
Residual
2.314E10
469
4.934E7
Total
1.379E11
473
F
581.575
Sig.
.000a
a. Predictors: (Constant), Previous Experience (months), Beginning Salary, Educational Level (years), Employment
Category
b. Dependent Variable: Current Salary
Interpretation:
As sig value is less than 0.05 refers to reject Ho and concluded that at least
anyone of the BS, JC, EL, PE effects the current salary of the employees.
BS= Basic Salary
JC= Job Category
EL = Educational Level
PE = Previous Experience
And now
Model summary table:
R- Shows the strength of relationship between dependent and selected
independents.
Model Summaryb
Model
1
R Square
.912a
Adjusted R
Std. Error of the
Square
Estimate
.832
.831
Durbin-Watson
$7,024.152
1.753
a. Predictors: (Constant), Previous Experience (months), Beginning Salary, Educational

Level (years), Employment Category
b. Dependent Variable: Current Salary
R- 0.92 there exists a strong (do not focus on sign) relationship b/w current
salary and BS, JC, EL, PE.
Adjusted R- square (as it shows unbiased accuracy)
Adjusted R-square is 83.1% means model is 83.1% accurate.
S.E allowed margin of error.
Comes to Data and check the value of salary that is 57000 and our model
predicted 57226 and there is difference of -226 which is under 7024(S.E)
means our model is right.
To check multi comes to table coefficient
VIF is greater than 10 means multi exist and less than 10 means does not
exists.
Coefficientsa
Model
1
Unstandardized
Standardized
Coefficients
Coefficients
B
(Constant)
Educational Level
Std. Error
-3068.271
1782.508
601.303
155.934
5930.283
Collinearity Statistics
Beta
Sig.
Tolerance
VIF
-1.721
.086
.102
3.856
.000
.515
1.940
640.029
.269
9.266
.000
.426
2.348
1.342
.070
.618
19.035
.000
.339
2.950
-19.031
3.327
-.117
-5.720
.000
.861
1.161
(years)
Employment Category
Beginning Salary
Previous Experience
(months)
a. Dependent Variable: Current Salary
All of the VIF are less than 10 that shows that no multi collinearity exists and
all the coefficients are showing the pure effect of their corresponding
variable on the target variable.
Or
Tolerance value greater than 0.1 means no multi collinearity exists.
For auto correlation Durbin Watson
D.W = 2 no auto correlation
D.W is not equal to 2 then auto correlation exists.
1.453 is not equal to 2
1.753 is equal to 2
Model:
Y = + 1X1 + 2X2+ 3X3+4X4+E
CS = + 1 (BS) + 2 (JC) + 3 (EL) + 4 (PE) + E
CS = -3068 + 1.342(BS) + 5930 (JC) + 601 (EL) 19 (PE) removing error sign
in the final model.
Now check sig values if sig value is greater than 0.05 then exclude that from
the model
As constant sig value is greater than 0.05 therefore remove constant from
the model
CS = 1.34 (BS) + 5930 (JC) + 601 (EL) 19 (PE)
CS will increase 1.34$ if the beginning salary is increases by 1$ because unit
of salary is written in 1$.
CS will increase by 19 if the previous experience decrease by 1 month.
From standardized coefficient
Showing the percentage and tells us which one is more effective
BS (61%)
JC (27%)
EL (10%)
PE (12%)
BS is most effect on current salary because it is standardized coefficient
value 61%.
Binary Logistics Regression

Effect of income, age, education, years at current address and employers on
the default status of a customer. File name banklaon.sav
Dependent default status (yes, no- binary)
Things to report
1- Godness of fit
2- Omnibus test of model coefficient (block 1)
3- Block Zero
4- Model Summary
5- Block 1
Goodness of fit from homer and lameshow test
If greater than 0.05 then it is fit.
Hosmer and Lemeshow Test

Step
Chi-square
df
Sig.
11.297
.185
The effect of independent variables is approved by the test as sig value is

0.185 which is greater than 0.05. If value comes less than 0.05 then stop the
test.
Omnibus Tests of Model Coefficients
Chi-square
Step 1
df
Sig.
Step
252.214
.000
Block
252.214
.000
Model
252.214
.000
Sig value will always be same for all three (step, block, and model)
In this case the model has significant ability to reflect the target variables on the covariates
because sig value is less than 0.05
Block 0
Classification Tablea,b,c
Predicted
Previously defaulted
Observed
Step 0
No
Yes
Percentage Correct
No
517
.0
Yes
183
100.0
Overall Percentage
a. No terms in the model.
b. Initial Log-likelihood Function: -2 Log Likelihood = 970.406
c. The cut value is .500
The information in this block refers to the fluke.

Cut off value is .500
The result is showing that the fluck can only predict 26.1% correct default status.
26.1
Variables not in the Equation

Score
Step 0
Variables
df
Sig.
employ
195.424
.000
address
148.225
.000
income
117.972
.000
age
169.521
.000
215.271
.000
Overall Statistics
Sig value less than 0.05 must be included in order to better predict the default status.
Model Summary
Step
-2 Log likelihood
Cox & Snell R Square
718.192a
Nagelkerke R Square
.303
.403
a. Estimation terminated at iteration number 4 because parameter estimates changed by less than .001.
Nagelkerke R Square
It is showing that at least 40.3% accuracy may increase in prediction of defaulter by considering their age,
income, address, and current employer.
Cox & Snell R Square
Cox & Snell R Square are always less than Nagelkerke R Square
Variables in the Equation

B
Step 1a
S.E.
Wald
df
Sig.
Exp(B)
employ
-.164
.023
51.984
.000
.849
address
-.051
.018
8.242
.004
.950
income
.013
.004
13.148
.000
1.013
-.001
.006
.051
.821
.999
age
a. Variable(s) entered on step 1: employ, address, income, age.
Lf = -0.001(age) 0.17 (ed) 0.165 (emp) 0.051 (add) + 0.014 (income)

On the bases of the sig value include the variables
Lf = -0.165 (emp) 0.051(add) + 0.014(income)
Classification Tablea
Predicted
Observed
Step 1
No
Yes
Percentage Correct
No
487
30
94.2
Yes
153
30
16.4
Overall Percentage
a. The cut value is .500
This model can predict the default status of a customer 73.9% accurately.
+ means default chances are increases and
-
Means chances decreases
Default is reduced by 0.51% by increasing the years at current address.

Default is increases by 1.4 % by increasing thousands of income.
Randomness
Ho: random data
H1= not random
73.9
Runs Test
Italy
Test Valuea
South Korea Romania
France
China
United States Russia
Enthusiast
8.4857
8.8953
8.1063
8.9553
8.0387
8.8367
8.1533
8.5050
Cases < Test Value
140
125
171
133
163
131
163
152
Cases >= Test
160
175
129
167
137
169
137
148
Total Cases
300
300
300
300
300
300
300
300
Number of Runs
158
154
150
152
154
146
153
148
.891
.853
.229
.343
.481
-.305
.364
-.344
Asymp. Sig. (2-
.373
.394
.819
.732
.631
.760
.716
.731
Value
tailed)
a. Mean
Italy sig value 0.373 which is greater than 0.05 therefore accept H1 and data is random
Normality
Only for metric variables
Tests of Normality
Kolmogorov-Smirnova
Statistic
Current Salary
.208
df
Shapiro-Wilk
Sig.
474
.000
Statistic
.771
df
Sig.
474
.000
a. Lilliefors Significance Correction
If sig value is greater than alpha (sig value) then test is normal. In this case
is less than 0.05 means test is abnormal.
Descriptives
Statistic
Current Salary
Mean
Std. Error
$34,419.57
95% Confidence Interval for

Mean
Lower Bound
$32,878.40
Upper Bound
$35,960.73
5% Trimmed Mean
$32,455.19
Median
$28,875.00
Variance
$784.311
2.916E8
Std. Deviation
$17,075.661
Minimum
$15,750
Maximum
$135,000
Range
$119,250
Interquartile Range
$13,163
Skewness
2.125
.112
Kurtosis
5.378
.224
If the value of skewness is 0 then test is normal otherwise abnormal.

One sample t-test
Claim was 30000
One-Sample Test
Test Value = 30000
95% Confidence Interval of the
Difference
t
Current Salary
df
5.635
Sig. (2-tailed)
473
.000
Mean Difference
$4,419.568
Lower
$2,878.40
Upper
$5,960.73
T- Value greater than 2 therefore reject the claim.

After applying one sample t-test, ii is identified that there exists a difference of 4419$ between
the claim value and the sample average which appears significant (sig. 0.000) therefore it is
concluded that the average salary of the employees is significantly higher than the claim value.
Therefore claim is considered wrong.
Paired t-test
Paired Samples Correlations

N
Pair 1
Correlation
VAR00001 & VAR00002
Sig.
.946
.000
.946 strong correlation and sig value is 0.000 which is significant.
Paired Samples Test

Paired Differences
95% Confidence Interval
Mean
Pair 1 VAR00001 -
1.87500
Std.
Std. Error
Deviation
Mean
3.09089
1.09279
of the Difference
Lower
-.70904
Upper
4.45904
Sig. (2t
df
1.716
tailed)
7
.130
VAR00002
Std error: 1.09279 which is ignorable

t- Value is less than 2 then accept.
It is identified that the before average is insignificantly higher than the after average therefore we
can say that both averages are insignificantly different from each other. Whereas both the
variables having a significant strong relation between them.
If the mean difference value is within the standard deviation then it is significant. (mean =
1.87500 less than standard deviation = 3.09089)
Multinomial logistic regression

This is a regression which is used to predict /model the dependent variable when it is non-metric
with multiple/more than 2 categories (multi chotomous)
As it is a logistic regression so it is based upon the assumption of GLM (exponential family)
therefore there is no restriction of normality, size of data, # of types of independent variables.
That means no auto correlation and hetroscadisity may occur in logistic regression.
Target variable must be non-metric
Dependent: diagnosis/disease___ 4 categories therefore use multinomial logistic regression.

Analyze_regression____multinomial logistic
In dependent variable it is showing reference category__by default is consider the last category
but u can choose on the base of four categories of your target variables.
All the independent variable should be selected in covariates
Likelihood ratio test is used to check whether the model with predictors is better than the model
without them (fluck) or not. If it found significant (sig<0.05) that means the M-logit is better
than fluck.
Also goodness of fit test is showing that the M logit model is fitted (sig> 0.05). Fitted used for
greater than that.
Or compatible for data analysis using this data
Neglekarke R-square shows the approximate betterment in accuracy that can achieve by using
M-logit model. In this case approximately 100% accuracy may achieve
In pseudo R-square check Neglekarke values
Only include those variables which are significant and exclude the insignificants values.
Significant variables will help to predict the desire category while the insignificant variables are
useless predictors.
That means the only two variables (tidi, time) is the differentiating variables among all the
diagnosis.
Findings: there is no significant difference in time and tidi in AN with reference to AED. This
means that:
(time and tidi are approximately same for AN and AED)
AN
In case of significance
(Diagnosis= AN) = -113.28 (tidi)+453 (time)
-
Showing: the tidi is decreased by 113.28 units in AN with reference to AED

THE time is increased by 453 units in AN with reference to AED.
These models are K-1 in quantity (k stands four categories like four categories in the
model)
And these models compute the probabilities of occurrence of that category.
Classification
Predicted
Observed
Anorexia Nervosa
Anorexia with Bullimia

Bulimia
Nervosa after
Nervosa
Anorexia
Anorexia
Nervosa
Atypical
Eating
Disorder
Percent
Correct
97
100.0%
Anorexia with Bulimia

Nervosa
36
100.0%
Bullimia Nervosa after

Anorexia
56
100.0%
Atypical Eating
Disorder
28
100.0%
44.7%
16.6%
25.8%
12.9%
100.0%
Overall Percentage
-
These model are correct.

Use spss from the file employee data and breakfast.sav file. Use these file for the use of
multinomial test chk yar. Also chk the data of telecom file in which u predict customer
category (dependent) and take some independent variables.
ABN
BNA
Ref: AED

QTA Interpretation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

QTA Interpretation

Uploaded by

Copyright:

Available Formats

Test explained

Educational Level (years) Pearson Correlation

**. Correlation is significant at the 0.01 level (2-tailed).

Employee Code & Date of

Birth & Educational Level

(months) & Minority

Means for groups in homogeneous subsets are displayed.

Independent Samples Test

t-test for Equality of Means

4.74714 -7.68414 12.82700

4.80380 -7.90694 13.04980

Value of contingent Coefficient is .229% which is weak and sign is +. It is

Employment Category * Minority Classification Crosstabulation

% within Employment Category

% within Minority Classification

% within Employment Category

% within Minority Classification

% within Employment Category

% within Minority Classification

% within Employment Category

% within Minority Classification

If employment category is the banner then report:

Std. Error of the

a. Predictors: (Constant), Previous Experience (months), Beginning Salary, Educational

Binary Logistics Regression

Hosmer and Lemeshow Test

The effect of independent variables is approved by the test as sig value is

The information in this block refers to the fluke.

Variables not in the Equation

Cox & Snell R Square

Variables in the Equation

a. Variable(s) entered on step 1: employ, address, income, age.

Lf = -0.001(age) 0.17 (ed) 0.165 (emp) 0.051 (add) + 0.014 (income)

Means chances decreases

Default is reduced by 0.51% by increasing the years at current address.

South Korea Romania

United States Russia

Cases < Test Value

Cases >= Test

Asymp. Sig. (2-

a. Lilliefors Significance Correction

95% Confidence Interval for

If the value of skewness is 0 then test is normal otherwise abnormal.

T- Value greater than 2 therefore reject the claim.

Paired Samples Correlations

VAR00001 & VAR00002

.946 strong correlation and sig value is 0.000 which is significant.

Paired Samples Test

Std error: 1.09279 which is ignorable

Multinomial logistic regression

Dependent: diagnosis/disease___ 4 categories therefore use multinomial logistic regression.

Showing: the tidi is decreased by 113.28 units in AN with reference to AED

Anorexia with Bullimia

Anorexia with Bulimia

Bullimia Nervosa after

These model are correct.

You might also like