Professional Documents
Culture Documents
Submitted to :
Prof. Shailaja Rego
Submitted By:
Deepak Joshi (I027)
Hardik Ranka (I048)
Research Objective: To map the profile of individuals based on internet use activities
Data Source: Textbook named SPSS 17.0 for Researchers by Dr. S.L. Gupta
Cluster Analysis
ABOUT THE DATA:
We have used the data of 31 respondents to map the profile based on internet use activites. The
respondents answered 16 questions on a rating scale 1-4.
Rating scale: Never-1, Occasionally-2, Considerably-3, Almost always-4, Always-5
Variables used:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Step 1:
Hierarchical Clustering: Determining the number of clusters
Missing
Percent
31
Total
Percent
100.0
N
.0
Percent
31
100.0
From this table, we find that all the 31 cases are valid.
Agglomeration Schedule
Agglomeration Schedule
Cluster Combined
Stage
Cluster 1
Cluster 2
Coefficients
Cluster 1
Cluster 2
Next Stage
19
20
1.000
19
2.000
28
24
26
3.500
20
22
25
5.000
10
17
6.500
21
8.000
10
22
31
10.500
22
27
13.000
23
14
16
15.500
21
10
18.000
17
11
29
30
21.000
12
12
28
29
24.000
11
18
13
15
23
27.000
20
14
21
30.500
22
15
12
18
34.500
22
16
13
38.500
24
17
42.750
10
19
18
11
28
47.250
12
26
19
52.200
17
24
20
15
24
57.450
13
23
21
10
14
63.450
25
22
12
70.700
14
15
27
23
15
22
79.075
20
26
24
87.589
16
19
30
25
10
96.389
21
27
26
11
15
106.681
18
23
29
27
117.798
25
22
28
28
135.798
27
29
29
11
166.756
28
26
30
30
220.452
29
24
From the agglomeration schedule, we find that the there is a sudden high jump in co-efficient at
stage 28 from 117.798 to 135.798.
Hence,
No. of clusters = Total sample size 28 = 31 28 = 3
0
5
10
15
20
25
+---------+---------+---------+---------+---------+
19
20
8
9
21
12
18
10
17
14
16
1
29
30
28
11
22
25
31
27
24
26
15
23
2
13
6
7
4
5
3
-+
-+---------------+
-+
|
---+-+
+-----------+
---+ +---+
|
|
---+-+
|
|
|
---+
+-------+
|
-+---+
|
|
-+
+-+ |
|
-+---+ +-+
+-------------------+
-+
|
|
|
-------+
|
|
-+
|
|
-+-+
|
|
-+ +-----+
|
|
---+
|
|
|
-+
+-------------------+
|
-+
|
|
-+-----+ |
|
-+
+-+
|
-+---+ |
|
-+
+-+
|
-+---+
|
-+
|
---+---+
|
---+
|
|
-+
+-----------------------------------------+
-+-+
|
-+ |
|
---+---+
---+
The above dendogram clearly shows that the longest horizontal lines are for 3 cluster solution,
shown by thick dotted line(the dotted line intersects three horizontal lines). It implies that
19,20,8,9,21,12,18,10,17,14,16,1, named as cluster 1;the cluster containing
29,30,28,11,22,25,31,27,24,26,15,23, named as cluster 2; and the cluster containing 2,13,6,7,4,5,3
named as cluster3. The cluster membership also shows similar results.
Collecting
1
Product/servi 2
ce
3
information
Total
Collecting
1
information
2
of current
3
vendor
Total
Searching
1
and
2
collecting
3
information
Total
of new
vendor
Collecting
1
competitive
2
and other
3
informationf
Total
or purchase
Cost/price
1
comparison
2
3
Total
Email
1
2
3
Total
Web
1
conferencing 2
3
Total
Electronic
1
Data
2
Interchange
3
Total
Discussion
1
Groups
2
3
Total
Just-in-time
1
Mean
Descriptives
Std.
Std.
Deviati Error
on
12
7
12
31
12
7
12
31
12
7
12
31
4.00
4.14
4.00
4.03
3.00
4.43
2.67
3.19
4.42
4.57
4.67
4.55
.000
.378
.426
.315
.603
.535
.778
.946
.515
.535
.492
.506
12
7
12
31
4.08
4.29
4.00
4.10
12
7
12
31
12
7
12
31
12
7
12
31
12
7
12
31
12
7
12
31
12
4.00
3.71
3.33
3.68
4.58
4.86
4.25
4.52
3.42
3.29
2.92
3.19
3.00
4.00
2.58
3.06
2.83
3.00
1.25
2.26
3.42
Minimu
m
Maximu
m
.000
.143
.123
.056
.174
.202
.225
.170
.149
.202
.142
.091
95%
Confidence
Interval for
Mean
Lower Upper
Bound Bound
4.00
4.00
3.79
4.49
3.73
4.27
3.92
4.15
2.62
3.38
3.93
4.92
2.17
3.16
2.85
3.54
4.09
4.74
4.08
5.07
4.35
4.98
4.36
4.73
4
4
3
3
2
4
2
2
4
4
4
4
4
5
5
5
4
5
4
5
5
5
5
5
.669
.756
.000
.539
.193
.286
.000
.097
3.66
3.59
4.00
3.90
4.51
4.98
4.00
4.29
3
3
4
3
5
5
4
5
.000
.756
.651
.599
.515
.378
.452
.508
.669
.488
.515
.601
.739
.000
.793
.854
.577
.000
.452
.930
.996
.000
.286
.188
.108
.149
.143
.131
.091
.193
.184
.149
.108
.213
.000
.229
.153
.167
.000
.131
.167
.288
4.00
3.02
2.92
3.46
4.26
4.51
3.96
4.33
2.99
2.83
2.59
2.97
2.53
4.00
2.08
2.75
2.47
3.00
.96
1.92
2.78
4.00
4.41
3.75
3.90
4.91
5.21
4.54
4.70
3.84
3.74
3.24
3.41
3.47
4.00
3.09
3.38
3.20
3.00
1.54
2.60
4.05
4
3
2
2
4
4
4
4
3
3
2
2
2
4
2
2
2
3
1
1
2
4
5
4
5
5
5
5
5
5
4
4
5
4
4
4
4
4
3
2
4
5
Inventory
planning
2
3
Total
Online
1
negotiation
2
3
Total
Online
1
bidding
2
3
Total
Online
1
payment
2
3
Total
Online
1
ordering
2
3
Total
Online status 1
checking
2
3
Total
Online
1
product/servi 2
ce support
3
Total
7
12
31
12
7
12
31
12
7
12
31
12
7
12
31
12
7
12
31
12
7
12
31
12
7
12
31
3.86
2.75
3.26
4.25
2.71
4.00
3.81
1.83
2.86
1.50
1.94
4.25
4.43
3.50
4.00
4.08
4.14
3.92
4.03
4.17
4.14
3.75
4.00
4.17
4.57
3.83
4.13
.378
.452
.815
.754
.488
.000
.792
.835
.900
.522
.892
.622
.535
.522
.683
.289
.378
.289
.315
.577
.378
.622
.577
.389
.535
.389
.499
.143
.131
.146
.218
.184
.000
.142
.241
.340
.151
.160
.179
.202
.151
.123
.083
.143
.083
.056
.167
.143
.179
.104
.112
.202
.112
.090
3.51
2.46
2.96
3.77
2.26
4.00
3.52
1.30
2.03
1.17
1.61
3.86
3.93
3.17
3.75
3.90
3.79
3.73
3.92
3.80
3.79
3.36
3.79
3.92
4.08
3.59
3.95
4.21
3.04
3.56
4.73
3.17
4.00
4.10
2.36
3.69
1.83
2.26
4.64
4.92
3.83
4.25
4.27
4.49
4.10
4.15
4.53
4.49
4.14
4.21
4.41
5.07
4.08
4.31
3
2
2
3
2
4
2
1
2
1
1
3
4
3
3
4
4
3
3
3
4
3
3
4
4
3
3
4
3
5
5
3
4
5
4
4
2
4
5
5
4
5
5
5
4
5
5
5
5
5
5
5
4
5
ANOVA
Sum of Squares
Collecting Product/service
information
Between Groups
df
Mean Square
.111
.055
Within Groups
2.857
28
.102
Total
2.968
30
Collecting information of
Between Groups
14.458
7.229
current vendor
Within Groups
12.381
28
.442
Total
26.839
30
.380
.190
.261
Between Groups
Within Groups
7.298
28
Total
7.677
30
.364
Between Groups
.182
Sig.
.542
.588
16.348
.000
.729
.491
.611
.550
Cost/price comparison
8.345
28
Total
8.710
30
Between Groups
2.679
1.339
Within Groups
8.095
28
.289
10.774
30
Between Groups
1.718
.859
Within Groups
6.024
28
.215
Total
7.742
30
Between Groups
1.577
.788
Within Groups
9.262
28
.331
10.839
30
8.954
4.477
Within Groups
12.917
28
.461
Total
21.871
30
Between Groups
20.019
10.009
5.917
28
.211
25.935
30
5.912
2.956
.501
Total
Email
Web conferencing
Total
Electronic Data Interchange
Discussion Groups
Between Groups
Within Groups
Total
.298
Just-in-time Inventory
Between Groups
planning
Within Groups
14.024
28
Total
19.935
30
Between Groups
11.160
5.580
7.679
28
.274
18.839
30
8.347
4.174
Within Groups
15.524
28
.554
Total
23.871
30
Between Groups
5.036
2.518
Within Groups
8.964
28
.320
14.000
30
.277
.139
Within Groups
2.690
28
.096
Total
2.968
30
Between Groups
1.226
.613
Within Groups
8.774
28
.313
Online negotiation
Within Groups
Total
Online bidding
Online payment
Between Groups
Total
Online ordering
Between Groups
4.633
.018
3.993
.030
2.383
.111
9.705
.001
47.368
.000
5.902
.007
20.348
.000
7.528
.002
7.865
.002
1.443
.253
1.957
.160
Total
10.000
30
Online product/service
Between Groups
2.436
1.218
support
Within Groups
5.048
28
.180
Total
7.484
30
6.757
The above ANOVA table tests the difference between the means for the different clusters. The null
hypothesis states that there is no difference between the clusters for given variable. The variables in
which the significance level is greater than 5% do not significantly vary for different clusters.
We again perform the hierarchical cluster analysis excluding these six variables.
.004
0
5
10
15
20
25
+---------+---------+---------+---------+---------+
24
26
23
27
9
15
22
25
31
29
30
28
11
8
20
19
14
16
10
18
17
12
1
7
13
6
4
21
3
5
2
-+
-+
-+---+
-+
+-+
-+---+ |
-+
+-+
-+
| |
-+-----+ |
-+
+-------------------+
-+
|
|
-+-+
|
|
-+ +-----+
|
---+
+-------------------+
-+
|
|
-+-------------+
|
|
-+
|
|
|
-+---+
|
|
|
-+
|
+-------------+
|
-+
+-+
|
|
-+
| |
|
|
-+---+ +-------+
|
-+
|
|
-------+
|
-+
|
-+
|
-+-+
|
-+ +---+
|
---+
+-----------------------------------------+
-+-+
|
-+ +---+
---+
Cost/price comparison
Online negotiation
df
Mean Square
Between Groups
15.294
7.647
Within Groups
11.544
28
.412
Total
26.839
30
Between Groups
2.197
1.099
Within Groups
8.577
28
.306
Total
10.774
30
Between Groups
13.916
6.958
Sig.
18.548
.000
3.587
.041
39.573
.000
Within Groups
4.923
28
18.839
30
5.865
2.933
Within Groups
18.006
28
.643
Total
23.871
30
Between Groups
6.494
3.247
Within Groups
7.506
28
.268
14.000
30
Total
Online bidding
Between Groups
Online payment
Total
.176
Online product/service
Between Groups
1.661
.830
support
Within Groups
5.823
28
.208
Total
7.484
30
Between Groups
1.073
.536
Within Groups
6.669
28
.238
Total
7.742
30
Between Groups
9.894
4.947
Within Groups
11.977
28
.428
Total
21.871
30
Between Groups
20.291
10.146
5.644
28
.202
25.935
30
5.628
2.814
.511
Discussion Groups
Within Groups
Total
Just-in-time Inventory
Between Groups
planning
Within Groups
14.308
28
Total
19.935
30
4.560
.019
12.113
.000
3.993
.030
2.252
.124
11.565
.000
50.331
.000
5.507
.010
Now, all the variables are significant(there is difference between the clusters) except for e-mail.
Hence, 3 cluster solution is a good solution.
Descriptives
95% Confidence
Interval for Mean
N
Collecting
information of
current vendor
Cost/price
comparison
Std.
Deviation
.568
Std.
Error
.180
Lower
Bound
2.49
Upper
Bound
3.31
Minimum
2
Maximum
4
10
Mean
2.90
4.38
.518
.183
3.94
4.81
13
2.69
.751
.208
2.24
3.15
Total
31
3.19
.946
.170
2.85
3.54
10
4.00
.000
.000
4.00
4.00
3.75
.707
.250
3.16
4.34
Online
negotiation
Online bidding
Online
payment
Online
product/service
support
Electronic Data
Interchange
Discussion
Groups
Just-in-time
Inventory
planning
13
3.38
.650
.180
2.99
3.78
Total
31
3.68
.599
.108
3.46
3.90
10
4.50
.527
.167
4.12
4.88
2.75
.463
.164
2.36
3.14
13
3.92
.277
.077
3.76
4.09
Total
31
3.81
.792
.142
3.52
4.10
10
1.90
.876
.277
1.27
2.53
2.63
1.061
.375
1.74
3.51
13
1.54
.519
.144
1.22
1.85
Total
31
1.94
.892
.160
1.61
2.26
10
4.40
.516
.163
4.03
4.77
4.38
.518
.183
3.94
4.81
13
3.46
.519
.144
3.15
3.78
Total
31
4.00
.683
.123
3.75
4.25
10
4.10
.316
.100
3.87
4.33
4.50
.535
.189
4.05
4.95
13
3.92
.494
.137
3.62
4.22
Total
31
4.13
.499
.090
3.95
4.31
10
4.60
.516
.163
4.23
4.97
4.75
.463
.164
4.36
5.14
13
4.31
.480
.133
4.02
4.60
Total
31
4.52
.508
.091
4.33
4.70
10
2.90
.738
.233
2.37
3.43
4.00
.000
.000
4.00
4.00
13
2.62
.768
.213
2.15
3.08
Total
31
3.06
.854
.153
2.75
3.38
10
3.00
.471
.149
2.66
3.34
2.88
.354
.125
2.58
3.17
13
1.31
.480
.133
1.02
1.60
Total
31
2.26
.930
.167
1.92
2.60
10
3.50
1.080
.342
2.73
4.27
3.75
.463
.164
3.36
4.14
13
2.77
.439
.122
2.50
3.03
Total
31
3.26
.815
.146
2.96
3.56
Step 2:
K-Means Approach: To find the cluster membership of each case
The K-means clustering method was used using the reference point as 3 clusters to obtain stable
clusters.
information
Collecting information of
current vendor
Searching and collecting
information of new vendor
Collecting competitive and
other information for
purchase
Cost/price comparison
Web conferencing
Discussion Groups
Just-in-time Inventory
Online negotiation
Online bidding
Online payment
Online ordering
Online product/service
planning
support
Iteration Historya
Change in Cluster Centers
Iteration
2.733
2.421
1.974
.271
.000
.194
.000
.000
.000
information
Collecting information of
current vendor
Cost/price comparison
Web conferencing
Discussion Groups
Just-in-time Inventory
Online negotiation
Online bidding
Online payment
Online ordering
Online product/service
planning
support
ANOVA
Cluster
Mean Square
Collecting Product/service
Error
df
Mean Square
df
Sig.
.046
.103
28
.452
.641
7.609
.415
28
18.333
.000
.120
.266
28
.454
.640
.141
.301
28
.467
.632
Cost/price comparison
.443
.353
28
1.253
.301
.403
.248
28
1.626
.215
Web conferencing
1.812
.258
28
7.034
.003
6.443
.321
28
20.082
.000
Discussion Groups
7.336
.402
28
18.235
.000
information
Collecting information of
current vendor
Searching and collecting
information of new vendor
Collecting competitive and
other informationfor purchase
Just-in-time Inventory
4.833
.367
28
13.176
.000
Online negotiation
6.205
.230
28
27.027
.000
Online bidding
3.839
.578
28
6.639
.004
Online payment
2.348
.332
28
7.067
.003
Online ordering
.138
.096
28
1.431
.256
.606
.314
28
1.931
.164
Online product/service
.742
.214
28
3.462
.045
planning
support
The F tests should be used only for descriptive purposes because the clusters have been chosen to maximize the
differences among cases in different clusters. The observed significance levels are not corrected for this and thus
cannot be interpreted as tests of the hypothesis that the cluster means are equal.
The Anova indicates that the clusters are different only for different activities like collecting
information of current vendor, Web conferencing, Electronic Data Interchange, Discussion groups
etc. as the significance is less than 0.05 only for these variables.
Collecting information of Current Vendor
Web Conferencing
Electronic Data Interchange
Discussion Groups
Just in time inventory planning
Online negotiation
Online bidding
Online payment
Online product/service support
Valid
Missing
9.000
8.000
14.000
31.000
.000
0.000
0.003
0.000
0.000
0.000
0.000
0.004
0.003
0.045