Professional Documents
Culture Documents
Data Mining
)) ((
. /.
. . .
/
.
.
.
Data Mining
.
.
"
:
.
:
.1
.2
.3
.4
.5
40
/13 / 48 2007
"
.1
.2
.3
.4
DM
"
1996 .
"
" .
.
.
"
.1
.2
.3
.4
.
.
.
.
"
"
Decision Tree Clustering
"
" -
" -
:
.1 "
.
.2
.
41
/13 / 48 2007
" -
277" 1530" %18
.
" -
19
. 2003 - 1985
2001 - 1997 .
" -
"
.
" -
:
: Microsoft SQL Server 2000
Decision Tree Clustering
.
:
.
-
.1
.2
.3
.4
.5
.6
:
.
.
.
.
.
.
42
.7
/13 / 48 2007
19
3
.
/ Data Mining
Data
Information Knowledge
) . (Wu , 2000 :1
) (Seiner, 2002: 2
"(Daft , 2001 :258) .
" "
DM Gold Mining
.
) (Noonan , 2000 : 6
" "
). (Soni & Tang & Yang , 2002 : 1
Data Mining
)Knowledge Discovery in Databases (KDD
Fayyad
.
. (Zaiane,1999 : 3) (Houston & Others, 1999: 438) .
DM :
) (Information Discovery,Inc. 2000 : 4) (Ramachandran,2001 : 2
-1 Executives
" .
-2 End Users
.
-3 Analysts
.
43
/13 / 48 2007
) (Saarenvirla , 2001: 1 )(DSS
Decision Support System
) (Reactive
) (Proactive "
.
). (Rob & Coronel, 2000:609
DM
) Data Warehouse (DW
( Romney & Steinbart ,2000:599) .
) Artificial Intelligence ( AI
"
.( Avison & Shah ,1997:327 ) .
.
:
)(Information Discovery,Inc., 2000 : 2-3) (Ramachandran, 2001:1
-1 Discovery
-2
Predictive Modeling
-3 Forensic Analysis
.
:
) (Ahola & Runsala , 2001:3
-1 Exploratory Analysis
44
/13 / 48 2007
-2 Predictive Analysis
.
(Lehman , 2001:7) .
:
) (Information Discovery,Inc. 2000 :5
)(Ramachandran, 2001: 2
Episodic Mining
-2 Strategic Mining
-3 Continuous Mining
:
) (Brand & Gerritsen ,1998 :1-3
)(Edelstein ,1997: 3
)(Ramachandran , 2001: 3-5 ) (Atre , 2001 : 2) (Tow Crows, 1999 : 6-15
-1 Classification
.
Decision Tree
Nearest Neighbor . Regression
-2 Association
. .
. Market Basket Analysis
45
/13 / 48 2007
-3 Sequential Analysis
Link Analysis
.
-4 Clustering
. ( K- Means) K
. Neural Networks
( Tow Crows , 1999 :10-15) :
- Decision Trees
.
- Neural Networks
" "
.
Input Layer
Hidden Layer
Output Layer
.
- Regression
.
- Time Series
".
- Rule Induction
. .
- K Nearest Neighbor
(K NN ) K
.
46
/13 / 48 2007
Discriminant Analysis
-
.
- Boosting
" .
- Genetic Algorithms
) ( .
" -
D M" "
" :
)(Brand & Gerritsen, 1998 :3
)( Tow Crows , 1999 :22
)(Saarenvirta , 2001: 6
)(Skalak , 2001 : 1
-1
" .
.
DW
Data Mart .
" "
. 90 % - 50 %
-3 Explore data
47
/13 / 48 2007
-5 Build model D M
) (
.
.
.
-6
Evaluate model
.
.
-7
:
.
) (1
.
.
DW
) (1 Data Mining
Source : Rob, Peter & Coronel, Carlos Data Base Systems Design,
" -
/13 / 48 2007
:
) (Avison & Shah , 1997 :328 ) ( Ramachandran , 2001 : 3
) ( Wu , 2002:2
)(Tow Crows , 2002:1
-1 : Banking
2 : Financial
-3 : Telecommunications
-4 : Marketing
-5 : Insurance and Health Care
- 6 : Medicine
-7 : Transportation
-8 : Retailing
-9 : Customer Relationship Management
.
-10 : Quality Control or Error Analysis
.
-11 : Hiring
-12 Electronic Commerce
-13 Food Service Menu Analysis
-14 Warranty Analysis
Student Recruiting and Retention
-15
:
)(Noonan , 2000 :4
-1
) ( Skalak , 2001:2
-2 ( Hermiz , 1999:3 ) .
) ( Small , 1997:6
.
-3 "
) . ( Small , 1997:7
-4 ) ( Skalak , 2001:2
) ( Noonan , 2000:3 .
49
/13 / 48 2007
-5
) (White Cross , 2001:5
-6
( Hermiz , 1999:4 ) .
( Skalak , 2001:2 ).
-7 ( Smyth,2001:5 ) .
"
/ Performance
) . (330 : 2000
:
-1
) . ( 211 : 1999
-2 productivity
) ..( 42 : 2001
-3 Effectiveness
.
.
"
:
) (211-208 :1999 ) (50-48:2001
-1
50
/13 / 48 2007
.
-2
" .
-3
.
-4
.
"
:
) ) ( Rambaldi & Bautista,2000:14) (77 :1986 (52-50 :2001
-1
.
-2 .
-3 .
-4 .
-5
".
"
) .
(76:2001
( Kunstelj & Others,2001:10 ) .
) ( 1 .
) : (:1995 77
) (79-54 :2000
Historical Standard
-1
.
.
.
-2 Industry Standard
.
51
/13 / 48 2007
.
.
)(Saunders , 1997:264
.
" -
" .
.
SQL
( Structured Query Language Server 2000 ) Server 2000
Database Management System
(Gunderloy & Jorden,2000:263) .
:
)(Tiedrich, 2000:14) (Soni & Others, 2001:2) (Bloor Research, 2001:102
-1 )Microsoft Decision Tree (MDT
Microsoft Clustering
-2
Data
Mining .
.
"
"
) (Seidman, 2001: 114
.
Risk ) (2
Level
52
/13 / 48 2007
Low
Branch
Babil
.
Content Navigator
.
) (2 Risk
" Attributes
) (2
97 bad
% 35 180 % 64.64
.% 0.36
Node Path
. .
.
53
/13 / 48 2007
" -
"
(Seidman, 2001: 146) .
.
Node Attribute Set .
) Cluster 2 Cluster 1 (
Node path .
Risk Cluster 1
bad ) (3
39 12
bad % 31.18 27 good
% 68.82 0 . % 0
Node path
"
.
.
54
/13 / 48 2007
) (3 Risk
" -
" " :
-1 :
-
-
-2 :
-
- /
- /
-3 :
- /
- / ) (
- /
55
/13 / 48 2007
" (Rose, 1999:159) .
1997 . 2001
"
%50.7 2001
%18.8 . 2000
" "
"
" .
.
.
.
.
.
.
56
/13 / 48 2007
" -
.1
.2
.3
.4
.5
.6
.7
.
.
.
.
.
.
.
" -
.1
.2
.3
.4
.5
.6
.
SQL Server 2000
.
.
.
.
.
.
Level
"
. Branch
57
/13 / 48 2007
.
.
.7
28 2
.
.
.8
/ 2001 %699
.% 50.7
" -
.1
.2
.3
.4
.5
.6
.
.
)(DW
.
.
" "
.
.
.
58
/13 / 48 2007
" -
.1
.
.2
.
.3 " "
). Insightful Miner (I Miner
Refernces
" -
.1
.2
.3
.4
.5
.6
" "
.2000
" "
.2001
"
" 1983-1979 .1986
" "
.1995
"
" . 2000
"
"
.1999
"
Journals
59
2007 / 48 /13
Internet
60
2007 / 48 /13
Books
61
/13 / 48 2007
) (1
) (
-1
-2
- -1
-2
-1
-2
-3
-4
-1
-2
-3
-4
-5
-6
:
)(Revsine & Others ,1999:160-174
) ( 78:1995
) (Hempel & Simonson ,1999: 67) (79-54 :2000
62