Professional Documents
Culture Documents
1
Why Data Mining?
The Explosive Growth of Data: from terabytes to
petabytes
Data collection and data availability
Automated data collection tools, database
systems
Major sources of abundant data
Business: Web, e-commerce, transactions
Science: Remote sensing, bioinformatics
Society and everyone: news, digital cameras,
We are drowning in data, but starving for knowledge!
Data mining—Automated analysis of massive data sets
2
What Is Data Mining?
Alternative name
Knowledge discovery in databases (KDD)
3
Data Mining—Potential Applications
Other Applications
Text mining (news group, email, documents)
and Web mining
Stream data mining
Bioinformatics and bio-data analysis
5
Market Analysis and Management
Data for DM
Credit card transactions, discount coupons,
customer complaint calls
Target marketing
Find customers who share the same
characteristics: interest, income level, spending
habits, etc.
Determine customer purchasing patterns over
time
6
Market Analysis and Management
Cross-market analysis
Associations/co-relations between product sales, &
prediction based on such association
Customer profiling
What types of customers buy what products
7
Fraud Detection & Mining Unusual Patterns
9
Data Mining Process
Task-relevant Data
Data Cleaning
Data Integration
Databases
10
Steps of a KDD Process
11
Architecture: Typical Data Mining System
Pattern evaluation
Data
Databases Warehouse
12
Data Mining: On What Kinds of Data?
Relational database
Data warehouse
Transactional database
Advanced database and information repository
Spatial and temporal data
Time-series data
Stream data
Multimedia database
13
Data Mining Functionalities
Cluster analysis
Class label is unknown: Group data to form new classes, e.g.,
Outlier analysis
Outlier: a data object that does not comply with the general
14
Data Mining: Combination of Multiple Disciplines
Database
Statistics
Systems
Machine
Learning
Data Mining Visualization
Algorithm Other
Disciplines
15