Professional Documents
Culture Documents
Introduction
Explanation of Data Mining Techniques
Advantages
Applications
Privacy
Data Mining
What is Data Mining?
The process of semi automatically analyzing large
Classification
Classification: Given a set of items that have several classes,
Technique for
Classification
Decision-Tree Classifiers
Job
Engineer
Carpenter
Income
<30
K
Bad
>50
K
Good
Income
<40
K
>90
K
Bad
Good
Doctor
Income
>100K
<50
K
Bad
Good
Clustering
Clustering
Group Data into Clusters
Similar data is grouped in the same cluster
Dissimilar data is grouped in the same cluster
Hierarchical
Advantages of Data
Mining
Provides new knowledge from existing data
Public databases
Government sources
Company Databases
Uses
of
Data
Mining
Sales/ Marketing
Diversify target market
Identify clients needs to increase response rates
Risk Assessment
Identify Customers that pose high credit risk
Fraud Detection
Identify people misusing the system. E.g. People who
have two Social Security Numbers
Customer Care
Identify customers likely to change providers
Identify customer needs
Privacy Concerns
Effective Data Mining requires large sources of data
To achieve a wide spectrum of data, link multiple data
sources
Linking sources leads can be problematic for privacy as
follows: If the following histories of a customer were
linked:
Shopping History
Credit History
Bank History
Employment History
Thank you