You are on page 1of 14

Overview

Introduction
Explanation of Data Mining Techniques
Advantages
Applications
Privacy

Data Mining
What is Data Mining?
The process of semi automatically analyzing large

databases to find useful patterns


Attempts to discover rules and patterns from data
Areas of Use
Internet Discover needs of customers
Economics Predict stock prices
Science Predict environmental change
Medicine Match patients with similar problems cure

Example of Data Mining


Credit Card Company wants to discover information

about clients from databases. Want to find:


Clients who respond to promotions in Junk Mail
Clients that are likely to change to another competitor
Clients that are likely to not pay
Services that clients use to try to promote services

affiliated with the Credit Card Company


Anything else that may help the Company provide/
promote services to help their clients and ultimately
make more money.

Data Mining & Data Warehousing


Data Warehouse: is a repository (or archive) of

information gathered from multiple sources, stored


under a unified schema, at a single site.
Collect data Store in single repository
Allows for easier query development as a single

repository can be queried.

Data Mining Techniques


Classification
Clustering
Regression
Association Rules

Classification
Classification: Given a set of items that have several classes,

and given the past instances (training instances) with their


associated class, Classification is the process of predicting the
class of a new item.
Therefore to classify the new item and identify to which class
it belongs
Example: A bank wants to classify its Home Loan Customers
into groups according to their response to bank
advertisements. The bank might use the classifications
Responds Rarely, Responds Sometimes, Responds
Frequently.
The bank will then attempt to find rules about the customers
that respond Frequently and Sometimes.
The rules could be used to predict needs of potential
customers.

Technique for
Classification
Decision-Tree Classifiers
Job
Engineer

Carpenter

Income
<30
K

Bad

>50
K

Good

Income
<40
K

>90
K

Bad

Good

Doctor

Income
>100K

<50
K

Bad

Predicting credit risk of a person with the jobs

Good

Clustering

Clustering algorithms find groups of items that are


similar. It divides a data set so that records with
similar content are in the same group, and groups
are as different as possible from each other. (2)

Example: Insurance company could use clustering to


group clients by their age, location and types of
insurance purchased.

The categories are unspecified and this is referred to


as unsupervised learning

Clustering
Group Data into Clusters
Similar data is grouped in the same cluster
Dissimilar data is grouped in the same cluster

How is this achieved ?


K-Nearest Neighbor

A classification method that classifies a point by


calculating the distances between the point and points
in the training data set. Then it assigns the point to the
class that is most common among its k-nearest
neighbors (where k is an integer).(2)

Hierarchical

Group data into t-trees

Advantages of Data
Mining
Provides new knowledge from existing data
Public databases
Government sources
Company Databases

Old data can be used to develop new knowledge


New knowledge can be used to improve services or products
Improvements lead to:
Bigger profits
More efficient service

Uses
of
Data
Mining
Sales/ Marketing
Diversify target market
Identify clients needs to increase response rates

Risk Assessment
Identify Customers that pose high credit risk
Fraud Detection
Identify people misusing the system. E.g. People who
have two Social Security Numbers
Customer Care
Identify customers likely to change providers
Identify customer needs

Privacy Concerns
Effective Data Mining requires large sources of data
To achieve a wide spectrum of data, link multiple data

sources
Linking sources leads can be problematic for privacy as
follows: If the following histories of a customer were
linked:
Shopping History
Credit History
Bank History
Employment History

Thank you

You might also like