You are on page 1of 20

Data Mining, process of extracting valid,unknown,actionable information from large databases.

Data mining also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns.

Knowledge
Pattern Evaluation Data Mining Task-relevant Data Selection Data Warehouse Data cleaning Data Integration

Databases

Decision Rule

trees

induction neighbor

Nearest

Dashboards

Data Mining Techniques


Descriptive Clustering Classification Association
Decision Tree

Predictive

Sequential Analysis
Rule Induction

Neural Networks
Nearest Neighbor Classification

Regression

Classification

technique is used to classify the records to form a finite set of possible class classifiers. The model that is produced is usually in the form of a decision tree or a set of rules

Tree where internal nodes are simple decision rules on one or more attributes and leaf nodes are predicted class labels. Eg:Customers renting property>2 years?

Rent property Rent property

Customer age >25 years

Buy property

Set of nodes connected by directed weighted edges


x1
w1 x2 w2 w3 x3

Try

to find rules of the form


IF <left-hand-side> THEN <right-hand-

side>

This is the reverse of a rule-based agent, where the rules are given and the agent must act. Here the actions are given and we have to discover the rules!

Regression is a data mining technique used to fit an equation to a dataset. The simplest form of regression,lieaner regression ,uses the formula of straight line and determines the appropriate values.

The

art of finding groups in data Objective: gather items from a database into sets according to (unknown) common characteristics Much more difficult than classification since the classes are not known in advance (no training) Technique: unsupervised learning

Purpose
Providing the rules correlate the presence of a set of items with another set of item Examples:

Performing

basket analysis analysis of Sales

Multidimensional

Database

marketing

Cardholder

pricing and profitability

Fraud

detection life-cycle management

Predictive

Call detail record multidimensional analysis.


Fraud pattern analysis

Biological

data mining has become an essential part of new research field called bioinformatics.
DNA Analysis

An Intrusion Detection System is an important part of the Security Management system for computers and networks that tries to detect breakins or break-in attempts.

Types of Intrusion detection


Network Based Host Based

mining is a decision support process in which we search for patterns of information in data.
Data This

technique can be used on many types of

data. Overlaps with machine learning, statistics, artificial intelligence, databases, visualization

You might also like