You are on page 1of 11

DATA MINING

UNIT-5

G.Kamal

Data Mining
Data mining is about finding new information in a lot of data Data mining is also called knowledge discovery and data mining (KDD) Data mining is
extraction of useful patterns from data sources, e.g., databases, texts, web, image

DATA MINING: COMPONENTS OF DATA MINING


Database, Data Warehouse, or Other Information Repository Database or Data Warehouse Server

Knowledge Base
Data Mining Engine

Pattern Evaluation Module


Graphical User Interface

Architecture of a Typical Data Mining System


Graphical User Interface

Pattern Evaluation Knowledge Base

Data Mining Engine Database or Data Warehouse Server

Database

Data Warehouse

Database,Data Warehouse, or Other Information Repository:


This is one or a set of databases, data warehouses, spreadsheets, or other kinds of information repositories. Data cleaning and data integration techniques may be performed on the data.

Database or Data Warehouse Server:


The database or data warehouse server is responsible for fetching the relevant data,based on the users data minig request. Knowledge Base: This is the domain knowledge that is used to guide the search, or evaluate the interesting of resulting patterns. Data Mining Engine: This is essential to the data mining system and ideally consists of a set of functional modules for tasks such as characterization, association, classification, cluster analysis, & evolution and deviation analysis.

Pattern Evauation: This component typically employs interestingness measures and interacts with the data mining modules so as to focus the search towards interesting patterns. o Graphical User Interface: This modules communicates between users and the data mining system, allowing the user to interact with the system by specifying a data mining query or task ,providing information to help focus the search, and performing exploratory data mining based on the intermediate data mining results.

Data Mining Techniques


1)Cluster Analysis 2)Induction a) Decision Tree b) Rule Induction 3)Neural Networks 4) Online Analytical Processing 5) Data Visualization

1)Cluster Analysis
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters)

Induction
Induction is a technique to infer information that is generalized from the database. This is higher-level information or knowledge in that it is a general statement statement about object in the database.

Induction has been used in the following ways with in data mining:

a) Decision Tree b) Rule Induction

Neural Network

Neural Network Use: 1)Sales forecasting 2)Industrail Process Control 3)Customer Research 4)Data Validation 5) Risk Management Online Analytical Process Data Visualisation