You are on page 1of 4

Introduction - Data Mining Chapter 1 topic 2 What is Data Mining?

Data Mining refers to extracting data / Knowledge. Data Mining can be called as Knowledge Mining/ Discovery Data Mining can be also called as o Knowledge Mining from Data o Knowledge Extraction o Data Pattern Analysis o Data Archaeology o Data Dredging Knowledge

Knowledge Discovery Knowledge Discovery as a process consists of iterative sequence 1. Data Cleaning: removing noise and inconsistent data

2. Data Integration: Multiple data source may be combined 3. Data Selection: data relevant to the analysis task are
retrieved from the database 4. Data Transformation: consolidating the data selected from database 5. Data Mining: process of Extraction of data patterns

6. Pattern Evaluation: (Model/Blueprint/Guide) identifying


the useful pattern representing knowledge 7. Knowledge Presentation: Representation techniques are used to present the mined knowledge to the user Steps 1 to 4 are data preprocessing, where the data are prepared for mining. Steps 5 t0 7 are extraction process and displaying process of data.

Based on the knowledge discovery, we can view the data mining system as following major components 1. Database, Data Warehousing, WWW or other Information Repository 2. Database or Data Warehouse Server 3. Knowledge base 4. Data mining engine 5. Pattern evaluation module 6. User Interface 1.Database, Data Warehousing, WWW or other Information Repository This is one or a set of databases, data warehouses, spread sheets, or other kinds of information repositories. Data cleaning and data integration techniques may be performed on the data. 2. Database or data warehouse server The database or data warehouse server is responsible for fetching the relevant data, based on the user's data mining request. 3. Knowledge base This is the domain knowledge that is used to guide the search, or evaluate the interestingness of resulting patterns. Such knowledge can include concept hierarchies, used to organize

attributes or attribute values into different levels of abstraction. 4. Data mining engine This is essential to the data mining system and ideally consists of a set of functional modules for tasks such as characterization, association analysis, classification, evolution and deviation analysis. 5. Pattern evaluation module This component typically employs interestingness measures and interacts with the data mining modules so as to focus the search towards interesting patterns. It may access interestingness thresholds stored in the knowledge base. Alternatively, the pattern evaluation module may be integrated with the mining module, depending on the implementation of the data mining method used. 6. Graphical user interface This module communicates between users and the data mining system, allowing the user to interact with the system by specifying a data mining query or task, providing information to help focus the search, and performing exploratory data mining based on the intermediate data mining results.

Data mining is considered as the most important frontier database and information system DM involves integration techniques disciplines such as o Database and data warehousing o Statistics o Machine learning o Data visualization o Image Processing o Data Analysis of from

in

multiple

Othe Info r Re itorie pos s

W Wide orld Wb e

You might also like