Professional Documents
Culture Documents
Data mining also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns.
Knowledge
Pattern Evaluation Data Mining Task-relevant Data Selection Data Warehouse Data cleaning Data Integration
Databases
Decision Rule
trees
induction neighbor
Nearest
Dashboards
Predictive
Sequential Analysis
Rule Induction
Neural Networks
Nearest Neighbor Classification
Regression
Classification
technique is used to classify the records to form a finite set of possible class classifiers. The model that is produced is usually in the form of a decision tree or a set of rules
Tree where internal nodes are simple decision rules on one or more attributes and leaf nodes are predicted class labels. Eg:Customers renting property>2 years?
Buy property
Try
side>
This is the reverse of a rule-based agent, where the rules are given and the agent must act. Here the actions are given and we have to discover the rules!
Regression is a data mining technique used to fit an equation to a dataset. The simplest form of regression,lieaner regression ,uses the formula of straight line and determines the appropriate values.
The
art of finding groups in data Objective: gather items from a database into sets according to (unknown) common characteristics Much more difficult than classification since the classes are not known in advance (no training) Technique: unsupervised learning
Purpose
Providing the rules correlate the presence of a set of items with another set of item Examples:
Performing
Multidimensional
Database
marketing
Cardholder
Fraud
Predictive
Biological
data mining has become an essential part of new research field called bioinformatics.
DNA Analysis
An Intrusion Detection System is an important part of the Security Management system for computers and networks that tries to detect breakins or break-in attempts.
mining is a decision support process in which we search for patterns of information in data.
Data This
data. Overlaps with machine learning, statistics, artificial intelligence, databases, visualization