Professional Documents
Culture Documents
Clustering
It is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters.
Difference
O Freeware O Shareware O Commercial Software
WEKA
O Waikato Environment for Knowledge
Advantages of Weka
O Free availability under the GNU general
public license. O Portability, since it is fully implemented in the java programming language and thus runs on almost any modern computing platform. O A comprehensive collection of data processing and modeling techniques. O Ease of use due to its graphical user interfaces.
Visualization
Feature selection
Access to SQL databases using
JAVA database connectivity. It is not capable of multi-relational data mining but there is separate software for converting a collection of linked database tables into a single table that is suitable for processing using Weka.
Weka product Open source and freely available Easily useable Platformindepende nt
environment supports essentially the same functions as the Explorer but with a drag-and-drop interface. One advantage is that it supports incremental learning. O Simple CLI : Provides a simple command-line interface that allows direct execution of WEKA
commands for operating systems that do not provide their own command line interface.
ARFF FILE
Attribute Relationship File Format
(ARFF) is the text format file used by weka to store data in data base. The ARFF file contains two sections: the header and the data section. The first line of the header tells us the relation name. Then there is the list of the attributes (@attribute...).
CLUSTER ATTRIBUTES
THANK YOU