Professional Documents
Culture Documents
Dept. name of Information Technology, Pimpri Chinchwad College of Engineering, Nigdi, Pune University Pune, India.
Abstract In this paper, we present a new approach for extracting classification rules from feed forward neural networks (NNs) that have been trained on data sets having both discrete and continuous attributes. The proposed starts by first generating rules using single pass Apriori algorithm and then applying Bayesian algorithm for filtering the generated rules. Index Terms Data mining, Artificial Neural Networks, Classification, Rule Extraction, Normalization.
class .This data needs to be preprocessed before giving it as input to the neural network.
I INTRODUCTION One important task or approach in classification of data into predefined groups or classes .There are number of techniques for classification some are knowledge based, some are rule based while some are probabilistic. Neural Network is one of the most popular approaches used for classification as it provides good accuracy .But the results of ANN are incomprehensible and output is complex to understand .Decision tree is one of popular approach for classification but its complexity increases as the size of dataset Increases .Although there are various algorithms evolved like ID3,C4.5,C5.0 etc for decision tree but in order to reduce the size of tree and extract optimal rules it becomes quite complex II PROPOSED SYSTEM This paper focuses on the proposed system which determines on a new approach towards the rule extraction from ANN result and provides simple way with satisfactory accuracy. 2.1 Dataset: The dataset is collection of historical data .Dataset consists of number of attributes which determines the membership in
2.2 Normalization Normalization is one of the preprocessing technique in which the real-time data is normalized or converted into a a specific range. Min-Max is one of efficient preprocessing technique used for normalization. In this technique every instance is normalized based on following formulas B = (A-min (A)) / (max (A)-min (A))*(D-C) +C Where B=Normalized value D and C determines the range in which we want our values A is the attribute in dataset Then this normalized data is given as i/p to ANN
Page 441
2.3 Neural Network: ANN is mathematical model or computational model consists of an interconnected group of artificial neurons and process information using a connectionist approach to computation .In most cases an ANN is adaptive system that changes its structures based on external or internal information that flows through the network during the learning Phase. Training ANN: In proposed system we use supervised training method .supervised learning is approach in which every input pattern is associated with an output pattern, which is target or desired class. The normalized dataset is given as i/p to NN.The major task f is to decide the architecture of NN .The following algorithm can be used for deciding architecture
Algorithm for training: Step1: Initially start or set the i/p neurons in input layer equal to no. of attribute, one hidden layer with one neuron, and output layer with number of neurons as one or no of target class Step 2: Train the Neural Network with the predefined architecture and dataset. Step 3: Check mean square Error rate value after training with the expected one. Step 4: If it is less than the expected, add another neuron in hidden layer and again follow step 2. Step 5: Follow the steps 2, 3, 4 till the no of neurons in hidden layer is less than or equal to 2i (i=no of i/p nodes). Step 6: If the desired accuracy is not achieved and the neurons in hidden layer become >=2i then add another hidden layer with one neuron and follow above steps till 2i condition Analyze the previous results and select the one with least MSE and number of neurons for layer before adding new layer. Step 7: follow above steps till desired result is achieved. Now let D be the set of correctly classified samples by the neural network. 2.4 Continuous to discrete conversion: 2.5 Inference engine:
The discrete data acts as input for the inference engine that determines the classification rule
The triangular curve is a function of a vector [4], x, and depends on three scalar parameters a, b, and c, as given by
Page 442
Single pass Apriori algorithm: Classification algorithms generally generate decision trees or classifiers according to pre-determined target; therefore, they need to be tuned to produce human readable rules that can be used in decision support. In this study, an integrated approach was proposed to produce association rules that can be used as classifiers. Apriori [2] algorithm was used as a base model and modified the algorithm to be able to generate human readable classification association rules. Input: Dataset with discrete values. Output: Rule set. Algo_SinglepassApriori () { Step 1: For every record in data set if: The key is present in the hash table then increment the key value(association count) Else Insert the record in the hash table. Initialize the association count to one. Step 2: Perform step 1 till all records in the input dataset are scanned. Step 3: Check the conflicting (overlapping) rules and eliminate those cases. Step 4: The contents of hash table are the rules. } Nave Bayesian Classification It is based on the Bayesian theorem. It is particularly suited when the dimensionality of the inputs are high. Parameter estimation for naive Bayes models uses the method of maximum likelihood. In spite over-simplified assumptions, it often performs better in many complex real world situations The Bayesian algorithm is used to validate the rules extracted using Apriori algorithm. Advantage: Requires a small amount of training data to estimate the parameters
III EXPERIMENTAL RESULTS: In this section the results of the proposed system on Bupa dataset are described. Bupa has 6 attribute from liver disorders that might arise. It has 345 instances. Attribute information: 1. Mcv: mean corpuscular volume 2. Alkphos: alkaline phosphates 3. Sgpt: alamine aminotransferase 4. Sgot: aspartate aminotransferase 5. Gammagt: gamma-glutamyl transpeptidase 6. Drinks: number of half-pint equivalents of alcoholic beverages drunk per day 7. Selector field used to split data into two sets
Training results: Srno 1 2 3 4 5 6 7 8 9 10 11 layers 2 2 2 2 3 2 3 3 3 3 3 Neurons 7,3 7,4 7,1 8,6 7,3,3 9,9 9,10 9,11 9,12 9,10,4 9,10,5 Regression 0.87 0.82 0.7 0.805 0.6 0.933 0.939 0.927 0.922 0.94 0.94 MSE 0.230 0.306 0.444 3.45 3.85 0.128 0.116 0.140 0.157 0.099 0.10
Page 443
Inference engine
Inference engine consists of Single pass Apriori algorithm and Bayesian Classification. Both algorithm works in co-ordinate manner. Initially Apriori is applied to extract classification rule from the input data, then Bayesian algorithm is used to check, validate and correct the rules generated by Apriori algorithm[2] [5]. Fig. Regression plot after training The rules generated as a result of inference engine are in following form: Ai Dj, Ai+1 Dj, Ai+2 Dj , An Dj => Ci Where, Ai denotes the attribute number Dj denoted discritized level Ci denoted the class. Fig. Architecture of ANN I V CONCLUSION In this way this approach provides a new a new way towards rule extraction for classification. The rules generated by this approach provide satisfactory accuracy. But the no of rules generated by this approach are more. This can be overcome by using apriori in incremental manner i.e. by eliminating the least significant attributes (entropy) and thus reducing number of rules.
Neural Network Training Results: Total Instances Correctly Trained Incorrectly trained 345 337 08
Page 444
Page 445