You are on page 1of 6

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882

Volume 4, Issue 5, May 2015

Students Performance Prediction Using Weighted Modified ID3 Algorithm


Sonia Joseph1, Laya Devadas2
1
2

(Department of CSE, College of Engineering Munnar, Kerala)


(Department of CSE, College of Engineering Munnar, Kerala)

Keywords Data Mining, Decision Tree Algorithm, Id3


Algorithm, Modified Id3 Algorithm

educational systems into useful information that could


potentially have a great impact on educational research
and practice. Knowledge discovered by EDM algorithms
can be used to help teachers to manage their classes,
understand their students learning processes and reflect
on their own teaching methods. Evaluation is a systematic
process of collecting, analysing and interpreting
evidences of students progress and achievement in
cognitive areas of learning for the purpose of taking a
variety of decisions. Evaluation, thus, involves gathering
and processing of information and decision-making .This
paper try to extract useful information from graduate
students data, collected from College of Engineering
Munnar. The system will take into account a number of
factors by gathering data about students Semester marks,
Term Test Marks, Attendance, and various other factors.
Here weighted modified ID3 algorithm is used to
evaluate students performance. By using this algorithm,
it is possible to extract knowledge that describes students
performance at the end of the semester examination.

1.

2.

ABSTRACT
The success of an educational institution can be measured
in terms of quality of education it provides to its students.
In education system, highest level of quality can be
achieved by exploring the knowledge regarding
prediction about students performance. Data mining
techniques play an important role in data analysis. The
application of data mining to educational data allows the
educators to discover useful knowledge about students.
This paper presents a classification model based on
decision tree approach to predict students academic
performance. This method is useful in identifying those
students who are probable to fail in the semester
examinations and allow the teachers to provide
appropriate assistance in timely manner. It also helps the
weaker students to improve and bring out betterment in
the result.

INTRODUCTION

In modern world, large amount of data is available which


can be effectively used to produce necessary information.
The information achieved can be used in the field of
medical, education, banking, business and so on. As huge
amount of data are being collected and stored in the
databases, traditional statistical techniques and database
management tools are not enough for analysing the huge
amount of data. Knowledge Discovery and Data Mining
(KDD) is an interdisciplinary area focusing upon
methodologies for discovering knowledge from large
amount of data. Within the KDD process, there can be
different means of data mining analysis that allow getting
important information from the
database such as
classification, clustering, association, neural network etc
[1]. Educational Data Mining (EDM) is an emerging
discipline concerned with developing methods for
exploring the unique types of data that come from
educational settings, and using those methods to better
understand students, and the settings which they learn in
[2]. The knowledge is hidden among the educational data
set and it is extractable through data mining techniques.
The EDM process converts raw data coming from

RELATED WORKS

In 2006, Qasam A. Al-Radaideh [3] et al. started an


attempt to use data mining functions to analyse and
evaluate students academic data for improving the
quality of the higher educational system. This paper used
the data mining processes, particularly classification, to
predict the final grade in a course under study by
evaluating student data to find the main attributes that
may greatly affect the student performance in courses. To
get the accuracy of the classification model the WEKA
toolkit is used. Three different classification methods
have been tested, the ID3, C4.5, and the Nave Bayes and
shows the evaluation result as a percentage of the
correctly classified instances using the three different
algorithms. From the obtained results, we can notice that
the classification accuracy for the three different
classification algorithms is not very high. This can
indicate that the collected samples and attributes are not
sufficient to generate a classification model of better
quality.
In 2003, Suchita Borkar et al. [4] presented the
potential use of education data mining using association
rule mining algorithm for enhancing the quality of

www.ijsret.org

571

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 5, May 2015

education and predicting students performances. The


analysis revealed that students university performance is
dependent on Unit test mark, Assignment, Attendance
and graduation percentage. The results shows that the
students performance level can be improved in university
result by identifying students who are poor in unit Test,
Attendance, Assignment and graduation giving them
additional guidance to improve the university result.
In 2012, Er. Rimmy Chuchra et al. [5] introduced the
concept of extracting information from large volume of
database. The author uses marks obtained by students in
Sri Sai University - Palampur in their post graduate exam
and other factors. Also the authors introduces various
techniques to improve post graduate students
performance and identify students with low grades. The
data include one and half year period of data. Authors use
Clustering, Decision Tree and Neural Networks for
evaluating students performance. It also helps in
identifying dropouts and students who need special
assistance.
In 2011, Umesh Kumar Pandey S. Pal [6] used data
mining technique named as Bayes classification, on
students database to predict the students results on the
basis of previous year database. An advantage of the
naive Bayes classifier is that it requires a small amount of
training data to estimate the parameters necessary for
classification. Because independent variables are
assumed, only the variances of the variables for each
class need to be determined and not the entire covariance
matrix. In spite of their naive design and apparently oversimplified assumptions, naive Bayes classifiers have
worked quite well in many complex real world situations.
In this study, data gathered from different degree,
colleges affiliated with Dr. R. M. L. Awadh University,
Faizabad.
In 2013, Kalpesh Adhatrao et al. [7] developed a
system which can predict the performance of students
from their previous performances using concepts of data
mining techniques under Classification They have
analysed the data set containing information about
students, such as gender, marks scored in the board
examinations of classes X and XII, marks and rank in
entrance examinations and results in first year of the
previous batch of students. By applying the ID3 and C4.5
classification algorithms on this data, they have predicted
the general and individual performance of freshly
admitted students in future examinations. Thus, for a total
of 182 students, the average percentage of accuracy
achieved in Bulk and Singular Evaluations is
approximately 75.275 for both ID3 and C 4.5.
In 2013, Ramanathan L et al. [8] proposed a modified
weighted decision tree algorithm. This algorithm
overcomes the shortcomings of simple ID3 algorithm.

One of the main drawbacks of the ID3 algorithm is that it


is inclined towards the attributes with more values. This
can be a wrong selection and hence, as a result, the tree
generated is may not be very much efficient. For
removing the inclination of traditional ID3 algorithm, an
improved weighted ID3 (wID3) algorithm is proposed in
this paper. In this, the attribute with highest Gain Ratio is
multiplied with a weight which gives it a new value, and
among the new values, attribute with highest Gain Ratio
is selected as a node of the tree. Also, information gain is
replaced by gain ratio, which in more normalized. The
Data set for the study has been collected from CSE
branch of VIT University. This data set consists of 304
instances and each instance consists of 10 attributes.
wID3s performance is also compared with J48 algorithm
and Nave Bayes classification algorithm. From
performance analysis they came to a result that wID3
algorithm is more efficient than other two algorithms.
wID3 classified 93% percentage of records.

3.

DATA MINING PROCESS

In present days educational system, a students


performance in career growth is determined by the internal
assessment and end semester examination. The internal
assessment is carried out by the teacher based upon
students performance in educational activities such as
class test, seminar, assignments, attendance and lab work.
The end semester examination is one that is scored by the
student in semester examination. Each student has to get
minimum marks to pass a semester in internal as well as
end semester examination.
The major objective to use data mining technique is to
discover knowledge from the existing data which is
available in the college database. Before applying the data
mining techniques on the data set, there should be a
methodology that governs this work. Fig. 1 depicts the
work methodology used.
Students record

Student data collection

Data pre-processing

Data
selection

www.ijsret.org

Data cleaning

Decision tree

Result evaluation

Knowledge representation

Data
transformation

572

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 5, May 2015

Fig.1 Data Mining Work Methodology


3.1 Preparation of Data
The initial data set is prepared from the data obtained
from the first batch students (2000 admission) of Dept. of
CSE, College of Engineering Munnar. The initial size of
the data is 56. In this step, data stored in different record
such as attendance record, mark sheet record etc. are
joined to form a single table. After joining process, errors
are removed. The TABLE 1 shows the sample data set.
Table 1 Data Set
Sl no
1
2
3
4

PSM
first
first
first
first

CTM
good
good
good
good

ASS
yes
yes
yes
yes

ATT
good
good
good
good

LW
yes
yes
yes
no

5
6
7
8
9
10

third
second
first
first
first
first

poor
average
good
good
good
good

no
yes
yes
yes
no
yes

poor
average
good
average
poor
good

yes
yes
yes
no
yes
yes

ESM
first
first
first
secon
d
first
first
first
first
first
first

3.2 Data Selection and Transformation


In this step, only those field are selected which are
required for data mining. Most of the attributes reveal the
past performance of the students. Reasons behind
concentrating on the past performance data are
1. Data is easily available in the administrative department
of the institute.
2. If student has performed well in the past, it is most
likely that he will perform better in subsequent exams as
well.
All the predictor and response variables which are
derived from the database are given in the TABLE 2.
Table 2 Student Related Variables
Variable

Description

Possible Values

PSM

Distinction>75%First>60%
Second>45%<60%Third>36
<45% Fail<36%
Poor, average, good

ASS

Previous
Semester
Mark
Class Test
Mark
Assignment

ATT

Attendance

Poor, average, good

LW

Lab Work

Poor, average, good

ESM

End
Semester
Mark

Distinction>75%First>60%
Second>45%<60%Third>36
<45% Fail<36%

CTM

Yes , No

PSM Previous Semester Marks. It is split into five class


values: Distinction >75%, First >60%and<75%, Second
>45% and <60%, Third >36% and < 45%, Fail < 40%.
CTM Class Test Mark obtained. Here in each semester
two class tests are conducted and average of two class test
are used to calculate sessional marks. CTM is split into
three classes: Poor < 40%, Average > 40% and <
60%, Good >60%.
ASS Assignment performance. In each semester two
assignments are given to students. Assignment
performance is divided into two classes: Yes student
submitted assignment, No Student not submitted
assignment.
ATT Attendance of Student. Minimum 70% attendance
is compulsory for attending the End Semester
Examination. But even through in some cases low
attendance students also participate in End Semester
Examination on genuine reason. Attendance is divided
into three classes: Poor - <60%, Average - > 60% and
<80%, Good - >80%.
LW Lab Work. Lab work is divided into two classes: Yes
student completed lab work, No student not completed
lab work.
ESM - End semester Marks obtained in B. Tech semester
and it is declared as response variable. It is split into five
class values: Distinction >75%, First >60%and<75%,
Second >45% and <60%, Third >36% and < 45%, Fail
< 40%.
3.2 Decision Tree
A decision tree is a tree in which each leaf node represents
a decision and branch node represents a choice between a
numbers of alternatives. Decision trees are commonly
used for getting information for the purpose of decisionmaking. Decision tree starts with a root node on which it
is for users to take actions. From this node, users split each
node recursively according to decision tree learning
algorithm. The final result is a decision tree in which each
branch represents a possible scenario of decision and its
outcome.
Decision trees classify examples by sorting them
based on their attribute values. The attributes values here
focus on Previous Semester Marks, Class Test Mark,
Assignment, Attendance, Lab Work, and End Semester
Marks. Each node in a decision tree represents any one of
this attribute, and each branch in a decision tree
represents a value that the node can take. Decision trees
are constructed using a top down greedy search algorithm
which recursively subdivides the training examples based
on the attribute that best classifies the training data. The
attribute that best divides the training examples would be
the root node of the tree. The algorithm is then repeated
on each partition of the divided data, creating sub trees

www.ijsret.org

573

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 5, May 2015

until the training data is divided into subsets of the same


class. At each level in the partitioning process a statistical
property known as information gain is used to determine
which attribute best divides the training data.
Strengths of Decision Tree Methods:
Ability to Generate Understandable Rules: The ability
of decision trees to generate rules that can be translated
into comprehensible English is the greatest strength of this
technique.
Ease of Calculation at Classification Time : Although,
as we have seen, a decision tree can take many forms, in
practice, the algorithms used to produce decision trees
generally yield trees with a low branching factor and
simple tests are performed at each node. Typical tests
include numeric comparisons, set membership, and simple
conjunctions. When implemented on a computer, these
tests translate into simple Boolean and integer operations
that are computationally inexpensive and fast.
Ability to Handle Both Continuous and Categorical
Variables: Decision-tree methods are equally well at
handling continuous and categorical variables. Categorical
variables, which show problems for neural networks and
statistical techniques, come ready-made with their own
splitting criteria: one branch for each category.
Continuous variables are easy to split by picking a number
somewhere in their range of values.
Ability to Clearly Indicate Best Fields: Decision-tree
building algorithms put the field that does the best job of
splitting the training records at the root node of the tree.
3.3 Weighted modified ID3 Algorithm
The ID3 algorithm is one of the most famous algorithms
present today to generate decision trees. But this algorithm
has a major shortcoming that is it inclined towards the
attributes with many values. So, we can overcome this
shortcoming of the algorithm by using gain ratio (instead
of information gain) as well as by giving weights to each
attribute at every decision making point. For removing the
inclination of traditional ID3 algorithm towards attributes
with many values, an improved weighted ID3 (wID3)
algorithm is proposed. In this, the attribute with highest
Gain Ratio (not information gain) is multiplied with a
weight which gives it a new value, and among the new
values, attribute with highest Gain Ratio is selected as a
node of the tree. Also, information gain is replaced by
gain ratio, which in more normalized. This whole process
overcomes the inclination problem.

GainRatio( A) Gain(S , A) / Entropy(S )

Entropy p j log p j
j

(2)

(1)

For calculating the weights, we check the relevance


between condition attributes and decision attributes by
using a correlation function. The value between the
conditions attributes A and the decision attribute D is
given by:
v

AFD ( A) (| Ai1 | | Ai 2 |) / v

(3)

i 1

where v is the number of different values A is having.


Now, the modified weight of condition attribute A will be:
n

wA' AFD ( A) /( AFD ( j ))

(4)

j 1

if the gain ratio of A is max otherwise wA ' 1 .


So, the modified gain ratio of condition attribute A
will be:
GainRatio ' ( A) wA ' * GainRatio( A)
(5)
Here we can notice that the weight of the attribute with
maximum gain ratio is modified and rest have the same
gain ratio as earlier. In this way, gain ratio of attributes
satisfying the condition is reduced and hence, it
overcomes the inclination problem.
wID3 Algorithm:
Begin
Create a node N
If (All samples are in same class)
Return node as leaf with class name;
If (attribute list is empty)
Return node as leaf node labeled with most common
class;
Calculate the weight of each attribute
Select test attribute i.e. attributes having highest gain ratio
Label node N with test attribute
For each known value of a of test attribute, grow branches
from node N for the condition test attribute = a;
Let Si be set of samples for which test attribute = ai;
If (Si is empty) then attach the leaf labeled with most
common class in sample.
Else attach the node returned by
generate_decision_tree(Si,attribute_list_test_attribute)
End
3.4 Knowledge from Decision Tree Classifier
Once a decision tree has been constructed, it is a simple
matter to convert it into an equivalent set of rules. To
generate rules, trace each path in the decision tree, from
root node to leaf node, recording the test outcomes as
antecedents and the leaf-node classification as the
consequent. These if-then rules extracted from decision
tree can serve as a knowledge base for further
classification. Converting a decision tree to rules has three
main advantages:

www.ijsret.org

574

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 5, May 2015

Converting to rules allows distinguishing among the


different contexts in which a decision tree node is used.
Since each different path through the decision tree node
produces a distinct rule.
Converting to rules removes the distinction between
attribute tests that occur near the root and those that occur
near the leaves of the tree.
Converting to rules improves readability. Rules are
often easier for people to understand.
Proposed Improved ID3 Algorithm differs from
original ID3 in following way:
Used of extra Association Function to overcome the
short comings of Id3.
In Improved Id3 more reasonable and effective rules
are generated.
Missing values can be considered (will not have impact
on accuracy of decision).
Accurate rules generated.
Time complexity is more in improved ID3, but it can
be neglected because now faster and faster computers are
present.
More no of rules to increase the accuracy of decision.
Root node is decided not only on value of information
gain of attribute but also it considers one more additional
function called association function (AF).
3.5 Design of Prediction System
The prediction system contains a knowledge base that has
accumulated experience and a set of rules for applying the
knowledge base to each particular situation. It is the
knowledge-base that incorporates Previous Semester
Marks, Class Test Mark, Assignment, Attendance, Lab
Work, and their relationships with the performance
obtained in End Semester examination. The architecture
for the prediction system is provided in Fig. 2.
Knowledge base

Working memory

Inference
mechanism

End
user

Fig. 2 Architecture of the prediction system


Knowledge Base: This contains the domain knowledge
and they are declarative representation often in IF-THEN
rules. These rules represent the knowledge base of the
system.
Working Memory: The user enters information on a
current problem into the working memory. The system
matches this information with knowledge contained in the
database to infer new facts. The system then enters these
new facts into the working memory and the matching

process continues. Eventually the system reaches some


conclusion that it also enters into the working process.
Inference Mechanism: It works with the information in
the knowledge base and the working memory. It searches
the rules for a match between their premises and
information contained in the working memory. When it
finds a match, it will produce the output.
End-User: the individual who will be consulting with the
system to get advice that would have been provided by the
prediction system.

4. RESULT
The dataset of 56 students from BTECH course was
obtained from CSE department of College of Engineering,
Munnar. The details like semester marks, class test marks,
assignment marks, attendance and, lab work were
collected from the college database. Weighted Modified
ID3 algorithm was implemented using this collected
information. Implementation of the algorithm was done
with the help of following tools: JDK (Java Development
Kit),
NetBeans
IDE
(Integrated
Development
Environment) 8.0.The Table I shows the accuracy of ID3,
C4.5 and CART algorithms for classification.
Table 3 Classifiers Accuracy
ALGORITHM

PREDICTION ACCURACY

C4.5

45.8333%

ID3

52.0833%

CART

56.25%

MODIIED ID3

76%

5. CONCLUSION
The educational system is the backbone of progress and
development of any society. Greater the ability of the
education system to improve the performances of its
students better the chance of the society to produce
successful citizens. Keeping this fact in mind,, it is
necessary to constantly work towards a more sophisticated
education system. Data mining is an incredible concept
which provides us hidden information from voluminous
and exhaustive databases. Data mining can provide many
solutions towards making a stronger education system.
This study shows that students past academic
performance can be used to create the model using
decision tree algorithm that can be used for prediction of
students performance in later examinations. From
TABLE 1, it is clear that the accuracy of proposed
prediction system is higher compared to other methods.

www.ijsret.org

575

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 5, May 2015

This study is a stepping stone towards the integration of


technology and the education system.

REFERENCES
[1] Elena Susena. Using Data Mining Techniques in
higher education, National defence university, carol I
Bucharest 68-72.
[2] A. Dinesh Kumar, Dr. V. Radhika, A Survey on
Predicting Student Performance, International Journal
of Computer Science and Information Technology, Vol. 5
(5), 2014.
[3] Qasem A. Al-Radaideh, Emad M. Al-Shawakfa,
Mustafa I. Al-Najjar Mining student data using decision
tree, The 2006 International Arab Conference on
Information Technology (ACIT2006), Nov. 2006,
Jordan.
[4] Suchita Borkar and K. Rajeswari, Predicting
Students Academic Performance Using Education Data
Mining, International Journal of Computer Science and
Mobile Computing. IJCSMC, Vol. 2, Issue. 7, July 2013,
pg.273 279.
[5] Er. Rimmy Chuchra, Use of Data Mining
Techniques for the Evaluation of Student Performance:A
Case Study, International Journal of Computer Science
and Management Research Vol 1 Issue 3 October 2012
[6] Umesh Kumar Pandey S. Pal, Data Mining : A
prediction of performer or underperformer using
classification, International Journal of Science and e
and Information Technologies, Vol. 2 (2) , 2011, 686690.
[7] Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan,
Rohit Jha and Vipul Honrao, Predicting Students
Performance using id3and c4.5 Classification
Algorithms International Journal of Data Mining &
Knowledge Management Process (IJDKP) Vol.3, No.5,
September 2013.
[8] Ramanathan L, Saksham Dhanda, Suresh Kumar D,
Predicting Students Performance using Modified ID3
Algorithm International Journal of Engineering and
Technology (IJET)

www.ijsret.org

576

You might also like