Student's Performance Prediction Using Weighted Modified ID3 Algorithm

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 5, May 2015
Students Performance Prediction Using Weighted Modified ID3 Algorithm

Sonia Joseph1, Laya Devadas2
1
2
(Department of CSE, College of Engineering Munnar, Kerala)

(Department of CSE, College of Engineering Munnar, Kerala)
Keywords Data Mining, Decision Tree Algorithm, Id3

Algorithm, Modified Id3 Algorithm
educational systems into useful information that could

potentially have a great impact on educational research
and practice. Knowledge discovered by EDM algorithms
can be used to help teachers to manage their classes,
understand their students learning processes and reflect
on their own teaching methods. Evaluation is a systematic
process of collecting, analysing and interpreting
evidences of students progress and achievement in
cognitive areas of learning for the purpose of taking a
variety of decisions. Evaluation, thus, involves gathering
and processing of information and decision-making .This
paper try to extract useful information from graduate
students data, collected from College of Engineering
Munnar. The system will take into account a number of
factors by gathering data about students Semester marks,
Term Test Marks, Attendance, and various other factors.
Here weighted modified ID3 algorithm is used to
evaluate students performance. By using this algorithm,
it is possible to extract knowledge that describes students
performance at the end of the semester examination.
1.
2.
ABSTRACT
The success of an educational institution can be measured
in terms of quality of education it provides to its students.
In education system, highest level of quality can be
achieved by exploring the knowledge regarding
prediction about students performance. Data mining
techniques play an important role in data analysis. The
application of data mining to educational data allows the
educators to discover useful knowledge about students.
This paper presents a classification model based on
decision tree approach to predict students academic
performance. This method is useful in identifying those
students who are probable to fail in the semester
examinations and allow the teachers to provide
appropriate assistance in timely manner. It also helps the
weaker students to improve and bring out betterment in
the result.
INTRODUCTION
In modern world, large amount of data is available which

can be effectively used to produce necessary information.
The information achieved can be used in the field of
medical, education, banking, business and so on. As huge
amount of data are being collected and stored in the
databases, traditional statistical techniques and database
management tools are not enough for analysing the huge
amount of data. Knowledge Discovery and Data Mining
(KDD) is an interdisciplinary area focusing upon
methodologies for discovering knowledge from large
amount of data. Within the KDD process, there can be
different means of data mining analysis that allow getting
important information from the
database such as
classification, clustering, association, neural network etc
[1]. Educational Data Mining (EDM) is an emerging
discipline concerned with developing methods for
exploring the unique types of data that come from
educational settings, and using those methods to better
understand students, and the settings which they learn in
[2]. The knowledge is hidden among the educational data
set and it is extractable through data mining techniques.
The EDM process converts raw data coming from
RELATED WORKS
In 2006, Qasam A. Al-Radaideh [3] et al. started an

attempt to use data mining functions to analyse and
evaluate students academic data for improving the
quality of the higher educational system. This paper used
the data mining processes, particularly classification, to
predict the final grade in a course under study by
evaluating student data to find the main attributes that
may greatly affect the student performance in courses. To
get the accuracy of the classification model the WEKA
toolkit is used. Three different classification methods
have been tested, the ID3, C4.5, and the Nave Bayes and
shows the evaluation result as a percentage of the
correctly classified instances using the three different
algorithms. From the obtained results, we can notice that
the classification accuracy for the three different
classification algorithms is not very high. This can
indicate that the collected samples and attributes are not
sufficient to generate a classification model of better
quality.
In 2003, Suchita Borkar et al. [4] presented the
potential use of education data mining using association
rule mining algorithm for enhancing the quality of
www.ijsret.org
571
education and predicting students performances. The

analysis revealed that students university performance is
dependent on Unit test mark, Assignment, Attendance
and graduation percentage. The results shows that the
students performance level can be improved in university
result by identifying students who are poor in unit Test,
Attendance, Assignment and graduation giving them
additional guidance to improve the university result.
In 2012, Er. Rimmy Chuchra et al. [5] introduced the
concept of extracting information from large volume of
database. The author uses marks obtained by students in
Sri Sai University - Palampur in their post graduate exam
and other factors. Also the authors introduces various
techniques to improve post graduate students
performance and identify students with low grades. The
data include one and half year period of data. Authors use
Clustering, Decision Tree and Neural Networks for
evaluating students performance. It also helps in
identifying dropouts and students who need special
assistance.
In 2011, Umesh Kumar Pandey S. Pal [6] used data
mining technique named as Bayes classification, on
students database to predict the students results on the
basis of previous year database. An advantage of the
naive Bayes classifier is that it requires a small amount of
training data to estimate the parameters necessary for
classification. Because independent variables are
assumed, only the variances of the variables for each
class need to be determined and not the entire covariance
matrix. In spite of their naive design and apparently oversimplified assumptions, naive Bayes classifiers have
worked quite well in many complex real world situations.
In this study, data gathered from different degree,
colleges affiliated with Dr. R. M. L. Awadh University,
Faizabad.
In 2013, Kalpesh Adhatrao et al. [7] developed a
system which can predict the performance of students
from their previous performances using concepts of data
mining techniques under Classification They have
analysed the data set containing information about
students, such as gender, marks scored in the board
examinations of classes X and XII, marks and rank in
entrance examinations and results in first year of the
previous batch of students. By applying the ID3 and C4.5
classification algorithms on this data, they have predicted
the general and individual performance of freshly
admitted students in future examinations. Thus, for a total
of 182 students, the average percentage of accuracy
achieved in Bulk and Singular Evaluations is
approximately 75.275 for both ID3 and C 4.5.
In 2013, Ramanathan L et al. [8] proposed a modified
weighted decision tree algorithm. This algorithm
overcomes the shortcomings of simple ID3 algorithm.
One of the main drawbacks of the ID3 algorithm is that it

is inclined towards the attributes with more values. This
can be a wrong selection and hence, as a result, the tree
generated is may not be very much efficient. For
removing the inclination of traditional ID3 algorithm, an
improved weighted ID3 (wID3) algorithm is proposed in
this paper. In this, the attribute with highest Gain Ratio is
multiplied with a weight which gives it a new value, and
among the new values, attribute with highest Gain Ratio
is selected as a node of the tree. Also, information gain is
replaced by gain ratio, which in more normalized. The
Data set for the study has been collected from CSE
branch of VIT University. This data set consists of 304
instances and each instance consists of 10 attributes.
wID3s performance is also compared with J48 algorithm
and Nave Bayes classification algorithm. From
performance analysis they came to a result that wID3
algorithm is more efficient than other two algorithms.
wID3 classified 93% percentage of records.
3.
DATA MINING PROCESS
In present days educational system, a students

performance in career growth is determined by the internal
assessment and end semester examination. The internal
assessment is carried out by the teacher based upon
students performance in educational activities such as
class test, seminar, assignments, attendance and lab work.
The end semester examination is one that is scored by the
student in semester examination. Each student has to get
minimum marks to pass a semester in internal as well as
end semester examination.
The major objective to use data mining technique is to
discover knowledge from the existing data which is
available in the college database. Before applying the data
mining techniques on the data set, there should be a
methodology that governs this work. Fig. 1 depicts the
work methodology used.
Students record
Student data collection
Data pre-processing
Data
selection
www.ijsret.org
Data cleaning
Decision tree
Result evaluation
Knowledge representation
Data
transformation
572
Fig.1 Data Mining Work Methodology

3.1 Preparation of Data
The initial data set is prepared from the data obtained
from the first batch students (2000 admission) of Dept. of
CSE, College of Engineering Munnar. The initial size of
the data is 56. In this step, data stored in different record
such as attendance record, mark sheet record etc. are
joined to form a single table. After joining process, errors
are removed. The TABLE 1 shows the sample data set.
Table 1 Data Set
Sl no
1
2
3
4
PSM
first
first
first
first
CTM
good
good
good
good
ASS
yes
yes
yes
yes
ATT
good
good
good
good
LW
yes
yes
yes
no
5
6
7
8
9
10
third
second
first
first
first
first
poor
average
good
good
good
good
no
yes
yes
yes
no
yes
poor
average
good
average
poor
good
yes
yes
yes
no
yes
yes
ESM
first
first
first
secon
d
first
first
first
first
first
first
3.2 Data Selection and Transformation

In this step, only those field are selected which are
required for data mining. Most of the attributes reveal the
past performance of the students. Reasons behind
concentrating on the past performance data are
1. Data is easily available in the administrative department
of the institute.
2. If student has performed well in the past, it is most
likely that he will perform better in subsequent exams as
well.
All the predictor and response variables which are
derived from the database are given in the TABLE 2.
Table 2 Student Related Variables
Variable
Description
Possible Values
PSM
Distinction>75%First>60%
Second>45%<60%Third>36
<45% Fail<36%
Poor, average, good
ASS
Previous
Semester
Mark
Class Test
Mark
Assignment
ATT
Attendance
Poor, average, good
LW
Lab Work
Poor, average, good
ESM
End
Semester
Mark
Distinction>75%First>60%
Second>45%<60%Third>36
<45% Fail<36%
CTM
Yes , No
PSM Previous Semester Marks. It is split into five class

values: Distinction >75%, First >60%and<75%, Second
>45% and <60%, Third >36% and < 45%, Fail < 40%.
CTM Class Test Mark obtained. Here in each semester
two class tests are conducted and average of two class test
are used to calculate sessional marks. CTM is split into
three classes: Poor < 40%, Average > 40% and <
60%, Good >60%.
ASS Assignment performance. In each semester two
assignments are given to students. Assignment
performance is divided into two classes: Yes student
submitted assignment, No Student not submitted
assignment.
ATT Attendance of Student. Minimum 70% attendance
is compulsory for attending the End Semester
Examination. But even through in some cases low
attendance students also participate in End Semester
Examination on genuine reason. Attendance is divided
into three classes: Poor - <60%, Average - > 60% and
<80%, Good - >80%.
LW Lab Work. Lab work is divided into two classes: Yes
student completed lab work, No student not completed
lab work.
ESM - End semester Marks obtained in B. Tech semester
and it is declared as response variable. It is split into five
class values: Distinction >75%, First >60%and<75%,
Second >45% and <60%, Third >36% and < 45%, Fail
< 40%.
3.2 Decision Tree
A decision tree is a tree in which each leaf node represents
a decision and branch node represents a choice between a
numbers of alternatives. Decision trees are commonly
used for getting information for the purpose of decisionmaking. Decision tree starts with a root node on which it
is for users to take actions. From this node, users split each
node recursively according to decision tree learning
algorithm. The final result is a decision tree in which each
branch represents a possible scenario of decision and its
outcome.
Decision trees classify examples by sorting them
based on their attribute values. The attributes values here
focus on Previous Semester Marks, Class Test Mark,
Assignment, Attendance, Lab Work, and End Semester
Marks. Each node in a decision tree represents any one of
this attribute, and each branch in a decision tree
represents a value that the node can take. Decision trees
are constructed using a top down greedy search algorithm
which recursively subdivides the training examples based
on the attribute that best classifies the training data. The
attribute that best divides the training examples would be
the root node of the tree. The algorithm is then repeated
on each partition of the divided data, creating sub trees
www.ijsret.org
573
until the training data is divided into subsets of the same

class. At each level in the partitioning process a statistical
property known as information gain is used to determine
which attribute best divides the training data.
Strengths of Decision Tree Methods:
Ability to Generate Understandable Rules: The ability
of decision trees to generate rules that can be translated
into comprehensible English is the greatest strength of this
technique.
Ease of Calculation at Classification Time : Although,
as we have seen, a decision tree can take many forms, in
practice, the algorithms used to produce decision trees
generally yield trees with a low branching factor and
simple tests are performed at each node. Typical tests
include numeric comparisons, set membership, and simple
conjunctions. When implemented on a computer, these
tests translate into simple Boolean and integer operations
that are computationally inexpensive and fast.
Ability to Handle Both Continuous and Categorical
Variables: Decision-tree methods are equally well at
handling continuous and categorical variables. Categorical
variables, which show problems for neural networks and
statistical techniques, come ready-made with their own
splitting criteria: one branch for each category.
Continuous variables are easy to split by picking a number
somewhere in their range of values.
Ability to Clearly Indicate Best Fields: Decision-tree
building algorithms put the field that does the best job of
splitting the training records at the root node of the tree.
3.3 Weighted modified ID3 Algorithm
The ID3 algorithm is one of the most famous algorithms
present today to generate decision trees. But this algorithm
has a major shortcoming that is it inclined towards the
attributes with many values. So, we can overcome this
shortcoming of the algorithm by using gain ratio (instead
of information gain) as well as by giving weights to each
attribute at every decision making point. For removing the
inclination of traditional ID3 algorithm towards attributes
with many values, an improved weighted ID3 (wID3)
algorithm is proposed. In this, the attribute with highest
Gain Ratio (not information gain) is multiplied with a
weight which gives it a new value, and among the new
values, attribute with highest Gain Ratio is selected as a
node of the tree. Also, information gain is replaced by
gain ratio, which in more normalized. This whole process
overcomes the inclination problem.
GainRatio( A) Gain(S , A) / Entropy(S )
Entropy p j log p j
j
(2)
(1)
For calculating the weights, we check the relevance

between condition attributes and decision attributes by
using a correlation function. The value between the
conditions attributes A and the decision attribute D is
given by:
v
AFD ( A) (| Ai1 | | Ai 2 |) / v
(3)
i 1
where v is the number of different values A is having.

Now, the modified weight of condition attribute A will be:
n
wA' AFD ( A) /( AFD ( j ))
(4)
j 1
if the gain ratio of A is max otherwise wA ' 1 .

So, the modified gain ratio of condition attribute A
will be:
GainRatio ' ( A) wA ' * GainRatio( A)
(5)
Here we can notice that the weight of the attribute with
maximum gain ratio is modified and rest have the same
gain ratio as earlier. In this way, gain ratio of attributes
satisfying the condition is reduced and hence, it
overcomes the inclination problem.
wID3 Algorithm:
Begin
Create a node N
If (All samples are in same class)
Return node as leaf with class name;
If (attribute list is empty)
Return node as leaf node labeled with most common
class;
Calculate the weight of each attribute
Select test attribute i.e. attributes having highest gain ratio
Label node N with test attribute
For each known value of a of test attribute, grow branches
from node N for the condition test attribute = a;
Let Si be set of samples for which test attribute = ai;
If (Si is empty) then attach the leaf labeled with most
common class in sample.
Else attach the node returned by
generate_decision_tree(Si,attribute_list_test_attribute)
End
3.4 Knowledge from Decision Tree Classifier
Once a decision tree has been constructed, it is a simple
matter to convert it into an equivalent set of rules. To
generate rules, trace each path in the decision tree, from
root node to leaf node, recording the test outcomes as
antecedents and the leaf-node classification as the
consequent. These if-then rules extracted from decision
tree can serve as a knowledge base for further
classification. Converting a decision tree to rules has three
main advantages:
www.ijsret.org
574
Converting to rules allows distinguishing among the

different contexts in which a decision tree node is used.
Since each different path through the decision tree node
produces a distinct rule.
Converting to rules removes the distinction between
attribute tests that occur near the root and those that occur
near the leaves of the tree.
Converting to rules improves readability. Rules are
often easier for people to understand.
Proposed Improved ID3 Algorithm differs from
original ID3 in following way:
Used of extra Association Function to overcome the
short comings of Id3.
In Improved Id3 more reasonable and effective rules
are generated.
Missing values can be considered (will not have impact
on accuracy of decision).
Accurate rules generated.
Time complexity is more in improved ID3, but it can
be neglected because now faster and faster computers are
present.
More no of rules to increase the accuracy of decision.
Root node is decided not only on value of information
gain of attribute but also it considers one more additional
function called association function (AF).
3.5 Design of Prediction System
The prediction system contains a knowledge base that has
accumulated experience and a set of rules for applying the
knowledge base to each particular situation. It is the
knowledge-base that incorporates Previous Semester
Marks, Class Test Mark, Assignment, Attendance, Lab
Work, and their relationships with the performance
obtained in End Semester examination. The architecture
for the prediction system is provided in Fig. 2.
Knowledge base
Working memory
Inference
mechanism
End
user
Fig. 2 Architecture of the prediction system

Knowledge Base: This contains the domain knowledge
and they are declarative representation often in IF-THEN
rules. These rules represent the knowledge base of the
system.
Working Memory: The user enters information on a
current problem into the working memory. The system
matches this information with knowledge contained in the
database to infer new facts. The system then enters these
new facts into the working memory and the matching
process continues. Eventually the system reaches some

conclusion that it also enters into the working process.
Inference Mechanism: It works with the information in
the knowledge base and the working memory. It searches
the rules for a match between their premises and
information contained in the working memory. When it
finds a match, it will produce the output.
End-User: the individual who will be consulting with the
system to get advice that would have been provided by the
prediction system.
4. RESULT
The dataset of 56 students from BTECH course was
obtained from CSE department of College of Engineering,
Munnar. The details like semester marks, class test marks,
assignment marks, attendance and, lab work were
collected from the college database. Weighted Modified
ID3 algorithm was implemented using this collected
information. Implementation of the algorithm was done
with the help of following tools: JDK (Java Development
Kit),
NetBeans
IDE
(Integrated
Development
Environment) 8.0.The Table I shows the accuracy of ID3,
C4.5 and CART algorithms for classification.
Table 3 Classifiers Accuracy
ALGORITHM
PREDICTION ACCURACY
C4.5
45.8333%
ID3
52.0833%
CART
56.25%
MODIIED ID3
76%
5. CONCLUSION
The educational system is the backbone of progress and
development of any society. Greater the ability of the
education system to improve the performances of its
students better the chance of the society to produce
successful citizens. Keeping this fact in mind,, it is
necessary to constantly work towards a more sophisticated
education system. Data mining is an incredible concept
which provides us hidden information from voluminous
and exhaustive databases. Data mining can provide many
solutions towards making a stronger education system.
This study shows that students past academic
performance can be used to create the model using
decision tree algorithm that can be used for prediction of
students performance in later examinations. From
TABLE 1, it is clear that the accuracy of proposed
prediction system is higher compared to other methods.
www.ijsret.org
575
This study is a stepping stone towards the integration of

technology and the education system.
REFERENCES
[1] Elena Susena. Using Data Mining Techniques in
higher education, National defence university, carol I
Bucharest 68-72.
[2] A. Dinesh Kumar, Dr. V. Radhika, A Survey on
Predicting Student Performance, International Journal
of Computer Science and Information Technology, Vol. 5
(5), 2014.
[3] Qasem A. Al-Radaideh, Emad M. Al-Shawakfa,
Mustafa I. Al-Najjar Mining student data using decision
tree, The 2006 International Arab Conference on
Information Technology (ACIT2006), Nov. 2006,
Jordan.
[4] Suchita Borkar and K. Rajeswari, Predicting
Students Academic Performance Using Education Data
Mining, International Journal of Computer Science and
Mobile Computing. IJCSMC, Vol. 2, Issue. 7, July 2013,
pg.273 279.
[5] Er. Rimmy Chuchra, Use of Data Mining
Techniques for the Evaluation of Student Performance:A
Case Study, International Journal of Computer Science
and Management Research Vol 1 Issue 3 October 2012
[6] Umesh Kumar Pandey S. Pal, Data Mining : A
prediction of performer or underperformer using
classification, International Journal of Science and e
and Information Technologies, Vol. 2 (2) , 2011, 686690.
[7] Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan,
Rohit Jha and Vipul Honrao, Predicting Students
Performance using id3and c4.5 Classification
Algorithms International Journal of Data Mining &
Knowledge Management Process (IJDKP) Vol.3, No.5,
September 2013.
[8] Ramanathan L, Saksham Dhanda, Suresh Kumar D,
Predicting Students Performance using Modified ID3
Algorithm International Journal of Engineering and
Technology (IJET)
www.ijsret.org
576

Student's Performance Prediction Using Weighted Modified ID3 Algorithm

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Student's Performance Prediction Using Weighted Modified ID3 Algorithm

Uploaded by

Copyright:

Available Formats

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882

Volume 4, Issue 5, May 2015

Students Performance Prediction Using Weighted Modified ID3 Algorithm

(Department of CSE, College of Engineering Munnar, Kerala)

Keywords Data Mining, Decision Tree Algorithm, Id3

educational systems into useful information that could

In modern world, large amount of data is available which

In 2006, Qasam A. Al-Radaideh [3] et al. started an

education and predicting students performances. The

One of the main drawbacks of the ID3 algorithm is that it

DATA MINING PROCESS

In present days educational system, a students

Student data collection

Fig.1 Data Mining Work Methodology

3.2 Data Selection and Transformation

Poor, average, good

Poor, average, good

PSM Previous Semester Marks. It is split into five class

until the training data is divided into subsets of the same

GainRatio( A) Gain(S , A) / Entropy(S )

For calculating the weights, we check the relevance

where v is the number of different values A is having.

wA' AFD ( A) /( AFD ( j ))

if the gain ratio of A is max otherwise wA ' 1 .

Converting to rules allows distinguishing among the

Fig. 2 Architecture of the prediction system

process continues. Eventually the system reaches some

This study is a stepping stone towards the integration of

You might also like