You are on page 1of 5

International Conference on Computing, Communication and Automation (ICCCA2017)

Student Academic Performance and Social Behavior


Predictor using Data Mining Techniques
Suhas S Athani Sharath A Kodli
Department of Information Science and Engg Department of Information Science and Engg
BVBCET BVBCET
Hubli, India Hubli, India
suhas2012athani@gmail.com sharath.akodli4@gmail.com

Mayur N Banavasi P. G. Sunitha Hiremath


Department of Information Science and Engg Department of Information Science and Engg
BVBCET BVBCET
Hubli, India Hubli, India
banavasimayur@gmail.com pgshiremath@bvb.edu

Abstract— Education can be utilized as a tool to face many mathematics is very serious, since they provide fundamental
problems, overcome many hurdles in life. The knowledge obtained knowledge for the success in the remaining school subjects. An
from education helps to enhance opportunities in one’s educational institution needs to maintain the records of all the
employment development. To extract useful information from the students which results in a large databases. For an instance
knowledge obtained, Educational Data Mining is widely used.
useful information like, how many students will give equal
Educational data mining provides the process of applying
different data mining tools and techniques to analyze and visualize importance to all subjects and what type of courses can be used
the data of an institution (school) and can be used to discover a to attract students? Is it possible to predict students’
unique pattern of students’ academic performance and behavior. performance? What are the factors that affect students’
The present work intends to enhance students’ academic performance? etc., can be extracted from the collected stored
performance in secondary school using data mining techniques. records. To mine such type of information, an interest in
Real data was collected using school reports and questionnaire business intelligence and data mining techniques arose due to
method by the Portugal school which has been used in this paper. information technology, leading to exponential growth of
Naïve Bayesian algorithm can be easily implemented to predict the databases. The database contains valuable information such as
students’ academic performance and behavior. Classification of
trends, patterns to improve decision making and optimizing the
students into two classes, pass and fail, involves training phase and
testing phase. In training phase, Naïve Bayes classifier is built and success rate. Data mining is used to extract relevant information
in the testing phase, Naïve Bayes classifier is used to make the from large database to gain knowledge [1]. There are different
prediction. The accuracy of the classifier is calculated using data mining techniques for analysis, classification and
WEKA tool in which confusion matrix is generated. The accuracy prediction to improve student performance [2]. Classification is
of the classifier obtained is 87% which can be further improved by a supervised learning which is used to classify the data-set
the selection of appropriate attributes. Developing the based on predefined class labels. There are different data
classification algorithms in this way helps to obtain a more mining techniques that can be used to build classification model
efficient student performance predictor tool using other data (SVM, Naïve Bayes and Decision Tree etc.) [3]. Alternative is
mining algorithms and it also helps to improve the quality of
used to analyze the raw data using automated tools and extract
education in an educational institution.
knowledge for decision maker.
Keywords—Educational Data Mining, Naïve Bayes Classifier,
WEKA II. LITERATURE REVIEW
I. INTRODUCTION In several studies, Association rule based DM approach has
addressed input variables such as sex, age and performance
Education plays an important role in the development of the over past years and the proposed system has outperformed
country and a key factor for achieving long term economic traditional allocation procedure. They have used many
progress. The Statistics of literacy shows the percentage of approach such as neural networks and decision tree (94%
failure rates due to, dropping out from the schools and student combined accuracy), binary classification (72% accuracy) [4].
failing in a particular subject. For an example during the The best result was obtained by Naïve Bayes classification. The
secondary examination held in Portugal in the year 2010, total authors adopted regression approach to predict math skills
of 60% students passed and rest of the students failed. There is based on score obtained by individuals. Most of the students
an increase in the literacy rate throughout the world from the join the public schools for free education. There were some core
past years. In particular, failure in the core class such as courses which share a common language like in other countries.

ISBN: 978-1-5090-6471-7/17/$31.00 ©2017 IEEE 1701


International Conference on Computing, Communication and Automation (ICCCA2017)

The grading point is scaled up to 20, where 0 is the lowest and


20 is the perfect score. During school year, students were
evaluated in three periods and last evaluation.
In [12], Naïve Bayes Classification is used to build a model
in which probability distribution function is computed to take
care of continuous data. In order to increase the accuracy of the
model, optimal equal width binning for discretization is
introduced. Furthermore, to increase the accuracy of the model
classes are balanced.
In [13], two classifiers namely, Naïve Bayes and J48, are
used by considering the data from the UCI Machine Learning
Repository. Analysis for these algorithms are performed using
WEKA tool and the accuracy of the models are increased using
discretization of continuous features.
In [14], many classification algorithms are used and one
among them is NB classifier. The students are classified into
four classes A, B, C and F where these classes are labels. The
entire data-set is used to build the classifier and then Bootstrap
method is applied to enhance the accuracy of the each classifier.
In [15], Bayesian Network is used to classification the
students based on marks scored by them. Model is built using
training data-set and to compare the relative performance test
data-set is used. 10-fold-cross validation is used for model
evaluation.
In [16], a technique is used to predict the students’
performance by combining three classification algorithms,
Naive Bayes, the 1-NN and the WINNOW using the voting
methodology.

III. MOTIVATION
Fig. 1 The details of a student which forms the data-set
This study aims to firstly implement an automated system
which just requires the data-set of students, and then the system
classifies the students automatically into two classes which are V. PROPOSED SYSTEM
pass and fail reducing the human work. Secondly, building
classification algorithms on educational environments helps to
identify the students who need special tutoring or counseling
from the school. The higher authorities of an institution can use
such classification models to improve students’ performance
according to the data-set. The proposed system predicts
students’ academic performance and the factors which affect
performance failure.
Building such classifiers helps an educational institution to
get the picture of their educational level, can compare their
progress with other educational institutions and finally, guide
Fig. 2 Steps involved in proposed system
students for their better future.

The data-set considered, contains some categorical data for the


IV. DATA DESCRIPTION certain number of attributes. As the pre-processing step, all the
The data-set considered consists of 395 tuples and 34 categorical information is converted into binary data with “yes”
attributes [1]. Each tuple represents the attribute values of a being as “1” and “no” being as “0”. After the pre-processing
student or it provides the details of the student in terms of step, the data-set consists of totally 35 attributes for each
academic performance and social behavior. student and only numerical data is present. In the next step, the
classifier is built using Naïve Bayes algorithm to classify the
student as pass or fail [3]. Naïve Bayes classifier is applied to
the training data-set which is preprocessed. This data-set also
contains class labels with pass (class label), labelled as 1 and

171
2
International Conference on Computing, Communication and Automation (ICCCA2017)

another class label, fail, labelled as 0. All these steps come table called frequency table is constructed for each feature
under the training phase. The testing data-set is given as an (attribute) against a particular class. Then the frequency tables
input to the built classifier for the prediction. The Naïve Bayes obtained are converted into probability tables and Naïve
algorithm, first classifies the students into two classes. Those Bayesian (Eq. 1) is used to calculate posterior probability for
two classes are pass and fail. Naïve Bayes classification is each class. The outcome of this computation is the class with the
discussed in the Section VI. This classification is performed on highest posterior probability. A student will be added to the class
the training data-set. Training is done to predict whether the which has the highest posterior probability. The same criteria is
student will pass or fail and also to predict the other attributes followed to classify all the students. The Laplacian correction or
Laplacian smoothening is carried out to avoid computing
of a student which describes a students’ social behaviour. Later,
probability values of zero [7]. In the pre-processing step,
by using this model, prediction is made whether a student will
categorical information will be converted into binary form, so
pass or fail and also prediction is made for any attribute given normal distributions are assumed. Two parameters, mean and
the other remaining attributes (predicting social behavior). standard deviation are used to find the probability density
Prediction made by the classifier is discussed in the Section VII. function for normal distribution. The equations to calculate the
Furthermore, the classifier accuracy is calculated using the mean, standard deviation and normal distribution are as follows
confusion matrix.

Mean (μ)= ∑
  (2)

VI. NAIVE BAYES CLASSIFICATION
where,  is an attribute value

Classification is a process which involves separation of Standard deviation ( ) = [ ∑
( − ) ]0.5 (3)
 
classes based on extracted features. After the classification,
classes formed will be distinct from each other. Different classes Normal distribution,
will have different features. The patterns found in the training
data-set plays an important role to build the classifier. There are ()

many classification algorithms like k-nearest neighbors, f(x)=   (4)
√
decision tree learning, support vector machine, naïve bayes and
neural networks which can be used according to the where, x is an attribute value, μ is the mean and is the standard
requirements of an application. In the proposed system, Naïve deviation.
Bayes classifier is used. Naïve Bayes classifiers are statistical These all steps are performed before the testing phase. In the
classifiers. Given a tuple, this classifier can predict to which testing phase, separate data-set is given as an input. Using the
particular class a tuple belongs to. A Naïve Bayesian classifier Eq. 1, probability density is calculated. The classifier assigns or
is based on Bayes theorem [6] [8]. predicts a student belonging to pass (class label assigned as 1
Bayes theorem provides a way of calculating the posterior during pre-processing) or fail (class label assigned as 0 during
probability, P(c|x), from P(c), P(x) and P(x|c). Naïve Bayes pre-processing) class depending on the value of posterior
classifier assumes that the effect of the value of a predictor (x) probability. The student is assigned to that class which has more
on a given class (c) is independent of the values of other value of probability density. This part classifies and predicts the
predictors. This assumption is called class conditional student into two classes, pass or fail only. These classifiers also
independence. Bayes theorem is given by exhibit high speed when applied to large databases. The
accuracy of this classifier is found to be nearly comparable with
(|)()
the decision tree and neural networks.
P(c|x) = (1)
()

P(c|x) = P(x1|c) x P(x2|c)………….P(xn|c) x P(c) VII. PREDICTION OF CLASSIFIER FOR STUDENT


BEHAVIOR
where, P(c|x) is the posterior probability of the class for given
attribute Prediction of student behavior is made by the classifier on
the test data-set after once the model is built in the training phase
P(c) is the prior probability of class [10] [11]. Prediction is made by the in which, 34 attributes are
given as an input and prediction is made for any one remaining
attribute.
P(x|c) is the likelihood which is the probability of predictor
given class The accuracy for the two different types of prediction is
calculated separately. Steps involved are
P(x) is the prior probability of predictor. 1) Load the data-set
2) Calculating mean and standard deviation: The mean is the
Naïve Bayes classifier assumes that the effect of the value of a
central tendency of the data, and it is used as the middle of our
predictor (x) on given class (c) is independent of the values of
normal distribution when calculating the probabilities. The
other predictors. This assumption is called class conditional
standard deviation is calculated for each attribute for a class
independence. In order to calculate the posterior probability, a

172
3
International Conference on Computing, Communication and Automation (ICCCA2017)

value. The standard deviation describes the variation of spread


of data which is used to characterize the expected spread of each
attribute in our normal distribution when calculating
probabilities.
3) Separation of data: In the proposed system, data-set is
separated by class values.
4) Summarize the data: The naïve bayes model consists the
information about the summary of the data in the training data-
set. This summary is used to make the predictions when test
data-set is given as an input. The summary obtained from the
classifier are the mean and standard deviation values for each
attribute. These values will be used to calculate the probability
of an attribute belonging to a particular class.
5) Making predictions: In this phase, prediction is made based
on the summaries obtained from the training data. Predictions
Fig. 4 Collecting input from the user for prediction
are made by calculating the normal probability density function
which uses mean and standard deviation for the attribute from
the training data. Figure 4, represents the part of the testing phase, where the user
is asked to provide details of the student and the classifier makes
VIII. RESULTS AND DISCUSSION the prediction accordingly.

The training data-set contains information of 395 students.


Each student has 35 values or attributes. The marks of the
students is pertained to only mathematics subject. The testing
data-set comprises of 20 students on which a prediction is made Fig. 5 Result of Naïve Bayes predictor
whether they will pass or fail and also the prediction of any Figure 5, shows the result made by the predictor when family
attributes. The accuracy of the model is calculated by confusion size is selected to be predicted by the user. The attribute famsize
matrix using Weka tool. has two values GT3 and LE3. The attribute value GT3 is
predicted by the classifier when other attribute values are given
by the user.

IX. CONCLUSION

Education is very important in today’s generation and


methods to analyze the education system in school and to
predict for the advancement of institution is very essential.
The proposed automated system emphasis on making
predictions of student’s for the advancement of institution
is very essential. The proposed automated system emphasis
on making predictions of student’s academic performance
and social behavior. The accuracy of the model is also
calculated. Further scope is to build a classifier using
Support Vector Machine (SVM) and analyze which
classifier is more appropriate in carrying out the
classification.
Fig. 3 Results of Naïve Bayes Classifier using WEKA Tool
X. FUTURE WORK
From the figure 3, out of 395 tuples, 344 tuples are correctly
classified and 51 tuples are incorrectly classified. From the
confusion matrix from the figure 3, 16 students are correctly In the proposed system, Naïve Bayes algorithm is used
classified as failed. 50 students that are failed, are incorrectly for classification and prediction. Likewise, many such
marked as passed. 1 student who is passed is incorrectly labeled algorithms can be used for classification and prediction of
as failed. 328 students are correctly classified as passed. The student behavior and his academic performance. The data-
accuracy of the classifier model is found out to be 87%. The set considered has less volume, so the future work will be
accuracy of the classifier can still be increased by selecting the to use large data-set and compare the results using other
appropriate attributes from the entire data-set which forms the classification algorithms.
part of data pre-processing.

173
4
International Conference on Computing, Communication and Automation (ICCCA2017)

REFERENCES [11] S. Huang, & N. Fang, Work in Progress - Prediction of Students’ Academic
Performance in an Introductory Engineering Course, In 41st ASEE/IEEE
[1] Mrinal Pandey, Vivek Kumar Sharma. “A Decision Tree Algorithm Frontiers in Education Conference, (2011), 11–13.
Pertaining to the Student Performance Analysis and Prediction”. International
Journal of Computer Applications (0975 – 8887), Volume 61– No.13, January [12] Syed Tanveer Jishan, Raisul Islam Rashu, Naheena Haque and Rashedur
2013. M Rahman. “Improving accuracy of students’ final grade prediction model
[2] Shaeela Ayesha and et al,” Data Mining Model for Higher Education using optimal equal width binning and synthetic minority over-sampling
System”, European Journal of Scientific Research, Vol.43 No.1 (2010), pp.24- technique”, Springer Open Journal, (2015).
29. [13] Kayah, F. “Discretizing Continuous Features for Naive Bayes and C4.
[3] Bhardwaj, “Data Mining: A prediction for performance improvement using Classifiers”. University of Maryland publications: College Park, MD, USA.
classification”. International Journal of Computer Science and Information [14] S. Taruna, Mrinal Pandey, “ An empirical analysis of classification
Security. Volume 9(4). (2011). .Bekele, R., Menzel, W. “A bayesian approach techniques for predicting academic performance”, IEEE Advances Computing
to predict performance of a student (BAPPS): A Case with Ethiopian Students”. Conference (2004).
Journal of Information Science (2013). [15] Varsha Namdeo, Anju Singh, Divakar Singh and Dr. R.C Jain, “Result
[4] Ahmed, A. B. E, Ibrahim S. E. "Data Mining: A prediction for Student's Analysis Using Classification Techniques”, International Journal of Computer
Performance Using Classification Method." World Journal of Computer Applications (0975 - 8887) Volume 1 – No. 22. (2010)
Application and Technology Volume 2(2) (2014). [16] S. Kotsiantis, K. Patriarcheas, M. Xenos, “A combinational incremental
ensemble of classifiers as a technique for predicting students’ performance in
[5] P. Cortez and A. Silva. “Using Data Mining to Predict Secondary School distance education”, Knowledge-Based Systems 23 Elsevier, 529–535, (2010).
Student Performance”. In A. Brito and J. Teixeira Eds., Proceedings of 5th
Future Business Technology Conference.
[6] Meghna Khatri. “A Survey of Naïve Bayesian Algorithms for Similarity in
Recommendation Systems”. International Journal of Advanced Research in
Computer Science and Software Engineering, Volume 2, Issue 5, May 2012.
[7] P. Domingos and M. Pazzani. “On the optimality of the simple Bayesian
classifier under zero-one loss”. Machine Learning, 29:103–130, 1997.
[8] Bekele, R., Menzel, W. “A bayesian approach to predict performance of a
student (BAPPS): A Case with Ethiopian Students”. Journal of Information
Science (2013).
[9] Sonali Agarwal, G. N. Pandey, and M. D. Tiwari, Data Mining in Education:
Data Classification and Decision Tree Approach, 2012.
[10] E.Chandra and K. Nandini, ”Predicting Student Performance using
Classification Techniques”, Proceedings of SPIT-IEEE Colloquium and
International Conference, Mumbai, India, p.no, 83-87.

174
5

You might also like