Download IEEE Paper Template in A4 (V1)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
INTERNATIONAL JOURNAL FOR RESEARCH IN EMERGING SCIENCE AND TECHNOLOGY, VOLUME-2, ISSUE-7, JULY-2015
E-ISSN: 2349-7610
A Detailed Analysis of Educational Data Mining
Ravi Tiwari1 and Awadhesh Kumar Sharma2
,
1
Department Of Computer Science and Engineering, M.M.M.U.T, Gorakhpur India
1
tiwari.ravi.cs@gmail.com
2
Department Of Computer Science and Engineering, M.M.M.U.T, Gorakhpur, India
2
akscse@rediffmail.com
ABSTRACT
Educational Data Mining is defined as a discipline which deals with the application of the data mining tools and techniques to
explore and analyse the educational data .This discipline helps in the development of the models which facilitate effective
decision making from the analysis of the historical educational data .It provides effective solution to various problems faced by
the education sector .It provides a direction to the efforts of various entity involves in the education sector which actually leads
to efforts utilization. This paper reviews and also describes various data mining techniques and tools used in educational data
mining.
Keywords — Educational Data Mining, Learning Analytics, Institutional Effectiveness, Educational Mining Methods,
Educational Mining Tools.
traditional data mining techniques which can be directly
1. INTRODUCTION
defines
applied on the educational data. The main goal of the paper is
Educational data mining as, “Educational Data Mining(EDM)
to review the various data mining techniques used in the
is an emerging discipline, concerned with Developing
educational data mining.
The
educational
data
mining community [1]
methods for exploring the unique types of data that come from
educational settings, and using those methods to better
2. BACKGROUND AND RELATED WORK
understand students, and the setting which they learn in”.
Romero and Ventura [3] done a lot of important work related
Educational Data Mining (EDM) is the application of Data
to education data mining. It uses data mining techniques for
Mining (DM) techniques to educational data, and so, its
the purpose of predication in education field. Nguyen et al. [4]
objective is to analyze these types of data in order to resolve
compares the accuracy of decision tree and Bayesian network
educational research issues. DM can be defined as the process
algorithm for
involved in extracting interesting, interpretable, useful and
undergraduate and post graduate students. Results from this
novel information from data [2]. At present time so much data
work are useful for prediction of the weak student who may
related to the education sector is available in the form file, text,
fail in the exam. Affendy and
image, video, audio etc. Educational data mining extract
performance in various subjects to predict the CGPA of
useful information from this huge data to develop models
bachelor students. Al-Radaieh et al.[6]
which helps in the improvement of the existing learning
technique to improve the quality of education. Cesar et al [7]
methods, techniques and environment to facilitate better
uses data mining technique to develop a model which helps
learning experience.
student to take academic decision. Nghe et al also provides the
Educational data has a different nature then the normal data on
a lot of contribution in this field. Ramaswami and Bhaskaran
which traditional Data mining techniques are applied normally
also develop a predictive model to evaluate achievement of
therefore new methods and techniques are developed from the
student at higher secondary level. N.S.Shah applies various
VOLUME-2, ISSUE-7, JULY-2015
COPYRIGHT © 2015 IJREST, ALL RIGHT RESERVED
the
prediction of performance
of the
Musthpain [5] uses
uses classification
1
INTERNATIONAL JOURNAL FOR RESEARCH IN EMERGING SCIENCE AND TECHNOLOGY, VOLUME-2, ISSUE-7, JULY-2015
E-ISSN: 2349-7610
decision tree technique to categorize bba student based on
world educational data mining problems are not simply
performance. Pauziah Mohd . et al [8] uses neural network
prediction.
technique to predict students performance Jui-Hsi Fu et al [9]
Therefore more complex techniques may be necessary to
uses regression model to predict performance .Tripti mishra
forecast future values using combination of the techniques
et al [10] also uses classification technique to predict student’s
(E.g. logistic regression, decision trees or neural networks).For
performance.
example, the CART (Classification and Regression Tree)
Decision tree algorithm can be used to build both
3. EDUCATIONAL DATA MINING METHODS
Various methods used for mining the educational data are as
classification trees (to classify categorical response variables)
and regression trees (to forecast continuous response
variables).
follows
Classification

Prediction

Clustering

Regression

Association rule mining

Text mining

Social network analysis
3.3 Clustering
Clustering is the process of grouping objects into group based
on the similarity. It is actually the process of binding a set of
objects into a class based on the common nature they share
with each other. Clustering can be achieve through the kmeans algorithm,
support vector clustering algorithm,
random clustering algorithm, agglomerative clustering , top
down clustering and DBSCAN clustering algorithm. In the
3.1 Classification
education sector clustering can be used for grouping student
Classification methods are used for constructing model or
classifier to predict categorical labels such as “pass” or “fail”
for the student exam result data , “selected” or “not selected”
based on their behaviour, performance, similarities and for
Grouping learning methods based on their effectiveness. It can
be used for analyse the learning.
for the placement data.
It is a twostep process, in the first step first classifier is built
3.4 Regression
describing the predefined set of data, it is known as learning
Regression is a statistical method of data mining. Regression
phase. In the second step classier describe the unknown data
actually used to model between the one or more dependent
into categorical labels. Classification methods can be done
variables and independent variables. It can be used for
with the Naïve Bayes algorithm, decision tree algorithms,
building model or classifier which can analyse the historical
neural network algorithms and support vector machines. These
data to predict the future trends.
methods can be used for classifying student based on their
In case of educational data this method can be used for student
performance, associated risk. It can also be used for the
behaviour modelling, for the prediction of the student
student behaviour modelling.
performance. It can also be used for developing model
providing self-awareness.
3.2 Predication
Data predication is also twostep process similar two
3.5 Association Rule Mining
classifications but it does not use class label terminology
Association rule mining is a method to find the frequent item
because the attribute for which value is predicted is continuous
set in the large data set. It is used to find the specific
valued instead of categorical .this method is useful in
relationship among the data set. Association rule mining
predicting student success rate, dropout rate, and retention rate.
discovers relationships among attributes in data set, producing
Predication can be achieved by naïve bayes algorithm,
if-then statements concerning attribute-values [3]. The finding
regression algorithms. Popular regression methods within
of association rule helps in the decision making for example in
educational data mining include linear regression, neural
educational data it suggest may help in the decision making of
networks and support vector machine regression. Many real-
which teaching methods need to encourage and which is need
VOLUME-2, ISSUE-7, JULY-2015
COPYRIGHT © 2015 IJREST, ALL RIGHT RESERVED
2
INTERNATIONAL JOURNAL FOR RESEARCH IN EMERGING SCIENCE AND TECHNOLOGY, VOLUME-2, ISSUE-7, JULY-2015
E-ISSN: 2349-7610
to be improved. With respect to educational data it can be
Name of
Functions/
used for the student behaviour analysis, student performance
Tool and
Features
analysis, and learning environment analysis. It also can be
developer
used for identify student failure, their relationship, their
Carrot
Techniques
nt
Provides ready-
migration and their admission pattern.
Environme
Clustering
Windows,
to-use
Linux
components for
3.6 Text Mining
fetching
It is defined as the method of extracting useful information
results
from the text .text mining can be used for extracting
from
information from unstructured data set like emails, documents
sources
and files. It can be used for development of the models which
See5
can assist students to improve their performance. It is helpful
C5.0
and
search
various
Provides
Decision Tree
Decision Tree
Unix
in the automatic construction of text books via text mining. It
Analysis,
can also be used for grouping books according to similarity in
Commercial
course and topics. Text mining is also used in evolution of
version of C4.5
thread discussion [4].
ALPHA
Provides the best
Versatile data
Windows
MINER
cost
mining
,Linux,
Functions
Mac
and
3.7 Social Network Analysis
performance
Social network analysis is a method to measure the interaction
ratio
and relationships between individual entities in networked
mining
information. Social network analysis involves the use of
applications
WEKA
for
data
Provides
Data
social network analysis techniques and data mining techniques
machine learning
processing ,cl
for information networks can be used to examine and assess
algorithms for
assification ,r
online interactions [5]. in educational data mining, SNA can
data
egression,
be used for mining group activities by analyzing the
tasks.
network and graph theories to analyse the social network. The
Windows ,
mining
pre-
Windows,
Linux
clustering ,ass
ociation rules
sociograms associated with a given group and the status of
participants and the group cohesion of social interactions [6].
Rallo et al. [7] propose to use the data mining and social
Table1: Open source educational data mining tools
networks to interpret and analyse the structure and contents of
online educational communities. Social network analysis also
Name of
Functions/
helps in student behaviour modelling.
Tool and
Features
Techniques
Environment
developer
Intelligent
Provides
Association
Windows,
Miner
tight
Mining,
Solaris,
( IBM )
integration
Classification,
used in the education data mining .Tools used for the
,
Regression,
education data mining can be classified into categories first
Scalability
Clustering,
category consist of open source tools table1
of
Pattern
Mining
Analysis
MSSQL
Provides
Integrates the
Windows
Server
DM
algorithms
Linux
(2005)
functions
developed
4. EDUCATION DATA MINING TOOLS
The tools used for the purpose of data mining process is also
and second
category table 2 consist of commercial education data mining
tools.
VOLUME-2, ISSUE-7, JULY-2015
COPYRIGHT © 2015 IJREST, ALL RIGHT RESERVED
,
3
INTERNATIONAL JOURNAL FOR RESEARCH IN EMERGING SCIENCE AND TECHNOLOGY, VOLUME-2, ISSUE-7, JULY-2015
[7] Cesar V. , Javier B. , liela S. ,recommendation in higher
by third party
Mine Set
E-ISSN: 2349-7610
Provides
Association
Windows,
education using data mining
Robust
Mining,
Linux
conference 2009
Graphics
Classification,
, educational data minng
[8] Pauziah Mohd Arsad, Norlida Buniyamin, Jamalul-lail Ab
Manan ,a neural network students’ performance
tools
Oracle
multidimen
Association
Windows,
prediction model IEEE , 2013
Data
sional data
Mining,
Mac, Linux
[9]
Mining
Analysis
Classification,
anooshiravan sharabiani, mariya atanasov, houshang darabi
Prediction,
an enhanced bayesian network model for prediction of
ashkan
sharabiani,
member,
ieee,
fazle
karim,
students’ academic performance
Table 2: Commercial educational data mining tools
in engineering programs , 2014
[10] Tripti mishra , Dr. dharminder kumar , dr.sangeeta
5. CONCLUSION
gupta , mining student data for performance prediction ,
Educational data is a growing field and have a lot of potential.
fourth international conference on advance computing and
This field not helps in the improvement of the student
communication technologies ,2014
performance but it also leads us towards a robust education
[11] Agarwal, R., Imielinski, T., & Swami, A., “ Mining
system which is based on a quick and robust decision making.
association rules between sets of items in large databases,” In
This field still needs to be exploring to achieve more accurate
Proceedings of the ACM SIGMOD international conference
results. Methods used in educational data mining at present are
on management of data,
mainly for the normal business data so there is a need of
Washington DC, USA, 1993
development of the methods which are according to the nature
[12] A Survey and Future Vision of Data mining in
of the educational data .for the future work intelligent systems
Educational Field Barahate Sachin R. (2012)
can be developed using these data mining methods to mine the
[13] Scott, J. Social network analysis: A handbook (2nd ed.).
educational data.
Newberry Park, CA: Sage.2000.
[14] Reyes, P., & Tchounikine, P., “Mining learning groups’
REFERENCES
activities in forum-type tools,” In Proceedings of the 2005
conference on computer support for collaborative learning,
[1] www.educationaldatamining.org.
[2] Delavari, N., Phon-Amnuaisuk, S., Beikzadeh, M. (2008).
Data Mining Application In Higher Learning Institutions. In
Informatics In Education Journal, 7,1, 31-54.
[3] Romero, C., Ventura, S. and Garcia, E., "Data mining in
Course management systems: Moodle case study and
Tutorial". Computers & Education, Vol. 51, No. 1.pp.368-
Taiwan, 2005, (pp 509–513)
[15] Rallo, R. Gisbert, M., & Salinas, J., “Using data mining
and social networks to analyze the structure and content of
educative online communities,” In International conference on
multimedia and ICTs in education, Caceres, Spain, 2005, (pp.
1–10).
384. 2008.
[4] Nguyen N. , Paul J. , and Peter H. , a comparative
analysis of techniques of predicting student performance. In
the proceeding of 37th ASEE/IEEE frontiers in education.
[5] I.H. , M.Paris , L.S.Affecndy , improving prediction
performance using voting technique in data mining , in world
academy of science, vol 38 , 2010
[6] Al- Radaideh , Al-Shwakfa , mining student data using
decision tree ,in international Arab conference on information
technology , 2006
VOLUME-2, ISSUE-7, JULY-2015
COPYRIGHT © 2015 IJREST, ALL RIGHT RESERVED
4