Using Educational Data Mining to Predict Students’ Academic Performance for Applying Early Interventions

Sarah Alturki, Nazik Alturki
Journal of Information Technology Education: Innovations in Practice  •  Volume 20  •  2021  •  pp. 121-137

One of the main objectives of higher education institutions is to provide a high-quality education to their students and reduce dropout rates. This can be achieved by predicting students’ academic achievement early using Educational Data Mining (EDM). This study aims to predict students’ final grades and identify honorary students at an early stage.

EDM research has emerged as an exciting research area, which can unfold valuable knowledge from educational databases for many purposes, such as identifying the dropouts and students who need special attention and discovering honorary students for allocating scholarships.

In this work, we have collected 300 undergraduate students’ records from three departments of a Computer and Information Science College at a university located in Saudi Arabia. We compared the performance of six data mining methods in predicting academic achievement. Those methods are C4.5, Simple CART, LADTree, Naïve Bayes, Bayes Net with ADTree, and Random Forest.

We tested the significance of correlation attribute predictors using four different methods. We found 9 out of 18 proposed features with a significant correlation for predicting students’ academic achievement after their 4th semester. Those features are student GPA during the first four semesters, the number of failed courses during the first four semesters, and the grades of three core courses, i.e., database fundamentals, programming language (1), and computer network fundamentals.

The empirical results show the following: (i) the main features that can predict students’ academic achievement are the student GPA during the first four semesters, the number of failed courses during the first four semesters, and the grades of three core courses; (ii) Naïve Bayes classifier performed better than Tree-based Models in predicting students’ academic achievement in general, however, Random Forest outperformed Naïve Bayes in predicting honorary students; (iii) English language skills do not play an essential role in students’ success at the college of Computer and Information Sciences; and (iv) studying an orientation year does not contribute to students’ success.

We would recommend instructors to consider using EDM in predicting students’ academic achievement and benefit from that in customizing students’ learning experience based on their different needs.

We would highly endorse that researchers apply more EDM studies across various universities and compare between them. For example, future research could investigate the effects of offering tutoring sessions for students who fail core courses in their first semesters, examine the role of language skills in social science programs, and examine the role of the orientation year in other programs.

The prediction of academic performance can help both teachers and students in many ways. It also enables the early discovery of honorary students. Thus, well-deserved opportunities can be offered; for example, scholarships, internships, and workshops. It can also help identify students who require special attention to take an appropriate intervention at the earliest stage possible. Moreover, instructors can be aware of each student’s capability and customize the teaching tasks based on students’ needs.

For future work, the experiment can be repeated with a larger dataset. It could also be extended with more distinctive attributes to reach more accurate results that are useful for improving the students’ learning outcomes. Moreover, experiments could be done using other data mining algorithms to get a broader approach and more valuable and accurate outputs.

Educational Data Mining (EDM), prediction of academic achievement, higher education
121 total downloads
Share this

Back to Top ↑