A Systematic Review of Machine Learning Techniques for Predicting Student Engagement in Higher Education Online Learning

KE TING CHONG, NORAINI IBRAHIM, SHARIN HAZLIN HUSPI, WAN MOHD NASIR WAN KADIR, MOHD ADHAM ISA
Journal of Information Technology Education: Research  •  Volume 24  •  2025  •  pp. 005

The purpose of this study is to review and categorize current trends in student engagement and performance prediction using machine learning techniques during online learning in higher education. The goal is to gain a better understanding of student engagement prediction research that is important for current educational planning and development. However, implementing machine learning approaches in student engagement studies is still very limited.

The rise of online learning during and after COVID-19 has created new difficulties for students’ engagement and academic achievements. Lecturers’ manual monitoring and supporting of students are inadequate online, leading to disengagement and performance challenges that may be very difficult to notice. Machine learning has great potential to provide an accurate prognosis of students’ engagement and outcomes to make early interventions possible. Nevertheless, the current knowledge deficit is in the systematic presentation of trends and insights concerning the utilization of these approaches in higher education online learning, especially with a focus on student engagement research. This research fills a crucial void by explaining and analyzing current trends in machine learning-based prediction models to enhance the quality and efficiency of an online learning system.

This research examines the existing literature on the application of machine learning, which allows computers to learn from data and improve their performance for early identification of student engagement and academic performance in higher education during online learning. The PICOC protocol was implemented to guide the search process and define the relevant keywords aligned with the research questions. Based on the PRISMA framework, a structured approach is adopted to identify and select studies to screen and extract the relevant papers from the database. Meta-analysis was adopted in data analysis whereby studies are combined and evaluated to provide insights into machine learning techniques’ effectiveness in student engagement and academic performance research.

This paper aims to present the current trends in predicting student engagement and academic achievement by applying machine learning approaches with a focus on their relevance in the context of online learning. It defines challenges that emerge with an interpretation of the extent of student engagement, which include the absence of consensus on levels of student engagement that hampers the use of explainable artificial intelligence – approaches that make training of machine learning models more logical, understandable and easily interpretable by lecturers. The finding points to the fact that through the prediction models, lecturers are enabled to recognize disengaged students early and foster their needs towards learning, providing direction toward more customized and effective online learning.

A total of 96 primary studies have been identified and included in this systematic review. It is important to highlight the relevance of classification machine learning methods that are implemented in 88.60% of papers, while clustering methods are only employed in 15.19% of studies. Furthermore, the review shows that most research focuses on student performance prediction (82.28%) compared to student engagement level prediction (12.66%). Besides, student engagement datasets are used in 92.14% of studies, emphasizing student engagement’s popularity in educational prediction research. Moreover, classification machine learning methods are more prevalent in educational prediction research. In contrast, classification methods for student engagement research are still limited due to challenges in constructing consistent engagement levels.

Lecturers need to occasionally assess student engagement levels during online learning to identify students who are left out and take immediate planning and action to encourage the student to engage during online learning. The syllabus designer should observe the students’ engagement level during online learning to plan the course content that can attract and engage the students. Students’ engagement during online learning can ensure their academic success and prevent them from dropping out.

Researchers should focus on defining the consensus on differentiating student engagement levels and implementing more explainable AI to enhance the interpretability and transparency of student engagement level predictive models. Researchers should enhance educational predictive models’ explainability, transparency, and accuracy by addressing issues brought about by feature selection, resampling techniques, and hyperparameter tuning.

The study highlights the growing importance of understanding student engagement through digital footprints, which can support personalized learning experiences and provide better educational outcomes. The efficient predictive models on student engagement can improve the effectiveness of higher education systems, benefiting students and institutions.

The challenges of current computational methods need to be overcome, including the need for more consistent approaches in differentiating engagement levels and enhancing the explainability and accuracy of educational predictive models through better feature selection, resampling techniques, and hyperparameter tuning.

machine learning, prediction, student engagement, student performance, systematic literature review
41 total downloads
Share this
 Back

Back to Top ↑