COMPARATIVE STUDY OF RESAMPLING TECHNIQUES FOR STUDENT PERFORMANCE PREDICTION USING SMOTE-ENN AND ENSEMBLE LEARNING

Authors

  • Eni Heni Hermaliani Universitas Dian Nuswantoro
  • Ahmad Zainul Fanani
  • Heru Agus Santoso Universitas Dian Nuswantoro
  • Affandy

DOI:

https://doi.org/10.33480/jitk.v11i3.8214

Keywords:

Ensemble Learning, LMS Data, Resampling Techniques, SMOTE-ENN, Student Performance Prediction

Abstract

This study analyzes the effectiveness of resampling techniques and ensemble learning in addressing class imbalance problems in student performance prediction using the xAPI-Edu-Data dataset from the Kalboard 360 LMS. The class imbalance ratio of 1:1.66 leads to bias in traditional classification models toward the majority class. The study evaluates six resampling methods, including hybrid SMOTE-ENN, combined with nine individual classifiers and three ensemble models (bagging, voting, and stacking). Evaluation was conducted using accuracy, precision, recall, and F1-score with stratified 5-fold cross-validation and hyperparameter optimization through GridSearchCV. The results indicate that the combination of SMOTE-ENN with voting and stacking achieved the best performance of 98.18% across all evaluation metrics and significantly improved minority-class recall, demonstrating its effectiveness for developing early warning systems to identify at-risk students.

Downloads

Download data is not yet available.

References

[1] D. Khairy, N. Alharbi, M. A. Amasha, M. F. Areed, S. Alkhalaf, and R. A. Abougalala, “Prediction of student exam performance using data mining classification algorithms,” Educ. Inf. Technol., vol. 29, no. 16, 2024, doi: 10.1007/s10639-024-12619-w.

[2] Z. Luo et al., “A Method for Prediction and Analysis of Student Performance That Combines Multi-Dimensional Features of Time and Space,” Mathematics, vol. 12, no. 22, 2024, doi: 10.3390/math12223597.

[3] C. Romero and S. Ventura, “Educational data mining and learning analytics: An updated survey,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 10, no. 3, 2020, doi: 10.1002/widm.1355.

[4] M. A. Prada et al., “Educational Data Mining for Tutoring Support in Higher Education: A Web-Based Tool Case Study in Engineering Degrees,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.2020.3040858.

[5] S. Badugu and B. Rachakatla, “Students’ Performance Prediction Using Machine Learning Approach,” in Advances in Intelligent Systems and Computing, Springer, 2020, pp. 333–340. doi: 10.1007/978-981-15-1097-7_28.

[6] Tao-Hongli, “Educational data mining for student performance prediction: feature selection and model evaluation,” J. Electr. Syst., vol. 20, no. 3, 2024, doi: 10.52783/jes.3434.

[7] A. Kord, A. Aboelfetouh, and S. M. Shohieb, “Academic course planning recommendation and students’ performance prediction multi-modal based on educational data mining techniques,” J. Comput. High. Educ., 2025, doi: 10.1007/s12528-024-09426-0.

[8] A. A. Jasim, L. R. Hazim, and W. D. Abdullah, “Characteristics of data mining by classification educational dataset to improve student’s evaluation,” J. Eng. Sci. Technol., vol. 16, no. 4, pp. 2825–2844, 2021, [Online]. Available: https://jestec.taylors.edu.my/Vol 16 Issue 4 August 2021/16_4_3.pdf

[9] E. A. Amrieh, T. Hamtini, and I. Aljarah, “Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods,” Int. J. Database Theory Appl., vol. 9, no. 8, pp. 119–136, Aug. 2016, doi: 10.14257/ijdta.2016.9.8.13.

[10] S. Vaheed, R. Pratap Singh, P. Nayak, and C. Mallikarjuna Rao, “Student’s Academic Performance Prediction Using Ensemble Methods Through Educational Data Mining,” in Smart Innovation, Systems and Technologies, 2022. doi: 10.1007/978-981-16-9669-5_20.

[11] D. N. Muhammady, H. A. E. Nugraha, V. R. S. Nastiti, and C. S. K. Aditya, “Students Final Academic Score Prediction Using Boosting Regression Algorithms,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 10, no. 1, pp. 154–165, 2024, doi: 10.26555/jiteki.v10i1.28352.

[12] E. H. Hermaliani et al., “Systematic Review of Educational Data Mining for Student Performance Prediction using Bibliometric Network Analysis (SeBriNA),” in 2022 International Seminar on Application for Technology of Information and Communication: Technology 4.0 for Smart Ecosystem: A New Way of Doing Digital Business, iSemantic 2022, 2022. doi: 10.1109/iSemantic55962.2022.9920477.

[13] R. Ghorbani and R. Ghousi, “Comparing Different Resampling Methods in Predicting Students’ Performance Using Machine Learning Techniques,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.2020.2986809.

[14] S. D. Abdul Bujang et al., “Imbalanced Classification Methods for Student Grade Prediction: A Systematic Literature Review,” 2023. doi: 10.1109/ACCESS.2022.3225404.

[15] S. Maldonado, C. Vairetti, A. Fernandez, and F. Herrera, “FW-SMOTE: A feature-weighted oversampling approach for imbalanced classification,” Pattern Recognit., vol. 124, 2022, doi: 10.1016/j.patcog.2021.108511.

[16] M. Mujahid et al., “Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-024-00943-4.

[17] P. B. Kikunda et al., “Predicting First-Year Student Performance with SMOTE-Enhanced Stacking Ensemble and Association Rule Mining for University Success Profiling,” J. Comput. Theor. Appl., vol. 3, no. 2, 2025, doi: 10.62411/jcta.14043.

[18] M. Skittou, M. Merrouchi, and T. Gadi, “A Recommender System for Educational Planning,” Cybern. Inf. Technol., vol. 24 Nomor 2, 2024, doi: 10.2478/cait-2024-0016.

[19] I. Alarab and S. Prakoonwit, “Effect of data resampling on feature importance in imbalanced blockchain data: Comparison studies of resampling techniques,” Data Sci. Manag., vol. 5, no. 2, 2022, doi: 10.1016/j.dsm.2022.04.003.

[20] M. Fachrie, A. Musdholifah, and R. Pulungan, “Effectiveness of data resampling and ensemble learning in multiclass imbalance learning,” Artif. Intell. Rev., vol. 58, no. 12, 2025, doi: 10.1007/s10462-025-11357-w.

[21] S. B. Keser and S. Aghalarova, “HELA: A novel hybrid ensemble learning algorithm for predicting academic performance of students,” Educ. Inf. Technol., vol. 27, no. 4, 2022, doi: 10.1007/s10639-021-10780-0.

[22] S. S. M. Ajibade, J. Dayupay, D. L. Ngo-Hoang, and ..., “Utilization of Ensemble Techniques for Prediction of the Academic Performance of Students,” J. Optoelectron. Laser, vol. 41, no. 6, 2022.

[23] Y. Sun, Z. Li, X. Li, and J. Zhang, “Classifier Selection and Ensemble Model for Multi-class Imbalance Learning in Education Grants Prediction,” Appl. Artif. Intell., vol. 35, no. 4, 2021, doi: 10.1080/08839514.2021.1877481.

[24] M. A. Tariq, A. B. Sargano, M. A. Iftikhar, and Z. Habib, “Comparing Different Oversampling Methods in Predicting Multi-Class Educational Datasets Using Machine Learning Techniques,” Cybern. Inf. Technol., vol. 23, no. 4, pp. 199–212, 2023, doi: 10.2478/cait-2023-0044.

[25] E. A. Amrieh, T. Hamtini, and I. Aljarah, “Preprocessing and analyzing educational data set using X-API for improving student’s performance,” in 2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), IEEE, Nov. 2015, pp. 1–5. doi: 10.1109/AEECT.2015.7360581.

[26] E. Bisong, “Introduction to Scikit-learn,” in Building Machine Learning and Deep Learning Models on Google Cloud Platform, 2019. doi: 10.1007/978-1-4842-4470-8_18.

[27] A. S. Almajid, “Multilayer Perceptron Optimization on Imbalanced Data Using SVM-SMOTE and One-Hot Encoding for Credit Card Default Prediction,” J. Adv. Inf. Syst. Technol., vol. 3, no. 2, 2022, doi: 10.15294/jaist.v3i2.57061.

[28] M. F. Al-Hammouri, Z. A. A. Hammouri, I. T. Almalkawi, and A. Lafee, “Optimizing Multi-Class Classification in Educational Data with Ensemble Learning and Data Balancing Techniques,” in 2024 5th International Conference on Intelligent Data Science Technologies and Applications, IDSTA 2024, 2024. doi: 10.1109/IDSTA62194.2024.10746987.

[29] U. Ashfaq, P. M. Booma, and R. Mafas, “Managing student performance: A predictive analytics using imbalanced data,” 2020, Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). doi: doi: 10.35940/ijrte.e7008.038620.

[30] V. Flores, S. Heras, and V. Julian, “Comparison of Predictive Models with Balanced Classes Using the SMOTE Method for the Forecast of Student Dropout in Higher Education,” Electron., vol. 11, no. 3, 2022, doi: 10.3390/electronics11030457.

[31] T. Wongvorachan, S. He, and O. Bulut, “A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining,” Inf., vol. 14, no. 1, 2023, doi: 10.3390/info14010054.

Downloads

Published

2026-02-28

How to Cite

[1]
“COMPARATIVE STUDY OF RESAMPLING TECHNIQUES FOR STUDENT PERFORMANCE PREDICTION USING SMOTE-ENN AND ENSEMBLE LEARNING”, jitk, vol. 11, no. 3, pp. 1009–1019, Feb. 2026, doi: 10.33480/jitk.v11i3.8214.