COMPARATIVE ANALYSIS OF BAGGING AND BOOSTING MODELS IN ENSEMBLE LEARNING FOR GRADUATION PREDICTION

Authors

DOI:

https://doi.org/10.33480/jitk.v11i3.7579

Keywords:

Bagging, Boosting, Ensemble Learning, Machine Learning, Student Graduation Prediction

Abstract

Student graduation prediction is an important aspect in supporting academic decision-making in higher education. However, conventional evaluation approaches have not been able to identify the risk of early graduation delays. This study aims to compare the performance of two ensemble learning approaches, namely Bagging using Random Forest and Boosting using XGBoost, in predicting student graduation. The study used  the Predict Students' Dropout and Academic Success dataset  consisting of 4,424 student data. Both models were trained on the same data and evaluated using the Accuracy, Precision, Recall, F1-Score, and ROC-AUC metrics. The results of the experiment showed that both models had almost equal accuracy, i.e. 82.6% for Random Forest and 82.5% for XGBoost. However, XGBoost showed better performance on Recall (0.878) and F1-Score (0.834), which indicated a higher ability to detect students who actually graduated. Based on these results, this study concludes that XGBoost is more effective than Random Forest in the context of predicting student graduation and is more suitable to be applied to  the Academic Early Warning System in universities

Downloads

Download data is not yet available.

References

[1] L. R. Pelima, Y. Sukmana, and Y. Rosmansyah, “Predicting University Student Graduation Using Academic Performance and Machine Learning: A Systematic Literature Review,” IEEE Access, vol. 12, pp. 23451–23465, 2024, doi: 10.1109/ACCESS.2024.3361479.

[2] O. Saidani, L. J. Menzli, A. Ksibi, N. Alturki, and A. S. Alluhaidan, “Predicting Student Employability Through the Internship Context Using Gradient Boosting Models,” IEEE Access, vol. 10, pp. 46472–46489, 2022, doi: 10.1109/ACCESS.2022.3170421.

[3] S. D. A. Bujang et al., “Multiclass Prediction Model for Student Grade Prediction Using Machine Learning,” IEEE Access, vol. 9, pp. 95608–95621, 2021, doi: 10.1109/ACCESS.2021.3093563.

[4] A. Rabelo, M. W. Rodrigues, C. Nobre, S. Isotani, and L. Zárate, “Educational data mining and learning analytics: a review of educational management in e-learning,” Inf. Discov. Deliv., vol. 52, no. 2, pp. 149–163, 2023, doi: 10.1108/IDD-10-2022-0099.

[5] G. ElSharkawy, Y. Helmy, and E. Yehia, “Employability Prediction of Information Technology Graduates using Machine Learning Algorithms,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 10, pp. 359–367, 2022, doi: 10.14569/IJACSA.2022.0131043.

[6] V. Christou et al., “Performance and early drop prediction for higher education students using machine learning,” Expert Syst. Appl., vol. 225, p. 120079, 2023, doi: https://doi.org/10.1016/j.eswa.2023.120079.

[7] A. J. Fernández-García, J. C. Preciado, F. Melchor, R. Rodriguez-Echeverria, J. M. Conejero, and F. Sánchez-Figueroa, “A Real-Life Machine Learning Experience for Predicting University Dropout at Different Stages Using Academic Data,” IEEE Access, vol. 9, pp. 133076–133090, 2021, doi: 10.1109/ACCESS.2021.3115851.

[8] Z. Liu, “Student Dropout Prediction Using Ensemble Learning with SHAP-Based Explainable AI Analysis,” vol. 2, pp. 111–131, 2025.

[9] K. Okoye, J. T. Nganji, J. Escamilla, and S. Hosseini, “Machine learning model (RG-DMML) and ensemble algorithm for prediction of students’ retention and graduation in education,” Comput. Educ. Artif. Intell., vol. 6, no. January, p. 100205, 2024, doi: 10.1016/j.caeai.2024.100205.

[10] L. G. R. Putra, D. D. Prasetya, and M. Mayadi, “Student Dropout Prediction Using Random Forest and XGBoost Method,” INTENSIF J. Ilm. Penelit. dan Penerapan Teknol. Sist. Inf., vol. 9, no. 1, pp. 147–157, 2025, doi: 10.29407/intensif.v9i1.21191.

[11] Y. Rimal and N. Sharma, “Ensemble machine learning prediction accuracy: local vs. global precision and recall for multiclass grade performance of engineering students,” Front. Educ., vol. 10, no. April, pp. 1–16, 2025, doi: 10.3389/feduc.2025.1571133.

[12] A. Anggrawan, H. Hairani, and C. Satria, “Improving SVM Classification Performance on Unbalanced Student Graduation Time Data Using SMOTE,” Int. J. Inf. Educ. Technol., vol. 13, no. 2, pp. 289–295, 2023, doi: 10.18178/ijiet.2023.13.2.1806.

[13] G. H. Yakin, I. M. Satriya Wibawa, and I. K. Putra, “Design of Soil pH Measuring Instruments Using pH Meter Sensor Module V1.1 SEN0161 Based on Arduino Uno,” Bul. Fis., vol. 22, no. 2, 2021, doi: 10.24843/bf.2021.v22.i02.p08.

[14] I. Riadi, R. Umar, and R. Anggara, “Comparative Analysis of Naive Bayes and K-NN Approaches to Predict Timely Graduation using Academic History,” Int. J. Comput. Digit. Syst., vol. 16, no. 1, pp. 1163–1174, 2024, doi: 10.12785/ijcds/160185.

[15] A. M. Messele, “Ensemble machine learning for predicting academic performance in STEM education,” Discov. Educ., vol. 4, no. 1, 2025, doi: 10.1007/s44217-025-00710-4.

[16] Y. Zhang, J. Liu, and W. Shen, “A Review of Ensemble Learning Algorithms Used in Remote Sensing Applications,” Appl. Sci., vol. 12, no. 17, 2022, doi: 10.3390/app12178654.

[17] A. Jain, A. K. Dubey, S. Khan, A. Panwar, M. Alkhatib, and A. M. Alshahrani, “A PSO weighted ensemble framework with SMOTE balancing for student dropout prediction in smart education systems,” Sci. Rep., vol. 15, no. 1, p. 17463, 2025, doi: 10.1038/s41598-025-97506-1.

[18] H. Sun, Y. Jiang, and X. Tang, “Artificial Intelligence-Driven Academic Performance Early Warning System for Engineering Universities: Application of XGBoost-SHAP Algorithm,” Int. J. High Speed Electron. Syst., vol. 0, no. 0, p. 2540536, doi: 10.1142/S0129156425405364.

[19] R. Muniappan et al., “An optimized deep learning framework based on LEE for real time student performance prediction in educational data,” Bull. Electr. Eng. Informatics, vol. 14, no. 5, pp. 3671–3682, 2025, doi: 10.11591/eei.v14i5.9773.

[20] F. E. Arévalo-Cordovilla and M. Peña, “Evaluating ensemble models for fair and interpretable prediction in higher education using multimodal data,” Sci. Rep., vol. 15, no. 1, p. 29420, 2025, doi: 10.1038/s41598-025-15388-9.

[21] A. A. Khan, O. Chaudhari, and R. Chandra, “A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation,” Expert Syst. Appl., vol. 244, p. 122778, 2024, doi: https://doi.org/10.1016/j.eswa.2023.122778.

[22] M. Imani, A. Beikmohammadi, and H. R. Arabnia, “Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under Varying Imbalance Levels,” Technologies, vol. 13, no. 3, 2025, doi: 10.3390/technologies13030088.

[23] S. Zhao, D. Zhou, H. Wang, D. Chen, and L. Yu, “Enhancing Student Academic Success Prediction Through Ensemble Learning and Image-Based Behavioral Data Transformation,” Appl. Sci., vol. 15, no. 3, 2025, doi: 10.3390/app15031231.

[24] R. Natras, B. Soja, and M. Schmidt, “Ensemble Machine Learning of Random Forest, AdaBoost and XGBoost for Vertical Total Electron Content Forecasting,” Remote Sens., vol. 14, no. 15, 2022, doi: 10.3390/rs14153547.

[25] S. Kim, E. Choi, Y.-K. Jun, and S. Lee, “Student Dropout Prediction for University with High Precision and Recall,” Appl. Sci., vol. 13, no. 10, 2023, doi: 10.3390/app13106275.

[26] T. Kavzoglu and A. Teke, “Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost),” Arab. J. Sci. Eng., vol. 47, no. 6, pp. 7367–7385, 2022, doi: 10.1007/s13369-022-06560-8.

[27] M. Sivakumar, S. Parthasarathy, and T. Padmapriya, “Trade-off between training and testing ratio in machine learning for medical image processing.,” PeerJ. Comput. Sci., vol. 10, p. e2245, 2024, doi: 10.7717/peerj-cs.2245.

[28] L. Huang et al., “Combining Random Forest and XGBoost Methods in Detecting Early and Mid-Term Winter Wheat Stripe Rust Using Canopy Level Hyperspectral Measurements,” Agriculture, vol. 12, no. 1, 2022, doi: 10.3390/agriculture12010074.

Downloads

Published

2026-02-28

How to Cite

[1]
“COMPARATIVE ANALYSIS OF BAGGING AND BOOSTING MODELS IN ENSEMBLE LEARNING FOR GRADUATION PREDICTION”, jitk, vol. 11, no. 3, pp. 979–986, Feb. 2026, doi: 10.33480/jitk.v11i3.7579.