BEYOND ALGORITHMS: AN INTEGRATED APPROACH TO FAKE NEWS DETECTION USING MACHINE LEARNING TECHNIQUES
DOI:
https://doi.org/10.33480/jitk.v10i3.6061Keywords:
BERT, ensemble learning, SVC, XGBoostAbstract
The internet has become a major source of information, but it also facilitates the rapid spread of fake news, which can significantly influence public opinion and social decisions. While various techniques have been developed for detecting fake news, many studies focus on individual algorithms, which often result in suboptimal performance. This study addresses this gap by comparing machine learning models, including Support Vector Classification (SVC), XGBoost, and a Stacking Ensemble that combines both SVC and XGBoost, to determine the most effective approach for fake news detection. Text preprocessing was performed using IndoBERT, which provides context-aware and semantically rich text representations specifically for the Indonesian language. The evaluation results demonstrate that the Stacking Ensemble outperforms the individual models, achieving an accuracy of 82%, compared to 79% for XGBoost and 78% for SVC. This superior performance is attributed to the complementary strengths of the base models: SVC excels in handling high-dimensional data, while XGBoost effectively manages imbalanced datasets and captures complex feature interactions. The use of IndoBERT further enhances model performance by improving text representation through contextual embeddings. These findings highlight the effectiveness of ensemble learning in enhancing predictive performance and robustness for fake news detection, demonstrating the potential of combining different machine learning techniques with advanced preprocessing methods to achieve more reliable results.
Downloads
References
G. Pennycook and D. G. Rand, “The psychology of fake news,” Trends Cogn. Sci., 2021, doi: doi: 10.31234/osf.io/ar96c..
Pierre Rainer, “Inilah Media Sosial yang Paling Sering Dipakai di Indonesia,” GoodStats.id, 2024. [Online]. Available: https://goodstats.id/article/inilah-media-sosial-paling-sering-dipakai-di-indonesia-Pdyt0#:~:text=Itu berarti%2C terdapat 139 juta,sosial pada bulan Januari 2024. [Accessed: 03-Dec-2024].
M. D. Molina, S. S. Sundar, T. Le, and D. Lee, “‘Fake News’ Is Not Simply False Information: A Concept Explication and Taxonomy of Online Content,” American Behavioral Scientist, vol. 65, no. 2, pp. 180–212, Oct. 2019, doi: 10.1177/0002764219878224.
K. Shu and H. Liu, Detecting fake news on social media. books.google.com, 2022.
R. Dutta and M. Majumder, “ABiLSTM with BERT Embedding for Classification of Imbalanced COVID-19 Rumors,” Current Applied Science And Technology, vol. 25, no. 1, pp. 1–19, 2025.
J. Alghamdi, Y. Lin, and S. Luo, “A Comparative Study of Machine Learning and Deep Learning Techniques for Fake News Detection,” Information, vol. 13, no. 12, p. 576, Dec. 2022, doi: 10.3390/info13120576.
E. Alpaydin, Machine learning. books.google.com, 2021.
M. Mayer, D. W. Heck, and J. Kimmerle, “Opting out in computer-supported sequential collaboration,” Computers in Human Behavior, vol. 165, p. 108527, Apr. 2025, doi: 10.1016/j.chb.2024.108527.
M. H. Shohan et al., “Use of Natural Language Processing for the Detection of Hate Speech on Social Media,” Journal of Advanced Research in Applied Sciences and Engineering Technology (ARASET), vol. 51, no. 2, pp. 86 – 96, 2025.
Y. Asiri, H. T. Halawani, H. M. Alghamdi, S. H. Abdalaha Hamza, S. Abdel-Khalek, and R. F. Mansour, “Enhanced Seagull Optimization with Natural Language Processing Based Hate Speech Detection and Classification,” Applied Sciences, vol. 12, no. 16, p. 8000, Aug. 2022, doi: 10.3390/app12168000.
D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: state of the art, current trends and challenges,” Multimedia Tools and Applications, vol. 82, no. 3, pp. 3713–3744, Jul. 2022, doi: 10.1007/s11042-022-13428-4.
G. Ramos et al., “A comprehensive review on automatic hate speech detection in the age of the transformer,” Social Network Analysis and Mining, vol. 14, no. 1, Oct. 2024, doi: 10.1007/s13278-024-01361-3.
H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP Journal on Advances in Signal Processing, vol. 2021, no. 1, Jun. 2021.
P. Schober and T. R. Vetter, “Logistic Regression in Medical Research,” Anesthesia & Analgesia, vol. 132, no. 2, pp. 365–366, Jan. 2021, doi: 10.1213/ane.0000000000005247.
A. Das, “Logistic Regression,” Encyclopedia of Quality of Life and Well-Being Research, pp. 3985–3986, 2023, doi: 10.1007/978-3-031-17299-1_1689.
A. K. Singh, A. Kumar, M. Mahmud, M. S. Kaiser, and A. Kishore, “COVID-19 Infection Detection from Chest X-Ray Images Using Hybrid Social Group Optimization and Support Vector Classifier,” Cognitive Computation, vol. 16, no. 4, pp. 1765–1777, Mar. 2021, doi: 10.1007/s12559-021-09848-3..
Y. Qiu, J. Zhou, M. Khandelwal, H. Yang, P. Yang, and C. Li, “Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration,” Engineering with Computers, vol. 38, no. S5, pp. 4145–4162, Apr. 2021, doi: 10.1007/s00366-021-01393-9.
A. Asselman, M. Khaldi, and S. Aammou, “Enhancing the prediction of student performance based on the machine learning XGBoost algorithm,” Interactive Learning Environments, vol. 31, no. 6, pp. 3360–3379, May 2021, doi: 10.1080/10494820.2021.1928235.
M. Al-alshaqi, D. B. Rawat, and C. Liu, “Ensemble Techniques for Robust Fake News Detection: Integrating Transformers, Natural Language Processing, and Machine Learning,” Sensors, vol. 24, no. 18, p. 6062, Sep. 2024.
M. Mimura and T. Ishimaru, “Analyzing common lexical features of fake news using multi-head attention weights,” Internet of Things, vol. 28, p. 101409, Dec. 2024, doi: 10.1016/j.iot.2024.101409.
S. Maham, A. Tariq, M. U. G. Khan, F. S. Alamri, A. Rehman, and T. Saba, “ANN: adversarial news net for robust fake news classification,” Scientific Reports, vol. 14, no. 1, Apr. 2024, doi: 10.1038/s41598-024-56567-4.
Y. Blanco-Fernández, J. Otero-Vizoso, A. Gil-Solla, and J. García-Duque, “Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models,” Applied Sciences, vol. 14, no. 21, p. 9729, Oct. 2024.
M. S. A. Alzaidi et al., “An Efficient Fusion Network for Fake News Classification,” Mathematics, vol. 12, no. 20, p. 3294, Oct. 2024, doi: 10.3390/math12203294.
J. Zhou, Z. Ye, S. Zhang, Z. Geng, N. Han, and T. Yang, “Investigating response behavior through TF-IDF and Word2vec text analysis: A case study of PISA 2012 problem-solving process data,” Heliyon, vol. 10, no. 16, p. e35945, Aug. 2024, doi: 10.1016/j.heliyon.2024.e35945.
O. Sagi and L. Rokach, “Approximating XGBoost with an interpretable decision tree,” Information Sciences, vol. 572, pp. 522–542, Sep. 2021, doi: 10.1016/j.ins.2021.05.055.
J. Li et al., “Application of XGBoost algorithm in the optimization of pollutant concentration,” Atmospheric Research, vol. 276, p. 106238, Oct. 2022, doi: 10.1016/j.atmosres.2022.106238.
A. Tariq et al., “Modelling, mapping and monitoring of forest cover changes, using support vector machine, kernel logistic regression and naive bayes tree models with optical remote sensing data,” Heliyon, vol. 9, no. 2, p. e13212, Feb. 2023.
N. Ghosh, S. Kumar Mridha, and R. Paul, “LASSO-mCGA: Machine Learning and Modified Compact Genetic Algorithm-Based Biomarker Selection for Breast Cancer Subtype Classification,” IEEE Access, vol. 13, pp. 17673–17682, 2025, doi: 10.1109/access.2025.3532361.
M. Noviana and S. A. Sudiro, “AUTOMATION OF THE BERT AND RESNET50 MODEL INFERENCE CONFIGURATION ANALYSIS PROCESS,” JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), vol. 10, no. 2, pp. 324–332, Nov. 2024, doi: 10.33480/jitk.v10i2.5053.
J. A. Prakash, V. Ravi, V. Sowmya, and K. P. Soman, “Stacked ensemble learning based on deep convolutional neural networks for pediatric pneumonia diagnosis using chest X-ray images,” Neural Computing and Applications, vol. 35, no. 11, pp. 8259–8279, Dec. 2022, doi: 10.1007/s00521-022-08099-z.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Bimantyoso Hamdikatama

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.