SENTIMENT CLASSIFICATION MODEL BASED ON COMPARATIVE STUDIES USING MACHINE LEARNING TECHNOLOGY

Authors

  • J PRAYOGA Universitas Dharmawangsa image/svg+xml
  • T. Irfan Fajri Universitas Islam Kebangsaan Indonesia
  • Febri Dristyan Politeknik Jambi

DOI:

https://doi.org/10.33480/jitk.v11i3.7105

Keywords:

Hybrid Model, Lexicon Approach, LSTM, Naive Bayes, Sentiment Analysis

Abstract

The development of social media has generated large amounts of text data, which is a valuable source for sentiment analysis. This study aims to conduct a comparative study of sentiment classification models on Indonesian-language YouTube comments, specifically comparing lexicon-based approaches, traditional machine learning models (Naive Bayes), and deep learning models (LSTM). Data was collected from YouTube videos themed around the youth generation and demographic bonuses, totaling 9,162 comments that underwent comprehensive text preprocessing. Model performance evaluation was conducted using accuracy, precision, recall, and F1-score metrics. The results show that the LSTM model outperforms Naive Bayes with an accuracy of 78.78% and an average F1-score of 0.79, compared to Naive Bayes, which only achieves an accuracy of 62.08% and an F1-score of 0.54. Although LSTM offers higher performance, the Naive Bayes model remains relevant due to its simplicity and efficiency. This study makes an important contribution to the selection of sentiment classification models for the Indonesian language and suggests the development of hybrid models and the use of contextual features for more optimal results. The LSTM model outperforms Naive Bayes with an accuracy of 82.15% (improved from 78.78% through enhanced regularization) and an average F1-score of 0.84. Comprehensive hyperparameter tuning via grid search and expanded manual annotation (40% of the dataset with κ=0.83) ensures robust model evaluation and reduces labeling bias. The study provides methodologically sound benchmarks for Indonesian sentiment analysis

Downloads

Download data is not yet available.

References

[1] R. Setiyawan and Z. Mustofa, “Comparison of the performance of naive bayes and support vector machine in sirekap sentiment analysis with the lexicon- based approach,” pp. 122–132, 2024.

[2] M. Hamka, D. R. Sari, U. M. Purwokerto, B. Digital, I. Teknologi, and M. Purbalingga, “ANALISIS SENTIMEN DAN INFORMATION EXTRACTION PEMBELAJARAN DARING MENGGUNAKAN PENDEKATAN LEXICON,” vol. 3, no. 1, 2022.

[3] M. K. Anam, T. Arita, and M. Bambang, “Sentiment Analysis for Online Learning using The Lexicon-Based Method and The Support Vector Machine Algorithm,” vol. 15, no. 2, pp. 290–302, 2023.

[4] R. Sistem, “Model Text-Preprocessing Komentar Youtube Dalam Bahasa Indonesia,” vol. 1, no. 10, pp. 648–654, 2021.

[5] T. A. Siddiq and M. Ikhsan, “Analisis Sentimen X Terhadap Pemilihan Presiden Indonesia 2024 dengan Metode K-Nearest Neighbor,” vol. 5, no. 4, pp. 1064–1078, 2024, doi: 10.47065/josyc.v5i4.5802.

[6] Y. Ansori, K. Fahmi, and H. Holle, “Perbandingan Metode Machine Learning dalam Analisis Sentimen Twitter Comparison of Machine Learning Methods in Twitter Sentiment Analysis,” vol. 10, no. 4, pp. 1–6, 2022, doi: 10.26418/justin.v10i4.51784.

[7] F. Akbar and C. E. Widodo, “Sentiment Analysis of Data on Google Maps Reviews Regarding Tourism on Keraton Kasepuhan Cirebon Using the Lexicon Based Method,” no. Icaisd 2023, pp. 19–24, 2024, doi: 10.5220/0012440100003848.

[8] L. Ashbaugh and Y. Zhang, “A Comparative Study of Sentiment Analysis on Customer Reviews Using Machine Learning and Deep Learning,” 2024.

[9] D. Ayu, N. Taradhita, I. K. Gede, and D. Putra, “Hate Speech Classification in Indonesian Language Tweets by Using Convolutional Neural Network,” vol. 14, no. 3, pp. 225–239, 2021, doi: 10.5614/itbj.ict.res.appl.2021.14.3.2.

[10] F. A. Aziz and L. S. Harahap, “Sentiment Analysis Regarding the Indonesian House of Representatives Rejecting the Constitutional Court Decision from Social Media Using Naive Bayes,” vol. 10, no. 1, pp. 31–37, 2025.

[11] H. Ali, N. Hendrastuty, C. Science, and U. T. Indonesia, “COMPARISON OF NAÏVE BAYES CLASSIFIER , SUPPORT VECTOR MACHINE , RANDOM FOREST ALGORITHMS FOR PUBLIC SENTIMENT ANALYSIS OF KIP-K KOMPARASI ALGORITMA NAÏVE BAYES CLASSIFIER , SUPPORT VECTOR MACHINE , RANDOM FOREST UNTUK ANALISIS SENTIMEN PUBLIK PROGRAM KIP-K DI TWITTER,” vol. 5, no. 6, pp. 1701–1712, 2024.

[12] N. Tietze, L. Gerhold, J. Kulin, and M. Fairbrother, “Sentiment Analysis on Twitter using Neural Network : Indonesian Presidential Election 2019 Dataset Sentiment Analysis on Twitter using Neural Network : Indonesian Presidential Election 2019 Dataset,” 2021, doi: 10.1088/1757-899X/1077/1/012001.

[13] J. Wang, J. Wei, and F. Tian, “A comparative study of machine learning models for sentiment analysis of transboundary rivers news media articles,” Soft Comput., vol. 28, no. 23, pp. 13331–13347, 2024, doi: 10.1007/s00500-024-10357-2.

[14] M. Makki et al., “Summarizing Netizens ’ Sentiments Towards the 1 st Indonesian Presidential Debate using Lexicon Sentiment Analysis Summarizing Netizens ’ Sentiments Towards the 1 st Indonesian Presidential Debate using Lexicon Sentiment Analysis”, doi: 10.1088/1757-899X/546/5/052041.

[15] I. Jahan, N. Islam, M. Hasan, and R. Siddiky, “Comparative analysis of machine learning algorithms for sentiment classification in social media text,” 2024.

[16] S. Islam, M. Nomani, K. Ngahzaifa, and A. Ghani, " Challenges and future in deep learning for sentiment analysis : a comprehensive review and a proposed novel hybrid approach ", vol. 57, no. 3. Springer Netherlands, 2024. doi: 10.1007/s10462-023-10651-9.

[17] K. Alahmadi, S. Alharbi, J. Chen, and X. Wang, “Generalizing sentiment analysis : a review of progress , challenges , and emerging directions,” Soc. Netw. Anal. Min., vol. 15, no. 1, pp. 1–28, 2025, doi: 10.1007/s13278-025-01461-8.

[18] D. K. Nasiopoulos, K. I. Roumeliotis, D. P. Sakas, K. Toudas, and P. Reklitis, “Financial Sentiment Analysis and Classification : A Comparative Study of Fine-Tuned Deep Learning Models,” pp. 1–27, 2025.

[19] A. Rajesh and T. Hiwarkar, “Sentiment analysis from textual data using multiple channels deep learning models,” J. Electr. Syst. Inf. Technol., 2023, doi: 10.1186/s43067-023-00125-x.

[20] N. A. Semary, W. Ahmed, K. Amin, P. Pławiak, and M. Hammad, “Improving sentiment classification using a RoBERTa-based hybrid model,” no. December, pp. 1–10, 2023, doi: 10.3389/fnhum.2023.1292010.

[21] A. S. Talaat, “Sentiment analysis classification system using hybrid BERT models,” J. Big Data, 2023, doi: 10.1186/s40537-023-00781-w.

[22] L. Khan, A. Amjad, K. M. Afaq, and H. Chang, “applied sciences Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media,” 2022.

Downloads

Published

2026-02-10

How to Cite

[1]
“SENTIMENT CLASSIFICATION MODEL BASED ON COMPARATIVE STUDIES USING MACHINE LEARNING TECHNOLOGY”, jitk, vol. 11, no. 3, pp. 755–764, Feb. 2026, doi: 10.33480/jitk.v11i3.7105.