THE IMPACT OF WORD EMBEDDING ON CYBERBULLYING DETECTION USING HYBIRD DEEP LEARNING CNN-BILSTM
DOI:
https://doi.org/10.33480/jitk.v10i3.6270Keywords:
BiLSTM, CNN, cyberbullying, GloVe, hybridAbstract
Cyberbullying can be perpetrated by anyone, whether children or adults, with the primary aim of belittling or attacking specific individuals. Social media platforms like X (formerly Twitter) often serve as the primary medium for cyberbullying, where interactions frequently escalate into retaliatory attacks, intimidation, and insults. In detecting these actions, short tweets are often difficult to understand without context, making specialized approaches like word embedding important. This research uses GloVe feature expansion, utilizing a corpus generated from the IndoNews dataset containing 127,580 entries to enhance vocabulary understanding in tweets that include the use of Indonesian language in both formal and informal forms. This data was then classified using the Hybrid Deep Learning method, which combines Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) with used 30,084 tweets taken from platform X as the dataset. The analysis results show that the application of expansion features using GloVe can improve the performance of the BiLSTM-CNN hybrid model, with the highest accuracy reaching 83.88%, an increase of +3.65% compared to the hybrid model without GloVe. This research successfully detected cyberbullying on platform X, making a significant contribution to efforts to create a safer and more positive social media environment for users.
Downloads
References
“APJII Jumlah Pengguna Internet Indonesia Tembus 221 Juta Orang.” Accessed: Apr. 12, 2024. [Online]. Available: https://apjii.or.id/berita/d/apjii-jumlah-pengguna-internet-indonesia-tembus-221-juta-orang
T. Nazir and L. Thabassum, “Cyberbullying: Definition, Types, Effects, Related Factors and Precautions to Be Taken During COVID-19 Pandemic,” vol. 9, 2021, doi: 10.25215/0904.047.
S. J. Johnson, M. R. Murty, and I. Navakanth, “A detailed review on word embedding techniques with emphasis on word2vec,” Multimed Tools Appl, vol. 83, no. 13, pp. 37979–38007, Apr. 2024, doi: 10.1007/s11042-023-17007-z.
M. A. S. Nasution and E. B. Setiawan, “Enhancing Cyberbullying Detection on Indonesian Twitter: Leveraging FastText for Feature Expansion and Hybrid Approach Applying CNN and BiLSTM,” Revue d’Intelligence Artificielle, vol. 37, no. 4, pp. 929–936, Aug. 2023, doi: 10.18280/ria.370413.
I. A. Asqolani and E. B. Setiawan, “A Hybrid Deep Learning Approach Leveraging Word2Vec Feature Expansion for Cyberbullying Detection in Indonesian Twitter,” Ingenierie des Systemes d’Information, vol. 28, no. 4, pp. 887–895, Aug. 2023, doi: 10.18280/isi.280410.
Febiana Anistya and Erwin Budi Setiawan, “Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 6, pp. 1044–1051, Dec. 2021, doi: 10.29207/resti.v5i6.3521.
M. Kamyab, G. Liu, and M. Adjeisah, “Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis,” Applied Sciences (Switzerland), vol. 11, no. 23, Dec. 2021, doi: 10.3390/app112311255.
R. Joshi, A. Gupta, and N. Kanvinde, “Res-CNN-BiLSTM Network for overcoming Mental Health Disturbances caused due to Cyberbullying through Social Media,” Apr. 2022, [Online]. Available: http://arxiv.org/abs/2204.09738
A. Toktarova et al., “Hate Speech Detection in Social Networks using Machine Learning and Deep Learning Methods,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 5, 2023, doi: 10.14569/IJACSA.2023.0140542.
J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global vectors for word representation,” in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, Association for Computational Linguistics (ACL), 2014, pp. 1532–1543. doi: 10.3115/v1/d14-1162.
S. S. Sohail et al., “Crawling Twitter data through API: A technical/legal perspective,” May 2021, [Online]. Available: http://arxiv.org/abs/2105.10724
A. M. Davani, M. Díaz, and V. Prabhakaran, “Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations,” Oct. 2021, [Online]. Available: http://arxiv.org/abs/2110.05719
M. Lamba and M. Madhusudhan, “Text Pre-Processing,” in Text Mining for Information Professionals, Springer International Publishing, 2022, pp. 79–103. doi: 10.1007/978-3-030-85085-2_3.
R. A. Rudiyanto and E. B. Setiawan, “Sentiment Analysis Using Convolutional Neural Network (CNN) and Particle Swarm Optimization on Twitter,” JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), vol. 9, no. 2, pp. 188–195, Feb. 2024, doi: 10.33480/jitk.v9i2.5201.
G. P and S. Juliet, “An Enhanced Approach for Fake News Detection using Ensemble Techniques,” in 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS), IEEE, Mar. 2023, pp. 1957–1962. doi: 10.1109/ICACCS57279.2023.10112856.
R. Ben Said, Z. Sabir, and I. Askerzade, “CNN-BiLSTM: A Hybrid Deep Learning Approach for Network Intrusion Detection System in Software-Defined Networking With Hybrid Feature Selection,” IEEE Access, vol. 11, pp. 138732–138747, 2023, doi: 10.1109/ACCESS.2023.3340142.
X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, and M. Parmar, “A review of convolutional neural networks in computer vision,” Artif Intell Rev, vol. 57, no. 4, p. 99, Mar. 2024, doi: 10.1007/s10462-024-10721-6.
S. Mishra, N. Bhatnagar, P. Prakasam, and S. T. R, “Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model,” Multimed Tools Appl, vol. 83, no. 13, pp. 37603–37620, Apr. 2024, doi: 10.1007/s11042-023-16849-x.
S. Narkhede, “Understanding Confusion Matrix,” Towards Data Science. Accessed: May 05, 2024. [Online]. Available: https://towardsdatascience.com/understanding-confusion-matrix-a9ad42dcfd62
S. T. Laxmi, R. Rismala, and H. Nurrahmi, “Cyberbullying Detection on Indonesian Twitter using Doc2Vec and Convolutional Neural Network,” in 2021 9th International Conference on Information and Communication Technology (ICoICT), IEEE, Aug. 2021, pp. 82–86. doi: 10.1109/ICoICT52021.2021.9527420.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Moh. Hilman Fariz, Erwin Budi Setiawan

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.