WORD2VEC OPTIMALIZATION USING TRANSFER LEARNING IN INDONESIAN LANGUAGE FOR HIGHER EDUCATION
DOI:
https://doi.org/10.33480/jitk.v11i2.6051Keywords:
indonesian language, NLP , Word2Vec , transfer learning , optimizationAbstract
Natural language processing (NLP) in Indonesian faces challenges due to limited linguistic resources, particularly in developing optimal word embedding models. This study optimizes the Word2Vec model for Indonesian in higher education contexts by leveraging transfer learning and lexicon expansion. Using a dataset of 4,463 higher education related tweets consisting of positive and negative sentiment categories, the proposed NewWord2Vec model combined with a Support Vector Machine (SVM) classifier achieved a 4% improvement in word detection accuracy compared to the standard Word2Vec. This enhancement demonstrates better performance in capturing linguistic nuances and sentiment orientation in Indonesian text. However, the model’s applicability remains limited to higher education terminology, and potential biases from transfer learning must be addressed. Future research should expand the dataset to diverse domains and refine the transfer learning process to better capture contextual variations in Indonesian. These findings contribute to advancing NLP applications in Indonesian, particularly for automated assessment systems, recommendation tools, and academic decision-making processes
Downloads
References
E. Shamsi and H. Bozorgian, “A review of empirical studies of social media on language learners’ willingness to communicate,” Educ. Inf. Technol., vol. 27, no. 4, pp. 4473–4499, 2022, doi: 10.1007/s10639-021-10792-w.
J. Fan, “Innovation and Exploration of the Path to Cultivating University Students’ Cultural Awareness through New Media in the Context of Cultural Inheritance,” Media Commun. Res., vol. 4, no. 11, pp. 61–70, 2023, doi: 10.23977/mediacr.2023.041109.
Y. Purnama and A. Asdlori, “The Role of Social Media in Students’ Social Perception and Interaction: Implications for Learning and Education,” Technol. Soc. Perspect., vol. 1, no. 2, pp. 45–55, 2023, doi: 10.61100/tacit.v1i2.50.
S. Kwayu, M. Abubakre, and B. Lal, “The influence of informal social media practices on knowledge sharing and work processes within organizations,” Int. J. Inf. Manage., vol. 58, no. December 2020, p. 102280, 2021, doi: 10.1016/j.ijinfomgt.2020.102280.
A. Sandu, L. A. Cotfas, A. Stănescu, and C. Delcea, A Bibliometric Analysis of Text Mining: Exploring the Use of Natural Language Processing in Social Media Research, vol. 14, no. 8. 2024.
N. A. Sharma, A. B. M. S. Ali, and M. A. Kabir, “A review of sentiment analysis: tasks, applications, and deep learning techniques,” Int. J. Data Sci. Anal., no. September, 2024, doi: 10.1007/s41060-024-00594-x.
A. Namoun and A. Alshanqiti, “Predicting student performance using data mining and learning analytics techniques: A systematic literature review,” Appl. Sci., vol. 11, no. 1, pp. 1–28, 2021, doi: 10.3390/app11010237.
D. S. Asudani, N. K. Nagwani, and P. Singh, Impact of word embedding models on text analytics in deep learning environment: a review, vol. 56, no. 9. Springer Netherlands, 2023.
U. Naseem, I. Razzak, S. K. Khan, and M. Prasad, “A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models,” ACM Trans. Asian Low-Resource Lang. Inf. Process., vol. 20, no. 5, pp. 1–46, 2021, doi: 10.1145/3434237.
M. Krichen, “Convolutional Neural Networks: A Survey,” Computers, vol. 12, no. 8, pp. 1–41, 2023, doi: 10.3390/computers12080151.
T. Adimulam, S. Chinta, and S. K. Pattanayak, “Transfer Learning in Natural Language Processing : Overcoming Low-Resource Challenges,” vol. 11, no. 2, pp. 65–79, 2022.
M. Iman, H. R. Arabnia, and K. Rasheed, “A Review of Deep Transfer Learning and Recent Advancements,” Technologies, vol. 11, no. 2, pp. 1–14, 2023, doi: 10.3390/technologies11020040.
N. Nedjah, I. Santos, and L. de Macedo Mourelle, “Sentiment analysis using convolutional neural network via word embeddings,” Evol. Intell., vol. 15, no. 4, pp. 2295–2319, 2022, doi: 10.1007/s12065-019-00227-4.
T. T. J. Kiran and P. P. Jadhav, “Optimizing Deep Learning: Unveiling the Collective Wisdom of Swarm Intelligence for LSTM Parameter Tuning,” Int. J. Intell. Syst. Appl. Eng., vol. 12, no. 13s, pp. 432–439, 2024.
M. A. Jassim, D. H. Abd, and M. N. Omri, “A survey of sentiment analysis from film critics based on machine learning, lexicon and hybridization,” Neural Comput. Appl., vol. 35, no. 13, pp. 9437–9461, 2023, doi: 10.1007/s00521-023-08359-6.
P. Tschisgale, P. Wulff, and M. Kubsch, “Integrating artificial intelligence-based methods into qualitative research in physics education research: A case for computational grounded theory,” Phys. Rev. Phys. Educ. Res., vol. 19, no. 2, p. 20123, 2023, doi: 10.1103/PhysRevPhysEducRes.19.020123.
A. Setyanto et al., “Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM),” Appl. Sci., vol. 12, no. 9, pp. 1–18, 2022, doi: 10.3390/app12094140.
F. R. Zagatti, G. Y. Shimizu, and H. D. E. M. Caseli, “Investigating the Relationship Between Text Vectorization Cosine Similarity and Classification Performance,” vol. 13, no. June, 2025, doi: 10.1109/ACCESS.2025.3595423.
R. K. Halder et al., “ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application,” J. Pathol. Inform., vol. 15, no. December 2023, p. 100371, 2024, doi: 10.1016/j.jpi.2024.100371.
M. W.Habib and Z. N. Sultani, “Twitter Sentiment Analysis Using Different Machine Learning and Feature Extraction Techniques,” Al-Nahrain J. Sci., vol. 24, no. 3, pp. 50–54, 2021, doi: 10.22401/anjs.24.3.08.
S. Nazir, M. Asif, M. Rehman, and S. Ahmad, “Machine learning based framework for fine-grained word segmentation and enhanced text normalization for low resourced language,” no. 1, pp. 1–19, 2024, doi: 10.7717/peerj-cs.1704.
J. Fehle, T. Schmidt, and C. Wolff, “Lexicon-based sentiment analysis in german: Systematic evaluation of resources and preprocessing techniques,” KONVENS 2021 - Proc. 17th Conf. Nat. Lang. Process., pp. 86–103, 2021.
X. Tannier et al., “Development and Validation of a Natural Language Processing Algorithm to Pseudonymize Documents in the Context of a Clinical Data Warehouse,” Methods Inf. Med., 2023, doi: 10.1055/s-0044-1778693.
I. Biri, U. T. Kucuktas, F. Uysal, and F. Hardalac, “Forecasting the future popularity of the anti-vax narrative on Twitter with machine learning,” J. Supercomput., vol. 80, no. 3, 2024, doi: https://doi.org/10.1007/s11227-023-05567-8.
M. A. Al-Garadi, Y. C. Yang, and A. Sarker, “The Role of Natural Language Processing during the COVID-19 Pandemic: Health Applications, Opportunities, and Challenges,” Healthc., vol. 10, no. 11, pp. 1–19, 2022, doi: 10.3390/healthcare10112270.
L. A. Demidova, “Two‐stage hybrid data classifiers based on svm and knn algorithms,” Symmetry (Basel)., vol. 13, no. 4, 2021, doi: 10.3390/sym13040615.
A. Mahmoudi, D. JEMIELNIAK, and L. CIECHANOWSKI, “Assessing Accuracy : A Study of Lexicon and Rule-Based Packages in R and Python for Sentiment Analysis,” IEEE Access, vol. 12, no. February, pp. 20169–20180, 2024, doi: 10.1109/ACCESS.2024.3353692.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sri Hadianti, Dwiza Riana, Herdian Tohir, Jarwadi Jarwadi, Tjaturningsih Rosdiana, Evi Sopandi, Dinar Ajeng Kristiyanti

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






-a.jpg)
-b.jpg)











