COMPARATIVE STUDY OF TRANSFORMER-BASED MODELS FOR AUTOMATED RESUME CLASSIFICATION

Nurul Firdaus; Berliana Kusuma Riasti; Muhammad  Asri Safi'ie

doi:10.33480/jitk.v11i2.7453

Authors

Nurul Firdaus Sebelas Maret University
Berliana Kusuma Riasti
Muhammad Asri Safi'ie

DOI:

https://doi.org/10.33480/jitk.v11i2.7453

Keywords:

bert, nlp, resume classification, transformer model

Abstract

This study presents a comparative evaluation of transformer-based models and traditional machine learning approaches for automated resume classification—a key task in optimizing recruitment workflows. While traditional approaches like Support Vector Machines (SVM) with TF-IDF demonstrated the highest performance (93.26% accuracy and 95% F1-score), transformer models such as DistilBERT and RoBERTa showed competitive results with 93.27% and 91.34% accuracy, respectively, and fine-tuned BERT achieved 84.35% accuracy and an F1-score of 81.54%, indicating strong semantic understanding. In contrast, Word2Vec + LSTM performed poorly across all metrics, highlighting limitations in sequential modelling for resume data. The models were evaluated on a curated resume dataset available in both text and PDF formats using accuracy, precision, recall, and F1-score, with preprocessing steps including tokenization, stop-word removal, and lemmatization. To address class imbalance, we applied stratified sampling, macro-averaged evaluation metrics, early stopping, and simple data augmentation for underrepresented categories. Model training was conducted in a PyTorch environment using Hugging Face’s Transformers library. These findings highlight the continued relevance of traditional models in specific NLP tasks and underscore the importance of model selection based on task complexity and data characteristics

Downloads

Download data is not yet available.

References

P. Skondras, G. Psaroudakis, P. Zervas, and G. Tzimas, “Efficient Resume Classification Through Rapid Dataset Creation Using ChatGPT,” in Proc. 14th Int. Conf. Inf., Intell., Syst. & Appl. (IISA), Volos, Greece, 2023, pp. 1–5,DOI:10.1109/IISA59645.2023.10345870.

M. Sharma, G. Choudhary, and S. Susan, “Resume Classification using Elite Bag-of-Words Approach,” in Proc. 5th Int. Conf. Smart Syst. Inventive Technol. (ICSSIT), Tirunelveli, India, 2023, pp. 1409–1413, DOI:10.1109/ICSSIT55814.2023.10061036.

B. Surendiran, T. Paturu, H. V. Chirumamilla, and M. N. R. Reddy, “Resume Classification Using ML Techniques,” in Proc. Int. Conf. Signal Process., Comput., Electron., Power Telecommun. (IConSCEPT), Karaikal, India, 2023,pp.1–5,DOI: 10.1109/IConSCEPT57958.2023.10169907.

D. F. Khatiboun, Y. Rezaeiyan, M. Ronchini, M. Sadeghi, M. Zamani, and F. Moradi, “Digital Hardware Implementation of ReSuMe Learning Algorithm for Spiking Neural Networks,” in Proc. 45th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), Sydney, Australia, 2023,pp.1–4, DOI:10.1109/EMBC40787.2023.10340282.

V. James, A. Kulkarni, and R. Agarwal, “Resume Shortlisting and Ranking with Transformers,” in Intelligent Systems and Machine Learning (ICISML 2022), S. N. Mohanty, V. Garcia Diaz, and G. A. E. Satish Kumar, Eds., Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol. 471, Springer, Cham, 2023, pp. 99–108, DOI: 10.1007/978-3-031-35081-8_8.

I. Huseyinov, I. Diallo, and M. W. Raed, “Resume Recommendation using RNN Classification and Cosine Similarity,” in Proc. 7th Int. Sci. Conf. Intelligent Information Technologies for Industry (IITI’23), S. Kovalev, I. Kotenko, and A. Sukhanov, Eds., Lecture Notes in Networks and Systems, vol. 776, Springer, Cham, 2023, pp. 96–107, DOI:10.1007/978-3-031-43789-2_9.

M. T. Aziz, T. Mahmud, M. K. Uddin, S. N. Hossain, N. Datta, S. Akther, M. S. Hossain, and K. Andersson, “Machine Learning-Driven Job Recommendations: Harnessing Genetic Algorithms,” in Proc. 9th Int. Congr. Inf. Commun. Technol. (ICICT 2024), Lecture Notes in Networks and Systems, vol. 1004, Springer, Singapore, 2024, pp. 471–480,DOI:10.1007/978-981-97-3305-7_38.

G. Kamineni, K. A. Sai and G. S. N. Rao, "Resume Classification using Support Vector Machine," 2023 3rd International Conference on Pervasive Computing and Social Networking (ICPCSN), Salem, India, 2023, pp. 91-96, DOI: 10.1109/ICPCSN58827.2023.00021.

P. K. R, R. M, B. M. G and V. R, "Empirical Evaluation of Large Language Models in Resume Classification," 2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), Bhilai, India, 2024, pp. 1-4, DOI: 10.1109/ICAECT60202.2024.10469472.

A. R. Panda, R. Kumar, A. Ghosh, L. Das, M. K. Mishra, and M. K. Gourisaria, "Optimizing Resume Screening with Machine Learning: An NLP Approach," 2024 6th International Conference on Computational Intelligence and Networks (CINE), Bhubaneswar, India, 2024, pp. 01-06, DOI: 10.1109/CINE63708.2024.10881885.

A. Deshmukh and A. Raut, “Applying BERT-Based NLP for Automated Resume Screening and Candidate Ranking,” Annals of Data Science, vol. 12, pp. 591–603, Mar. 2024, doi: 10.1007/s40745-024-00524-5.

M. Salem, A. Mohamed, and K. Shaalan, “Transformer Models in Natural Language Processing: A Comprehensive Review and Prospects for Future Development,” in Proc. 11th Int. Conf. on Advanced Intelligent Systems and Informatics (AISI 2025), A. E. Hassanien, R. Y. Rizk, A. Darwish, M. T. R. Alshurideh, V. Snášel, and M. F. Tolba, Eds., Lecture Notes on Data Engineering and Communications Technologies, vol. 238, Cham: Springer, 2025, pp. 463–472. doi: 10.1007/978-3-031-81308-5_42.

A. Kathikar, A. Nair, B. Lazarine, A. Sachdeva, and S. Samtani, “Assessing the Vulnerabilities of the Open-Source Artificial Intelligence (AI) Landscape: A Large-Scale Analysis of the Hugging Face Platform,” in Proc. 2023 IEEE Int. Conf. on Intelligence and Security Informatics (ISI), 2023, pp. 1–6, doi: 10.1109/ISI58743.2023.10297271.

R. S. Gargees, “Scholarly Article Classification Leveraging DistilBERT Transformer and Transfer Learning,” in Proc. ISBCom 2024, 2024.

J.-H. Kim, S.-W. Park, J.-Y. Kim, J. Park, S.-H. Jung, and C.-B. Sim, “RoBERTa-CoA: RoBERTa-Based Effective Finetuning Method Using Co-Attention,” IEEE Access, vol. 11, pp. 120292–120303, 2023, doi: 10.1109/ACCESS.2023.3328352.

R. Wang and Y. Shi, “Research on application of article recommendation algorithm based on Word2Vec and Tfidf,” in Proc. 2022 IEEE Int. Conf. on Electrical Engineering, Big Data and Algorithms (EEBDA), 2022, pp. 454–457, doi: 10.1109/EEBDA53927.2022.9744824.

R. Spring and M. Johnson, “The possibility of improving automated calculation of measures of lexical richness for EFL writing: A comparison of the LCA, NLTK and SpaCy tools,” System, vol. 106, Art. no. 102770, Mar. 2022, doi: 10.1016/j.system.2022.102770.

S. Bharadwaj, R. Varun, P. S. Aditya, M. Nikhil, and G. C. Babu, “Resume Screening using NLP and LSTM,” in Proc. 2022 Int. Conf. on Inventive Computation Technologies (ICICT), 2022, pp. 238–241, doi: 10.1109/ICICT54344.2022.9850889.

COMPARATIVE STUDY OF TRANSFORMER-BASED MODELS FOR AUTOMATED RESUME CLASSIFICATION

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

Information

statistikblok

menutama

indexing

Open Access

Indexing JITK