MULTI-ARCHITECTURE DEEP LEARNING FOR SUBJECT INDEPENDENT FACIAL EXPRESSION RECOGNITION
DOI:
https://doi.org/10.33480/jitk.v11i4.7404Keywords:
Computational Cost Analysis, Convolutional Neural Network, Deep Learning Architecture Comparison, Facial Expression Recognition, Subject-Independent ValidationAbstract
Facial Expression Recognition (FER) remains a challenging problem in computer vision, particularly under subject-independent conditions in which models must generalize to individuals not seen during training. This study reports a controlled comparative evaluation of three Convolutional Neural Network (CNN) architectures — MobileNetV3-Large, EfficientNet-B3, and ResNet50 — using the Extended Cohn-Kanade (CK+) dataset (981 apex-frame images, 118 subjects, seven emotion classes). All models were trained and tested under identical experimental conditions with a subject-disjoint partition (72/23/23 subjects for training, validation, and testing), so that observed performance differences may be attributed primarily to architectural design. The results indicate that MobileNetV3-Large attains the highest test accuracy of 95.16%, exceeding EfficientNet-B3 (93.01%) and ResNet50 (91.94%), while requiring the fewest parameters (~5.4M) and the shortest inference latency (~8.2 ms per image). A multi-dimensional evaluation covering per-class metrics and computational cost is also reported. These observations provide preliminary architectural guidance for FER deployment in resource-constrained environments; however, because they are derived from a single dataset and a single subject split, broader claims should be confirmed on more diverse benchmarks.
Downloads
References
[1] S. Li and W. Deng, "Deep facial expression recognition: A survey," IEEE Trans. Affect. Comput, vol. 13, no. 3, pp. 1195–1215, 2022, doi: 10.1109/TAFFC.2020.2981446.
[2] F. Z. Canal et al., "A survey on facial emotion recognition techniques: A state-of-the-art literature review," Inf. Sci, vol. 582, pp. 593–617, 2022, doi: 10.1016/j.ins.2021.10.005.
[3] P S. Singh and D. Schicker, “Seven Basic Expression Recognition Using ResNet-18,” arXiv preprint arXiv:2107.04569, Jul. 2021, doi: 10.48550/arXiv.2107.04569.
[4] B. Li and D. Lima, "Facial expression recognition via ResNet-50," Int. J. Cogn. Comput. Eng., vol. 2, pp. 57–64, Jun. 2021, doi: 10.1016/j.ijcce.2021.02.002.
[5] I. Dominguez-Catena, D. Paternain, and M. Galar, "Gender stereotyping impact in facial expression recognition," Commun. Comput. Inf. Sci., vol. 1752, pp. 9–22, 2023, doi: 10.1007/978-3-031-23618-1_1.
[6] A. Bhatt et al., "CNN variants for computer vision: History, architecture, application, challenges and future scope," Electronics, vol. 10, no. 20, p. 2470, 2021, doi: 10.3390/electronics10202470.
[7] L. Alzubaidi et al., "Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions," J. Big Data, vol. 8, no. 1, pp. 1–74, 2021, doi: 10.1186/s40537-021-00444-8.
[8] Z. Li and F. Liu, "A survey of convolutional neural networks: Analysis, applications, and prospects," IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999–7019, 2022, doi: 10.1109/TNNLS.2021.3084827.
[9] M. K. Chowdary, T. N. Nguyen, and D. J. Hemanth, "Deep learning-based facial emotion recognition for human-computer interaction applications," Neural Comput. Appl., vol. 35, no. 32, pp. 23311–23328, Nov. 2023, doi: 10.1007/s00521-021-06012-8.
[10] M. A. Saleem et al., "Convolutional neural networks: A survey," Computers, vol. 12, no. 8, p. 151, 2023, doi: 10.3390/computers12080151.
[11] A. Howard et al., "Searching for MobileNetV3," in Proc. IEEE/CVF Int. Conf. Comput. Vision (ICCV), Oct. 2019, pp. 1314–1324, doi: 10.1109/ICCV.2019.00140.
[12] M. Tan and Q. V. Le, "EfficientNet: Rethinking model scaling for convolutional neural networks," in Proc. 36th Int. Conf. Mach. Learn. (ICML), Jun. 2019, vol. 97, pp. 6105–6114. [Online]. Available: https://proceedings.mlr.press/v97/tan19a.html.
[13] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Conf. Comput. Vision Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90.
[14] E. Owusu, J. A. Kumi, and J. K. Appati, "On facial expression recognition benchmarks," Appl. Comput. Intell. Soft Comput., vol. 2021, doi: 10.1155/2021/9917246.
[15] L. Pham, T. H. Vu, and T. A. Tran, "Facial expression recognition using residual masking network," in Proc. 25th Int. Conf. Pattern Recognit., 2021, pp. 4513–4519.
[16] M. Karnati, A. Seal, D. Bhattacharjee, A. Yazidi, and O. Krejcar, "Understanding deep learning techniques for recognition of human emotions using facial expressions: A comprehensive survey," IEEE Trans. Instrum. Meas., vol. 72, Art. no. 5006631, pp. 1–31, 2023, doi: 10.1109/TIM.2023.3243661.
[17] M. Sajjad et al., "A comprehensive survey on deep facial expression recognition: Challenges, applications, and future guidelines," Alexandria Eng. J., vol. 68, pp. 817–840, Apr. 2023, doi: 10.1016/j.aej.2023.01.017.
[18] M. Iman, H. R. Arabnia, and K. Rasheed, "A review of deep transfer learning and recent advancements," Technologies, vol. 11, no. 2, p. 40, Mar. 2023, doi: 10.3390/technologies11020040.
[19] A. V. Savchenko, "Facial expression and attributes recognition based on multi-task learning of lightweight neural networks," in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit. Workshops, 2022, pp. 119–128.
[20] A. P. Fard and M. H. Mahoor, "Ad-Corre: Adaptive correlation-based loss for facial expression recognition in the wild," IEEE Access, vol. 10, pp. 26756–26768, 2022, doi: 10.1109/ACCESS.2022.3156598.
[21] P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, "The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression," in Proc. IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit. Workshops (CVPRW), Jun. 2010, pp. 94–101, doi: 10.1109/CVPRW.2010.5543262.
[22] T. Kanade, J. F. Cohn, and Y. Tian, "Comprehensive database for facial expression analysis," in Proc. 4th IEEE Int. Conf. Automatic Face Gesture Recognit. (FG), Mar. 2000, pp. 46–53, doi: 10.1109/AFGR.2000.840611.
[23] T. Kumar, K. Turab, V. Raj, and T. Minh, "Image data augmentation approaches: A comprehensive survey and future directions," IEEE Trans. Artif. Intell., vol. 5, no. 12, pp. 6118–6138, 2024, doi: 10.1109/TAI.2024.3449026.
[24] K. Alomar, H. I. Aysel, and X. Cai, "Data augmentation in classification and segmentation: A survey and new strategies," J. Imaging, vol. 9, no. 2, p. 46, 2023, doi: 10.3390/jimaging9020046.
[25] A. Mumuni and F. Mumuni, "Data augmentation: A comprehensive survey of modern approaches," Array, vol. 16, Article 100258, 2022, doi: 10.1016/j.array.2022.100258.
[26] A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, "Albumentations: Fast and flexible image augmentations," Information, vol. 11, no. 2, Art. no. 125, Feb. 2020, doi: 10.3390/info11020125.
[27] F. Zhuang et al., "A comprehensive survey on transfer learning," Proc. IEEE, vol. 109, no. 1, pp. 43–76, 2021, doi: 10.1109/JPROC.2020.3004555.
[28] A. W. Salehi et al., "A study of CNN and transfer learning in medical imaging: Advantages, challenges, future scope," Sustainability, vol. 15, no. 7, Article 5930, 2023, doi: 10.3390/su15075930.
[29] Ž. Vujović, "Classification model evaluation metrics," Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 6, pp. 599–606, 2021, doi: 10.14569/IJACSA.2021.0120670.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Taufiq, Muhammad, Asran, Ezwarsyah, Muchlis Abdul Muthalib

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






-a.jpg)
-b.jpg)











