EVALUATION OF REAL-TIME SPEECH RECOGNITION ACCURACY IN INTERACTIVE VIDEO MEDIA FOR DEAF STUDENTS

Hilman Nuril Hadi; Adnan Zulkarnain

doi:10.33480/jitk.v11i4.7933

Authors

Hilman Nuril Hadi Universitas Bhinneka Nusantara
Adnan Zulkarnain Universitas Bhinneka Nusantara

DOI:

https://doi.org/10.33480/jitk.v11i4.7933

Keywords:

Communication, Deafness, Speech Recognition

Abstract

Deafness is a type of disability characterized by partial or complete hearing loss in one or both ears. Deaf students in higher education face several critical challenges: (1) dependence on oral communication they cannot directly access, (2) limited sign language interpreters in regular classrooms, (3) the absence of media that converts speech into real-time text while displaying the speaker's facial expressions. These conditions cause deaf students to struggle with following explanations, engaging in discussions, and participating actively in the learning process. However, individuals with hearing impairment tend to rely on visual learning, whereas the majority of instructional information is delivered through oral communication. This research aims to develop interactive media based on speech recognition and real-time video as a solution to improve communication in the learning process of deaf students. The novelty of this research lies in the integration of web-based speech recognition with a multi-actor interface (instructor, student, and general user) specifically designed for inclusive education in higher education settings, distinguishing it from conventional solutions. The method used is Research and Development (R&D) with the stages of needs analysis, system design, implementation, and functional testing and performance testing using Word Error Rate (WER). The overall average WER was 19.70%, with the range of WER being 14.05% (from the minimum of 13.22% to the maximum of 27.27%). The results showed that all system features performed as required, and an average WER indicated a good level of accuracy for interactive educational contexts.

Downloads

Download data is not yet available.

References

[1] E. Lewis, S. Mitra, and J. Yap, “Do Disability Inequalities Grow with Development? Evidence from 40 Countries,” Sustainability (Switzerland), vol. 14, no. 9, May 2022, doi: 10.3390/su14095110.

[2] E. Juherna, D. D. Kurniawati, G. L. Sugiarti, and A. N. Falaah, “Efektifitas Penggunaan Coachlear Implant dalam Pemerolehan Bahasa Anak Tunarungu Usia 4 Tahun,” Jurnal Pelita PAUD, vol. 6, no. 2, pp. 261–269, Jun. 2022, doi: 10.33222/pelitapaud.v6i2.1598.

[3] N. Luh Putu Sri Adnyani, N. Made Rai Wisudariani, G. Aditra Pradnyana, I. Made Ardwi Pradnyana, and N. Komang Arie Suwastini, “Multimedia English Learning Materials for Deaf or Hard of Hearing (DHH) Children,” Journal of Education Technology, vol. 5, no. 4, pp. 571–578, Nov. 2021, doi: 10.23887/jet.v5i4.3.

[4] F. A. Nugroho and A. P. Lintangsari, “Deaf Students’ Challenges in Learning English : A Literature Review,” IJDS Indonesian Journal of Disability Studies, vol. 9, no. 02, pp. 217–224, Dec. 2022, doi: 10.21776/ub.ijds.2022.009.02.06.

[5] R. Sarkar and A. Ghosh, “Challenges faced by students with hearing impairment in higher education: A comprehensive analysis,” International Journal of Speech and Audiology, vol. 5, no. 1, pp. 06–12, Jan. 2024, doi: 10.22271/27103846.2024.V5.I1A.43.

[6] P. A. Rodríguez-Correa, A. Valencia-Arias, O. N. Patiño-Toro, Y. Oblitas Díaz, and R. la Puente, “Benefits and development of assistive technologies for Deaf people’s communication: A systematic review,” Front. Educ. (Lausanne)., vol. Volume 8-2023, 2023, doi: 10.3389/feduc.2023.1121597.

[7] L. Pragt, P. van Hengel, D. Grob, and J. W. A. Wasmann, “Preliminary Evaluation of Automated Speech Recognition Apps for the Hearing Impaired and Deaf,” Front. Digit. Health, vol. 4, Feb. 2022, doi: 10.3389/fdgth.2022.806076.

[8] K. K. Widiartha, K. Agustini, I. M. Tegeh, and I. W. S. Warpala, “Real Time Automated Speech Recognition Transcription and Sign Language Character Animation on Learning Media,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 13, no. 3, Dec. 2024, doi: 10.23887/janapati.v13i3.85065.

[9] L. A. Kumar, D. K. Renuka, S. L. Rose, M. C. Shunmuga priya, and I. M. Wartana, “Deep learning based assistive technology on audio visual speech recognition for hearing impaired,” International Journal of Cognitive Computing in Engineering, vol. 3, pp. 24–30, Jun. 2022, doi: 10.1016/j.ijcce.2022.01.003.

[10] Y. Samaradivakara et al., “SeEar: Tailoring Real-time AR Caption Interfaces for Deaf and Hard-of-Hearing (DHH) Students in Specialized Educational Settings,” in Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, in CHI EA ’24. New York, NY, USA: Association for Computing Machinery, 2024. doi: 10.1145/3613905.3650974.

[11] K. Kuhn, V. Kersken, B. Reuter, N. Egger, and G. Zimmermann, “Measuring the Accuracy of Automatic Speech Recognition Solutions,” ACM Trans. Access. Comput., vol. 16, no. 4, Jan. 2024, doi: 10.1145/3636513.

[12] P. Arisaputra and A. Zahra, “Indonesian Automatic Speech Recognition with XLSR-53,” Ingenierie des Systemes d’Information, vol. 27, no. 6, pp. 973–982, Dec. 2022, doi: 10.18280/isi.270614.

[13] A. Adila, D. Lestari, A. Purwarianti, D. Tanaya, K. Azizah, and S. Sakti, “Enhancing Indonesian Automatic Speech Recognition: Evaluating Multilingual Models with Diverse Speech Variabilities,” Computer Science Computation and Language, Oct. 2024, doi: 10.48550/arXiv.2410.08828.

[14] A. Ferraro, A. Galli, V. La Gatta, and M. Postiglione, “Benchmarking open source and paid services for speech to text: an analysis of quality and input variety,” Front. Big Data, vol. 6, 2023, doi: 10.3389/fdata.2023.1210559.

[15] J. Lee and S. Watanabe, “Intermediate Loss Regularization for CTC-Based Speech Recognition,” in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6224–6228. doi: 10.1109/ICASSP39728.2021.9414594.

[16] R. Yakubovskyi and Y. Morozov, “Speech Models Training Technologies Comparison Using Word Error Rate,” Advances in Cyber-Physical Systems, vol. 8, no. 1, pp. 74–80, May 2023, doi: 10.23939/acps2023.01.074.

[17] T. Amorese, C. Greco, M. Cuciniello, R. Milo, O. Sheveleva, and N. Glackin, “Automatic speech recognition (ASR) with Whisper: Testing Performances in Different Languages,” in Proceedings of the 9th Italian Conference on Computational Linguistics (CLiC-it 2023), 2023. Accessed: Nov. 30, 2025. [Online]. Available: https://ceur-ws.org/Vol-3574/

[18] T. von Neumann, C. Boeddeker, K. Kinoshita, M. Delcroix, and R. Haeb-Umbach, “On Word Error Rate Definitions and Their Efficient Computation for Multi-Speaker Speech Recognition Systems,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5. doi: 10.1109/ICASSP49357.2023.10094784.

[19] S. Nouriska, M. C. Untoro, A. Afriansyah, M. Praseptiawan, W. Yulita, and I. F. Ashari, “USER EXPERIENCE ANSWER SYSTEM AUTOMATICALLY WITH USER CENTERED DESIGN AND USER EXPERIENCE QUESTIONNAIRE-SHORT,” JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), vol. 9, no. 1, pp. 81–88, Aug. 2023, doi: 10.33480/jitk.v9i1.4152.

[20] P. Roychowdhury et al., “Evaluating the accuracy of speech to text applications for cochlear implant candidates during COVID-19,” Cochlear Implants Int., vol. 24, no. 1, pp. 1–5, Jan. 2023, doi: 10.1080/14670100.2022.2120450.

[21] L. Pragt, P. van Hengel, D. Grob, and J.-W. A. Wasmann, “Preliminary Evaluation of Automated Speech Recognition Apps for the Hearing Impaired and Deaf,” Front. Digit. Health, vol. Volume 4-2022, 2022, doi: 10.3389/fdgth.2022.806076.

[22] A. El Hannani, R. Errattahi, F. Z. Salmam, T. Hain, and H. Ouahmane, “Evaluation of the effectiveness and efficiency of state-of-the-art features and models for automatic speech recognition error detection,” J. Big Data, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40537-020-00391-w.

[23] H. Kapoh, O. E. Melo, and A. A. Kimbal, “Black Box Testing in Web-based Applications: Case Study-Remedial Application at Manado State Polytechnic,” Int. J. Comput. Appl., vol. 174, no. 12, pp. 975–8887, Jan. 2021, doi: 10.5120/ijca2021921002.

[24] P. K. Ayuningtyas, D. Atmodjo, and P. Rachmadi, “Performance And Functional Testing With The Black Box Testing Method,” International Journal of Progressive Sciences and Technologies (IJPSAT, vol. 39, no. 2, pp. 212–218, Jul. 2023, doi: 10.52155/ijpsat.v39.2.5471.

[25] P. Yellamma, P. R. Varun, N. C. N. L. Narayana, Y. Chowdary, P. Manikanth, and K. H. G. Sai, “Automatic and Multilingual Speech Recognition and Translation by using Google Cloud API,” in 2024 5th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI), 2024, pp. 566–571. doi: 10.1109/ICMCSI61536.2024.00089.

[26] D. C. Tran, D. L. Nguyen, H. S. Ha, and M. F. Hassan, “Speech Recognizing Comparisons Between Web Speech API and FPT. AI API,” in Proceedings of the 12th National Technical Seminar on Unmanned System Technology, Springer, 2021, pp. 853–865.

EVALUATION OF REAL-TIME SPEECH RECOGNITION ACCURACY IN INTERACTIVE VIDEO MEDIA FOR DEAF STUDENTS

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

Latest publications

Information

statistikblok

menutama

indexing

Open Access

Indexing JITK