Analysis of the Application of Speech Recognition Technology in French Cross-Cultural Communication and Its Impact on Improving Students’ Language Proficiency

doi:70517/ijhsa464209

Research article
DOI: https://doi.org/10.70517/ijhsa464209

Volume 46, Issue 4
Pages: 2471
-2488
Open Access
Download

Analysis of the Application of Speech Recognition Technology in French Cross-Cultural Communication and Its Impact on Improving Students’ Language Proficiency

By: ^¹,², ^², ^¹

¹Xi’an International University, Xi’an, Shaanxi, 710077, China

²Xidian University, Xi’an, Shaanxi, 710126, China

Published: 10/08/2025

Abstract

This study focuses on the development of high-precision French speech recognition technology and its application in cross-cultural communication teaching. First, we propose an end-to-end French phoneme recognition method based on cross-modal knowledge distillation, using a CTC decoder to address phoneme alignment issues, and designing a frame-level distillation weight adaptation mechanism and sequence-level distillation. Additionally, we integrate speaker recognition technology based on i-vectors, using factor analysis to extract low-dimensional speaker features, thereby enhancing the system’s adaptability to learners. We also propose a teaching strategy to enhance students’ language proficiency by cultivating French thinking, creating authentic contexts, strengthening cross-cultural awareness, and establishing a layered interactive teaching model. Experiments based on French speech datasets show that the English pre-trained model performs optimally, with a CER of 8.87% and a SER of 10.46% between the Latin alphabet and the French alphabet set. The CTC decoder significantly outperforms the Transformer/Conformer, with a CER 9.42 percentage points lower than the Transformer encoder’s 24.95%. After introducing i-vectors, the maximum error rate reduction reached 61.2%, and the syllable error rate SER on multilingual character sets decreased from 18.60% to 7.22%. Through stepwise multiple regression analysis of 476 student questionnaires, it was found that language attitude is the core predictor of conversational ability (β = 0.24, explaining 13.4% of the variance), self-efficacy dominates French proficiency improvement (β = 0.24, △R² = 0.065), and learning resources contribute most to reading ability (β = 0.33, explaining 21.1% of the variance).

Keywords: French speech recognition, cross-modal knowledge distillation, CTC, i-vector, language ability

On this page

Analysis of the Application of Speech Recognition Technology in French Cross-Cultural Communication and Its Impact on Improving Students’ Language Proficiency

Abstract