On this page

Optimizing the quality of long English text translation: Paradigm improvement driven by self-attention mechanisms

By: Xiaoyan Li 1,2, Wei Chen 2, Ruonan Wang 3, Jingjing Zhang 1,2
1School of Education, Hefei University, Hefei, Anhui, 230616, China
2School of Foreign Languages, Bengbu University, Bengbu, Anhui, 233030, China
3 Xidian University, Xi’an, Shaanxi, 710068, China

Abstract

Machine translation technology plays an important role in the process of globalization, but traditional translation systems often face semantic breaks and lack of coherence when dealing with long texts. Although existing neural machine translation models perform well at the sentence level, they are still deficient in crosssentence semantic understanding and contextualization. In this study, an optimization model based on the multihead self-attention mechanism is constructed to address the problem of lack of semantic coherence in English long text translation. Methodologically, a context-dependent semantic coherence computation model is designed by adopting an encoder-decoder architecture combined with the multi-head attention mechanism, extracting sentence features through convolutional neural networks, and fusing document topic information and semantic matching strategies. The replication mechanism and gating mechanism are introduced into the encoder to improve the accuracy of vocabulary generation. The results show that after integrating the multi-head attention mechanism, the model achieves a BLEU value of 22.0885 on the Chinese-English translation task, which is improved by 0.7885 compared with the baseline model; in the semantic coherence analysis task, the accuracy rate reaches 60.2485%, with an F1 value of 49.4955%; and the Pearson’s correlation coefficient with the manual scoring is 0.7498.The conclusions show that the multi-head self-attention mechanism can effectively capture global semantic relations in long texts, significantly improve translation quality and semantic coherence, and provide a feasible technical path for English long text translation.