On this page

Exploring the Enhancement Path of Generative Model-Based Optimization of Natural Language Generation in Multi-Round Dialogues

By: Na Zhang1
1College of Computer and Artificial Intelligence, Henan Finance University, Zhengzhou, Henan, 450046, China

Abstract

Dialogue generation is a key research direction in natural language processing, and the adversarial generative network GAN has been widely used in the field of dialogue generation. In this paper, based on the reinforcement learning method, combining the generative adversarial network with the proximal policy optimization algorithm, a PPO-GAN dialogue generation model is proposed, and experimental validation of the model is carried out. The experimental results show that, comparing with the Adver-REGS dialog generation model that uses policy gradient to train GAN, the PPO-GAN model achieves the optimal values of similarity metrics BLEU-1, BLEU-2, BLEU-3, and BLEU-4, which are 19.7, 14.6, 10.8, and 9.5, respectively, and outperforms Adver-Regs in terms of correctness, smoothness, and relevance in generating replies. It also outperforms the Adver-REGS model in terms of correctness, fluency, and relevance of generated responses. In addition, comparing the Seq2Seq-Attention, REGS, RCDG, and PAML models, the PPO-GAN model also achieves higher quality of dialog generation and outperforms in terms of consistency of generated dialog. This study opens up a feasible path for optimization of multi-round dialogue generation and provides strong support for human-machine dialogue learning.