On this page

A Study on Chinese Syntactic Structure Classification Method Based on Word Vector Modeling

By: Song Wang1
1School of Chinese Language and Literature, Xinyang College, Xinyang, Henan, 464000, China

Abstract

As a key link to improve the performance of tasks such as semantic understanding and machine translation, the study of Chinese syntactic structure classification helps to promote the rapid development of natural language processing technology. In this study, the Glove pre-trained word vector model is used to vectorize the Chinese vocabulary, and the semantic associations between words are modeled by contextual information. Then the BiLSTM model is combined to extract the global syntactic features of sentences, while the multi-head selfattention mechanism is introduced to improve the interpretability of the model. The graph convolutional network layer is further designed to obtain the syntactic structure classification probability through the softmax function. The syntactic structure classification precision, recall, and F1 score of the CSSLSTM model on the CTB5 dataset are 0.951, 0.947, and 0.949, respectively, which are much higher than the comparison methods. When the HEAD number of the model’s multi-head attention mechanism is 4, the model’s classification performance achieves the best results on both CTB5 and CTB7 datasets. The confusion matrix of syntactic structure classification shows that the model has an accuracy of more than 0.92 for the syntactic structures of “subject-verb”, “subject-verb-object”, “linked sentence”, “put word sentence”, “subject word sentence”, “compared sentence”, “existing sentence” and “concurrent sentence”, and the average accuracy of syntactic structure classification in CTB5 and CTB7 datasets is 0.941 and 0.944, respectively, and the classification effect is better.