With the rapid development of artificial intelligence technology, generative modeling is increasingly used in the field of artistic creation. In this paper, we design a Note Rank Transformer pop music generative model based on improved Transformer-XL. The model combines lyrics embedding with memory embedding module, and adopts masking array to improve the mechanism of multi-head attention to realize the effective modeling of music sequences. Based on the self-constructed dataset, the Note Rank Transformer model generates samples with a mean value of 90.46% for scale consistency, which is closest to real music (90.22%) in terms of statistical significance, and most of the values of the generating samples are slightly higher than the mean value of real samples for the three metrics of polyphony, note span, and note uniqueness, and for the repetitiveness and high quality notes than the Note Rank Transformer converges faster than Transformer-XL, and the training process is more stable, the interval range of the parameter distributions obtained from the music generation experiments based on Note Rank Transformer is in the range of (-0.2,0.2), which is significantly smaller than that obtained from the experiments using The interval range of parameter distribution obtained from Transformer-XL experiments is (- 0.5.0.5), which proves that the improvement strategy in this paper is effective.