This study proposes a multilevel modeling framework that integrates Bayesian inference, genetic algorithm optimization and dynamic temporal regularization DTW, aiming at high-precision collaborative matching of music and dance rhythms. By constructing a dynamic bar-pointer model, the bar line position and tempo of the music rhythm are taken as hidden state variables, and a posteriori density estimation is combined with a sequential Monte Carlo method to realize robust music rhythm extraction. For the dance movement system, a feature optimization framework based on genetic algorithm is proposed to filter the optimal music-dance movement matching combinations through the fitness function and quantify the rhythm synchronization by combining with the DTW algorithm. Experimental validation shows that the beat tracking algorithm based on Bayesian theory performs well in music cycle extraction, with cycle peaks stabilized in the interval from -0.8 to 1, with beat division. The correlation coefficient between music and dance movement features reaches 0.827, and the matching accuracy reaches 84.52% at 20 feature pairs. The beat points of the synthesized dance highly overlapped with the music, and the intensity distribution trend was consistent. This study not only provides a quantifiable analysis tool for musicdance co-creation, but also lays a theoretical foundation for cross-modal interaction technology in virtual reality and intelligent choreography.