Virtual character performance generation technology is widely used in film and television animation, especially the combination of motion capture and deep learning, which can effectively improve the naturalness and fluency of the performance. In this paper, a virtual character performance generation method is proposed, which adopts motion capture technology to obtain character movement data and combines deep reinforcement learning for training. The study introduces hierarchical policy learning based on the Actor-Critic framework and uses the PPO algorithm to optimize the motion control strategy. The experimental results show that the reward value of the virtual character in completing the one-legged squatting movement tends to stabilize after 6000 training rounds. In terms of muscle-driven control, the ablation experiments verified the importance of the degree of muscle activation in generating movement fluency and variety. In the one-legged squat maneuver, the maximum reward value was 32.08, while the maximum value after removing the muscle reward decreased to 16.39. Through the user research, the smoothness and naturalness of the virtual character’s movements were highly evaluated, and the system’s usability and visualization received positive feedback. The technique proposed in this paper has important application value in virtual character performance generation.