This paper proposes a MIDI automatic composition framework that integrates multi-track clustering algorithm and WaveNet model. The main melody is extracted by multi-track clustering algorithm, and the iterative prediction mechanism of pitch sequence is constructed based on WaveNet model. The model designed in this paper is used to generate Yunnan Ethnic Minority Music and explore its specific application effects. Deconstruct the mapping relationship between pitch and physical parameters, and quantify the short-time energy and spectral characteristics. The skyline algorithm is selected as a control to test the improvement effect of multi-track clustering algorithm in training efficiency and accuracy. Combined with user ratings and melody line visualization, the performance level of music generation of this paper’s model is analyzed. The results show that the music generated by this paper’s model improves about 24%~58% on five subjective evaluation indexes respectively compared to skyline model, and about 12%~22% on five subjective evaluation indexes respectively compared to KD3 model. This study provides a solution with both technical suitability and cultural fidelity for the digital inheritance of ethnic music, which is of great significance for the inheritance and development of Yunnan Ethnic Minority Music.