A Probabilistic Framework for Robust Chorus Melody Recognition Using High-Order Cepstral Features and KeyIndependent Quaternary Language Models

doi:70517/ijhsa464527a

Research article
DOI: https://doi.org/10.70517/ijhsa464527a

Volume 46, Issue 4
Pages: 4113
-4130
Open Access
Download

A Probabilistic Framework for Robust Chorus Melody Recognition Using High-Order Cepstral Features and KeyIndependent Quaternary Language Models

By: ^¹

¹School of music, University of Sanya, Sanya, Hainan, 572000, China

Published: 10/08/2025

Abstract

Chorus melody recognition—the automatic identification of note sequences from choral audio—is a critical front-end component of melody-based retrieval and educational tools. Traditional non-statistical approaches rely heavily on noisy fundamental-frequency estimation and ad-hoc segmentation, resulting in poor robustness across speakers and acoustic conditions. In this work, we present a novel probabilistic framework adapted from continuous speech recognition. First, instead of fundamental frequency, we extract high-order cepstral coefficients within the human voice pitch range (C2–E4 for male, C3–E5 for female) and normalize them to fixed-length feature vectors, thereby reducing errors due to voicing determination. Second, each note (and silence) is treated as an HMM “word” whose state likelihoods are modeled by GMMs and trained jointly via the forward–backward algorithm. Third, we construct a key-independent quaternary n-gram language model to capture prior probabilities of note transitions, obviating explicit key detection. Finally, recognition is performed by a global Viterbi search over the combined acoustic and language model. Evaluated on a corpus of multi-speaker choral recordings with syllables both “da/ta” and lyric content, our system achieves over 90% correct note-sequence accuracy in clean conditions and maintains 80% accuracy in 10 dB SNR noise, outperforming baseline fundamental-frequency-based methods by 15–20%. Moreover, integration into a chorus query prototype demonstrates a 30% improvement in top-3 retrieval precision.

Keywords: speech signal processing, melody recognition, chorus query, teaching

On this page

A Probabilistic Framework for Robust Chorus Melody Recognition Using High-Order Cepstral Features and KeyIndependent Quaternary Language Models

Abstract