针对流行音乐中人声的发现问题,使用SVM分类器针对MFCC特征进行训练和分类。依据音频特征的连续性,后期对分类结果进行低通滤波。实验结果表明,该方法在帧层面上的识别率可以达到85.76%。实验中也发现不同语种的演唱者在发音上,特别是在MFCC特征上存在很大的统计差异性。实验中对歌曲分类的结果可以作为近一步实现音乐相似性度量的依据之一。
Facing the problem of vocal discrimination in pop music,the authors propose applying MFCC parameters as features, and Support Vector Machine (SVM) as classifier.Due to the continuity of audio signal features,the authors consider low-pass filtering to the classification results as post-processing.Experiment results show that at frame level,a quite promising classification accuracy of 85.76% can be obtained.It is also revealed that singers with different languages have large vocal differences in pronunciation, especially in MFCC feature statistics.The classification results may be used as a similarity measure for music structure analysis in the future work.