目的中医舌诊中,一幅舌象对应舌色、苔色和苔厚等多个类别,而且舌象的多个类别间存在一定的相关性。传统的数据挖掘技术无法利用这些相关性同时进行建模,本文拟探索用多标记学习方法解决舌象这种多标记数据的分类问题。方法首先对舌象进行苔质分离,分别提取舌质和舌苔的颜色特征,再对舌苔图像分块,提取每一块的纹理特征,随后通过多标记学习算法(multi-label learning by exploiting label dependency,LEAD)进行分类。最后将LEAD的分类结果和ML-k NN的结果进行对比,评价指标为汉明损失(Hamming loss)、平均精度(average precision)和(-评估)(-evaluation)。结果相对于SVM等传统的单标记学习算法,LEAD可以将多个类别同时赋予一幅舌图像,而且在三个指标上的分类效果均优于ML-k NN。结论多标记LEAD算法用于舌象分类能够使得对舌象的描述更全面、准确,可以辅助中医进行舌诊。
Objective In tongue inspection of traditional Chinese medicine( TCM),a tongue image is associated with multiple labels of tongue body color,tongue coat color,the coat thickness and so on,and there are certain correlations between these labels. Modeling can not be carried on with the correlation at the same time by traditional data mining technology. So,we explore with multi-label learning to solve the classification of tongue images with multiple labels. Methods First,color features are extracted after separating tongue coat and tongue body,then blocking is done on tongue coat only and texture features are extracted on each block,and multi-label learning algorithm LEAD is subsequently used for classification. Finally,the classification results of LEAD and ML-k NN are compared,and the evaluation metrics adopted are Hamming loss,average precision and -evaluation. Results A set of proper labels can be assigned to a tongue image simultaneously through this method compared with the traditional single-label learning such as SVM. What’s more,LEAD can achieve better classification results on all the three metrics than ML-k NN. Conclusions LEAD can make the description of the tongue image more comprehensive and more accurate,providing an objective reference for the TCM tongue diagnosis.