当前对中医学的怀疑关键在于其缺少科学数据的支撑,因此,把中医诊疗的过程数据化十分重要。针对该问题提出一种数据驱动的中医诊疗方法,基于对医案中病症和对应处方的隐语义分析,找出隐含病机,发现隐含病机与病症和药物间存在的关系,建立了一个基于传统中医医案挖掘的多内容隐含狄利克雷分布(LDA)模型。基于模型的结果,提出根据症状推荐药物的算法,并且建立了基于隐语义模型的中医在线辅助诊疗系统。通过实验评估推荐算法的有效性,在精度、召回率方面均好于基线方法。中医在线辅助诊疗系统能提供数据驱动的诊疗结果辅助中医师诊疗,帮助中医更准确、全面、智能地制定科学的治疗方案。
Current doubts of Traditional Chinese Medicine( TCM) are mainly due to insufficiency of scientific data.Therefore, digitizing TCM diagnosis process is very important. To solve this problem, a data-driven TCM diagnosis method was proposed. Based on the latent semantic analysis, the method aimed at finding latent pathogenesis and exploring the inherent correlation between symptoms and herbs, and a Multi-Content Latent Dirichlet Allocation( LDA) model was developed. Based on the result of the model, a herb recommendation algorithm was proposed according to patients' symptoms. Whats' more, an auxiliary diagnosis and treatment system of TCM was built based on latent semantic model. Experimental results illustrate that its effectiveness, precision and recall of the model are better than the baseline method. The auxiliary diagnosis and treatment system of TCM can provide data-driven treatment result to help TCM doctors, make scientific treatment prescriptions more accurately, comprehensively and intelligently.