针对当前大量电子病历信息无法充分利用的问题,研究了面向电子病历中文医学信息的主题建模及可视组织方法.首先基于电子病历数据和医疗问答数据,进行预处理并转换为纯文本语料,然后采用基于Mallet的LDA主题模型训练算法进行主题建模,并结合主题模型分析的需求进行可视组织与呈现,最后构建了面向中文医学信息的可视分析系统.实例验证表明该系统可以有效的辅助用户进行主题模型的构建与分析,并有利于进一步的诊断.
To make the best of the Chinese medical information in electronic medical records, a visual organization method is proposed. Firstly, a medical information dataset based on electronic medical records and medical community web pages is constructed, which is preprocessed into text corpus. Secondly, a topic model using Mallet is trained and visualized the output of topic model. Finally, a visual analysis system for Chinese medical information is also built. Experiments showed that the system could effectively help the analyzers train topic models and diagnose.