慢性阻塞性肺疾病已经成为一个重要的社会问题.气流受限是慢性阻塞性肺疾病的最基本特征.而肺功能检查对评估气流受限程度具有重要意义.本文利用机器学习的方法,分析慢性阻塞性肺疾病患者肺功能检测指标,建立更加有效的判定及预测慢性阻塞性肺疾病的模型.首先,进行数据预处理,包括填补缺失值以及剔除脏数据.然后,利用因子分析、决策树分析及其优化方法,选取13个主成分,建立了最大树深度为3、最小父节点为10、最小子节点为10的慢性阻塞性肺疾病预测模型.最后,通过大量实验,验证本模型的有效性,预测慢性阻塞性肺疾病准确率达到83%,并且具有较好的稳定性.
Chronic obstructive pulmonary disease has become a serious social problem. Airflow limitation is its most basic feature. And pulmonary function test is of great significance to assess the degree of airflow limitation. In this paper, we utilize machine learning methods to analyze the lung function detection indicators of patients with Chronic Obstructive Pulmonary Disease and then establish more effective model to determine and predict Chronic Obstructive Pulmonary Disease. First, we perform data preprocessing, including missing value padding and dirty data culling. Then, we use factor analysis, decision tree analysis and optimization method to select 13 principal components, and establish the prediction model of Chronic Obstructive Pulmonary Disease with the maximum tree depth of 3, minimum parent node of 10 and minimum child node of 10. Finally, we test the effectiveness of the model by a large number of experiments. The accuracy of the model is 83% and the stability is good.