针对临床分类诊断中普遍存在的样本不均衡、错分代价不同、大量无标签样本和测量误差等特点,引入了机器学习中较新的研究成果——多层降噪自编码(stacked denoising autoencoders,SDA)神经网络,并与欠采样局部更新的元代价(metacost)算法相结合,对SDA神经网络进行了改进,使组合模型具有代价敏感、降低不均衡性、有效利用无标签样本、抗噪声的特性。实验中将改进的SDA神经网络与SOFTMAX回归、反向传播(back propagation,BP)神经网络、支持向量机(support vector machine,SVM)、传统多层自编码(stacked autoencoders,SAE)神经网络,以及传统SDA神经网络等作了比较。实验结果表明,改进的SDA神经网络的准确率、ROC曲线下面积等均优于其他模型,提高了分类模型的辅助诊断性能。
To aim at the common issues in clinical diagnose and classification,such as imbalance,different misclassification costs,numerous unlabeled samples,and measure errors,this paper introduced a relatively new research findings in deep learning,stacked denoising autoencoders( SDA) neural network. Then SDA neural network and metacost algorithm with under-sampling,it combined partial update to form an improved SDA neural network. This improved SDA neural network was cost-sensitive,denoising,able to utilize the unlabeled samples and alleviate the imbalance. In the experiment,it compared with SOFTMAX regression,back propagation( BP) neural network,support vector machine( SVM),stacked autoencoders( SAE) neural network,and traditional SDA neural network. It shows that the accuracy( ACC) and area under ROC( AUC) of the improved SDA neural network outperforms the others. Consequently,the auxiliary diagnostic ability of the classification model is enhanced.