为了提高定量细胞分析中细胞核类型识别的准确性、鲁棒性和效率,提出了一种细胞核特征向量降维方法.该方法首先采用基于统计的F-score算法对细胞核特征参数进行初步筛选,剔除F-score明显过低的细胞核特征参数;然后利用随机森林算法计算特征参数对于分类提供的信息量,并以此为依据对特征参数排序;最后在不同数量特征参数情况下进行支持向量机分类实验,得出最终降维结果.实验结果表明:与降维前相比,细胞核的识别时间可节约50%,识别准确性由91.32%提高到98.67%.
In order to improve the performance and efficiency in quantitative cytological analysis system, a method to reduce the dimensionality of cell nuclei feature vector was proposed. First, the statistically based F-score value of each cell nucleus feature was selected and apparent useless features were rejected. Then, the RF algorithm was conducted on the remaining features, and they were sorted in descending order by RF-score value. After evaluating the performances of the cell nuclei classifiers under the conditions of different numbers of features, the final feature vector for cell nuclei recognition was determined. Experiment results show that compared with the original cell nuclei classifier, the dimensionality reduction algorithm can save about 50% computation time in the final classifier, and raise the cell nuclei recognition accuracy from 91. 32% to 98.67%.