主曲线是一种新的基于非线性变换的特征抽取方法,它是一种通过数据分布“中间”并满足“自相合”的光滑曲线来进行特征提取的方法。为了提高手写数字串切分的正确率,提出了一种基于笔划组合的手写数字串切分方法。该方法首先使用主曲线完成字符模板的笔划抽取,然后以字符识别器提供的置信度为依据来组合笔划,以实现手写数字串的切分过程。另外,在字符识别器设计方面,则是使用基于数字轮廓分段特征与规范化模板特征这两个单特征分类器组合。实验表明,分别基于这两个特征的分类器具有较强的互补性。由于字符识别器的置信度难以真实反映识别结果,为此需使用类条件置信变换法,通过估计分类器的后验概率来对识别器的置信度进行修正。实验结果表明,该方法对于手写数字的分割是有效的。
Principal curves is a new feature extraction method based on nonlinear transformation. They are smooth selfconsistent curves that passes through the "middle" of the distribution. They perfectly reflect the structural features of the data. The paper chooses principal curves to extract strokes of characters and segments numeral strings by grouping strokes based on the confidence of the classifiers. The classifiers based on the segmented contour feature and the normalized template features are combined and experimental results indicate that the correlation of these two features is small. The paper modifies the confidence of the combined classifier by posterior probabilities which are estimated by a novel class- conditional confidence transformation approach. Experimental results indicate that the method is effective in the segmentation of numeral strings.