字符分割是验证码字符识别的关键。为了解决粘连字符构成的验证码分割成功率低的问题,提出了一种基于SOM(self-organizingmaps)神经网络聚类与维诺图(Voronoi)骨架形态分析相结合的粘连字符分割算法。该算法通过连通分量区分粘连字符,然后利用Voronoi图获得粘连字符的骨架形态,提取粘连字符的骨架特征点;根据SOM聚类后的拓扑神经元分布确定分割点,完成粘连字符骨架的分割与复原。用网络验证码图片集进行了测试,实验效果与滴水法和连通分量提取法对比显示了该分割算法的优越性。该算法对各种字符粘连类型及字体倾斜扭曲的验证码均能准确分割,为粘连字符分割提供了一种新的方法。
Character segmentation is the point in CAPTCHA recognition. As the connected characters in CAPTCHA would be segmented with a low success rate, this paper proposed a character segmentation algorithm based on the clustering of the tou- ching region via self-organizing maps and skeletonization via Voronoi. Firstly, it used connected-component-based method to confirm connected character pairs, and selected feature points through a skeletonization process by Voronoi. Then determined the segmentation points by the neurons of SOM,leading to the final segmentation and character restoration. The results from the tests on the online CAPTCHA collections show that this algorithm achieves a better performance than the drop-fall and the con- nected-component-based algorithms. It can segment varieties of'connected and distorted CAPTCHA, providing a new method for the segmentation of connected characters.