在连续手写中文中,有偏旁部首离得较远的单字,单字之间可能会存在粘连、重叠。针对这种情况给出了一种基于识别得分提取单字的演化方法。对行笔划序列进行二进制编码,采用改进的遗传算法实现演化过程。染色体中连续0或1对应的笔划组成候选单字。用汉王手写单字识别器获取它们的识别得分,以单字个数较少和总的识别得分较大为优化目标。遗传算法中的变异概率和交叉概率自适应生成。测试结果表明该方法对连续手写中文具有较好的分割效果。
There are characters with apart far radicals and touching and overlapping characters in continuous handwriting Chinese.To address this problem,the paper proposes a novel approach to extract characters from handwriting Chinese based on character recognition score using an improved genetic algorithm.Chromosomes are randomly encoded into binary strings according to the number of strokes,adjacent genes with ls or Os form candidate characters.Characters are recognized using Hanwang handwriting character recognizer.The approach is to produce a recognized string with less characters and a larger score.Mutation probabilities and crossover ones are calculated adaptively.Many applications show that the approach is effective and robust for character extraction from continuous handwriting Chinese.