对表格手写数字串的提取问题进行研究,提出一种基于混合二值化的单元格字符准确定位和完整提取方法,其核心是感兴趣单元格的定位与提取和断裂笔划的修复.该方法可克服书写时带来的各种常规影响,把表格中的手写数字完整提取出来.实验结果表明本文方法的有效性.
The handwritten numeral string extraction in form document is studied. A method is proposed to effectively discern and capture the characters from overlapping borders based on hybrid binarization. Two key problems are investigated in detail including the location and the extraction on the cell of interest (COI) with broken strokes mended. The extracted handwritten characters remain integrated even for characters in different writing styles. Experimental results demonstrate that the proposed method is efficient.