手写文档的非结构化,导致对手写文档的编辑很困难。文本行是手写文档中一个显著的结构,它的可靠提取对于更高级别结构化文档(图形与文字分离,段结构的提取,文字的提取)及编辑文档非常重要。目前关于手写文档的结构化,分为联机和脱机两种。使用联机算法提取文本行,然后讨论文本行的提取对手势设计的影响。
Informal structures of handwritten documents lead to difficulty in editing. Text lines are the prominent structures in handwritten documents and the reliable extraction of the text lines is the foundation to structuring documents in high-level ( eg : distinguishing text from graphics, extraction of paragraphs, extraction of words) as well as to editing documents. There are two ways in structuring handwritten documents, off-line and on-line. The algorithm for extracting text lines discussed in this paper is of on-line, in it the content strokes were identified from the gesture strokes followed by forming the content stroke clusters, and then extracted the text lines with the cluster increments. The effect of extracting the text lines in gesture design is also described.