当历史的中国书法工作是被数字化,检索的问题成为新挑战。但是,当前,没有光学字符识别技术能把书法字符图象变换成文本,存在笔迹字符识别途径也不能为它工作。这篇论文建议一条新奇途径到高效地根据类似检索中国书法人物:书法特性图象被许多歧视的特征代表,并且有合理有效性的高检索速度被完成。首先,没有类似于询问的可能性的书法字符被比较字符复杂性,笔划密度和笔划伸出一步一步地滤出。然后,类似的书法人物根据他们近似形状火柴生产的匹配的费用被检索并且评价。以便加快检索,我们采用了高维的数据结构—PK 树。最后,算法的效率被一个初步的实验与 3012 幅书法特性图象表明。电子增补材料这篇文章(doi:10.1007/s11390-007-9077-8 ) 的联机版本 contatins 增补材料,它对授权用户可得到。
As historical Chinese calligraphy works are being digitized, the problem of retrieval becomes a new challenge. But, currently no OCR technique can convert calligraphy character images into text, nor can the existing Handwriting Character Recognition approach does not work for it. This paper proposes a novel approach to efficiently retrieving Chinese calligraphy characters on the basis of similarity: calligraphy character image is represented by a collection of discriminative features, and high retrieval speed with reasonable effectiveness is achieved. First, calligraphy characters that have no possibility similar to the query are filtered out step by step by comparing the character complexity, stroke density and stroke protrusion. Then, similar calligraphy characters axe retrieved and ranked according to their matching cost produced by approximate shape match. In order to speed up the retrieval, we employed high dimensional data structure - PK-tree. Finally, the efficiency of the algorithm is demonstrated by a preliminary experiment with 3012 calligraphy character images.