提出一种基于混合距离树的高维书法字索引方法以加速检索.首先将n个书法字通过层次聚类聚成若干类,然后分别计算每个字对应的统一化始点距离和质心距离,最后将两者结合生成索引键值.给定一个查询字,借助混合距离树索引完成高维书法字的查询.实验证明,该方法能够取得较高的查询效率,特别适合海量书法字检索。
The paper proposes a hybrid-distance-tree(HD-Tree)-based high-dimensional indexing method which is to facilitate and speedup the Chinese calligraphic characters retrieval. Two steps are made in HD- Tree, first for every character in high-dimensional space are grouped into T clusters using hierarchy-based cluster algorithm, then the uniform start distance and centroid distance of every characters are pre-calculated and indexed by a partition-based B+-tree. Comprehensive experiments are conducted to indicate the efficiency of our approach which is especially suitable for the retrieval for the large Chinese calligraphic characters database.