为用后缀树聚类算法对维吾尔文网页进行聚类,通过分析可扩展后缀树和维吾尔文的特点设计了维吾尔文后缀树构造算法。实验结果证明该方法能够在线性的时间范围内构造维吾尔文后缀树,并用它来对维吾尔文网页进行聚类。
Suffix Tree Clustering(STC) have been applied to web page clustering problems. In order to use the STC algorithm to cluster Uighur page, this paper analyzes the characteristics of the generalized suffix tree and Uighur features to design the Ui- ghur generalized suffix tree construction algorithm. The experimental result shows that the method can construct Uighur suffix tree in linear time range, and it can be used to cluster Uighur web page.