针对文本分类和信息检索中的信息冗余和计算复杂等问题,在概念层次网络的基础上,提出了反义词、同义词、近义词的聚类算法.算法的基本思想是将词语的语义映射到HNC概念符号体系上,将所有的词语都变成一系列符号串,并在计算语义相似度和语义距离的基础上,在词语的HNC符号语料库上实现同义、近义、反义的聚类.
To solve the problems of redundant information and computational complexity,an antonym,synonym,near-synonym clustering algorithm is proposed based on HNC.The basic idea of the algorithm is mapping semantic meaning of the words to the system of HNC concept symbols,and thus all the words become a series of symbols.And based on calculating the semantic similarity and semantic distance,the clustering of antonym,synonym and near-synonym is completed on HNC word symbol corpus.