针对如何为构件检索提供更合理的候选构件集问题,设计了一种基于标识潜在语义分析的模糊聚类方法(TL—SAF)。首先设计了标识提取算法从构件描述文档中提取出构件的标识,然后应用潜在语义分析对标识进行降维并提取潜在语义关系,最后使用模糊聚类的思想对构件进行聚类。TLSAF聚类摆脱了传统聚类的硬划分模式,使构件可以分别隶属于不同的簇,对构件检索将能够提供更好的支持。通过在原型构件库中应用TISAF对本文方法的可行性与有效性进行了验证。
For providing more reasonable sets of candidate components in components retrieval, a tags latent semantic analysis based fuzzy clustering (TLSAF) approach is proposed. First, an algorithm of Tags extraction is designed to extract the Tags from description documents of components, and then, the LSA is used to reduce dimension and analyze the latent semantic rela- tions of Tags, and the fuzzy clustering is used in the last stage of TLSAF. The TLSAF clustering, breaking the rigid division mode with permitting a component to be classified into more than one cluster, can provide better support for components retrieval than traditional components clustering. Through the application in prototype component repository, the feasibility and efficiency of TLSAF clustering is validated.