目前的高属性维稀疏数据算法大多面向二态数据,而且没有聚类结果的评价方法,给应用带来很大局限.针对这些问题,文中提出了一种基于知识粒度的高属性维聚类算法.首先通过设计面向数据稀疏特征的半模糊聚类算法对数据进行离散化,并基于此给出稀疏相似度和初始等价关系的定义;然后设计可变精度的二次聚类模型对初始聚类结果进行修正,使算法具有较强的抗噪声能力;最后结合应用领域定义一种新的聚类质量评价模型.实验证明,该算法可提供多粒度分析结果,准确度更高,得到的聚类结果能真实反映数据的特征.
Most existing high-attribute dimensional sparse clustering algorithms can only process binary data and are insufficient in evaluating clustering results,which limits their applications. In order to solve this problem,a noval high-attribute dimensional sparse clustering algorithm based on knowledge granularity is proposed. In this algorithm,first,a semi-fuzzy clustering algorithm is persented to discretize sparse data,with which the sparse similarity and the initial equivalence relation are defined. Then,a precision-variable quadratic clustering model is established to refine the results and further to improve the noise resistance of the proposed algorithm. Finally,an applicationoriented evaluation model of clustering quantity is defined. Test results show that the proposed algorithm is suitable for various granularities and helps to obtain high-accuracy of results of reflecting data characteristics.