针对不确定数据流上的聚类问题提出一种不确定数据流子空间聚类算法UDSSC。该算法使用滑动窗口机制接收新到达的数据,剔除陈旧的数据;还引入子空间簇生成策略和新型离群点机制;系统建立了三个缓冲区分别存储新到来的元组、要进行聚类的元组和离群点元组,以此获得高质量的聚类结果。实验表明,UDSSC算法与同类型算法相比,具有更好的聚类效果、更低的时间复杂度和更强的扩展性。
In order to cluster uncertain data stream, this paper proposed a subspace clustering algorithm for uncertain data stream, named UDSSC. Using a sliding window mechanism to receive new arrival data, and remove scale data. It introduced subspace clusters strategy and outliers mechanism, it established three buffers to reserve new arrival tuples, clustering tuples, and outliers to obtain good performance. Experiments show that the UDSSC algorithm has better clustering effect, lower time complexity and better expansibility.