对样本点数量巨大、用于刻画对象特征的指标众多、带有时空动态特性、包含大量噪声等特点的大规模复杂数据集进行定义。针对大规模复杂数据集的挖掘要求,结合统计分析、粗糙集、模糊集理论中的数据约简思想和方法,提出一种基于样本模糊聚类和粗糙集属性约简的大规模复杂数据集约简方法。
This paper gives the definition of large-scale complex dataset with characteristics of large,multi-attribute,temporal and spatial,rough.For the problem of large-scale complex dataset mining,according to theory of data reduction of statistics,rough set,fuzzy set,an efficient method is proposed to reduce large-scale complex data based on fuzzy clustering and attribute reduction of rough set.