机器学习中很多方法要求目标属性是离散的,而实际中很多属性是连续的。目前的连续属性量化算法存在的问题是当新的对象加入决策表时,原有的分割点可能不是最优的。基于PCA(主成分分析)、模糊C-均值聚类和不相容度概念,提出一种目标连续属性量化算法,该算法具有在量化过程中区别对待不同的条件属性,以决策袁的不相容度为连续属性量化终止的标准,在保持决策表信息损失最少的情况下,尽量减少分类的区间数等特点。
The discrete attributes were required by voluminous methods on machine learning, but continuous attributes are universal in practice. The problem of actual continuous attributes quantifying algorithm is that intrinsic cutting point may be not the best when a new object enter decision tables. An object continuous attributes decision tables quantifying algorithm based on fuzzy cmeans,Principal Component Analysis (PCA) and incompatibility concept was proposed. The algorithm can deal with different qualification attributers distinctively,think the incompatibility of a decision table as quantifying ending criterion,reduce the field number and keep the least information loss simultaneously.