东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

IP网络性能测量研究现状和进展

ISSN号：1000-9825
期刊名称：《软件学报》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]山西大学计算机与信息技术学院,太原030006, [2]计算智能与中文信息处理教育部重点实验室(山西大学),太原030006
相关基金：国家自然科学基金重点项目（61432011）;国家自然科学基金项目（61573229,61502289）;山西省科技基础条件平台建设项目（2012091002-0101）;山西省自然科学基金项目（201601D202039）;山西省研究生教育创新项目（2016SY002）

关键词：集成聚类, 不完备数据, 混合数据, 缺失值填充, K原型聚类算法, clustering ensemble, incomplete data, mixed data, missing value imputation, K-Prototypesclustering algorithm

中文摘要：

集成聚类技术由于具有较好的泛化能力,目前引起了研究者的高度关注.已有研究主要关注数值型完备数据的集成聚类问题.然而,实际应用中面临的数据往往是兼具数值属性和分类属性共同描述的混合型数据,而且通常带有缺失值.为此,针对不完备混合数据提出了一种集成聚类算法,首先利用3种缺失值填充方法对不完备混合数据进行完备化处理;其次在3种填充后的不同完备数据集上分别多次执行K-Prototypes算法产生基聚类结果;最后对基聚类结果进行集成.在UCI真实数据集上与传统聚类算法通过实验进行了比较分析,实验结果表明提出的算法是有效的.

英文摘要：

Cluster ensembles have recently emerged a powerful clustering analysis technology and caught high attention of researchers due to their good generalization ability. From the existing work, these techniques held great promise, most of which generate the final results for complete data sets with numerical attributes. However, real life data sets are usually incomplete mixed data described by numerical and categorical attributes at the same time. And these existing algorithms are not very effective for an incomplete mixed data set. To overcome this deficiency, this paper proposes a new clustering ensemble algorithm which can be used to ensemble final clustering results for mixed numerical and categorical incomplete data. Firstly, the algorithm conducts completion of incomplete mixed data using three different missing value filling methods. Then, a set of clustering solutions are produced by executing K-Prototypes clustering algorithm on three different kinds of complete data sets multiple times, respectively. Next, a similarity matrix is constructed by considering all the clustering solutions. After that, the final clustering result is obtained by hierarchical clustering algorithms based on the similarity matrix. The effectiveness of the proposed algorithm is empirically demonstrated over some UCI real data sets and three benchmark evaluation measures. The experimental results show that the proposed algorithm is able to generate higher clustering quality in comparison with several traditional clustering algorithms.

同期刊论文项目

基于空间相关性的空间数据离散化算法研究

期刊论文 9

面向多源大数据的鲁棒聚类模型与算法研究

期刊论文 1

同项目期刊论文

基于多学习器协同训练模型的人体行为识别方法

基于改进蝙蝠算法的混合整数规划问题

基于二次飞行和随机扰动的改进蝙蝠算法

分层多种群的自适应粒子群算法

惯性权重动态调整的混沌粒子群算法

基于选择性集成旋转森林的人体行为识别算法

基于移动感知的智能手机防盗软件的研究

一种结合二元蚁群和粗糙集的连续属性离散化算法

期刊信息

《软件学报》
北大核心期刊（2011版）

主管单位:中国科学院
主办单位:中国科学院软件研究所中国计算机学会
主编：赵琛
地址：北京8718信箱中国科学院软件研究所
邮编：100190
邮箱：jos@iscas.ac.cn
电话：010-62562563

国际标准刊号：ISSN：1000-9825
国内统一刊号：ISSN：11-2560/TP
邮发代号:82-367

获奖情况:
2001年入选中国期刊方阵“双百期刊”,2000年荣获中国科学院优秀科技期刊一等奖

国内外数据库收录:
俄罗斯文摘杂志,美国数学评论（网络版）,波兰哥白尼索引,德国数学文摘,荷兰文摘与引文数据库,美国工程索引,美国剑桥科学文摘,英国科学文摘数据库,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:54609