位置:成果数据库 > 期刊 > 期刊详情页
最小最大模块化网络中基于聚类的数据划分方法研究
  • ISSN号:0469-5097
  • 期刊名称:南京大学学报(自然科学版)
  • 时间:2012.2.2
  • 页码:133-139
  • 分类:TP311.132[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]南京邮电大学计算机学院,南京210003, [2]南京邮电大学计算机技术研究所,南京210003
  • 相关基金:国家自然科学基金(61073114),江苏省高校自然科学基金(08KJB520008),南京邮电大学攀登计划(NY210010)
  • 相关项目:基于能量学习的特征选择方法及其应用研究
作者: 解晓敏|李云|
中文摘要:

利用最小最大模块化网络实现模式分类的关键问题之一就是找到一种有效且复杂度较低的训练样本划分方法,以便缩短训练的时间,得到相对平衡的划分子集.本文提出一种新的基于二分K一均值的训练集划分方法,它可以得到全局最优解,时间复杂度较低,并且可以通过层次聚类得到相对平衡的样本划分效果.在现实数据集上的实验表明,该划分方法在不降低分类精确率的情况下能有效地缩短最小最大模块化网络的训练时间.

英文摘要:

For small data sets, there exists many machine learning algorithms, such as neural networks, naive bayes classifier, decision tree and support vector machine, etc, can get very good performance. But for large-scale problem, the performance of these learning algorithms is not satisfactory. Then we always resort to ensemble learning. Min-Max modular support vector machines (M3-SVM) is one of effective ensemble learning methods. This approach has successfully been applied in many fields of pattern classification. One of the key problems of M3- SVM is to find an effective and low-complexity partitioning method of training samples, then to shorten the training time and to get relatively balanced training subsets. The advantages of traditional K-means clustering are simple and low time complexity. However, it is sensitive to initial point selection. The criterion function is generally optimized by a gradient method, and the search direction of the gradient is along with the direction of energy decreasing, so the result is often local optimal solution rather than global one. In the paper, a new partitioning method is presented, which based on bisecting K-means. For the bisecting clustering, dichotomy strictly belongs to hierarchy clustering. And hierarchy clustering forms a hierarchical tree structure, which contains the information of all levels and the similarity within and between clusters. So the bisecting K-means algorithm can get a global optimal solution and its time complexity is still low. Furthermore, it can get relatively balanced trainirig subsets by means of hierarchical clustering. The experimental results on real-world datasets show that this partitioning method can get compromise between the the training time and classification accuracy rate.

同期刊论文项目
期刊论文 23 会议论文 6 专利 2
同项目期刊论文
期刊信息
  • 《南京大学学报:自然科学版》
  • 中国科技核心期刊
  • 主管单位:中华人民共和国教育部
  • 主办单位:南京大学
  • 主编:龚昌德
  • 地址:南京汉口路22号南京大学(自然科学版)编辑部
  • 邮编:210093
  • 邮箱:xbnse@netra.nju.edu.cn
  • 电话:025-83592704
  • 国际标准刊号:ISSN:0469-5097
  • 国内统一刊号:ISSN:32-1169/N
  • 邮发代号:28-25
  • 获奖情况:
  • 中国自然科学核心期刊,中国期刊方阵“双效”期刊
  • 国内外数据库收录:
  • 美国化学文摘(网络版),美国数学评论(网络版),德国数学文摘,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版),中国北大核心期刊(2000版)
  • 被引量:9316