位置:成果数据库 > 期刊 > 期刊详情页
一种改进的FastMap算法
  • ISSN号:0469-5097
  • 期刊名称:《南京大学学报:自然科学版》
  • 时间:0
  • 分类:TP181[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
  • 作者机构:中南大学信息科学与工程学院,长沙410083
  • 相关基金:国家自然科学基金(61175064,61273314,91220301)
中文摘要:

FastMap是经典多维标度法(classical multidimensional scaling,CMDS)的一种快速算法,它包含一系列的投影.在每次投影中,两个相距较远的点被选为枢轴点,连接枢轴点得到一个枢轴;然后将各样本投影到枢轴上;最后,修改所有样本间的距离.FastMap的不足在于只能得到CMDS的近似解.对FastMap进行了深入分析,指出FastMap的本质就是把各样本投影到由枢轴确定的一组正交向量上.由于这组向量通常不同于样本集的主轴,使得FastMap只能得到CMDS的近似解;并指出FastMap算法的最大投影次数等于样本集的内在维数.在此理论分析的基础上,提出了一种改进的FastMap算法—iFastMap(improved FastMap)算法.通过对FastMap坐标进行主成分分析,iFastMap得到了与CMDS完全一致的解;此外,从样本集中选取一个内在维数等于整个样本集内在维数的子集,将枢轴点的选取限定在这个子集上,并在每次投影后只修改枢轴点与各样本间的距离,iFastMap的速度得到进一步提高.实验验证了iFastMap与CMDS解的完全一致性及其高效性.

英文摘要:

Classical multidimensional scaling(CMDS)is a very common method for dimensionality reduction,data visualization,machine learning,pattern recognition,etc.When the distance is Euclidean,the CMDS solution can be viewed as the projections of the samples onto the principal axes of the sample set.The problem of CMDS is that its time increases rapidly with the increase of data size.As one of the fast variants of CMDS,the FastMap algorithm has been widely used in many fields.It includes some recursive projections.And each projection consists of three steps.Firstly,apivot is obtained by passing through two far apart points(referred to as pivot points from now on);then the samples are projected onto the pivot and the coordinates of the samples in a low-dimensional Euclidean space are obtained;and finally,the distances between all samples are modified.The shortcoming of the FastMap algorithm is that it can only find approximate solution of CMDS.In this paper,the FastMap algorithm is analyzed in detail.It is found that the essence of the FastMap algorithm is to project the samples onto a set of mutually orthogonal directions determined by the pivots.Since these directions are usually different from the principal axes of the sample set,the Fast-Map algorithm can only get the approximate solution of CMDS.It is also found that the pivots can be selected from a subset provided that its intrinsic dimension is equal to that of the whole sample set.Last but not least,it is found that only the distances between the pivot points and the samples are used to obtain the FastMap coordinates.That is to say,it's unnecessary to modify the distances between all the samples.Based on the theoretical analysis,an improved algorithm called iFastMap(improved FastMap)is put forward in this paper.By introducing principal component analysis,the FastMap coordinates can be aligned with the CMDS coordinates.As a result,the iFastMap algorithm can find exactly the same solution as that of CMDS.In addition,by selecting a subset with the same

同期刊论文项目
同项目期刊论文
期刊信息
  • 《南京大学学报:自然科学版》
  • 中国科技核心期刊
  • 主管单位:中华人民共和国教育部
  • 主办单位:南京大学
  • 主编:龚昌德
  • 地址:南京汉口路22号南京大学(自然科学版)编辑部
  • 邮编:210093
  • 邮箱:xbnse@netra.nju.edu.cn
  • 电话:025-83592704
  • 国际标准刊号:ISSN:0469-5097
  • 国内统一刊号:ISSN:32-1169/N
  • 邮发代号:28-25
  • 获奖情况:
  • 中国自然科学核心期刊,中国期刊方阵“双效”期刊
  • 国内外数据库收录:
  • 美国化学文摘(网络版),美国数学评论(网络版),德国数学文摘,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版),中国北大核心期刊(2000版)
  • 被引量:9316