作为古典MDS算法的一个非线性扩展,ISOMAP算法能较好地对嵌入在高维欧氏空间中的低维非线性流形进行可视化.然而,ISOMAP算法不但要求数据具有良好抽样且位于单一流形之上,而且还依赖于难以有效选取的邻域大小,这极大地限制了该算法的实际应用.为此提出了一种改进算法--GISOMAP,它采用MDS算法的一个变种来减弱长测地距离和“短路”边对距离保持的影响,不但能更好地对具有多聚类结构的数据进行可化,而且对邻域大小也不再敏感,从而能更容易地得到实际应用.
As a nonlinear extension of the classical MDS algorithm, ISOMAP is suitable to visualize nonlinear low-dimensional manifolds embedded in high-dimensional spaces. However, ISOMAP requires that the data belong to a single well-sampled cluster. When the data consists of multiple clusters, long geodesic distances may be badly approximated by the corresponding shortest path lengths, which makes the classical MDS algorithm used in ISOMAP unsuitable. Besides, the success of ISOMAP depends greatly on being able to choose a suitable neighborhood size; however, it's difficult to choose a suitable neighborhood size efficiently. When the neighborhood size is unsuitable, shortcut edges are introduced into the neighborhood graph so that the neighborhood graph cannot represent the right neighborhood structure of the data. To solve the above problems, a new variant of ISOMAP, i.e., GISOMAP, is presented, which uses a special case of MDS to reduce the influence of long geodesic distances and shortcut edges on distance preservation to a certain extent. Consequently, GISOMAP can visualize the data which consists of multiple clusters better than ISOMAP, and can also be less sensitive to the neighborhood size than ISOMAP, which makes GISOMAP be applied more easily than ISOMAP. Finally, the feasibility of GISOMAP can be verified by experimental results well.