考虑了一种带有数据领域知识的降维问题。这里领域知识是指关于数据的一些额外监督信息,如类别标号以及比标号弱的样本间相似性和不相似性约束等。其中,约束可以从标号中产生,但反过来从约束中却得不到标号信息,因而约束比标号更一般。另外,在图像检索等实际应用中,约束比标号更容易获取。鉴于此,本文主要研究基于约束的降维问题。提出了一种有效利用约束进行降维的约束保持嵌入算法(constraint preserving embed-ding,COPE),将其纳入到图嵌入统一框架之中并指出与同类方法的关系。进一步,通过引入无标记样本提出了半监督COPE算法;提出核COPE以揭示数据中的非线性结构。最后,在人脸识别、图像检索及半监督聚类等一系列实验中的结果验证了算法的有效性。
The problem of dimensionality reduction given some domain knowledge on the data is considered.Here the domain knowledge denotes additional supervision information other than the data,e.g.the class labels of data or more weakly,the pairwise similarity or dissimilarity constraints.The focus is on the latter because it is more general than the former.Given class labels of data,corresponding pairwise similarity or dissimilarity constraints can be generated,but not vice versa.Also in real world application such as image retrieval,obtaining pairwise constraints is much easier than obtaining labels.A simple algorithm called constraint preserving embedding(COPE)was presented,which can effectively use the pairwise constraints for better embedding.The algorithm is formulated under a unified spectral graph embedding framework and the relationship between it and existing related methods is indicated.Moreover,COPE is extended to semi-supervised and kernel cases,in order to include unlabeled data and capture the nonlinear relationships between data.The performance of the proposed algorithms is evaluated through a series of experiments including face image recognition and retrieval and semi-supervised clustering.Experimental results show that the algorithms are effective and promising in learning from pairwise constraints.