图的最短路径查询作为图论的经典问题,广泛应用于现实世界的许多应用中.然而随着图的规模日益增大,传统单机环境下的查询算法已无法满足大规模图的处理需求.为解决上述问题,提出基于Hadoop的大规模图最短路径查询方法(D-CH方法):首先利用经典的图分割算法(CNM算法)将存储于Hadoop分布式文件系统(HDFS)中的大规模图进行分割,给出了适于后续算法的标记分割结果;然后将查询区分为分割后子图内查询和子图间查询,基于MapReduce编程模型分别给出相应的并行化查询处理算法.实验结果表明,D-CH方法对大规模图的最短路径查询具有良好的执行效率.
As the classical problem of graph theory, the query of shortest path is widely applied to many applications in real world. However, with the size of graph is increasing, the traditional method of data processing in a single-machine environment can not satisfy the need for computing of large-scale graph. To solve above problem, a shortest path query method ( D-CH method) on large-scale graph based on Hadoop has been proposed, first the large-scale graph stored in HDFS was partitioned by the classical graph partition algorithm (CNM) and the partition results are provided which are suitable for the follow-up algorithm in this paper;Then the query is divided into the query within subgraph and the query among subgraphs, the corresponding parallel query processing method is proposed based on MapReduce program ruing model. The experimental results show that D-CH method has good execution efficiency on the query of shortest path on large-scale graph.