为解决传统串行检索方式在面对海量数据进行处理和查询时效率低下的问题,提出基于JPPF的分布式并行检索策略。JPPF是一种基于Java的功能强大的并行处理框架,其并行环境易于搭建、简单实用。通过分析JPPF的框架结构和分布式工作流程,利用其在执行队列管理及负载均衡方面的优势,设计和实现了一个基于JPPF的检索系统。采用对比实验的方法,以数据库查询为例,比较了串行检索和JPPF并行检索的效率。实验结果表明在数据规模较大的情况下,JPPF并行方式较之串行方式可以显著提高检索效率。
Introduced a distributed parallel method based on JPPF into information retrieval area in order to provide a solution for mass data query. JPPF is a powerful Java-based parallel processing framework,and it is easy to set up a simple and practical parallel environment. Analyzed the frame structure and distributed work processes of JPPF, designed and implemented a retrieval system based on JPPF because of its advantages on queue management and load balancing. Took the database query for example, the efficiency difference between serial search and JPPF parallel search was compared. The experiment results have demonstrated that the distributed parallel computing techniques can be applied to significantly improve the performance of retrieval system.