如今,Web应用已经可以提供接近传统桌面应用的用户体验,其网页也相应地变得更加复杂,从而对web浏览器的性能提出了巨大挑战.传统的Web浏览器通常使用单一线程处理网页,无法充分利用多处理器设备的运算能力,针对于此提出了一种并行的网页解析算法.与现有针对网页处理的并行算法不同,本算法基于数据并行的方案,通过将输入数据划分成多个部分,对其进行并行处理,再合并各个部分的结果以得到最终结果.本算法可以充分利用现有的高度优化的串行网页处理算法,并且兼容现有的Web标准和技术.在Webkit浏览器引擎上进行的实验指出,本并行算法可以有效利用多核处理器的运算能力,显著提高了网页解析过程的速度.
Web applications have become more complex and rich in user experiences that can compete with desktop applications. This poses great challenges to Web browsers, which traditionally process a Webpage in a single thread therefore cannot exploit the compu- ting power in modem multi-processor devices. This paper presents a parallel algorithm for Webpage parsing. Unlike the existing par- allel algorithms for Webpage processing, the algorithm proposed in this paper is based on the data parallel scheme. By partitioning the input data into several parts, then processing them in parallel and finally merging the partial results to generate the final results, this algorithm in this paper could leverage the existing highly optimized algorithms and be compatible with existing Web standards and technologies. The experimental results on the Webkit Web browser engine show that the parallel algorithm could dramatically speed up the Webpage parsing on a device with multi-core processors.