针对线性时间选择算法随着元素数量的增加,执行效率较低的缺点,提出了MapReduce模型下的并行线性时间选择算法。重新设计了线性时间选择问题的算法,使其符合以key/value数据形式作为输入的MapReduce编程模型。并行计算局部最优解,汇总局部最优解再计算出全局最优解。实验结果表明,在面对大数据情况下,经过改进后的MapReduce模型下的并行线性时间选择算法具有执行效率高,且执行效率随着并行程度的增加而提高的特点。
The execution efficiency of liner time selection algorithm is relatively low with the increased amount of element. A parallel liner time selection algorithm on mapreduce is proposed. The liner time selection algorithm is redesigned to meet the ma- preduce programming model which puts key/value form asits data input. The locally optimal solution is computed in parallel. The locally optimal solution is sunmaaried and the globally optimal solution is gotten. Experiments show that the parallel liner time selection algorithm on mapreduce is with high performance in the face of big data. And the execution efficiency of the algo- rithm is high with more compute units.