针对海量数据规模下的集中式核函数极限学习机的性能问题,将基于核函数的极限学习机扩展到云计算技术框架下,提出了基于MapReduce的分布式核函数极限学习机MR-KELM.该算法将分布式径向基核函数计算出的核函数矩阵进行分布式矩阵分解,并通过分布式矩阵向量乘法得到分类器输出权重,减小了网络通讯和数据交换代价.实验结果表明,MR-KELM算法能够在不影响基于核函数的极限学习机的计算理论的前提下,具有较好的可扩展性和分类训练性能.
With the exponentially increasing volume of training data, the performance of centralized ELM with kernels suffers due to large matrix operations. A distributed algorithm named MapReduce based kernelized ELM (MR-KELM) was proposed, which realized an implementation of ELM with kernels on MapReduce in the cloud. The kernel matrix generated by distributed radial basis function was decomposed and then the output weights by distributed multiplication of matrix and vector were calculated by the proposed algorithm. Communications and data exchanges in distributed matrix operations were reduced and good scalability was achieved by MR-KELM. Extensive experiments on synthetic datasets were conducted to verify the training performance and scalability of MR-KELM. Experimental results showed that MR-KELM was effective and efficient for massive learning applications.