图像数据作为大数据的重要组成部分蕴含着丰富的知识,且图像分类有着广泛的应用,利用传统分类方法已经无法满足实时计算的需求.针对此问题,提出并行在线极端学习机算法.首先利用在线极端学习机理论得到隐层输出权值矩阵;其次根据MapReduce计算框架的特点对该矩阵进行分割,以代替原有大规模矩阵累乘操作,并将分割后的多个矩阵在不同工作节点上并行计算;最后将计算节点上的结果按键值合并,得到最终的分类器.在保证原有计算精度的前提下,将文中算法在MapReduce框架上进行拓展,以人脸图像为例对大规模图像数据进行分类的结果表明,该算法能够针对大数据图像进行快速、准确的分类.
As an important part of big data,image data contains abundant knowledge.The classification of image data has been widely used,while nowadays,the traditional classification methods are unable to meet the need of real-time computing.To solve this problem,we propose a parallel online extreme learning machine algorithm.Firstly,with the theory of online extreme learning machine,we calculate the output weight matrix of hidden layer nodes.Secondly,this matrix is partitioned to several matrix blocks based on the characteristics of the MapReduce framework so as to substitute the original large-scale matrix multiplication operation,and the matrix blocks are calculated in different work nodes in parallel.Finally,the values in calculation nodes are merged by the key values and we get the classifier.Under the premise that the original calculation accuracy is guaranteed,we extend the online extreme learning machine algorithm to the MapReduce framework and the classification experiment results on massive image data,taking facial image data as an example,show that the algorithm in this paper can classify massive image data fast and accurately.