针对海量图像的识别技术进行研究,使用SVM算法作为图像识别模型,考虑到随着图像训练样本数据量逐步增大,训练样本呈现指数上升这一问题,在此对基于Hadoop云平台的并行运算SVM方法进行研究,缩短训练时间,加快图像识别效率。使用Corel图像库中图像进行实验研究,结果表明,常规单机SVM图像识别系统以及基于Hadoop平台SVM的图像识别系统的识别准确率相差不大。当Hadoop平台中拥有超过2个节点时,加速比明显上升,训练时间下降,Hadoop平台中使用SVM进行图像识别的效率优势体现出来。
The recognition technology of massive images is researched,in which the SVM algorithm is taken as the imagerecognition model. In consideration of the problem that the training samples increase exponentially with the gradual increase ofdata size of image training samples,the parallel computation SVM method based on Hadoop platform is studied to shorten thetraining time and quicken the image recognition efficiency. In an experiment, the SVM image recognition technology were studiedby means of the images in Corel image library. The results show that the recognition accuracy rate of the image recognition sys?tem using SVM algorithm based on Hadoop platform has no difference with that of the conventional stand?alone SVM image rec?ognition system,but when more than 2 nodes exist in Hadoop platform,the speedup ratio is increased significantly,and thetraining time is decreased,so the efficiency advantage of using SVM in Hadoop platform for the image recognition is reflected.