用单词标注图像会产生歧义或噪声,故采用句子标注商品图像,以准确刻画商品特性.现有商品图像句子标注方法存在特征学习不充分的问题,针对该问题,提出基于核特征模型抽取图像的形状、颜色和梯度3种核特征,并在多核学习模型内融合生成新特征,基于新特征完成商品图像分类,检索视觉相似的训练图像,摘录其标题中的关键文本标注商品图像.最后,从信息检索和机器翻译两个角度分别评价标注性能.实验表明:基于新特征能获取最优的商品图像分类性能,图像分类缩小了图像检索范围,有助于改善检索性能;标注模型的MAP(Mean Average Precision)值和P-R(Precision-Recall)指标均优于基线;所标句子与图像内容语义相关,且连贯性和流畅性更优.
Ambiguity and noise often occur in traditional image annotation by single words. As the reason, an efficient scheme of product image sentence annotation is presented to describe products more accurately. However, insufficient feature learning still remain in current sentence annotation works. Therefore, image feature learning is implemented based on kernel descriptors so that three kernel features of product image including shape, color and gradient are extracted respectively. Secondly, product image label is obtained after fusing these kernel features based on multiple kernel learning. Thirdly, a group of similar training images are retrieved by visual similarity computation. And the texts annotated on these images are summa- rized to annotate the product image. Finally, experiments are executed from two aspects including informa- tion retrieval and machine translation to evaluate the annotation model. Experiments show that 1) Classifi- cation performance based on fusion feature is superior to the baselines. More importantly, the image classi- fication narrows the retrieval range and boosts the retrieval performance. 2) MAP (Mean Average Preci- sion) values and P-R (Precision-Recall) performances of the annotation model are all superior to the state of art baselines. 3) Sentences generated by our model are more fluent and coherent than baselines as well as describe image's content well.