针对商品图像句子标注中图像特征单一、关键词受噪声干扰等问题,提出一种聚焦图像特征学习和关键词摘取的商品图像句子标注模型.从梯度、形状和颜色3个角度抽取图像核特征,并在多核学习模型内进行后融合.利用tag-rank模型中的绝对排序和相对排序特征提升关键词权重,设计词序列拼积木算法把关键词拼装成N元词序列.基于N元词序列和模板生成句子.实验表明:句子的BLEU-1和BLEU-2评分优于对比模型.
Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.