东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

面向智能交互的图像识别技术综述与展望

ISSN号：1000-1239
期刊名称：《计算机研究与发展》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：中国科学院智能信息处理重点实验室(中国科学院计算技术研究所),北京100190
相关基金：国家自然科学基金重点项目（61532018）;国家自然科学基金优秀青年科学基金项目（61322212）;国家自然科学基金青年科学基金项目（61303160）;国家“九七三”重点基础研究发展计划基金项目（2012CB316400）

关键词：图像识别, 智能的视觉识别, 智能交互, 视觉描述, 视觉问答, 深度学习, image recognition, intelligent visual recognition, intelligent interaction, visual descriptionvisual question and answering （VQA） , deep learning

中文摘要：

视觉在人与人交互以及人与自然界的交互过程中起到非常重要的作用，让终端设备具有智能的视觉识别和交互能力是人工智能和计算机技术的核心挑战和远大目标之一．可以看到，近年来视觉识别技术发展飞速，新的创新技术不断涌现，新的研究问题不断被提出，面向智能交互的应用呈现出一些新的动态，正在不断刷新人们对此领域的原有认识．从视觉识别、视觉描述和视觉问答3个角度对图像识别技术进行综述，对基于深度学习的图像识别以及场景分类技术进行了具体介绍，对视觉描述和问答技术的最新技术进行了分析和讨论，同时对面向移动终端和机器人的视觉识别和交互应用进行了介绍，最后对该领域的未来研究趋势进行了分析．

英文摘要：

Vision plays an important role in both the human interaction and human-nature interaction. Furthermore, equipping the terminals with the intelligent visual recognition and interaction is one of the core challenges in artificial intelligence and computer technology, and also one of lofty goals. With the rapid development of visual recognition techniques, in recent years the emerging new techniques and problems have been produced. Correspondingly, the applications with the intelligent interaction also present a few new characteristics, which are changing our original understanding of the visual recognition and interaction. We give a survey on image recognition techniques, covering recent advances in regarding to visual recognition, visual description, visual question and answering （VQA）. Specifically, we first focus on the deep learning approaches for image recognition and scene classification. Next, the latest techniques in visual description and VQA are analyzed and discussed. Then we introduce visual recognition and interaction applications in mobile devices and robots. Finally, we discuss future research directions in this field.

同期刊论文项目