视频语义模型的目的是表示和管理视频中包含的对象、事件及关系等语义信息,并提供实现语义查询的基础.随着视频技术及与视频相关的应用的发展,对有效的视频语义模型的要求越来越迫切.文章对现有的视频语义模型进行了全面的综述,共包含16种视频语义模型:5种基于标注的模型和11种丰富语义模型.尽管视频语义模型对视频数据库提供查询服务和其它特性来说是至关重要的,但目前仍没有较好的评价视频语义模型的准则.因此,作者提出了针对丰富语义模型的评价准则共22条,并根据这些准则对11种丰富语义模型进行了评价.评价的结果表明这些模型可以满足用户的基本查询要求,但在高级能力方面(如表达能力方面的不确定性和对象历史等,查询能力方面的推理、查询条件重写等)还有所欠缺.而在表示与领域相关的约束以及为语义信息获取提供辅助功能等方面,目前的模型基本上还没有考虑.根据这些评价结果,文章最后建议了视频语义模型未来的研究方向.
The development of video technology and video-related applications demands strong support in semantic data models. To meet such a requirement, many video semantic data models have been proposed. The semantic model plays a key role in providing query capability and other features for a video database. However, to our knowledge, the criteria for a good semantic model remain open at present. As a result, people lack the rules for evaluating an existing model and the guidelines for the design of a new data model when necessary. To address this issue, this paper proposes twenty two properties as the criteria for video semantic models, and gives the evaluation result of eleven existing rich semantic models according to these properties. It shows that these models mostly concentrate on basic expressive power and query capability, and fulfill users' primary requirements. But in some advanced features such as expressive power, acquisition and analysis of semantic information, and query capability etc, there are rooms for further enhancement. The paper concludes by indicating some research directions for video semantic models.