为了便于在不同结构层次上对视频进行检索和浏览。可以把视频序列分为不同层次的逻辑单元。逻辑单元的层次由上到下可以分成序列、场景、镜头和帧。其中场景是时间上具有一定顺序关系的相似镜头的集合。文章提出了基于类内和类间损失的场景构造算法。首先利用时间约束和颜色直方图求得镜头距离;然后基于类内和类间损失对相似镜头进行聚类,得到镜头类;最后在分析镜头类的基础上构造场景。实验证明,构造的场景比较好的反映了视频的内容。
In order to process video data efficiently, video data should be structured. A scene is composed of a collection of shots that are semantically related and temporally consecutive. In this paper, a scene construction algorithm based on the neighbor function is proposed. We use time-constrained color histograms to calculate distance between shots. Then we obtain shot groups by clustering similar shots based on the Intra-class and Inter-class Losses. Finally we analyze shot groups to construct parallel scenes and series scenes respectively. Experimental results show the constructed scenes by using our algorithm can properly reflect the content of a video sequence.