为提高手语合成视频的真实感,提出一种面向手语合成的视频语义描述方法,并基于语义描述构建出相应的视频数据库.采集特定研究领域的手语视频数据,按照词义把源视频切分成词条基元和基于人体-部件的多层次过渡基元,通过对视频基元每帧图像进行语义描述来建立它们的多维语义模型.每个视频基元的多维语义模型代表了该视频每帧图像所包含的具体手语信息,包括位置、手形、韵律等.在手语合成过程中,通过解析视频的多维语义模型即可实时地调用有用的信息.该视频语义描述方法可为手语合成提供实时一致的语义理解,并且在拼接2段不同韵律的手语视频时,可通过解析出的韵律信息适当地调整过渡帧的插值位置,进而合成韵律一致的过渡视频.
To improve synthesis realistic of sign language videos,a method to describe sign language video semantics is proposed,and the sign language video database based on semantic description for sign language synthesis is constructed.Chinese sign language videos in specific research field are captured,then sign language video units and multi-dimensional transition units are cut from the captured sign language videos.By describing the semantic information of every frame in sign language videos,which include locations,hand shapes and rhythm information,their multi-dimensional semantic models are constructed.During sign language video synthesis,useful information can be used in real-time by parsing multi-dimensional semantic models.This method provides real-time and coherent semantic information for sign language video synthesis,and in the process of joining two sign language videos,different rhythm information can be parsed out from their semantic models,then interpolated locations of transition frames can be moderately adjusted to make the rhythm in transition frames gradually change.