东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于多尺度时长音频特征的暴力音频事件检测

ISSN号：2095-2163
期刊名称：智能计算机与应用
时间：2014.10.1
页码：72-75
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001
相关基金：国家自然科学基金（61171186）.
相关项目：面向智能信息处理的高级音频信息认知规律及其应用研究

关键词：暴力镜头检测, 多尺度时长特征, 音频事件检测, 支持向量机, Violence Detection , Multi - scale Audio Features , Audio Event Detection , Support Vector Machine （SVM）

中文摘要：

暴力镜头检测是近年来的研究热点之一.早期的暴力镜头检测主要依赖视频特征,由于音频信息具有良好的稳定性和在不同文化和人群之间的一致性,现在人们越来越多地关注音频信息的使用.为此研究使用音频特征对电影镜头中的暴力音频事件进行检测.为此提出了一种基于多尺度时长的特征提取方法.提取了除MFCC、LPC、能量等短时特征以外,还提取了能量均值方差、子带能量均值和方差、帧间差分等长时特征.暴力镜头中出现较多且具有代表性的音频事件有爆炸、尖叫、枪击三种.本文以电影的镜头为识别单位,使用支持向量机分类算法实现了一个检测系统.通过在15部好莱坞电影上的实验,表明本文基于多尺度时长的音频特征在暴力音频事件检测工作中,能够取得较好的结果.

英文摘要：

Violence detection is one of the hot research topic in recent years. Early work mainly depends on the video characteristic, considering the audio information has good stability and consistency between different cultures and people, people have paid more and more attention to the use of audio information. This paper studies using audio features to detect violent audio event in the movie. So this paper presents a multi - scale feature extraction method. Besides MFCC, LPC, short term energy, the paper also extracted the long term feature, such as the mean and variance of energy and sub - band ener- gy, difference between frames. The audio events appeared frequently in violence scenes are explosions, screams, gunshots. Therefore, using support vector machine classification algorithm, the paper implements a detection system, to detect the vi- olent audio event in the movie scenes. Through experiments on 15 Hollywood movies, experiments results show that the multi - scale audio features can achieve good results in the violent audio event detection work.

同期刊论文项目