提出了一种基于梅尔频率倒谱系数相关性的语音感知哈希内容认证算法.该算法提取分段语音的声纹梅尔频率倒谱系数作为感知特征.为提高算法的安全性,算法利用伪随机序列作为密钥,计算得到梅尔频率倒谱系数与伪随机之间的相关度,最后量化相关值并加密生成感知哈希序列.语音认证过程中,采用相似性度量函数来衡量哈希序列之间的距离,同时与汉明距离方法进行了比较.仿真结果表明,该算法对语音内容保持操作,如重采样、MP3压缩等具有较好的鲁棒性,相似性度量函数也对语音篡改检测定位具有较高的灵敏性.
A perceptual hashing algorithm for speech content authentication based on correlation coefficient of mel-frequency cepstrum coefficients (MFCC) was proposed. The MFCC of the framed speech signal is extracted as perceptual feature. The correlation coefficients between MFCC and a pseudo-random sequence, which is generated by keys for security, were calculated. Hash sequence is generated by quantifying the correlation coefficients and then scrambling. For audio authentication procedure, a new method, similarity metric, was used to measure the distance of hashes, which is compared with the hamming distance method. Simulations show that the algorithm is robust against content-preserving manipulations such as re-sampling, MP3 compression, and so on. It is very sensitive to tamper of speech by similarity metric.