针对Mimic模式的文本信息隐藏技术,提出了一种基于文本剩余度的文本隐藏信息检测方法。该方法将待检测的文本作为m阶马尔可夫信源,将文本中的单词作为信源符号,计算该信源剩余度,通过剩余度与文本大小的关系判断文本中是否含有隐藏信息。通过对NiceText、Texto、Stego和Sams Big Play Maker等4种主要工具软件生成的8000个隐写文本,及随机选择的2400个正常文本的测试,该检测方法的虚警率为0.5%,漏警率为3.9%。实验和分析结果表明,该方法可以对Mimic模式的基于自然语言处理的文本信息隐藏方法进行有效检测。
Targeted at the text steganography with mimic model, a steganalysis method based on source redundancy for stegotexts was proposed. This method processed the text and the words in it as the m-order Markov source and the source symbols respectively, then computed the redundancy of the source. Through analyzing the relationship between the redundancy and the size of the text, the existence of hidden information could be determined. With testing of 8 000 stegotexts produced by the four main softwares NiceText, Texto, Stego and Sams Big Play Maker, and 2 400 normal texts randomly sampled from innocuous texts downloaded from the Internet, the results show that the false positive rate of our steganalysis method is 0.5% and the false negative rate is 3.9%. Experiments and analyzing results indicate that the steganalysis method can effectively detect stegotexts based on natural language processing with mimic model.