基于同义词替换的文本信息隐藏方法,可以通过对载体中的同义词进行有选择的替换来嵌入隐藏信息.通过分析,发现这种方法嵌入隐藏信息后会导致载体文本中同义词结对概率的明显增加.基于此,提出了一种通过分析文本中同义词结对值来进行隐藏信息检测的算法.实验表明,该检测算法漏警率约为4%,虚警率约为9.8%,证明该检测算法可以有效地检测基于同义词替换的文本信息隐藏方法隐藏的信息.
As text steganography becomes a new research hotspot of security communication recently, text steganalysis, whose purpose is to detect the presence of secret message in text, has attracted more and more attention. At present, all existing methods concerning text steganography can be roughly divided into three categories: those based on invisible characters, those based on format, and those based on natural language processing. The important major technique shared by most of the natural language processing based steganographic methods is utilizing synonym substitution, which embeds secret information by substituting the synonyms selectively. Since it has the advantages of good imperceptibility, robustness, it is much more difficult for the steganalysis researchers to detect the existence of the hidden information embedded using this type of approaches. Nevertheless, it is found that the synonym substitution based steganography can lead to an obvious increase in the probability of synonym pairs in the carrier text. In the light of this observation, a steganalysis algorithm which makes use of the number of synonym pairs to decide whether the hidden information exists in text or not is proposed. Experimental results show that the proposed algorithm can efficiently break the text steganography lying on synonym substitution. The achieved false negative rate is approximately 4 % and the false positive rate is approximately 9.8 %.