通过引入扩展频谱技术对水印信息进行编码,提出一种篇章层的自然语言数字水印方法。抽取文本中所有命名实体构成一个向量空间,根据密钥选择一个子空间用于嵌入信息,通过指代消解技术修改子空间内命名实体的个数实现信息嵌入。通过比较最终提取的信息向量与原始水印信息所生成的向量判断是否嵌入了水印信息。实验结果表明该算法具有衰好的鲁棒性,能抵抗一些常见的主动攻击。
This paper proposes a natural language digital watermark method on paragraph level, by introducing spread spectrum technique to encode the watermark. All the named entities are picked out to create a vector space. A sub-space is selected to embed watermark according to the secret key. In order to embed watermark, the number of named entities is modified by employing the anaphora technique. Whether the watermark is embedded in text is determined by comparing the extracted vector with the vector generated by the original watermark. Experimental results show that the method is robust, and can resist some common active attacks.