为保护数字文本的知识产权,提出一种利用语义角色嵌入水印信息的文本水印算法。采用Unicode和Huffman编码对水印进行预处理形成特定形式的水印信息串,借助自然语言处理技术标注文本中的语义角色,将水印信息映射为语义角色的位置,实现水印的嵌入与提取。算法对文本的格式和内容不作任何修改,具有很强的隐蔽性和鲁棒性,能有效抵抗常见的格式变换和攻击,同时能提供较大的水印容量,与其他文本水印算法相比具有一定的优越性。
In order to protect the intellectual property of digital texts, this paper proposed a text watermarking algorithm that used the semantic roles to embed watermark information. The algorithm used the Unicode and Huffman code to preprocess a wa- termark to form a particular form of watermark information. It used natural language processing technology to label the semantic roles in a text and embedded a watermark into the text by mapping the watermark information into the positions of the semantic roles. The algorithm did not make any change to the format and content of a text and had strong concealment and robustness. It could resist common format transformations and attacks, and provided a large watermark capacity at the same time. In comparison with other text watermarking algorithms, the proposed algorithm has advantages.