随着网络应用的扩展,个人或组织会收到或存储越来越多的信息,计算机中介通讯中时刻有大量的文本信息传输,然而人们仍然没有提出有效的办法识别其中的欺骗信息,因此,提出一种能自动检测欺骗的方法是非常迫切的.由于目前关于中文欺骗语料库仍少相关报道,因此文章基于两个步骤实现,首先建立欺骗性语料库,然后通过对欺骗性语料和非欺骗性语料的分析,提出一种基于分类的欺骗行为检测方法.实验结果显示开放测试系统的精确率、召回率和F-值分别可达到78.3%、72%和0.75.
Inundated with massive amounts of textual information transmitted through Computer-mediated Communication(CMC) ,people remain largely unsuccessful and inefficient in detecting those messages that may be deceptive. At present it is desirable to propose an automated deception detection method that could help people flag the possible deceptive messages in CMC. In this research we first construct the deceptive and non-deceptive Chinese text corpora, which have not been published so far, and then propose a novel classification-based deception detection method of Chinese text based on the corpora analysis results. The experimental results show the precision rate,the recall rate and the F- value may achieve 78.3% ,72% and 0. 75 in open test,aespectively.