东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

面向网络文本的汉语反讽修辞识别方法研究

ISSN号：0253-2395
期刊名称：《山西大学学报：自然科学版》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]北京大学信息管理系,北京100871
相关基金：国家社会科学基金重大项目（12＆ZD227）

关键词：反讽识别, LOGISTIC模型, 情感极性, 语义偏离, irony detection, logistic regression, emotion fluctuation, intentional meaning

中文摘要：

互联网文本的大量出现给情感分析研究提供了新的可能。文章研究中文的反讽修辞识别,试图通过归纳的方法提出了一个汉语中出现反讽修辞的特征体系,并进行了相关的算法设计。通过在互联网上抓取相关信息建立文档,然后训练反讽识别的Logistic模型。通过模型自身的显著性、模型识别能力和人工标注识别结果的比较,验证了模型的有效性。显著性测试表明＂意指义和字面义的偏离＂和＂情感的变化张力＂是反讽修辞在网络上汉语中最主要的两个特征。模型达到的71.2%的召回率和60.3%的分类准确度可以与近年国内外在英语,意大利语等类似问题研究中做出的最好结果相比较。

英文摘要：

The emergency of large quantity of Internet text material has provided new possibility for researches of sentiment analysis.In order to discuss Chinese irony recognition issues,this paper proposes a set of features characterizing irony phenomenon and designs effective algorithms.By crawling documents from Internet to form documents with related information,and training a Logistic model for irony recognition,this paper compares results of model pattern recognition and manual tagging outcomes,so as to verify the model＇s effectiveness.Tests show that＂deviation of sense meaning and literal meaning＂and＂emotion fluctuation＂are the two main features characterizing Chinese irony in Internet text.The model achieves a recall rate of 71.2% and classification accuracy of 60.3%.By comparing with the best recent results obtained from similar researches in English and Italian,it can be concluded that the model is efficient.

同期刊论文项目