东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

面向互联网的藏文实体关系模板获取技术研究

ISSN号：1003-0077
期刊名称：《中文信息学报》
时间：0
分类：TP391.1[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]西北民族大学甘肃省民族语言智能处理重点实验室,甘肃兰州30030
相关基金：国家自然科学基金（No.61262052,No.61262054）;中央高校基本科研业务费专项资金资助项目（No.31920140064）.

关键词：藏文, 实体关系, 模板, 互联网, tibetan, entity relations , templates, Internet

中文摘要：

确定实体之间的关系有助于更好的理解文本内容，通过实体关系模板可以从海量无结构的文本中获取大量的实体关系，并予以结构化．本文针对互联网藏文文本的特点，通过对藏文实体进行模板表示，采用基于word2vec的无监督词义相似度计算方法，构建近义词资源，实现了藏文词义相似度计算系统，最终构建一种基于相似度计算的实体关系模板获取模型．通过网络爬虫抓取青海湖藏文网的语料进行试验，实验结果表明本文提出的藏文实体关系模板抽取方法较为有效，达到了较好的实验效果．

英文摘要：

Extracting entity relations is benefcial to understand the meanings of text. By the entity relation templates, we can get a lot of entity relation and structured data from the massive unstructured text. According to the characteristics of Tibetan text from the internet, the paper studies the Tibetan template representations, and implements an unsupervised Tibetan semantic similarity system based on word2vec, finally implement a Tibetan entity relation templates extraction model based on similarity calculation model. We studies the mode by crawling the amdotibet. The experimental results show that our model is effective, and achieved a good results.

同期刊论文项目