东篱科研大数据发现系统（DRDS）

欢迎您！东篱公司退出

申报数据库
1. 申报指南
立项数据库
成果数据库
1. 期刊论文
2. 会议论文
3. 著作
4. 专利
项目获奖数据库

位置：成果数据库 > 期刊 > 期刊详情页

基于迁移学习的蛋白质交互关系抽取

期刊名称：中文信息学报 (录用)
时间：0
页码：-
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：大连理工大学计算机学院,辽宁大连116023
相关基金：国家自然科学基金（61173101,61173100,61272375）
相关项目：融合指代消解和迁移学习的蛋白质交互关系抽取的研究

作者：李丽双|郭瑞|黄德根|周惠巍|

关键词：蛋白质交互关系抽取, 迁移学习, 负迁移, PPI, transfer learning, negative transfer

中文摘要：

作为生物医学信息抽取领域的重要分支,蛋白质交互关系（Protein-Protein Interaction,PPI）抽取具有重要的研究意义。目前的研究大多采用统计机器学习方法,需要大规模标注语料进行训练。训练语料过少,会降低关系抽取系统的性能,而人工标注语料需要耗费巨大的成本。该文采用迁移学习的方法,用大量已标注的源领域（其它领域）语料来辅助少量标注的目标领域语料（本领域）进行蛋白质交互关系抽取。但是,不同领域的数据分布存在差异,容易导致负迁移,该文借助实例的相对分布来调整权重,避免了负迁移的发生。在公共语料库AIMed上实验,两种迁移学习方法获得了明显优于基准算法的性能;同样方法在语料库IEPA上实验时,TrAdaboost算法发生了负迁移,而改进的DisTrAdaboost算法仍保持良好迁移效果。

英文摘要：

As an important branch of biomedical information extraction,Protein-Protein Interaction（PPI）extraction has great research significance.Currently,research of PPI mainly focuses on traditional machine learning,which requires the use of large amounts of annotated corpus for training and makes it costly to label the new data.This paper employs Transfer Learning in extracting PPI with a small amount of labeled data of target domain（in-domain）,drawing support from annotated data of source domain（out-of-domain）.To avoid the negative transfer caused by large differences between the distributions of different domains,we adjust the weights of each instance from source domain,depending on its relative distribution.Experiments on the AIMed corpus and on IEPA corpus reveals the efficiency of our alogrithems.

同期刊论文项目

跨语言信息检索中的机器翻译研究

期刊论文 50 会议论文 29 著作 1

融合指代消解和迁移学习的蛋白质交互关系抽取的研究

期刊论文 42 会议论文 22

基于翻译学习和核方法的中文模糊限制信息检测研究

期刊论文 10

同项目期刊论文

MT-Oriented English PoS Tagging and Its Application to Noun Phrase Chunking

最大生成树算法和决策式算法相结合的中文依存关系解析

Implication operators on the set of V-irreducible element in the linguistic truth-valued intuitionis

基于句法结构约束的模糊限制信息范围检测

一种基于十八元语言值模糊相似矩阵的聚类方法

A Multistage Gene Normalization System Integrating Multiple Effective Methods

A two-phase Bio-NER system based on integrated classifiers and multiagent strategy

A distributed meta-learning system for Chinese entity relation extraction

Creating Chinese-English Comparable Corpora

基于条件随机场与时间词库的中文时间表达式识别

基于组合核的蛋白质交互关系抽取

中英平行短语依存树库构建

ExtractingBiomedical Event with Dual Decomposition Integrating Word Embeddings

基于广义Jaccard系数的微博情感新词判定

Co-training for detecting hedges and their scope in biomedical texts

Hedge Scope Detection in Biomedical Texts: An Effective Dependency-Based Method

基于简单名词短语的汉语介词短语识别研究

Identification of English prepositional phrases within business domain for machine translation

基于信息熵和词频分布变化的术语抽取研究

利用句法短语改善统计机器翻译性能

An Unsupervised Graph Based Continuous Word Representation Method for BiomedicalText Mining

中医针灸领域术语自动抽取研究

Context Information and Fragments Based Cross-Domain Word Segmentation

基于条件随机场的汽车领域术语抽取

一种基于十元格蕴涵代数的知识表示方法

语言真值直觉模糊命题逻辑系统的推理规则

基于TOPSIS的语言真值直觉模糊多属性决策

一种融合句法短语的汉英统计机器翻译方法

利用词表示和深层神经网络抽取蛋白质关系

Domain term extraction based on conditional random fields combined with active learning strategy

RON受体型酪氨酸激酶在胰腺癌侵袭及转移中的作用

基于句法结构约束的模糊限制信息范围检测

A general Protein-Protein Interaction extraction architecture based on word representation and featu

Automatic Part-Of-Speech Tagging for Oromo Language Using Maximum Entropy Markov Model (MEM

Integrating Active Learning Strategy to the Ensemble Kernel-based Method for Protein-Protein Interac

Integrating semantic information into multiple kernels for Protein-Protein Interaction extraction fr

基于条件随机场与时间词库的中文时间表达式识别

Challenges of Diacritical Marker or Hudhaa Character in Tokenization of Oromo Text

基于组合核的蛋白质交互关系抽取

基于广义Jaccard系数的微博情感新词判定

A two-phase Bio-NER system based on integrated classifiers and multi-agent strategy

基于简单名词短语的汉语介词短语识别研究

Extracting biomedical event with dual decomposition integrating word embeddings

基于信息熵和词频分布变化的术语抽取研究

基于双代价参数SVM的生物医学文本指代消解研究

Augmenting performance of SMT models by deploying fine tokenization of the text and Part-of-Speech T

利用句法短语改善统计机器翻译性能

An approach to improve kernel-based Protein–Protein Interaction extraction by learning from large-sc

基于词表示方法的生物医学命名实体识别

Boosting performance of gene mention tagging system by hybrid methods

An unsupervised graph based continuous word representation method for biomedical text mining

基于半监督隐马尔科夫模型的汉语词性标注研究

基于条件随机场的汽车领域术语抽取

基于组合核的中文实体关系抽取研究

利用词表示和深层神经网络抽取蛋白质关系

基于句法结构约束的模糊限制信息范围检测

基于条件随机场与时间词库的中文时间表达式识别

基于广义Jaccard系数的微博情感新词判定

基于简单名词短语的汉语介词短语识别研究

利用句法短语改善统计机器翻译性能

基于云平台的智能家居气象站的研究与设计

属性序下的粗糙集与KNN相结合的英文文本分类研究

语义角色映射为句法成分的词汇语义制约规律及特点