主要研究如何选择和动态使用各自的敏感特征来抽取"一X就Y"结构的MWEs.分析了多词表达研究的现状,参考汉语言学相关专家的研究对"一X就Y"格式进行大致分类,针对该格式采用分词的方法提取特征集,并在训练集中按类别进行敏感特征的选择.
The NLP community has increasingly become aware of the problems that multiword expressions(MWEs) pose.This paper studies how to select and use their sensitive features to extract "yi(一) X jiu(就) Y " Structure in MWE.It analyses the status of MWE studies,and then broadly classi?es the structure according to Chinese linguistic experts’ studies.A feature set is extracted by segmentation.Finally, sensitive features are selected in the training set according to different categories.